Claude Fable 5 Is Now Live on TrueFoundry AI Gateway

Built for Speed: ~10ms Latency, Even Under Load
Blazingly fast way to build, track and deploy your models!
- Handles 350+ RPS on just 1 vCPU — no tuning needed
- Production-ready with full enterprise support
Claude Fable 5 Is Now Live on TrueFoundry AI Gateway
Anthropic launched Claude Fable 5 on June 9, 2026 - its most capable generally available model to date, and the first "Mythos-class" model offered to the public. Starting today, you can call Fable 5 through the TrueFoundry AI Gateway using the same unified, OpenAI-compatible API you already use for every other model - with full governance, cost controls, and automatic fallbacks built in.
No new SDK. No separate integration. Point your existing requests at your TrueFoundry gateway and set the model to Claude Fable 5.
Try Claude Fable 5 on the TrueFoundry AI Gateway →
What is Claude Fable 5?
Fable 5 sits a tier above Anthropic's Opus class. It's built for ambitious, long-running, asynchronous work — large code migrations, multi-day agentic sessions, deep analytical research - the kind of tasks where models previously lost the thread. A few headline facts from Anthropic's launch:
- State-of-the-art coding. Fable 5 scores 80.3% on SWE-Bench Pro, versus 69.2% for Claude Opus 4.8 and 58.6% for GPT-5.5.
- Built for long horizons. Stripe reported that Fable 5 completed a codebase-wide migration in a 50-million-line Ruby codebase in a single day — work that would have taken a full team over two months.
- 1M-token context window, with text, image, and file inputs and reasoning support.
- Pricing: $10 per million input tokens and $50 per million output tokens, with a 90% prompt-caching discount on input.
Why access Fable 5 through TrueFoundry
Calling a brand-new frontier model directly is easy - until you need to run it in production across a team. That's where a gateway earns its place:
- One API, every model. Swap Fable 5 in (or back to Opus 4.8, GPT-5.5, or any other model) by changing a single string. No per-provider rewrites.
- Cost governance. Fable 5 is a premium model at $10/$50 per million tokens. With TrueFoundry you set budgets, rate limits, and per-team virtual keys so spend stays visible and controlled from day one.
- Automatic fallbacks and load balancing. Launch-day demand for Fable 5 is expected to be very high. Route overflow or failures to a backup model automatically so your app never goes dark.
- Observability. Token usage, latency, and cost per request, per model, per team — in one place.
- No lock-in. Run Fable 5 via Anthropic, Amazon Bedrock, Google Vertex AI, or Microsoft Foundry through the same gateway.
Get started in minutes
Step 1 - Add Claude Fable 5 to your gateway
In the TrueFoundry AI Gateway, open your connected Anthropic provider under Models → Setup Anthropic account and manage models. The setup walks through three steps — Configure Account → Models Selection → Access Control. On the Models Selection screen, search for claude-fable-5 and check it (you'll see it listed at $10 input / $50 output per 1M tokens, alongside the rest of the Claude lineup like claude-opus-4-6, claude-opus-4-5, and claude-haiku-4-5). Then use Access Control to decide which teams and virtual keys can use it — so a premium model like Fable 5 is governed from the moment it goes live.

Step 2 - Call Fable 5 with the OpenAI-compatible snippet
Open the Playground, pick claude-fable-5, and grab the ready-made Usage code snippet - TrueFoundry generates it for OpenAI, LangChain, Node.js, cURL, LlamaIndex, CrewAI, Pydantic AI, and more, in both streaming and non-streaming modes. The OpenAI Python version looks like this:

A few things worth noticing in that snippet — and why the gateway is doing more than proxying a request:
- Base URL is just
https://gateway.truefoundry.ai. Every model you've enabled is reachable from this one endpoint. - The model ID is namespaced —
test-anthropic/claude-fable-5(<your-provider-account>/<model>). Swap totest-anthropic/claude-opus-4-6and you've switched models with a one-line change; everything else stays identical. X-TFY-LOGGING-CONFIGturns on request logging andX-TFY-METADATAlets you tag each call (team, feature, environment) so spend and usage are attributable per request.- It's standard
client.chat.completions.create— drop it into any existing OpenAI-compatible codebase as-is.
From here you can layer on caching, guardrails, and fallback policies from the same gateway dashboard — without touching your application code.
Start using Claude Fable 5 today
Fable 5 is the model you reach for when the task is too big, too long, or too complex for anything else. Run it through TrueFoundry and you get frontier capability with production-grade control from the first request.
TrueFoundry AI Gateway delivers ~3–4 ms latency, handles 350+ RPS on 1 vCPU, scales horizontally with ease, and is production-ready, while LiteLLM suffers from high latency, struggles beyond moderate RPS, lacks built-in scaling, and is best for light or prototype workloads.
The fastest way to build, govern and scale your AI




























.webp)



