overloaded_error in streaming flows. A common failure mode is receiving HTTP 200 but an error payload in the early stream events.
TrueFoundry behavior
For streaming requests routed through routing rules/virtual models, TrueFoundry AI Gateway waits until it sees the first non-empty stream chunk.- If that first meaningful chunk indicates an Anthropic
overloaded_error, the gateway marks the attempt as failed and falls back to the next eligible target. - If a normal first meaningful chunk arrives, streaming continues on the same target.
Why this matters
In streaming mode, some provider-side overload failures can appear as an error payload in early stream events instead of a non-2xx HTTP status. The first-chunk check ensures these requests fail fast and route to the next eligible target.Practical recommendation
- If you use Anthropic with streaming in TrueFoundry, configure fallback targets in your Routing Config or Virtual Models.
- Keep fallback chains short and provider-diverse (for example Anthropic primary, OpenAI or Bedrock secondary).
- Continue using retries and rate limits to reduce pressure during provider-side overload windows.