Agent Harness works with every model provider available through TrueFoundry AI Gateway — OpenAI, Anthropic, Azure OpenAI, Google Vertex, AWS Bedrock, Databricks, Together AI, self-hosted models, and more. You never paste API keys into an agent definition. The AI Gateway already holds provider credentials, enforces access policies, and routes traffic. Agent Harness inherits all of this.Documentation Index
Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt
Use this file to discover all available pages before exploring further.
Selecting a model
In the agent builder, click Select a Model to pick from any model enabled for you in the AI Gateway. Switching models is a one-click change — no code edits, no new credentials.
Gateway-managed model access
In other harness products (Claude Managed Agents, LangSmith Managed Deep Agents), you supply provider API keys when creating an agent or register them per-workspace. In TrueFoundry, model access is managed once at the AI Gateway layer and agents simply reference model names.| Concern | How TrueFoundry handles it |
|---|---|
| Provider credentials | Stored in AI Gateway. Agents never see raw keys. |
| Who can use which models | RBAC — assign model access to teams, users, or virtual accounts. |
| Spend control | Per-user and per-team budgets and rate limits enforced at gateway. |
| Guardrails | Content policies, PII filters, and custom guardrails applied before/after model calls. |
| Observability | Every model call traced with cost, latency, tokens, and user attribution. |
| Model swapping | Change the model in the agent builder — no code changes, no new keys. |
Why model choice matters for agents
Agent workloads are different from single-turn chat. They involve multiple tool calls, retries, long context, and structured outputs. Model selection directly affects:- Reliability of tool calling and structured output
- Latency across multi-step tasks
- Total run cost over many turns
- Accuracy on planning and complex reasoning
Common model choices
| Category | Common choices | Best for |
|---|---|---|
| Balanced general-purpose | claude-sonnet, gpt-4o, gemini-2.5-pro | Most production assistants and tool-driven workflows |
| Cost-optimized high-volume | gpt-4o-mini, gemini-2.5-flash, claude-haiku | High request volume, triage, lightweight actions |
| High-reasoning depth | claude-opus, o3/o4 class models | Complex planning, multi-step analysis, difficult tool chains |
| Open/self-hosted | Llama, Qwen, Mistral via supported providers | Data residency, cost control, private deployment |
Available model names depend on your configured provider accounts in AI Gateway. See Supported Providers and Model Discovery.
Virtual model routing for agents
Virtual models let you assign one logical model name to multiple backing models with load balancing, failover, and policy-based routing. Agent Harness support for virtual model selection enables:- Cost optimization — route simpler agent steps to cheaper models automatically
- Quality optimization — route complex reasoning steps to stronger models
- Resilience — cross-provider failover if one provider is degraded
- Stable configuration — agent references one name, routing evolves centrally
Coming soon — automatic model selection. On the roadmap, you’ll be able to attach a virtual model containing a set of models to an agent and let Agent Harness pick the best fit for each step at runtime. The harness will weigh task complexity, expected latency, and cost to route every turn to the most suitable model in the set — improving end-to-end cost and latency without any manual routing rules.