Agent Harness works with every model provider available through TrueFoundry AI Gateway — OpenAI, Anthropic, Azure OpenAI, Google Vertex, AWS Bedrock, Databricks, Together AI, self-hosted models, and more. You never paste API keys into an agent definition. The AI Gateway already holds provider credentials, enforces access policies, and routes traffic. Agent Harness inherits all of this.Documentation Index
Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt
Use this file to discover all available pages before exploring further.
Gateway-managed model access
In other harness products (Claude Managed Agents, LangSmith Managed Deep Agents), you supply provider API keys when creating an agent or register them per-workspace. In TrueFoundry, model access is managed once at the AI Gateway layer and agents simply reference model names.| Concern | How TrueFoundry handles it |
|---|---|
| Provider credentials | Stored in AI Gateway. Agents never see raw keys. |
| Who can use which models | RBAC — assign model access to teams, users, or virtual accounts. |
| Spend control | Per-user and per-team budgets and rate limits enforced at gateway. |
| Guardrails | Content policies, PII filters, and custom guardrails applied before/after model calls. |
| Observability | Every model call traced with cost, latency, tokens, and user attribution. |
| Model swapping | Change the model in the agent builder — no code changes, no new keys. |
Why model choice matters for agents
Agent workloads are different from single-turn chat. They involve multiple tool calls, retries, long context, and structured outputs. Model selection directly affects:- Reliability of tool calling and structured output
- Latency across multi-step tasks
- Total run cost over many turns
- Accuracy on planning and complex reasoning
Common model choices
| Category | Common choices | Best for |
|---|---|---|
| Balanced general-purpose | claude-sonnet, gpt-4o, gemini-2.5-pro | Most production assistants and tool-driven workflows |
| Cost-optimized high-volume | gpt-4o-mini, gemini-2.5-flash, claude-haiku | High request volume, triage, lightweight actions |
| High-reasoning depth | claude-opus, o3/o4 class models | Complex planning, multi-step analysis, difficult tool chains |
| Open/self-hosted | Llama, Qwen, Mistral via supported providers | Data residency, cost control, private deployment |
Available model names depend on your configured provider accounts in AI Gateway. See Supported Providers and Model Discovery.
No vendor lock-in
Unlike Claude Managed Agents (Anthropic models only), Agent Harness is provider-agnostic:- Use any model enabled in your AI Gateway
- Switch models without rewriting agent logic or changing credentials
- Run the same agent on different models for A/B testing or cost optimization
- Apply identical governance regardless of provider
Virtual model routing for agents
Virtual models let you assign one logical model name to multiple backing models with load balancing, failover, and policy-based routing. Agent Harness support for virtual model selection enables:- Cost optimization — route simpler agent steps to cheaper models automatically
- Quality optimization — route complex reasoning steps to stronger models
- Resilience — cross-provider failover if one provider is degraded
- Stable configuration — agent references one name, routing evolves centrally
Recommended rollout
- Start with one strong general-purpose model per agent.
- Track run traces, cost, and latency in production.
- Introduce virtual-model routing for optimization.
- Keep model decisions in gateway policy, not hard-coded in application logic.