Models - TrueFoundry Docs

Agent Harness works with every model provider available through TrueFoundry AI Gateway — OpenAI, Anthropic, Azure OpenAI, Google Vertex, AWS Bedrock, Databricks, Together AI, self-hosted models, and more. You never paste API keys into an agent definition. The AI Gateway already holds provider credentials, enforces access policies, and routes traffic. Agent Harness inherits all of this.

Gateway-managed model access

In other harness products (Claude Managed Agents, LangSmith Managed Deep Agents), you supply provider API keys when creating an agent or register them per-workspace. In TrueFoundry, model access is managed once at the AI Gateway layer and agents simply reference model names.

Concern	How TrueFoundry handles it
Provider credentials	Stored in AI Gateway. Agents never see raw keys.
Who can use which models	RBAC — assign model access to teams, users, or virtual accounts.
Spend control	Per-user and per-team budgets and rate limits enforced at gateway.
Guardrails	Content policies, PII filters, and custom guardrails applied before/after model calls.
Observability	Every model call traced with cost, latency, tokens, and user attribution.
Model swapping	Change the model in the agent builder — no code changes, no new keys.

Because governance lives in the gateway, platform teams can update policies, add new providers, or rotate credentials without touching any agent configuration.

Why model choice matters for agents

Agent workloads are different from single-turn chat. They involve multiple tool calls, retries, long context, and structured outputs. Model selection directly affects:

Reliability of tool calling and structured output
Latency across multi-step tasks
Total run cost over many turns
Accuracy on planning and complex reasoning

Common model choices

Category	Common choices	Best for
Balanced general-purpose	`claude-sonnet`, `gpt-4o`, `gemini-2.5-pro`	Most production assistants and tool-driven workflows
Cost-optimized high-volume	`gpt-4o-mini`, `gemini-2.5-flash`, `claude-haiku`	High request volume, triage, lightweight actions
High-reasoning depth	`claude-opus`, `o3/o4` class models	Complex planning, multi-step analysis, difficult tool chains
Open/self-hosted	Llama, Qwen, Mistral via supported providers	Data residency, cost control, private deployment

Available model names depend on your configured provider accounts in AI Gateway. See Supported Providers and Model Discovery.

No vendor lock-in

Unlike Claude Managed Agents (Anthropic models only), Agent Harness is provider-agnostic:

Use any model enabled in your AI Gateway
Switch models without rewriting agent logic or changing credentials
Run the same agent on different models for A/B testing or cost optimization
Apply identical governance regardless of provider

Virtual model routing for agents

Virtual models let you assign one logical model name to multiple backing models with load balancing, failover, and policy-based routing. Agent Harness support for virtual model selection enables:

Cost optimization — route simpler agent steps to cheaper models automatically
Quality optimization — route complex reasoning steps to stronger models
Resilience — cross-provider failover if one provider is degraded
Stable configuration — agent references one name, routing evolves centrally

See Virtual Models for routing strategies.

Recommended rollout

Start with one strong general-purpose model per agent.
Track run traces, cost, and latency in production.
Introduce virtual-model routing for optimization.
Keep model decisions in gateway policy, not hard-coded in application logic.

Documentation Index

​Gateway-managed model access

​Why model choice matters for agents

​Common model choices

​No vendor lock-in

​Virtual model routing for agents

​Recommended rollout

Gateway-managed model access

Why model choice matters for agents

Common model choices

No vendor lock-in

Virtual model routing for agents

Recommended rollout