Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt

Use this file to discover all available pages before exploring further.

Agent Harness works with every model provider available through TrueFoundry AI Gateway — OpenAI, Anthropic, Azure OpenAI, Google Vertex, AWS Bedrock, Databricks, Together AI, self-hosted models, and more. You never paste API keys into an agent definition. The AI Gateway already holds provider credentials, enforces access policies, and routes traffic. Agent Harness inherits all of this.

Gateway-managed model access

In other harness products (Claude Managed Agents, LangSmith Managed Deep Agents), you supply provider API keys when creating an agent or register them per-workspace. In TrueFoundry, model access is managed once at the AI Gateway layer and agents simply reference model names.
ConcernHow TrueFoundry handles it
Provider credentialsStored in AI Gateway. Agents never see raw keys.
Who can use which modelsRBAC — assign model access to teams, users, or virtual accounts.
Spend controlPer-user and per-team budgets and rate limits enforced at gateway.
GuardrailsContent policies, PII filters, and custom guardrails applied before/after model calls.
ObservabilityEvery model call traced with cost, latency, tokens, and user attribution.
Model swappingChange the model in the agent builder — no code changes, no new keys.
Because governance lives in the gateway, platform teams can update policies, add new providers, or rotate credentials without touching any agent configuration.

Why model choice matters for agents

Agent workloads are different from single-turn chat. They involve multiple tool calls, retries, long context, and structured outputs. Model selection directly affects:
  • Reliability of tool calling and structured output
  • Latency across multi-step tasks
  • Total run cost over many turns
  • Accuracy on planning and complex reasoning

Common model choices

CategoryCommon choicesBest for
Balanced general-purposeclaude-sonnet, gpt-4o, gemini-2.5-proMost production assistants and tool-driven workflows
Cost-optimized high-volumegpt-4o-mini, gemini-2.5-flash, claude-haikuHigh request volume, triage, lightweight actions
High-reasoning depthclaude-opus, o3/o4 class modelsComplex planning, multi-step analysis, difficult tool chains
Open/self-hostedLlama, Qwen, Mistral via supported providersData residency, cost control, private deployment
Available model names depend on your configured provider accounts in AI Gateway. See Supported Providers and Model Discovery.

No vendor lock-in

Unlike Claude Managed Agents (Anthropic models only), Agent Harness is provider-agnostic:
  • Use any model enabled in your AI Gateway
  • Switch models without rewriting agent logic or changing credentials
  • Run the same agent on different models for A/B testing or cost optimization
  • Apply identical governance regardless of provider

Virtual model routing for agents

Virtual models let you assign one logical model name to multiple backing models with load balancing, failover, and policy-based routing. Agent Harness support for virtual model selection enables:
  • Cost optimization — route simpler agent steps to cheaper models automatically
  • Quality optimization — route complex reasoning steps to stronger models
  • Resilience — cross-provider failover if one provider is degraded
  • Stable configuration — agent references one name, routing evolves centrally
See Virtual Models for routing strategies.
  1. Start with one strong general-purpose model per agent.
  2. Track run traces, cost, and latency in production.
  3. Introduce virtual-model routing for optimization.
  4. Keep model decisions in gateway policy, not hard-coded in application logic.