Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt

Use this file to discover all available pages before exploring further.

Most organizations need to separate dev, staging, and prod for the AI Gateway so that experimental work, model evaluations, and production traffic do not interfere with each other. The right setup depends on:
  • Which deployment mode you are on — see Deployment Options.
  • Whether dev / staging / prod use the same models and providers, or different ones.
  • How much isolation you want between environments (data, users, access, blast radius).
This guide walks through the recommended approach for each deployment mode.

Quick recommendation

Deployment modeRecommended setup
SaaS (TrueFoundry-hosted), same models across envsOne tenant + separate virtual accounts for dev / staging / prod, scoped with RBAC.
SaaS (TrueFoundry-hosted), different models or stronger isolationSeparate tenants per environment (e.g. company-dev, company-staging, company-prod).
Self-hosted Gateway Plane (TrueFoundry-hosted Control Plane)One gateway plane is fine for most cases. Use two gateway planes (dev + prod) if dev and prod are in different VPCs and you don’t want to expose the same gateway to both environments.
Self-hosted Control PlaneTwo control planes (non-prod + prod) is the recommended default — it’s the safest setup because you can test TrueFoundry upgrades on non-prod before applying them to prod. Run a single control plane only if you want lower operational overhead and can accept upgrading prod directly.
Across all options, use virtual accounts for application traffic and Personal Access Tokens (PATs) for individual user traffic — never share tokens across environments.

SaaS (TrueFoundry-hosted Control Plane and Gateway)

On SaaS, infrastructure is fully managed by TrueFoundry. Environment separation is achieved either through virtual accounts inside one tenant or by creating multiple tenants.
Recommended when models are the same across environments.Use this when dev, staging, and prod use the same set of models, providers, and credentials, and you mostly need to separate which application (or environment) is calling the gateway for cost tracking, rate limiting, and observability.What you set up
1

Create one virtual account per environment per application

For each application or service, create one virtual account per environment — for example checkout-dev, checkout-staging, checkout-prod. Apps in each environment use the corresponding virtual account token to call the gateway.See Manage Virtual Accounts.
2

Tag virtual accounts and use teams as owners

Set the owner team on each virtual account (e.g. checkout-team) so team members can see logs / metrics for that environment in the gateway. Add a metadata.environment value (dev / staging / prod) on the virtual account so you can filter request logs and metrics by environment.
3

Apply environment-specific policies

Attach different rate-limiting and budget-limiting policies to dev / staging / prod virtual accounts — for example, lower limits and stricter budgets in dev, higher in prod. See Rate Limiting and Budget Limiting.
4

Control access using RBAC

Decide who can use which environment by granting provider account access (for models) and virtual account access at the team / user level. Typical pattern:
  • All developers can use *-dev virtual accounts.
  • Only the on-call / SRE team can use *-prod virtual accounts.
  • Tenant admins can manage everything.
See Access Control and Manage User Roles & Permissions.
This option keeps operational overhead minimal: one set of model integrations, one set of users, one place to look for logs. Teams just switch tokens between environments.

Self-hosted Gateway Plane (Control Plane on SaaS)

In this mode, the Gateway Plane runs in your own infrastructure while the Control Plane is hosted by TrueFoundry. You control how many gateway planes to deploy.
Recommended default.Run a single gateway plane that serves dev, staging, and prod traffic. Use virtual accounts and policies (the One tenant option above) to separate environments.When to choose this
  • Dev / staging / prod live in the same VPC or are reachable from a single shared network.
  • You want to minimize hosting cost and operational overhead.
  • Same models and providers are used across environments.
The gateway plane is stateless and horizontally scalable, so a single fleet can handle traffic for all environments. Combine this with one tenant + per-environment virtual accounts on the Control Plane.

Self-hosted Control Plane

When you self-host the Control Plane (and Gateway Plane) on your own infrastructure, you have three options. We recommend running two control planes — one for non-prod (dev / staging) and one for prod — so that you can validate TrueFoundry upgrades on non-prod before rolling them out to prod. This is the safest setup for production traffic. If your team prefers lower operational overhead and can accept upgrading prod directly, you can run a single control plane and separate environments inside it using virtual accounts or tenants.
We recommend running at most two control planes — a non-prod (dev / staging) and a prod control plane. More than two typically doubles operational burden — backups, upgrades, monitoring, certificates, secrets — without proportional benefit.
Recommended default — safest setup for production.Host two control planes — one for non-prod (dev + staging) and one for prod — each with its own database, blob storage, gateway plane, and Helm release.Why this is the recommended default
  • You can test TrueFoundry upgrades on the non-prod control plane and validate against your non-prod gateway before touching prod.
  • You get full infrastructure isolation between non-prod and prod (separate clusters, databases, blob storage, failure domains).
  • You can roll out config changes (new models, policies, guardrails) to non-prod first and promote to prod after validation.
What you set up
1

Deploy a non-prod control plane (dev + staging)

Install one TrueFoundry control plane in your non-prod cluster, following Deploy Control Plane and Gateway Plane. Use this control plane for both dev and staging — they typically share most config and infra needs.Inside this control plane, separate dev and staging using virtual accounts (see One CP, one tenant) or tenants (see One CP, multiple tenants).
2

Deploy a prod control plane

Install a second TrueFoundry control plane in your prod cluster on its own database, blob storage, and domain (e.g. app.company.com for prod and app-nonprod.company.com for non-prod).
3

Bake control-plane upgrades on non-prod first

When upgrading TrueFoundry, upgrade the non-prod control plane first, validate against the non-prod gateway, and only then upgrade the prod control plane. This gives you a safe rollout window for new releases.
4

Keep configuration in sync via GitOps

Manage models, virtual accounts, policies, and access rules through GitOps so the same source-of-truth can be applied to both control planes (with environment-specific overrides).
Pick this option if you have a specialized deployment setup (custom networking, air-gapped environment, strict change-management) or any platform-team capacity to operate two stacks. It’s also the right choice when prod uptime requires upgrade testing before any change.
Do not run more than two control planes. A third control plane (e.g. a dedicated staging control plane) typically doubles operational burden — backups, upgrades, monitoring, certificates, secrets — without proportional benefit. Use a separate tenant or virtual account for staging inside the non-prod control plane instead.

Choosing between options

Start with Two control planes unless your team explicitly wants to consolidate to a single stack. Use this matrix to decide:
RequirementTwo control planes (recommended)One CP, one tenantOne CP, multiple tenants
Test control-plane upgrades on non-prod first✅ Best fit
Infrastructure isolation (DB, storage, cluster)✅ Best fit
Same models across envs✅ Best fit
Different models / API keys per env⚠️ Possible via separate provider accounts✅ Best fit
Separate users / SSO per env
Operational overheadHighestLowestMedium

Cross-cutting recommendations

These apply regardless of which option you pick.
Store models, virtual accounts, policies, guardrails, and access rules as YAML in Git. Apply the same definitions to dev / staging / prod with environment-specific overrides using tfy apply. This avoids drift and makes promotion (dev → staging → prod) auditable. See Setup GitOps and Using tfy apply.
For each environment, configure secret store sync so the dev / staging / prod tokens are written to distinct paths in your secret manager (e.g. AWS Secrets Manager, HashiCorp Vault). Apps read tokens from the secret manager, not from the UI. Combine with auto-rotation for prod.
Rate limits, budget limits, guardrails, and data-access rules should be progressively stricter from dev → staging → prod. Reserve aggressive guardrails (PII redaction, content moderation, prompt injection) for prod traffic; keep dev permissive so engineers can iterate.
Use Data Access rules so only on-call / SRE / security teams can read prod request logs, while broader teams can see dev / staging data. This is especially important when running everything in a single tenant.