Use this file to discover all available pages before exploring further.
Most organizations need to separate dev, staging, and prod for the AI Gateway so that experimental work, model evaluations, and production traffic do not interfere with each other. The right setup depends on:
SaaS (TrueFoundry-hosted), same models across envs
One tenant + separate virtual accounts for dev / staging / prod, scoped with RBAC.
SaaS (TrueFoundry-hosted), different models or stronger isolation
Separate tenants per environment (e.g. company-dev, company-staging, company-prod).
Self-hosted Gateway Plane (TrueFoundry-hosted Control Plane)
One gateway plane is fine for most cases. Use two gateway planes (dev + prod) if dev and prod are in different VPCs and you don’t want to expose the same gateway to both environments.
Self-hosted Control Plane
Two control planes (non-prod + prod) is the recommended default — it’s the safest setup because you can test TrueFoundry upgrades on non-prod before applying them to prod. Run a single control plane only if you want lower operational overhead and can accept upgrading prod directly.
SaaS (TrueFoundry-hosted Control Plane and Gateway)
On SaaS, infrastructure is fully managed by TrueFoundry. Environment separation is achieved either through virtual accounts inside one tenant or by creating multiple tenants.
One tenant
Multiple tenants
Recommended when models are the same across environments.Use this when dev, staging, and prod use the same set of models, providers, and credentials, and you mostly need to separate which application (or environment) is calling the gateway for cost tracking, rate limiting, and observability.What you set up
1
Create one virtual account per environment per application
For each application or service, create one virtual account per environment — for example checkout-dev, checkout-staging, checkout-prod. Apps in each environment use the corresponding virtual account token to call the gateway.See Manage Virtual Accounts.
2
Tag virtual accounts and use teams as owners
Set the owner team on each virtual account (e.g. checkout-team) so team members can see logs / metrics for that environment in the gateway. Add a metadata.environment value (dev / staging / prod) on the virtual account so you can filter request logs and metrics by environment.
3
Apply environment-specific policies
Attach different rate-limiting and budget-limiting policies to dev / staging / prod virtual accounts — for example, lower limits and stricter budgets in dev, higher in prod. See Rate Limiting and Budget Limiting.
4
Control access using RBAC
Decide who can use which environment by granting provider account access (for models) and virtual account access at the team / user level. Typical pattern:
All developers can use *-dev virtual accounts.
Only the on-call / SRE team can use *-prod virtual accounts.
This option keeps operational overhead minimal: one set of model integrations, one set of users, one place to look for logs. Teams just switch tokens between environments.
Recommended when models differ per environment or you need strong isolation.Use this when dev / staging / prod need different models or providers, when you want separate user lists, SSO, or billing per environment, or when compliance requires complete isolation.On SaaS you can create multiple tenants, each with its own URL — for example:
Users, teams, virtual accounts, and SSO configuration.
Provider accounts, models, MCP servers, and guardrails.
Request logs, metrics, and policies.
What you set up
1
Request additional tenants
Reach out to TrueFoundry to provision the additional tenants (e.g. company-dev and company-staging alongside an existing company-prod).
2
Configure SSO and users per tenant
For each tenant, configure SSO and decide on a user provisioning mode (SCIM, JIT, or invite-only). Most teams give all developers access to dev / staging and a smaller group access to prod.
3
Set up provider accounts and models per tenant
Add provider accounts and models in each tenant. You can use different API keys per environment (e.g. a sandbox OpenAI key in dev, the production key in prod) by creating separate provider accounts in each tenant.
4
Use environment-specific virtual accounts inside each tenant
Within each tenant, follow Setup AI Gateway in Org to create teams, virtual accounts, and policies as usual.
Tenants are fully isolated — users, virtual accounts, models, and request logs are not shared across them. Plan access deliberately so the same engineer can be added to the dev and prod tenants if they need both.
In this mode, the Gateway Plane runs in your own infrastructure while the Control Plane is hosted by TrueFoundry. You control how many gateway planes to deploy.
One gateway plane
Two gateway planes
Recommended default.Run a single gateway plane that serves dev, staging, and prod traffic. Use virtual accounts and policies (the One tenant option above) to separate environments.When to choose this
Dev / staging / prod live in the same VPC or are reachable from a single shared network.
You want to minimize hosting cost and operational overhead.
Same models and providers are used across environments.
The gateway plane is stateless and horizontally scalable, so a single fleet can handle traffic for all environments. Combine this with one tenant + per-environment virtual accounts on the Control Plane.
Recommended when dev and prod live in separate VPCs.Run two gateway planes — one in your dev VPC and one in your prod VPC — both connected to the same TrueFoundry-hosted Control Plane.When to choose this
Dev and prod must stay in separate VPCs (network isolation, different cloud accounts, regulatory boundaries).
You want production traffic on isolated nodes / scaling profile, separate from dev experiments.
You want to roll out gateway version upgrades to dev first and bake before prod.
What you set up
1
Deploy a gateway plane in each VPC
Install the gateway-plane Helm chart in each environment’s cluster. Both gateways point to the same TrueFoundry-hosted Control Plane. See Deploy Gateway Plane.
2
Use distinct gateway endpoints
Expose each gateway on its own hostname (e.g. gateway-dev.company.com and gateway.company.com). Apps in each environment call their own gateway endpoint.
3
Separate environments inside the Control Plane
Use either the One tenant or Multiple tenants approach from the SaaS section above, depending on your isolation needs. Models, policies, and access rules are still configured centrally on the SaaS Control Plane.
Two gateway planes share the same configuration source-of-truth (the Control Plane), so you don’t have to duplicate model integrations or access rules.
When you self-host the Control Plane (and Gateway Plane) on your own infrastructure, you have three options.We recommend running two control planes — one for non-prod (dev / staging) and one for prod — so that you can validate TrueFoundry upgrades on non-prod before rolling them out to prod. This is the safest setup for production traffic.If your team prefers lower operational overhead and can accept upgrading prod directly, you can run a single control plane and separate environments inside it using virtual accounts or tenants.
We recommend running at most two control planes — a non-prod (dev / staging) and a prod control plane. More than two typically doubles operational burden — backups, upgrades, monitoring, certificates, secrets — without proportional benefit.
Two control planes
One CP, one tenant
One CP, multiple tenants
Recommended default — safest setup for production.Host two control planes — one for non-prod (dev + staging) and one for prod — each with its own database, blob storage, gateway plane, and Helm release.Why this is the recommended default
You can test TrueFoundry upgrades on the non-prod control plane and validate against your non-prod gateway before touching prod.
You get full infrastructure isolation between non-prod and prod (separate clusters, databases, blob storage, failure domains).
You can roll out config changes (new models, policies, guardrails) to non-prod first and promote to prod after validation.
What you set up
1
Deploy a non-prod control plane (dev + staging)
Install one TrueFoundry control plane in your non-prod cluster, following Deploy Control Plane and Gateway Plane. Use this control plane for both dev and staging — they typically share most config and infra needs.Inside this control plane, separate dev and staging using virtual accounts (see One CP, one tenant) or tenants (see One CP, multiple tenants).
2
Deploy a prod control plane
Install a second TrueFoundry control plane in your prod cluster on its own database, blob storage, and domain (e.g. app.company.com for prod and app-nonprod.company.com for non-prod).
3
Bake control-plane upgrades on non-prod first
When upgrading TrueFoundry, upgrade the non-prod control plane first, validate against the non-prod gateway, and only then upgrade the prod control plane. This gives you a safe rollout window for new releases.
4
Keep configuration in sync via GitOps
Manage models, virtual accounts, policies, and access rules through GitOps so the same source-of-truth can be applied to both control planes (with environment-specific overrides).
Pick this option if you have a specialized deployment setup (custom networking, air-gapped environment, strict change-management) or any platform-team capacity to operate two stacks. It’s also the right choice when prod uptime requires upgrade testing before any change.
Do not run more than two control planes. A third control plane (e.g. a dedicated staging control plane) typically doubles operational burden — backups, upgrades, monitoring, certificates, secrets — without proportional benefit. Use a separate tenant or virtual account for staging inside the non-prod control plane instead.
Use this when you want minimal operational overhead and can upgrade prod directly.Host a single control plane and run dev / staging / prod inside the same tenant using virtual accounts, teams, and RBAC.When to choose this
Same models and providers across environments.
You want one place to manage users, SSO, models, policies, and observability.
Smaller platform team, comfortable upgrading the only control plane in place — usually after the SaaS release has been live for a couple of weeks.
Create dev / staging / prod virtual accounts for each application (e.g. checkout-dev, checkout-staging, checkout-prod) and tag them with environment metadata.
3
Apply RBAC, rate limits, and budgets per environment
Use Access Control, Rate Limiting, and Budget Limiting so that dev tokens have looser limits, prod tokens have stricter ones, and access to prod is restricted to a small group.
4
(Optional) Run separate gateway planes per environment
If dev and prod live in different VPCs, deploy two gateway planes connected to the same control plane (same as the Two gateway planes option in the section above).
Use this when models or users differ per environment but you want a single control plane.Host a single control plane in multi-tenant mode and create one tenant per environment — for example dev, staging, prod, hosted on dev.<base_domain>, staging.<base_domain>, and prod.<base_domain>.When to choose this
Dev / staging / prod use different models, providers, or API keys and you want them fully isolated.
You want separate user lists, SSO settings, virtual accounts, and request logs per environment.
You want tenant-level isolation but only have ops capacity to run one control-plane stack.
From the admin dashboard at <base_domain>/admin/, create the dev, staging, and prod tenants. Each tenant gets its own subdomain and admin invite.
3
Configure each tenant independently
For each tenant, configure SSO, users, teams, virtual accounts, provider accounts, and models — using Setup AI Gateway in Org as a checklist.
Tenants on the same control plane share infrastructure (Postgres, blob storage, NATS, gateway pods) but data is logically isolated per tenant. If you need full infrastructure isolation between non-prod and prod, or you want to bake control-plane upgrades on non-prod first, choose Two control planes instead.
Use GitOps to manage configuration across environments
Store models, virtual accounts, policies, guardrails, and access rules as YAML in Git. Apply the same definitions to dev / staging / prod with environment-specific overrides using tfy apply. This avoids drift and makes promotion (dev → staging → prod) auditable. See Setup GitOps and Using tfy apply.
Sync virtual account tokens to your secret manager
For each environment, configure secret store sync so the dev / staging / prod tokens are written to distinct paths in your secret manager (e.g. AWS Secrets Manager, HashiCorp Vault). Apps read tokens from the secret manager, not from the UI. Combine with auto-rotation for prod.
Apply tighter policies in prod
Rate limits, budget limits, guardrails, and data-access rules should be progressively stricter from dev → staging → prod. Reserve aggressive guardrails (PII redaction, content moderation, prompt injection) for prod traffic; keep dev permissive so engineers can iterate.
Restrict who can read prod data
Use Data Access rules so only on-call / SRE / security teams can read prod request logs, while broader teams can see dev / staging data. This is especially important when running everything in a single tenant.