Setup Dev, Staging and Prod Environments

Most organizations need to separate dev, staging, and prod for the AI Gateway so that experimental work, model evaluations, and production traffic do not interfere with each other. The right setup depends on:

Which deployment mode you are on — see Deployment Options.
Whether dev / staging / prod use the same models and providers, or different ones.
How much isolation you want between environments (data, users, access, blast radius).

This guide walks through the recommended approach for each deployment mode.

Quick recommendation

Deployment mode	Recommended setup
SaaS (TrueFoundry-hosted), same models across envs	One tenant + separate virtual accounts for dev / staging / prod, scoped with RBAC.
SaaS (TrueFoundry-hosted), different models or stronger isolation	Separate tenants per environment (e.g. `company-dev`, `company-staging`, `company-prod`).
Self-hosted Gateway Plane (TrueFoundry-hosted Control Plane)	One gateway plane is fine for most cases. Use two gateway planes (dev + prod) if dev and prod are in different VPCs and you don’t want to expose the same gateway to both environments.
Self-hosted Control Plane	Two control planes (non-prod + prod) is the recommended default — it’s the safest setup because you can test TrueFoundry upgrades on non-prod before applying them to prod. Run a single control plane only if you want lower operational overhead and can accept upgrading prod directly.

Across all options, use virtual accounts for application traffic and Personal Access Tokens (PATs) for individual user traffic — never share tokens across environments.

SaaS (TrueFoundry-hosted Control Plane and Gateway)

On SaaS, infrastructure is fully managed by TrueFoundry. Environment separation is achieved either through virtual accounts inside one tenant or by creating multiple tenants.

One tenant
Multiple tenants

Recommended when models are the same across environments.Use this when dev, staging, and prod use the same set of models, providers, and credentials, and you mostly need to separate which application (or environment) is calling the gateway for cost tracking, rate limiting, and observability.What you set up

Create one virtual account per environment per application

For each application or service, create one virtual account per environment — for example checkout-dev, checkout-staging, checkout-prod. Apps in each environment use the corresponding virtual account token to call the gateway.See Manage Virtual Accounts.

Tag virtual accounts and use teams as owners

Set the owner team on each virtual account (e.g. checkout-team) so team members can see logs / metrics for that environment in the gateway. Add a metadata.environment value (dev / staging / prod) on the virtual account so you can filter request logs and metrics by environment.

Apply environment-specific policies

Attach different rate-limiting and budget-limiting policies to dev / staging / prod virtual accounts — for example, lower limits and stricter budgets in dev, higher in prod. See Rate Limiting and Budget Limiting.

Control access using RBAC

Decide who can use which environment by granting provider account access (for models) and virtual account access at the team / user level. Typical pattern:

All developers can use *-dev virtual accounts.
Only the on-call / SRE team can use *-prod virtual accounts.
Tenant admins can manage everything.

See Access Control and Manage User Roles & Permissions.

This option keeps operational overhead minimal: one set of model integrations, one set of users, one place to look for logs. Teams just switch tokens between environments.

Recommended when models differ per environment or you need strong isolation.Use this when dev / staging / prod need different models or providers, when you want separate user lists, SSO, or billing per environment, or when compliance requires complete isolation.On SaaS you can create multiple tenants, each with its own URL — for example:

company-dev.truefoundry.cloud
company-staging.truefoundry.cloud
company-prod.truefoundry.cloud

Each tenant has its own:

Users, teams, virtual accounts, and SSO configuration.
Provider accounts, models, MCP servers, and guardrails.
Request logs, metrics, and policies.

What you set up

Request additional tenants

Reach out to TrueFoundry to provision the additional tenants (e.g. company-dev and company-staging alongside an existing company-prod).

Configure SSO and users per tenant

For each tenant, configure SSO and decide on a user provisioning mode (SCIM, JIT, or invite-only). Most teams give all developers access to dev / staging and a smaller group access to prod.

Set up provider accounts and models per tenant

Add provider accounts and models in each tenant. You can use different API keys per environment (e.g. a sandbox OpenAI key in dev, the production key in prod) by creating separate provider accounts in each tenant.

Use environment-specific virtual accounts inside each tenant

Within each tenant, follow Setup AI Gateway in Org to create teams, virtual accounts, and policies as usual.

Tenants are fully isolated — users, virtual accounts, models, and request logs are not shared across them. Plan access deliberately so the same engineer can be added to the dev and prod tenants if they need both.

Self-hosted Gateway Plane (Control Plane on SaaS)

In this mode, the Gateway Plane runs in your own infrastructure while the Control Plane is hosted by TrueFoundry. You control how many gateway planes to deploy.

One gateway plane
Two gateway planes

Recommended default.Run a single gateway plane that serves dev, staging, and prod traffic. Use virtual accounts and policies (the One tenant option above) to separate environments.When to choose this

Dev / staging / prod live in the same VPC or are reachable from a single shared network.
You want to minimize hosting cost and operational overhead.
Same models and providers are used across environments.

The gateway plane is stateless and horizontally scalable, so a single fleet can handle traffic for all environments. Combine this with one tenant + per-environment virtual accounts on the Control Plane.

Recommended when dev and prod live in separate VPCs.Run two gateway planes — one in your dev VPC and one in your prod VPC — both connected to the same TrueFoundry-hosted Control Plane.When to choose this

Dev and prod must stay in separate VPCs (network isolation, different cloud accounts, regulatory boundaries).
You want production traffic on isolated nodes / scaling profile, separate from dev experiments.
You want to roll out gateway version upgrades to dev first and bake before prod.

What you set up

Deploy a gateway plane in each VPC

Install the gateway-plane Helm chart in each environment’s cluster. Both gateways point to the same TrueFoundry-hosted Control Plane. See Deploy Gateway Plane.

Use distinct gateway endpoints

Expose each gateway on its own hostname (e.g. gateway-dev.company.com and gateway.company.com). Apps in each environment call their own gateway endpoint.

Separate environments inside the Control Plane

Use either the One tenant or Multiple tenants approach from the SaaS section above, depending on your isolation needs. Models, policies, and access rules are still configured centrally on the SaaS Control Plane.

Two gateway planes share the same configuration source-of-truth (the Control Plane), so you don’t have to duplicate model integrations or access rules.

Self-hosted Control Plane

When you self-host the Control Plane (and Gateway Plane) on your own infrastructure, you have three options. We recommend running two control planes — one for non-prod (dev / staging) and one for prod — so that you can validate TrueFoundry upgrades on non-prod before rolling them out to prod. This is the safest setup for production traffic. If your team prefers lower operational overhead and can accept upgrading prod directly, you can run a single control plane and separate environments inside it using virtual accounts or tenants.

We recommend running at most two control planes — a non-prod (dev / staging) and a prod control plane. More than two typically doubles operational burden — backups, upgrades, monitoring, certificates, secrets — without proportional benefit.

Two control planes
One CP, one tenant
One CP, multiple tenants

Recommended default — safest setup for production.Host two control planes — one for non-prod (dev + staging) and one for prod — each with its own database, blob storage, gateway plane, and Helm release.Why this is the recommended default

You can test TrueFoundry upgrades on the non-prod control plane and validate against your non-prod gateway before touching prod.
You get full infrastructure isolation between non-prod and prod (separate clusters, databases, blob storage, failure domains).
You can roll out config changes (new models, policies, guardrails) to non-prod first and promote to prod after validation.

What you set up

Deploy a non-prod control plane (dev + staging)

Install one TrueFoundry control plane in your non-prod cluster, following Deploy Control Plane and Gateway Plane. Use this control plane for both dev and staging — they typically share most config and infra needs.Inside this control plane, separate dev and staging using virtual accounts (see One CP, one tenant) or tenants (see One CP, multiple tenants).

Deploy a prod control plane

Install a second TrueFoundry control plane in your prod cluster on its own database, blob storage, and domain (e.g. app.company.com for prod and app-nonprod.company.com for non-prod).

Bake control-plane upgrades on non-prod first

When upgrading TrueFoundry, upgrade the non-prod control plane first, validate against the non-prod gateway, and only then upgrade the prod control plane. This gives you a safe rollout window for new releases.

Keep configuration in sync via GitOps

Manage models, virtual accounts, policies, and access rules through GitOps so the same source-of-truth can be applied to both control planes (with environment-specific overrides).

Pick this option if you have a specialized deployment setup (custom networking, air-gapped environment, strict change-management) or any platform-team capacity to operate two stacks. It’s also the right choice when prod uptime requires upgrade testing before any change.

Do not run more than two control planes. A third control plane (e.g. a dedicated staging control plane) typically doubles operational burden — backups, upgrades, monitoring, certificates, secrets — without proportional benefit. Use a separate tenant or virtual account for staging inside the non-prod control plane instead.

Use this when you want minimal operational overhead and can upgrade prod directly.Host a single control plane and run dev / staging / prod inside the same tenant using virtual accounts, teams, and RBAC.When to choose this

Same models and providers across environments.
You want one place to manage users, SSO, models, policies, and observability.
Smaller platform team, comfortable upgrading the only control plane in place — usually after the SaaS release has been live for a couple of weeks.

What you set up

Deploy a single self-hosted control plane

Install one TrueFoundry control plane following Deploy Control Plane and Gateway Plane.

Create per-environment virtual accounts

Create dev / staging / prod virtual accounts for each application (e.g. checkout-dev, checkout-staging, checkout-prod) and tag them with environment metadata.

Apply RBAC, rate limits, and budgets per environment

Use Access Control, Rate Limiting, and Budget Limiting so that dev tokens have looser limits, prod tokens have stricter ones, and access to prod is restricted to a small group.

(Optional) Run separate gateway planes per environment

If dev and prod live in different VPCs, deploy two gateway planes connected to the same control plane (same as the Two gateway planes option in the section above).

Use this when models or users differ per environment but you want a single control plane.Host a single control plane in multi-tenant mode and create one tenant per environment — for example dev, staging, prod, hosted on dev.<base_domain>, staging.<base_domain>, and prod.<base_domain>.When to choose this

Dev / staging / prod use different models, providers, or API keys and you want them fully isolated.
You want separate user lists, SSO settings, virtual accounts, and request logs per environment.
You want tenant-level isolation but only have ops capacity to run one control-plane stack.

What you set up

Enable multi-tenancy on your control plane

Follow Manage Multi-Tenancy on the Control Plane to enable multi-tenant mode and pick a base domain (e.g. app.company.com).

Create one tenant per environment

From the admin dashboard at <base_domain>/admin/, create the dev, staging, and prod tenants. Each tenant gets its own subdomain and admin invite.

Configure each tenant independently

For each tenant, configure SSO, users, teams, virtual accounts, provider accounts, and models — using Setup AI Gateway in Org as a checklist.

Tenants on the same control plane share infrastructure (Postgres, blob storage, NATS, gateway pods) but data is logically isolated per tenant. If you need full infrastructure isolation between non-prod and prod, or you want to bake control-plane upgrades on non-prod first, choose Two control planes instead.

Choosing between options

Start with Two control planes unless your team explicitly wants to consolidate to a single stack. Use this matrix to decide:

Requirement	Two control planes (recommended)	One CP, one tenant	One CP, multiple tenants
Test control-plane upgrades on non-prod first	✅ Best fit	❌	❌
Infrastructure isolation (DB, storage, cluster)	✅ Best fit	❌	❌
Same models across envs	✅	✅ Best fit	✅
Different models / API keys per env	✅	⚠️ Possible via separate provider accounts	✅ Best fit
Separate users / SSO per env	✅	❌	✅
Operational overhead	Highest	Lowest	Medium

Cross-cutting recommendations

These apply regardless of which option you pick.

Use GitOps to manage configuration across environments

Store models, virtual accounts, policies, guardrails, and access rules as YAML in Git. Apply the same definitions to dev / staging / prod with environment-specific overrides using tfy apply. This avoids drift and makes promotion (dev → staging → prod) auditable. See Setup GitOps and Using tfy apply.

Sync virtual account tokens to your secret manager

For each environment, configure secret store sync so the dev / staging / prod tokens are written to distinct paths in your secret manager (e.g. AWS Secrets Manager, HashiCorp Vault). Apps read tokens from the secret manager, not from the UI. Combine with auto-rotation for prod.

Apply tighter policies in prod

Rate limits, budget limits, guardrails, and data-access rules should be progressively stricter from dev → staging → prod. Reserve aggressive guardrails (PII redaction, content moderation, prompt injection) for prod traffic; keep dev permissive so engineers can iterate.

Restrict who can read prod data

Use Data Access rules so only on-call / SRE / security teams can read prod request logs, while broader teams can see dev / staging data. This is especially important when running everything in a single tenant.

​Quick recommendation

​SaaS (TrueFoundry-hosted Control Plane and Gateway)

​Self-hosted Gateway Plane (Control Plane on SaaS)

​Self-hosted Control Plane

​Choosing between options

​Cross-cutting recommendations

Quick recommendation

SaaS (TrueFoundry-hosted Control Plane and Gateway)

Self-hosted Gateway Plane (Control Plane on SaaS)

Self-hosted Control Plane

Choosing between options

Cross-cutting recommendations