Adding Models
This section explains the steps to add Google Vertex AI models and configure the required access controls.Navigate to Google Vertex Models in AI Gateway
From the TrueFoundry dashboard, navigate to 
AI Gateway > Models and select Google Vertex.
Add Google Vertex Account and Authentication
Give a unique name to your Google Vertex account. This will be used to refer to the models later. Add collaborators to your account, this will give access to the account to other users/teams. Learn more about access control here.

Get Google Vertex Authentication Details
Get Google Vertex Authentication Details
Required IAM RoleThe Google Cloud identity used by the gateway (a service account, whether referenced by a key, by GKE Workload Identity, or by Workload Identity Federation) must have the PrerequisitesThis produces a JSON file with To set up GKE Workload Identity, follow the official Google Cloud documentation: Configure Workload Identity on GKE.When adding the Vertex AI provider account in TrueFoundry, leave the authentication section empty — the gateway will automatically pick up GKE Workload Identity credentials via Application Default Credentials (ADC).
Vertex AI User role (roles/aiplatform.user), which includes the aiplatform.endpoints.predict permission required by the gateway.The gateway supports three authentication methods. Pick the one that matches your deployment.1. Using Service Account JSON KeyThis method works for all deployment types (GKE, EKS, AKS, on-premises, or the SaaS Gateway).- Generate a Service Account JSON key by following the official Google Cloud documentation here.
- The service account must have the
Vertex AI Userrole. - When adding the provider account in TrueFoundry, select Service account key file as the authentication type and paste the JSON key into the Service account key JSON field (or store it as a secret and reference it).
Workload Identity Federation is the recommended approach for production deployments running outside of GKE. It eliminates long-lived service account keys while supporting any Kubernetes environment, and it also works on the SaaS version of the AI Gateway.
- A Google Cloud project with Vertex AI enabled.
- A Workload Identity Pool and Provider configured in Google Cloud IAM. Follow the official guide: Configure Workload Identity Federation with Kubernetes.
- A Google Cloud IAM service account that the federated identity can impersonate, with the
Vertex AI Userrole (roles/aiplatform.user) granted on the project. - The Kubernetes service account used by the gateway must have permission to issue
TokenRequestresources for itself. The TrueFoundry-provided Helm chart configures this RBAC automatically.
gcloud CLI to generate the credential configuration file:"type": "external_account" describing the identity pool, audience, and STS token-exchange endpoints. It is not a private key.Configure in TrueFoundryWhen adding or editing the Vertex AI provider account:- Select Workload Identity Federation file as the authentication type.
- Paste the contents of the generated
credential-config.jsoninto the Key file content field, or store it as a secret and reference it.
GKE Workload Identity is GKE-specific. Pods using the configured KSA authenticate as the associated GSA when accessing Google Cloud APIs, with no extra configuration on the gateway side.
GCP Workload Identity (GKE ADC) does not work on the SaaS version of the Gateway, and it only works when the gateway runs inside a GKE cluster. For all other environments, use Workload Identity Federation or a Service Account JSON key.

Configure Project ID and Region
Provide your Google Cloud Project ID and a default Region for all models under this account. You can override the region for individual models later.Project ID
Region
- You can find your Project ID in the top-right corner of your Google Cloud Console.

- Specify a default region for all models under this account. You can override this region for individual models later.
Add Models
You can either select available models from the list or add them manually by clicking
+ Add Model. When adding a model manually, the Model ID format depends on the provider.Adding Google (Gemini) Models
Adding Google (Gemini) Models
Select a Gemini model from the list or add it manually.
- Model ID Format:
google/<vertex-model-id> - Example:
google/gemini-1.5-pro

Adding Anthropic Models
Adding Anthropic Models
Select a Claude model from the list or add it manually.
- Model ID Format:
anthropic/<vertex-model-id> - Example:
anthropic/claude-3-5-sonnet-v2@20241022

Adding Mistral AI Models
Adding Mistral AI Models
Select a Mistral model from the list or add it manually.
- Model ID Format:
mistralai/<vertex-model-id> - Example:
mistralai/mistral-large-2411@001

When adding any model manually, you can specify a Region to override the default one set at the account level.
Inference
After adding the models, you can perform inference using an OpenAI-compatible API via the Playground or by integrating it with your own application.
FAQs
Do I need to add multiple provider accounts for different regions?
Do I need to add multiple provider accounts for different regions?
No. You can set a default region at the account level and override it for each individual model if needed. This allows you to use models from different regions with a single provider account.
Which authentication method should I choose?
Which authentication method should I choose?
- Service Account JSON Key — Works everywhere (any cloud, on-prem, SaaS Gateway). Simplest to set up, but requires you to manage and rotate a long-lived secret.
- Workload Identity Federation — Recommended for production. Keyless, works on any Kubernetes cluster (EKS, AKS, GKE, on-prem) and on the SaaS Gateway. Requires a one-time setup of a Workload Identity Pool in Google Cloud.
- GCP Workload Identity (GKE) — Only available when the self-hosted gateway runs inside a GKE cluster. Keyless and zero-config on the gateway side, but does not work on the SaaS Gateway or outside of GKE.
| Service Account Key | Workload Identity Federation | GCP Workload Identity (GKE) | |
|---|---|---|---|
| Works on GKE | Yes | Yes | Yes |
| Works on EKS / AKS / on-prem | Yes | Yes | No |
| Works on SaaS Gateway | Yes | Yes | No |
| Key management required | Yes | No | No |
| Requires credential JSON in TrueFoundry | Yes (service account key) | Yes (external_account config) | No (leave empty) |
What is the difference between GCP Workload Identity and Workload Identity Federation?
What is the difference between GCP Workload Identity and Workload Identity Federation?
Both are keyless authentication mechanisms, but they target different environments.GCP Workload Identity is a GKE-only feature. The GKE metadata server automatically maps a Kubernetes service account to a Google Cloud IAM service account. The gateway picks this up through Application Default Credentials (ADC) when no auth data is configured. It does not work on the SaaS Gateway or outside of GKE.Workload Identity Federation is a broader Google Cloud feature that works across any Kubernetes cluster (EKS, AKS, on-prem, and GKE) and on the SaaS Gateway. It requires you to provide an
external_account credential configuration JSON (generated via gcloud iam workload-identity-pools create-cred-config). The gateway exchanges a short-lived Kubernetes service account token for a Google Cloud access token through Google’s Security Token Service.When should I use Gemini vs Vertex AI? What's the difference?
When should I use Gemini vs Vertex AI? What's the difference?
Gemini is generally recommended for individual developers and prototyping use cases, while Vertex AI is recommended for production and enterprise use cases.Vertex AI offers everything available in the Gemini API and more, including:
- More secure auth using service accounts instead of API keys
- A Model Garden that includes multiple third-party models
- Access to provisioned throughput