Claude Code Proxy: Route Claude, GPT-5 & Gemini Through TrueFoundry AI Gateway

Built for Speed: ~10ms Latency, Even Under Load

Blazingly fast way to build, track and deploy your models!

Handles 350+ RPS on just 1 vCPU — no tuning needed
Production-ready with full enterprise support

Get Started with Truefoundry Now Talk to the Expert

Introduction

Claude Code is the most powerful AI coding assistant available today. Engineers who adopt it rarely go back. But when dozens or hundreds of engineers start using it at the same time, a new problem appears: Claude Code, by default, talks directly to Anthropic's API. Every developer authenticates with their own key, uses Anthropic models exclusively, and generates API spend that is completely invisible to the platform team until the monthly invoice arrives.

A Claude Code proxy is the answer. By pointing Claude Code at a proxy endpoint instead of directly at Anthropic, you gain a centralized control point for every model call across your entire engineering organization: visibility into who is spending what, the ability to enforce budget caps before they're exceeded, access to models from any provider - GPT-5, Gemini 2.5 Pro, Llama via Bedrock - through the same interface Claude Code already knows, and the ability to deploy gateway configuration once and have it apply to all developers without touching individual machines.

TrueFoundry AI Gateway is the enterprise-grade Claude Code proxy. It is a drop-in Anthropic-compatible endpoint that Claude Code connects to with a single environment variable change. Once connected, every Claude Code request flows through the gateway giving you observability, cost controls, multi-model routing, and enterprise security policies that apply to the whole organization, not just the developers who remember to configure them.

This guide explains exactly what a Claude Code proxy does, why TrueFoundry AI Gateway is the right one for enterprise engineering teams, and how to configure it, including for both standard API key and Claude Max subscription flows.

What Is a Claude Code Proxy?

Claude Code ships with a single configuration knob for changing its backend: the ANTHROPIC_BASE_URL environment variable. When set, Claude Code sends all its API requests - messages, model calls, streaming responses to that URL instead of to https://api.anthropic.com.

That one variable is the foundation of every Claude Code proxy. A proxy is any server that:

Accepts Anthropic-format API requests from Claude Code
Adds controls, routing, or observability at the proxy layer
Forwards requests to the actual model provider (Anthropic, OpenAI, Google, Bedrock, on-prem)
Returns responses back to Claude Code in the format it expects

The simplest possible proxy is a reverse proxy with logging. The most sophisticated is an enterprise AI gateway that handles authentication, budget enforcement, model routing across providers, semantic caching, guardrails, and full audit trails - all transparently, with no changes to how Claude Code behaves for the developer.

Why do teams build or adopt a Claude Code proxy?

Cost control: Multiple developers using Claude Code with individual Anthropic keys generate spend that is invisible until month-end. A proxy intercepts every request and enforces per-developer daily limits before costs exceed budget.
Multi-model access: Claude Code's interface is powerful, but Claude models are not always the best or most cost-effective choice for every task. A proxy lets you route haiku-tier tasks to GPT-4o-mini or Gemini Flash, and opus-tier tasks to the best available model without any client-side changes.
Enterprise security: Direct API keys on developer laptops are a security liability. A proxy centralizes credentials: developers authenticate to the proxy, and the proxy holds provider keys. No Anthropic key ever needs to live on a developer machine.
Team-wide governance: Individual developers can configure their own ANTHROPIC_BASE_URL. But enforcing it across an entire team requires a centralized deployment mechanism - MDM, server-managed settings, or a shared project .claude/settings.json checked into version control.

Why TrueFoundry AI Gateway Is the Right Claude Code Proxy

There are three ways to proxy Claude Code: build your own, use a simple reverse proxy, or use a purpose-built AI gateway. Building your own means owning the maintenance, security, and reliability of a production API gateway. A simple reverse proxy adds logging but none of the controls. TrueFoundry AI Gateway gives you everything an enterprise engineering team actually needs without building or maintaining it.

TrueFoundry AI Gateway is a unified proxy layer between Claude Code and your model providers. It accepts the same Anthropic API format that Claude Code already speaks, so Claude Code never needs to know it's talking to a gateway rather than directly to Anthropic. Behind the gateway, you can connect any provider: Anthropic direct, AWS Bedrock, Google Vertex AI, Azure OpenAI, OpenAI, or your own on-prem models.

Here is what Claude Code actually sees:

Claude Code  →  ANTHROPIC_BASE_URL (TrueFoundry Gateway)  →  Anthropic / OpenAI / Gemini / Bedrock / On-prem

Every Claude Code request that flows through TrueFoundry gains, automatically:

Capability	What It Does for Claude Code Users	TrueFoundry Feature
Multi-provider model access	Use GPT-5, Gemini 2.5 Pro, Llama, or on-prem models through the same Claude Code interface	Virtual Models
Per-developer budget limits	Blocks requests when daily or monthly spend caps are hit — before cost overruns, not after	Budget Limiting
Rate limiting	Throttle per-developer, per-team, or per-environment request rates	Rate Limiting
Cost attribution	Dashboard showing exactly which developer, team, and model drove every dollar of spend	Analytics
RBAC and virtual keys	No Anthropic API keys on developer machines — team members authenticate with TrueFoundry keys scoped to their access level	Access Control
Automatic failover	If Anthropic hits a rate limit or outage, the gateway silently retries on the next configured provider	Load Balancing & Fallbacks
Guardrails	PII detection, prompt injection protection, and custom content policies applied before requests reach the model	Guardrails
Full audit trail	Every request logged with user, model, token count, cost, and latency — exportable via OpenTelemetry	OpenTelemetry Export

~3–4ms p95 gateway overhead, 350+ RPS on a single vCPU. At Claude Code response times (seconds, not milliseconds), the gateway adds no perceptible latency.

Step 1: Point Claude Code at TrueFoundry AI Gateway

The core configuration is a single environment variable:

export ANTHROPIC_BASE_URL="https://<your-truefoundry-gateway-url>"

For persistent configuration - which is what you want for production use - edit Claude Code's settings.json. Two paths are supported:

Global (applies to all projects): ~/.claude/settings.json
Project-specific (checked into version control): .claude/settings.json in your project directory

Standard API Key Configuration

Use this when developers authenticate with a TrueFoundry API key (the recommended enterprise pattern — no Anthropic keys on developer machines):‍

{
  "env": {
    "ANTHROPIC_BASE_URL": "{GATEWAY_BASE_URL}",
    "ANTHROPIC_AUTH_TOKEN": "your-truefoundry-api-key",
    "ANTHROPIC_MODEL": "anthropic/claude-4-sonnet-20250514",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "anthropic/claude-4-opus-20250514",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "anthropic/claude-4-sonnet-20250514",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "anthropic/claude-3-5-haiku-20241022",
    "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1",
    "ANTHROPIC_CUSTOM_HEADERS": "x-tfy-anthropic-beta: context-management-2025-06-27"
  }
}

What each field does:

ANTHROPIC_BASE_URL — redirects all Claude Code requests to TrueFoundry
ANTHROPIC_AUTH_TOKEN — TrueFoundry API key; authenticates the developer to the gateway (replaces Anthropic API key)
ANTHROPIC_MODEL — the default model for Claude Code sessions
ANTHROPIC_DEFAULT_OPUS_MODEL, ANTHROPIC_DEFAULT_SONNET_MODEL, ANTHROPIC_DEFAULT_HAIKU_MODEL — map Claude Code's built-in model aliases (/model opus, /model sonnet, /model haiku) to your TrueFoundry-configured models
CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS — disables experimental Claude Code features for stable gateway behavior
ANTHROPIC_CUSTOM_HEADERS — forwards the x-tfy-anthropic-beta header to Anthropic for beta features like context management

Important: Claude Code detects model capabilities (extended thinking, ToolSearch, beta tool blocks) by string-matching the model ID. Make sure ANTHROPIC_DEFAULT_OPUS_MODEL, ANTHROPIC_DEFAULT_SONNET_MODEL, and ANTHROPIC_DEFAULT_HAIKU_MODEL contain a recognizable Anthropic model ID like claude-opus-4-7、 claude-sonnet-4-6、または claude-haiku-4-5。TrueFoundry仮想モデルを使用している場合は、表示名に基盤となるモデルIDが含まれていることを確認してください（例： your-account/claude-haiku-4-5 ）—これにより文字列の一致が成功します。

Claude Code Maxサブスクリプションの構成

チームがClaude Code Maxサブスクリプションを使用している場合、Claude CodeはAnthropicアカウント認証用に Authorization ヘッダーを確保します。代わりに x-tfy-api-key を ANTHROPIC_CUSTOM_HEADERS で指定してください。‍

{
  "env": {
    "ANTHROPIC_BASE_URL": "{GATEWAY_BASE_URL}",
    "ANTHROPIC_CUSTOM_HEADERS": "x-tfy-api-key: your-truefoundry-api-key\nX-TFY-LOGGING-CONFIG: {\"enabled\": true}",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "anthropic/claude-4-opus-20250514",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "anthropic/claude-4-sonnet-20250514",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "anthropic/claude-3-5-haiku-20241022"
  }
}

このパターンがMaxユーザーに適している理由：

Claude Codeのセッション認証にはAnthropic Maxサブスクリプションを維持します。つまり、 Authorization ヘッダーはそのままAnthropicに渡されます。
TrueFoundry は別途、 x-tfy-api-key - ゲートウェイがリクエストを管理し、Anthropic がお客様のサブスクリプションを通じて請求を処理します
日常の Claude Code ワークフローを変更することなく、一元的なガバナンス（可視性、クォータ、RBAC、ログ、ガードレール）を実現できます。

詳細は TrueFoundry Claude Code ドキュメント で完全な統合ガイドを、および Claude Code Max 統合 で Max サブスクリプションのバリアントをご確認ください。

ステップ2：Claude Code を通じて GPT-5、Gemini、およびあらゆるモデルを使用する

ここで、Claude Code プロキシは便利なものから革新的なものへと変化します。 Claude Code が TrueFoundry を経由してルーティングされると、あらゆるプロバイダーのあらゆるモデルにアクセスできるようになります Anthropic だけでなく。TrueFoundry ゲートウェイダッシュボードでプロバイダーアカウント（OpenAI、Google Vertex AI、AWS Bedrock、Azure OpenAI、xAI、または独自のオンプレミスデプロイメント）を追加すると、それらのモデルが同じゲートウェイエンドポイントで利用可能になります。

Claude Code エイリアスを Anthropic 以外のモデルに設定する

Claude Code の「opus」スロット（最も高性能なモデル層）に GPT-5 を使用するには、モデルエイリアスを更新するだけです。

{
  "env": {
    "ANTHROPIC_BASE_URL": "{GATEWAY_BASE_URL}",
    "ANTHROPIC_AUTH_TOKEN": "your-truefoundry-api-key",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "openai-main/gpt-5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "anthropic/claude-4-sonnet-20250514",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "google-vertex/gemini-2.5-flash",
    "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1"
  }
}

この構成では：

/model opus → GPT-5（複雑なアーキテクチャおよび計画タスク向け）
/model sonnet → Claude Sonnet 4（標準的なコーディングタスク向け）
/model haiku → Gemini 2.5 Flash（メール検証やクイックルックアップのような高速で軽量なタスク向け）

開発者エクスペリエンスは同一です。 開発者は引き続き /model opus または --model haiku。各エイリアスの背後にあるプロバイダーを知る必要も、OpenAIやGoogleの認証情報を管理する必要もありません。

高度なルーティングに仮想モデルを活用する

TrueFoundryの仮想モデル により、重みベース、優先度ベース、またはレイテンシーベースのルーティングで、複数のプロバイダーにリクエストをルーティングする単一のモデル識別子を作成できます。Claude Codeモデルのエイリアスを仮想モデルに指定すると、ゲートウェイがルーティングロジックを透過的に処理します。

例：プロバイダー間での優先度ベースのフォールバック

プライマリのAnthropicアカウントがレート制限に達した場合、開発者が気づくことなく、自動的にBedrock Claudeにフォールバックし、次にGPT-4にフォールバックします。

routing_config:
  type: priority-based-routing
  load_balance_targets:
    - target: anthropic-main/claude-sonnet-4-20250514
      priority: 0
      fallback_status_codes: ["429", "500", "502", "503"]
    - target: bedrock-main/claude-sonnet-4-20250514
      priority: 1
      fallback_status_codes: ["429", "500"]
    - target: openai-main/gpt-4o
      priority: 2

例：重みベースのA/B評価

チーム全体をコミットする前に、Claude Codeトラフィックの10%で新しいモデルをカナリアリリースします。‍

routing_config:
  type: weight-based-routing
  load_balance_targets:
    - target: anthropic-main/claude-4-sonnet-20250514
      weight: 90
    - target: openai-main/gpt-5
      weight: 10

次に、Claude Codeのソネットエイリアスをこの仮想モデルに指定します。Claude Codeのソネットリクエストの10%はGPT-5に送られ、ゲートウェイダッシュボードで完全なコストと品質のメトリクスを確認して結果を比較できます。

ステップ3：すべてのClaude Codeリクエストに適用されるエンタープライズコントロール

Claude CodeがTrueFoundryを介してルーティングされると、開発者が設定するからではなく、ゲートウェイ層で強制されるため、すべてのリクエストがエンタープライズグレードのガバナンスを継承します。

予算制限：コスト超過を未然に防ぐ

TrueFoundryの階層型予算制限 トークンが消費される前に発動し、月末の請求書が届いてからではありません。ルールは積み重ねて組み合わせることができます。

Order	Rule ID	Filter	Budget	Per
1	`senior-eng-budget`	Subjects: `team:senior-engineers`	$50/day	User
2	`default-dev-budget`	(matches all)	$10/day	User
3	`opus-monthly-cap`	Models: `anthropic-main/claude-4-opus`	$1000/month	Shared

シニアエンジニアは1日あたり50ドル。その他のメンバーはデフォルトで1日あたり10ドルです。組織全体のOpus利用額は月額1000ドルに制限されているため、各開発者が個人の上限内であっても、組織レベルのモデル予算を超えることはありません。

レート制限：オンプレミス環境と制御環境を保護

TrueFoundry AI Gateway interface showing how to configure rate limitingrules through the Configtab

レート制限 ゲートウェイでのは、Claude Codeに特化した3つのシナリオに対応します。

CIパイプライン：CIで実行されるClaude Codeは、インタラクティブな開発者セッションとは独立してレート制限されるべきです。コードレビューのためにClaude Codeを呼び出すテストスイートが、開発者のアクティブなコーディングセッションと同じクォータを使い果たすべきではありません。
開発モデルと本番モデル：メタデータスコープのレート制限により、 environment: dev のリクエストをより安価なモデルにルーティングし、そのリクエストレートを制限できます — 本番環境に影響を与えることなく。
オンプレミスGPUの保護：Claude Codeの主要なターゲットとしてオンプレミスモデルを実行している場合、オンプレミスエンドポイントにレート制限をかけ、容量が飽和した際には自動的にクラウドAPIにバーストさせます。

# Limit Claude Code in CI to 500 requests/day on GPT-4
- id: ci-pipeline-limit
  when:
    models: ['openai-main/gpt-4']
    metadata:
      environment: ci
  limit_to: 500
  unit: requests_per_day

コストの帰属：誰が何にいくら使っているかを正確に把握

TrueFoundryによって処理されるすべてのClaude Codeリクエストは、認証されたユーザーに自動的に帰属されます。その 分析ダッシュボード は、開発者、チーム、モデル、日付別にコストを内訳表示します — を介して渡す任意のメタデータタグでフィルタリング可能です X-TFY-METADATA ヘッダー。

プロジェクトベースのコスト配分を使用するチームの場合、Claude Codeのリクエストに project_id または feature メタデータを付与すると、すべてのリクエストが適切なコストセンターに自動的にマッピングされます。‍

{
  "env": {
    "ANTHROPIC_CUSTOM_HEADERS": "X-TFY-METADATA: {\"team\": \"platform\", \"project_id\": \"infra-2026\"}"
  }
}

すべてのトレースは OpenTelemetry を介してGrafana、Datadog、Splunk、または既存のオブザーバビリティスタックにエクスポートされます。

ステップ4：エンジニアリングチーム全体への展開

1人の開発者の settings.json は簡単です。 組織内のすべての開発者に一貫したプロキシ設定を適用するには 展開戦略が必要です。TrueFoundryは3つのアプローチをサポートしています。

オプションA：MDMプッシュ型管理設定（企業向け推奨）

をプッシュします。 managed-settings.json ファイルをMDMソリューション（Jamf、Kandji、Mosyle、Intune）を介してすべての企業デバイスにプッシュし、OSレベルでの変更をロックします。これはClaude Codeの エンドポイント管理設定 アプローチ。‍

{
  "model": "sonnet",
  "availableModels": ["sonnet", "haiku"],
  "env": {
    "ANTHROPIC_BASE_URL": "https://your-gateway.internal.corp",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "anthropic/claude-4-opus-20250514",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "anthropic/claude-4-sonnet-20250514",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "anthropic/claude-3-5-haiku-20241022"
  }
}

システムレベルのパス:

macOS: /Library/Application Support/ClaudeCode/managed-settings.json
Linux: /etc/claude-code/managed-settings.json

この設定は改ざん防止機能があり、ネットワークに依存せず起動時に即座に適用され、開発者による操作は不要です。MDMプロファイルを受け取ったすべてのマシンは、自動的にTrueFoundryを介してプロキシされます。

オプションB: Anthropic管理コンソール経由のサーバー管理設定

設定は、 Claude管理コンソール （管理設定 → Claude Code → 管理対象設定）で一元的に構成します。開発者が組織の認証情報で認証すると、設定はAnthropicのサーバーから配信されます。ファイルのデプロイは不要です。

このアプローチはMDMインフラストラクチャを必要とせず、BYODマシンでも機能します。設定は認証時に配信され、ユーザーが上書きすることはより困難です。

オプションC: バージョン管理下のプロジェクトレベルのsettings.json

各リポジトリのルートに .claude/settings.json をコミットします。リポジトリをクローンし、そのディレクトリでClaude Codeを実行する開発者は、TrueFoundryゲートウェイURLやモデル構成を含むプロジェクト設定を自動的に使用します。‍

# Check into your monorepo or template repository
.claude/settings.json

これは、標準化されたリポジトリ構造を持つチームにとって、最も摩擦の少ないオプションです。新しい開発者は、クローンした瞬間にプロキシ構成を継承します。

ステップ5: VS Code拡張機能とClaude Agent SDK

VS Code拡張機能

CLIを設定すれば、Claude Code VS Code拡張機能はTrueFoundryとシームレスに連携します。この拡張機能は単体では動作せず、まずClaude Code CLIをインストールして設定する必要があります。‍

# macOS/Linux: Launch VS Code from terminal to inherit shell environment
code .

拡張機能は、CLI設定（ベースURL、APIキー、モデルエイリアス）を自動的に使用します。個別の設定は不要です。

macOS/Linuxでの注意点: GUIアプリケーションは、デフォルトではシェル環境変数を継承しません。拡張機能が確実に認識するように、Claude Codeが設定されているターミナルからVS Codeを起動してください。 ANTHROPIC_BASE_URL。

Claude Agent SDK

The Claude Agent SDK (Claude Code SDKの後継) は、既存の .claude/settings.json をTrueFoundryを介して使用します。ゲートウェイ設定をプログラムでロードするには、 setting_sources=["project"] を指定してください。

from claude_agent_sdk import query, ClaudeAgentOptions

async for message in query(
    prompt="Analyze my codebase for security vulnerabilities",
    options=ClaudeAgentOptions(
        setting_sources=["project"],  # Loads .claude/settings.json with TrueFoundry config
        max_turns=5,
        allowed_tools=["Read", "Grep", "Glob"]
    )
):
    if message.type == "result":
        print(message.result)

Anthropic Direct、AWS Bedrock、Google Vertex AIを含むすべてのTrueFoundry設定は、Agent SDKと全く同じように動作します。

自作のClaude Codeプロキシ vs. TrueFoundry AIゲートウェイ

Capability	DIY Reverse Proxy	TrueFoundry AI Gateway
Setup time	Days to weeks	Minutes — one env var change
Multi-provider routing	Custom build required	Built-in: Anthropic, OpenAI, Gemini, Bedrock, Azure, on-prem
Per-developer budget limits	Not included	Hierarchical, configurable
Cost attribution dashboard	Custom build required	Built-in with OTEL export
Automatic failover	Custom retry logic per request	Gateway-level, configurable per provider
Guardrails (PII, injection)	Not included	Built-in
RBAC / virtual accounts	Custom build required	Built-in with SSO/SCIM
Semantic caching	Not included	Built-in
Ongoing maintenance	Your team owns it	TrueFoundry-managed (SaaS) or self-hosted
Deployment modes	Self-hosted only	SaaS, hybrid, or fully self-hosted VPC

Claude Codeプロキシ：Claude、GPT-5、GeminiをTrueFoundry AIゲートウェイ経由でルーティング