
Key Features
Unified API Interface
Call 1000+ LLMs using a single endpoint with unified API interface
API Keys Management
Generate and manage API keys for users/applications
Multimodal Inputs
Support for text, image, and audio inputs across compatible models
Access Control
Fine-grained access control and permissions management
Rate Limiting
Control Models Usage with flexible rate limiting policies per user/model/application
Load Balancing
Distribute requests across multiple model instances based on weight, latency or cost metrics.
Budget Limiting
Control spending and enforce cost limits for users, teams, and models
Guardrails
Content filtering and safety checks to ensure
Observability & Metrics
Opentelemetry compliant metrics and logging for all requests.
Prompt Playground
Centralized prompt playground with versioning and management system
Batch Predictions
Process multiple requests efficiently with batch processing
MCP Registry
Deploy and manage your own MCP servers with TrueFoundry AI Gateway.
Centralized Authn/Authz for all MCP Servers
One API key to access all MCP servers and their tools.
Virtual MCP Servers
Create virtual MCP servers combining specific tools from multiple MCP servers.
Agent Playground
Test Agents by adding tools and models from Playground
Build Agents with unified API for all MCP servers
Connect to MCP Servers with a single API in the gateway.
Rate Limiting and Observability for Tools
Coming Soon
Supported Model Providers
We integrate with 1000+ LLMs through the following providers.















Supported APIs
The following accordions show which features are supported for each provider across different endpoints:Legend:
- ✅ Supported by Provider and Truefoundry
- ❌ Provider by provider, but not by Truefoundry
- - Provider does not support this feature
Chat Completion (/chat/completions)
Chat Completion (/chat/completions)
| Provider | Stream | Non Stream | Tools | JSON Mode | Schema Mode | Prompt Caching | Reasoning | Structured Output |
|---|---|---|---|---|---|---|---|---|
| OpenAI | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | - |
| Azure OpenAI | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | - |
| Anthropic | ✅ | ✅ | ✅ | - | ✅ | ✅ | ✅ | - |
| Bedrock | ✅ | ✅ | ✅ | - | ✅ | ✅ | ✅ | - |
| Vertex | ✅ | ✅ | ✅ | - | ✅ | ✅ | ✅ | - |
| Cohere | ✅ | ✅ | ✅ | ✅ | ✅ | - | ✅ | - |
| Gemini | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | - |
| Groq | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | - |
| AI21 | ✅ | ✅ | - | ✅ | - | - | - | - |
| Cerebras | ✅ | ✅ | - | ✅ | - | - | ✅ | - |
| SambaNova | ✅ | ✅ | - | ✅ | - | - | ✅ | - |
| Perplexity-AI | ✅ | ✅ | - | ✅ | - | - | ✅ | ✅ |
| Together-AI | ✅ | ✅ | ✅ | ✅ | - | ✅ | ✅ | ✅ |
| xAI | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| DeepInfra | ✅ | ✅ | ✅ | ✅ | - | ✅ | ✅ | - |
Embedding (/embeddings)
Embedding (/embeddings)
| Provider | String | List of String |
|---|---|---|
| OpenAI | ✅ | ✅ |
| Azure OpenAI | ✅ | ✅ |
| Anthropic | - | - |
| Bedrock | ✅ | ✅ |
| Vertex | ✅ | ✅ |
| Cohere | ✅ | ✅ |
| Gemini | - | - |
| Groq | - | - |
| SambaNova | ❌ | ❌ |
| Together-AI | ✅ | ✅ |
| xAI | - | - |
| DeepInfra | ❌ | ❌ |
Image Generation (/images/generations)
Image Generation (/images/generations)
| Provider | Generate |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | ✅ |
| Bedrock | ✅ |
| Vertex | ✅ |
| Anthropic | - |
| Cohere | - |
| Gemini | ❌ |
| Groq | - |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Image Edit (/images/edits)
Image Edit (/images/edits)
| Provider | Edit |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | ✅ |
| Bedrock | ✅ |
| Vertex | ✅ |
| Anthropic | - |
| Cohere | - |
| Gemini | ❌ |
| Groq | - |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Image Variation (/images/variations)
Image Variation (/images/variations)
| Provider | Variation |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | - |
| Bedrock | ✅ |
| Vertex | - |
| Anthropic | - |
| Cohere | - |
| Gemini | ❌ |
| Groq | - |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Audio Transcription (/audio/transcriptions)
Audio Transcription (/audio/transcriptions)
| Provider | Transcription |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | ✅ |
| Anthropic | - |
| Bedrock | - |
| Vertex | ❌ |
| Cohere | - |
| Gemini | ❌ |
| Groq | ✅ |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Audio Translation (/audio/translations)
Audio Translation (/audio/translations)
| Provider | Translation |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | ✅ |
| Anthropic | - |
| Bedrock | - |
| Vertex | ❌ |
| Cohere | - |
| Gemini | ❌ |
| Groq | ✅ |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Text To Speech (/audio/speech)
Text To Speech (/audio/speech)
| Provider | Text To Speech |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | ✅ |
| Anthropic | - |
| Bedrock | - |
| Vertex | ❌ |
| Cohere | - |
| Gemini | ❌ |
| Groq | ❌ |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Rerank (/rerank)
Rerank (/rerank)
| Provider | Rerank |
|---|---|
| OpenAI | - |
| Azure OpenAI | - |
| Anthropic | - |
| Bedrock | ✅ |
| Vertex | - |
| Cohere | ✅ |
| Gemini | - |
| Groq | - |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Batch (/batches)
Batch (/batches)
| Provider | Batch |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | ❌ |
| Anthropic | ❌ |
| Bedrock | ✅ |
| Vertex | ✅ |
| Cohere | ❌ |
| Gemini | ❌ |
| Groq | ✅ |
| Cerebras | - |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Fine Tune
Fine Tune
| Provider | Fine Tune |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | - |
| Anthropic | - |
| Bedrock | ❌ |
| Vertex | ✅ |
| Cohere | ❌ |
| Gemini | - |
| Groq | ❌ |
| Cerebras | - |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Files (/files)
Files (/files)
| Provider | Files |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | ❌ |
| Anthropic | ❌ |
| Bedrock | ✅ |
| Vertex | ✅ |
| Cohere | ❌ |
| Gemini | ❌ |
| Groq | ✅ |
| Cerebras | - |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Moderation (/moderations)
Moderation (/moderations)
| Provider | Moderation |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | - |
| Anthropic | - |
| Bedrock | - |
| Vertex | - |
| Cohere | ❌ |
| Gemini | - |
| Groq | ✅ |
| Cerebras | - |
| Together-AI | ❌ |
| xAI | - |
| DeepInfra | ❌ |
Model Response (/responses)
Model Response (/responses)
| Provider | Model Response |
|---|---|
| OpenAI | ✅ |
| Azure OpenAI | ✅ |
| Anthropic | - |
| Bedrock | - |
| Vertex | - |
| Cohere | - |
| Gemini | - |
| Groq | ❌ |
| Cerebras | - |
| Together-AI | - |
| xAI | - |
| DeepInfra | - |
Completion (/completions)
Completion (/completions)
| Provider | Completion |
|---|---|
| OpenAI | - |
| Azure OpenAI | - |
| Anthropic | - |
| Bedrock | - |
| Vertex | - |
| Cohere | - |
| Gemini | - |
| Groq | - |
| Cerebras | ❌ |
| Together-AI | ✅ |
| xAI | - |
| DeepInfra | ✅ |
Ecosystem & Integrations
Discover how TrueFoundry connects with your favorite AI frameworks and tools to streamline your ML development workflow.- AI Frameworks
- Coding Assistants
- Agent Builder
- Guardrails
- Others
Deployment Options
The Truefoundry AI Gateway can either be used as a SaaS offering or deployed on-premise.- SaaS Offering: You can directly use the gateway as a SaaS offering by signing up on our website, you can find the instructions here.
- Enterprise Deployment for enterprise security and control. You can deploy the gateway in your cloud or on-premise. You can find the architecture and deployment instructions here.
Frequently Asked Questions
What's the performance impact of using the gateway?
What's the performance impact of using the gateway?
The latency overhead is minimal, typically less than 5ms. Our benchmarks show enterprise-grade performance that scales with your needs. Our SaaS offering is hosted in multiple regions across the world to ensure low latency and high availability. You can also deploy the gateway on-premise or on any cloud provider in your region which
is closer to your users.
is closer to your users.

Can I deploy the gateway on-premise?
Can I deploy the gateway on-premise?
Yes, the AI Gateway supports on-premise deployments on any infrastructure or cloud provider, giving you complete control over your AI operations.
How do I integrate my self-hosted models?
How do I integrate my self-hosted models?
You can easily integrate any OpenAI-compatible self-hosted model. Check our self-hosted models guide for detailed instructions.
Can I use the gateway without the full MLOps platform?
Can I use the gateway without the full MLOps platform?
Yes, The AI Gateway can be used as a standalone solution. You can use the full MLOps platform if you’re using features like model deployment(traditional models and LLMs), model training, llm fine-tuning or training/data-processing workflows.





























