Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt

Use this file to discover all available pages before exploring further.

Working with Multi Modal

The API supports various media types including images, audio, video and pdf.
Supported Providers: OpenAI, Bedrock, Anthropic, Google Vertex, Google GeminiSend images as part of your chat completion requests using either URLs or base64 encoding:

Using Image URLs

from openai import OpenAI

client = OpenAI(
    api_key="your_truefoundry_api_key",
    base_url="{GATEWAY_BASE_URL}"
)

response = client.chat.completions.create(
    model="openai-main/gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/image.jpg"
                    }
                }
            ]
        }
    ]
)

Using Base64 Encoded Images

import base64

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

response = client.chat.completions.create(
    model="openai-main/gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{encode_image('image.jpeg')}"
                    }
                }
            ]
        }
    ]
)
Supported Providers: OpenAI, Azure OpenAI, Google Gemini, Google Vertex AI, xAIThe detail parameter in the image_url object allows you to control the resolution at which images are processed. This helps balance between response quality, latency, and cost.Supported Values: low, high, auto

Example Usage

import base64

from openai import OpenAI

API_KEY = "your_truefoundry_api_key"
BASE_URL = "{GATEWAY_BASE_URL}"

# Read and encode the image as base64
with open("test-img.png", "rb") as image_file:
    base64_image = base64.b64encode(image_file.read()).decode('utf-8')

client = OpenAI(
    api_key=API_KEY,
    base_url=BASE_URL
)

response = client.chat.completions.create(
    model="test-123/gemini-3-pro-preview",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/png;base64,{base64_image}",
                        "detail": "low"  # Options: "low", "high", "auto"
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message)
For Google Gemini and Vertex AI providers, the detail parameter is automatically translated to the mediaResolution parameter:
  • "low" → MEDIA_RESOLUTION_LOW (64 tokens)
  • "high" → MEDIA_RESOLUTION_HIGH (256+ tokens with scaling)
  • "auto" or omitted → No explicit media resolution (model decides)
Supported Models: Google Gemini models (Gemini 2.0 Flash, etc.)Send audio files in supported formats (MP3, WAV, etc.). Currently supported for Google Gemini models:

Using Audio URLs

response = client.chat.completions.create(
    model="internal-google/gemini-2-0-flash",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Transcribe this audio"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/audio.wav",
                        "mime_type": "audio/wav" # required for gemini models
                    }
                }
            ]
        }
    ]
)

Using Base64 Encoded Audio

import base64

def encode_audio(audio_path):
    with open(audio_path, "rb") as audio_file:
        return base64.b64encode(audio_file.read()).decode('utf-8')

response = client.chat.completions.create(
    model="internal-google/gemini-2-0-flash",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Transcribe this audio"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:audio/wav;base64,{encode_audio('audio.wav')}"
                    }
                }
            ]
        }
    ]
)
Supported Models: Google Gemini models (Gemini 2.0 Flash, etc.)Video processing is natively supported for Google Gemini models:

Using Video URLs

response = client.chat.completions.create(
    model="internal-google/gemini-2-0-flash",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe what's happening in this video"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://www.youtube.com/watch?v=example",
                        "mime_type": "video/mp4" # required for gemini models
                    }
                }
            ]
        }
    ]
)

Using Base64 Encoded Video

import base64

def encode_video(video_path):
    with open(video_path, "rb") as video_file:
        return base64.b64encode(video_file.read()).decode('utf-8')

response = client.chat.completions.create(
    model="internal-google/gemini-2-0-flash",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe what's happening in this video"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:video/mp4;base64,{encode_video('video.mp4')}",
                        "mime_type": "video/mp4" # required for gemini models
                    }
                }
            ]
        }
    ]
)
Supported Providers: OpenAI, Bedrock, Anthropic, Google Vertex, Google GeminiPDF document processing allows models to analyze and extract information from PDF files:

Using Base64 Encoded PDF

from openai import OpenAI

client = OpenAI(
    api_key="your_truefoundry_api_key",
    base_url="{GATEWAY_BASE_URL}"
)

import base64

with open("sample.pdf", "rb") as file_data:
    base64_image = base64.b64encode(image_file.read()).decode('utf-8')

response = client.chat.completions.create(
    model="tfy-ai-anthropic/claude-4-sonnet-20250514",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "what's the data in the file"},
                {
                    "type": "file",
                    "file": {
                        "filename": "sample.pdf",
                        "file_data": f"data:application/pdf;base64,{file_data}",
                    }
                },
            ]
        }
    ]
)

print(response.choices[0].message.content)

Vision

TrueFoundry supports vision models from all integrated providers as they become available. These models can analyze and interpret images alongside text, enabling multimodal AI applications.
ProviderModels
OpenAIgpt-4-vision-preview, gpt-4o, gpt-4o-mini
Anthropicclaude-3-sonnet, claude-3-haiku, claude-3-opus, claude-3.5-sonnet, claude-3.5-haiku, claude-4-oppus, claude-4-sonnet, claude-3-7-sonnet
Geminigemini-1.0-pro-vision, gemini-1.5-flash, gemini-1.5-flash-8b, gemini-1.5-pro, gemini-2.5-pro, gemini-2.5-flash
AWS Bedrockanthropic.claude-3-5-sonnet, anthropic.claude-3-5-haiku, anthropic.claude-3-5-sonnet-20240620-v1:0
Azure OpenAIgpt-4-vision-preview, gpt-4o, gpt-4o-mini
xAIgrok-2-vision-1212

Using Vision Models with OpenAI SDK

from openai import OpenAI

client = OpenAI(
    api_key="your_truefoundry_api_key",
    base_url="{GATEWAY_BASE_URL}"
)

response = client.chat.completions.create(
    model="openai-main/gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message)