What is AI Gateway ?

Route requests to the best model based on latency, cost, or use case

Automatically retry failed calls and cache responses to save costs

Define per-user or per-team rate limits and quotas

Track usage metrics, latencies, and cost at granular levels

Enforce fine-grained access control through API keys or tokens

Version prompts for consistent and reproducible outputs

Capture and monitor input/output data for debugging and improvement

What is AI Gateway ? Core Concepts and Guide