Setup

Create a Custom Endpoint provider account
From the TrueFoundry dashboard, go to AI Gateway → Models, open the Custom Endpoints tab, and click Add Custom Endpoint.In the wizard, complete Configure Account: Name (required), optional Endpoint Type (None, Azure Speech Service, or Other), and optional Header Auth at the account level so integrations can inherit the same default upstream authentication. Use Continue to Endpoints when ready.To turn the account into a load-balanced pool, expand Advanced and set Routing Type to 
Weight or Priority, then enter a Slug. The slug becomes the second URL segment for the aggregated endpoint (/proxy-api/<account-name>/<slug>/<upstream-path>). Leave Routing Type as None for the standard per-endpoint flow. See Load balancing across endpoints.
Add an integration (endpoint)
On the Endpoints wizard step, add or edit an endpoint integration (Add Endpoint from the list view works the same flow).Set Display Name and Base URL (the upstream origin — must not end with a trailing slash; see configuration reference). Optionally enable Custom Headers, Header Auth (per-endpoint upstream credentials), or TLS Settings.If the account has Routing Type set, a Load Balancing Config group appears on each endpoint with Weight (weight mode) or Priority (priority mode), plus Fallback Status Codes and Fallback Candidate. See Per-endpoint load balancing fields.

Make a request
Call the gateway using the pattern below. Replace 
GATEWAY_BASE_URL, providerAccountName, and endpointName with your values, and append any path that should be joined to the integration Base URL.Callers only need a TrueFoundry API key for the gateway. Upstream authentication is applied by the gateway from your provider account and integration settings—you do not pass upstream secrets from client code.

The wizard also includes an Access Control step after Endpoints (who can use this provider account), consistent with other AI Gateway providers — see Gateway access control.
The same manifest fields are available through the TrueFoundry CLI (
tfy apply) if you configure provider accounts without the dashboard.Endpoint structure
Requests use this URL shape (query parameters are forwarded unchanged):| Segment | Meaning |
|---|---|
providerAccountName | Name of your Custom Endpoint provider account (lowercase letters, digits, hyphens; 3–32 characters) |
endpointName | Integration Display Name used in the URL (letters, digits, -, _, .; no spaces; 2–62 characters; cannot start with a digit) |
upstream-path | Path appended to the integration’s Base URL |
query-params | Optional; passed through to the upstream as-is |
Provider account and integration names must satisfy the patterns above when created. If your HTTP client requires encoding for certain characters in the path segment, URL-encode
endpointName accordingly.Authentication
Gateway authentication — Same as other AI Gateway routes: sendAuthorization: Bearer <TrueFoundry API key> (or your deployment’s documented gateway auth).
Upstream authentication — Configure Header Auth (header name/value pairs used as upstream credentials) and optional Custom Headers on the provider account and/or each integration. The gateway adds these to the proxied request; callers never see upstream keys.
Upstream services that expect Bearer or HTTP Basic credentials still use Header Auth: you store the exact header name and value the upstream expects (often Authorization). Examples:
Bearer token
| Header | Value |
|---|---|
Authorization | Bearer <your-upstream-access-token> |
Authorization header whose value is Basic followed by the Base64 encoding of username:password (UTF-8), with no newline inside that string.
- Build the string
username:password(colon between user and password). - Base64-encode it (standard Base64, padding as needed).
- Set Header Auth to
Authorization=Basic <encoded-result>.
Basic <output-from-above> into Header Auth in the UI.
API key in a custom header (e.g. Azure Speech)
| Header | Value |
|---|---|
Ocp-Apim-Subscription-Key | <subscription-key> |
Load balancing across endpoints
Pool multiple endpoints under the same provider account behind a single aggregated URL to raise aggregate concurrency or to run a primary/backup topology. Typical uses include scaling out replicas of an internal API, fanning out across multiple regional deployments of the same HTTP service, or combining several Azure Speech subscriptions to lift the per-subscription request limit.How it works
- Set
routing_type(weightorpriority) and aslugon the provider account, and add aloadbalancing_configto each endpoint integration. - Call the pool at
{GATEWAY_BASE_URL}/proxy-api/{providerAccountName}/{slug}/{upstream-path}. The slug replaces the singleendpointNamesegment. - Per-endpoint URLs continue to work alongside the slug URL — useful for testing one upstream in isolation.
- For each request, the gateway picks an endpoint, proxies the call, and on a response whose status is listed in
fallback_status_codes(or on a network error) retries the next eligible endpoint. Repeated failures cool an endpoint down across requests automatically. - Access is checked at the provider account level in aggregated mode — per-endpoint
authorized_subjectslists are not consulted.
The dashboard’s “usage code snippet” helper is only generated for individual endpoints, not for the load-balanced slug endpoint. Build the request URL yourself using the template below — gateway authentication and upstream auth injection work exactly like the per-endpoint flow.Template:
{GATEWAY_BASE_URL}/proxy-api/{providerAccountName}/{slug}/{upstream-path}{slug}is the account’s Slug (not an endpoint’s Display Name).{upstream-path}is appended to the chosen endpoint’sbase_url, identical to the per-endpoint flow.- You can copy the snippet from any individual endpoint as a starting point and replace the endpoint name segment with the slug.
Configuration structure
The following YAML shows the complete shape of a load-balanced custom endpoint provider account. The same fields are available in the dashboard form editor.Per-endpoint load balancing fields
| Field | Type | Description |
|---|---|---|
weight | int (0-100) | Traffic share, weight mode only. Weights across all endpoints must sum to 100. |
priority | int (≥ 0) | Priority, priority mode only. Lower number = higher priority; 0 is highest. |
fallback_status_codes | string[] | Upstream HTTP statuses that trigger a fallback to the next endpoint. Default: ["401", "403", "404", "408", "429", "500", "502", "503"]. |
fallback_candidate | bool | If false, the endpoint never receives fallback traffic from another endpoint — it is only used when picked as the primary. Default: true. |
Validation rules: setting
routing_type requires slug and at least 2 endpoints. In weight mode, every endpoint needs a weight and the sum across endpoints must equal 100. In priority mode, every endpoint needs a priority. Without routing_type, slug and loadbalancing_config are ignored.Weight-based routing
Distributes requests across endpoints in proportion to theirweight. Best for spreading load across multiple equivalent backends to raise aggregate concurrency.
Example: pool multiple Azure Speech subscriptions for higher throughput
Example: pool multiple Azure Speech subscriptions for higher throughput
A single Azure Speech subscription has a fixed per-resource concurrency. Combining several subscriptions or regions behind one slug endpoint multiplies the headroom and lets the gateway fail over on The pool is called at
429.{GATEWAY_BASE_URL}/proxy-api/azure-speech-pool/tts/v1 — same SSML body and X-Microsoft-OutputFormat header as the single-endpoint example in Use cases, only the URL changes.Priority-based routing
Routes every request to the highest-priority healthy endpoint (0 is highest) and falls back to the next on failure. Best for primary/backup topologies.
primary while it is healthy; when it returns a fallback status code or is in cooldown, the gateway tries backup.
Fallback and health
fallback_status_codes— Upstream statuses that cause the gateway to stop on the current endpoint and try the next eligible one in the pool. Statuses outside this list propagate to the caller immediately. Default:["401", "403", "404", "408", "429", "500", "502", "503"].fallback_candidate— Whenfalse, the endpoint is excluded from receiving fallback traffic from other endpoints; it is only used when selected as its own primary by the routing strategy.- Network errors (timeouts, connection failures) always roll over to the next endpoint regardless of
fallback_status_codes. - Automatic cooldown — Repeated failures (
401,403,429,5xx) within a short rolling window mark an endpoint unhealthy. Healthy endpoints are tried first; if every endpoint is in cooldown the gateway still tries them as a last resort. Recovery is automatic once errors age out of the window.
Use cases
Configuration reference
Provider account (provider-account/custom-endpoint)
| Field | Description |
|---|---|
| Name | Unique account identifier (lowercase letters, digits, hyphens; 3–32 characters); used in the gateway URL as providerAccountName |
| Endpoint Type | Optional: azure-speech-service or other (used for tracking / defaults in the product) |
| Routing Type | Optional: weight or priority. Turns the account into a load-balanced pool. Leave unset for the standard per-endpoint flow. See Load balancing across endpoints |
| Slug | Required when Routing Type is set. Pattern: letters, digits, -, _, .; 2–62 characters; cannot start with a digit. Becomes the second URL segment for the aggregated endpoint (/proxy-api/<account>/<slug>/<upstream-path>) |
| Header Auth | Optional default upstream auth: header-based credentials (type: header plus a Headers map), applied when an integration has no Header Auth of its own |
| Collaborators | Who can manage or use this account — see Gateway access control |
Integration (endpoint)
| Field | Description |
|---|---|
| Display Name | Identifies the endpoint in the UI and in the URL as endpointName (pattern: letters, digits, -, _, .; no spaces; cannot start with a digit) |
| Base URL | HTTPS (or HTTP) origin without a trailing slash; the gateway appends upstream-path from the request |
| Custom Headers | Optional key/value headers merged into every upstream request (often under Advanced) |
| Header Auth | Optional upstream credentials as header key/value pairs; when set, replaces provider-account Header Auth for this integration |
| TLS Settings | Optional Reject Unauthorized toggle and optional Custom CA Certificates text for upstream TLS verification |
| Load Balancing Config | Required on every endpoint when the parent account has Routing Type set. See Per-endpoint load balancing fields for the sub-field reference |
| Access Control | Optional authorized_subjects for per-subject allow lists — commonly edited via API or manifest rather than the default UI. Not consulted when calling the account’s aggregated slug URL (access is checked at the provider-account level) |
Tracing
All traffic through Custom Endpoints is traced with span typeCustomEndpoint on the gateway trace / root span. To browse requests in the dashboard, see Request Logging. For general tracing concepts (traces, spans, attributes), see the LLM tracing overview.
Aggregated (load-balanced) requests carry extra attributes on the same span so you can see which pooled endpoint served each request and how often fallback occurred:
| Attribute | Meaning |
|---|---|
custom_proxy_routing_type | weight or priority |
custom_proxy_slug | The account slug used in the request URL |
custom_proxy_endpoint_name | The endpoint that actually served the request |
custom_proxy_target_was_cooled_down | true if the request had to use an endpoint in cooldown (all targets unhealthy) |
loadbalance_target_attempt_count | Number of endpoints attempted before success or final failure |
loadbalance_first_target | First endpoint tried for this request |
loadbalance_final_target | Endpoint that produced the final response |