Blank white background with no objects or features visible.

Join our VAR & VAD ecosystem — deliver enterprise AI governance across LLMs, MCPs & Agents. Become a Partner →

Exporting LLM Gateway Traces to Traceloop with OpenTelemetry

Summarize with
Metallic silver knot design with interlocking loops and circular shape forming a decorative pattern.
Blurry black butterfly or moth icon with outstretched wings on white background.
Blurry red snowflake on white background, symmetrical frosty design with soft edges and abstract shape.

TrueFoundry AI Gateway exports OpenTelemetry traces to Traceloop over OTLP/HTTP using the https://api.traceloop.com/v1/traces endpoint and a Bearer token in the Authorization header. Every LLM request that passes through the gateway produces a span tree that lands in the Traceloop dashboard without any changes to application code or deployment topology.

This post covers the trace generation path inside the TrueFoundry AI Gateway and how Traceloop ingests and surfaces that data. It also describes the configuration surface and the data privacy controls available at the gateway level.

How the Gateway Generates Traces

The TrueFoundry AI Gateway is built on the Hono framework and runs as a stateless pod handling over 250 requests per second on a single vCPU with approximately 3 ms of added latency per request. The gateway operates in a split architecture where a control plane manages configuration and one or more gateway pods process inference traffic.

When a request arrives the gateway executes the following sequence in the hot path:

  1. JWT token validated against public keys cached in memory (downloaded once from the IdP and refreshed via NATS)
  2. Authorization checked against an in-memory user-to-model map kept current by NATS pub/sub
  3. Model identifier resolved to a physical provider endpoint via Virtual Model routing logic running in memory
  4. Request translated from OpenAI-compatible format to the target provider format via an adapter layer
  5. Request forwarded to the provider and the response streamed back to the client

None of these steps make external calls except the provider call itself. Rate limiting runs the Sliding Window Token Bucket algorithm against in-memory state. Guardrail evaluation (when configured) runs concurrently with the model call for input checks and sequentially for output checks.

After the request completes the gateway publishes the span tree asynchronously to NATS. The OTEL exporter reads from this async path and forwards spans to the configured external endpoint. Because the export path is fully decoupled from the request path a slow or unreachable OTEL backend never adds latency to the client and never causes a request to fail. If Traceloop is unreachable spans are dropped at the exporter and logged internally. TrueFoundry's own internal trace storage is unaffected because export is additive.

The gateway generates spans across five stages: the inbound HTTP handler and authentication and model resolution and the outbound provider call and the streaming response assembly. Each span carries a consistent set of attributes.

Span AttributeDescription
tfy.inputFull request body sent to the LLM provider
tfy.outputFull response body returned by the LLM provider
tfy.input_short_handCondensed input summary with flags for file and image and audio content
tfy.span_typeOperation type: ChatCompletion or AgentResponse or MCPGateway
tfy.data_routing.destinationTarget model or Virtual Model identifier
tfy.request.created_by_subjectIdentity of the requesting user
service.nameAlways set to tfy-llm-gateway
gen_ai.usage.prompt_tokensInput token count for the request
gen_ai.usage.completion_tokensOutput token count for the response
gen_ai.request.modelModel name resolved at routing time
gen_ai.systemProvider system identifier (openai and anthropic etc.)

The gen_ai.* attributes follow the OpenTelemetry Semantic Conventions for Generative AI Systems. This means the trace data arriving in Traceloop is structurally identical to what any OpenLLMetry-instrumented application would produce.

What Traceloop Does with the Data

Traceloop is an LLM observability platform built on OpenLLMetry which is its open-source OpenTelemetry instrumentation layer. Traceloop's backend accepts OTLP/HTTP trace data and indexes it for the Traceloop dashboard. The platform is trace-native. Metrics such as token usage and latency and cost are computed from span attributes rather than from a separate OTLP metrics stream. This is why configuring only the Traces Exporter in TrueFoundry is sufficient — there is no /v1/metrics endpoint in Traceloop's ingestion surface.

Traceloop organizes data around three core abstractions. Traces are the top-level unit and correspond directly to an LLM request or an agentic workflow. Spans within a trace represent individual operations (an LLM call and a tool invocation and a retrieval step). Environments map to deployment stages and each environment has its own API key allowing Development and Staging and Production traces to remain isolated in the dashboard.

The Traceloop dashboard surfaces token usage over time and latency distributions and error rates and model breakdowns directly from gen_ai.* span attributes. Because TrueFoundry populates these attributes on every span the Traceloop dashboard is fully populated without any SDK instrumentation in the application layer.


Traceloop also supports prompt versioning and regression testing pipelines but those features operate at the application SDK level and are outside the scope of this integration. The gateway-level integration covers the full observability surface: every request that passes through TrueFoundry produces a trace in Traceloop regardless of what LLM provider or model is called.

The Integration Surface

The connection between TrueFoundry and Traceloop is a single OTLP/HTTP POST to https://api.traceloop.com/v1/traces carrying Proto-encoded span batches. Authentication is a Bearer token in the Authorization header. The token is a Traceloop API key scoped to a specific environment.

TrueFoundry exposes this configuration under AI Gateway → Controls → Settings → OTEL Config. The Otel Traces Exporter section accepts the following fields.

FieldValue
ProtocolHTTP Configuration
Endpointhttps://api.traceloop.com/v1/traces
EncodingProto
Header KeyAuthorization
Header ValueBearer <your-traceloop-api-key>

The endpoint must include the full /v1/traces path. TrueFoundry's exporter does not auto-append signal paths. This differs from the OTel Collector otlphttp exporter which appends the path automatically from the base URL. Both resolve to the same destination.

Traceloop API keys are generated per environment from the Environments page in the Traceloop dashboard. A key is displayed only once at creation time. The key value is passed in the header as Bearer <key> including the Bearer prefix as a literal string.

Traceloop EnvironmentRecommended TrueFoundry Usage
DevelopmentNon-production gateway instances or internal test traffic
StagingPre-production gateway with realistic model traffic
ProductionProduction gateway instances with live user traffic

Data Privacy Controls

The gateway provides an Exclude Request Data toggle in the OTEL Config section. When enabled the exporter strips tfy.input and tfy.output and tfy.input_short_hand from every span before forwarding to Traceloop. The remaining span attributes (token counts and model names and latency and routing metadata) are unaffected. This toggle is appropriate when prompts or completions contain user PII or proprietary content that should not leave the cluster boundary.

The Additional Resource Attributes field allows appending custom key-value pairs to every exported span. This is useful for environment tagging and cost center attribution and multi-tenant filtering within a single Traceloop environment.

Architecture Summary

Every LLM request through TrueFoundry AI Gateway produces a span tree covering authentication and routing and the provider call and the response. After the request completes the gateway publishes this span tree to NATS asynchronously. The OTEL exporter reads from NATS and POSTs Proto-encoded batches to https://api.traceloop.com/v1/traces with a Bearer token. Traceloop indexes the spans and surfaces token usage and latency and model breakdowns in its dashboard from the gen_ai.* attributes on each span.

No sidecars are required. No changes to application code are required. No OpenLLMetry SDK needs to be added to services calling the gateway. The integration operates entirely at the gateway layer and covers 100% of traffic passing through it regardless of the calling application's instrumentation state.

The architectural property that makes this clean is the async NATS publish. Because span export is decoupled from the request path the integration adds zero latency to inference calls and introduces no availability dependency on Traceloop. The gateway processes requests at full throughput whether or not Traceloop is reachable.

The fastest way to build, govern and scale your AI

Sign Up
Table of Contents

Govern, Deploy and Trace AI in Your Own Infrastructure

Book a 30-min with our AI expert

Book a Demo

The fastest way to build, govern and scale your AI

Book Demo

Discover More

No items found.
May 11, 2026
|
5 min read

Creativity, AI Systems and Truefoundry with Nikunj Bajaj

No items found.
May 11, 2026
|
5 min read

Exporting LLM Gateway Traces to Traceloop with OpenTelemetry

No items found.
May 11, 2026
|
5 min read

Exporting TrueFoundry AI Gateway Traces to OpenLIT via OTLP

No items found.
May 11, 2026
|
5 min read

Exporting TrueFoundry AI Gateway Traces to SigNoz via OTLP

No items found.
No items found.

Recent Blogs

Black left pointing arrow symbol on white background, directional indicator.
Black left pointing arrow symbol on white background, directional indicator.
Take a quick product tour
Start Product Tour
Product Tour