Blank white background with no objects or features visible.

Rejoignez notre écosystème VAR & VAD — assurez la gouvernance de l'IA d'entreprise pour les LLM, MCP et Agents. Read →

Middleware integration with TrueFoundry AI Gateway

Par Rishiraj Dutta Gupta

Mis à jour : May 22, 2026

Middleware integration with TrueFoundry AI Gateway

As organizations scale their AI applications, knowing what your models are doing in production is as important as getting them running in the first place. Engineers need visibility into every inference request   latency, token usage, model behavior, finish reason   but connecting observability tooling to every model and provider means complex, repetitive instrumentation work for each integration.

The big question: how do you get full-stack visibility across all the models your teams are using without custom engineering for each one?

At Middleware, the goal is to make observability as easy as it is powerful. That's why we're thrilled to announce the integration of Middleware with the TrueFoundry AI Gateway. This integration gives your organization complete visibility into every AI inference request   correlated with infrastructure metrics, application traces, and logs   all from a single, centralized platform, helping ensure your AI operations are transparent and under control.

The Power of the TrueFoundry AI Gateway

The TrueFoundry AI Gateway is a powerful way for developers and platform teams to manage, monitor, and scale their AI applications. It brings together unified access to hundreds of large language models, smart routing, and centralized policy enforcement  all in one place. A single gateway pod handles 250+ requests per second while adding approximately 3 ms of latency, making it production-grade from day one.

As AI adoption accelerates, the real challenge isn't accessing models   it's managing the complexity that follows. Multiple providers, evolving APIs, and strict compliance requirements can quickly slow teams down. The TrueFoundry AI Gateway brings order to this complexity, serving as the control plane for enterprise AI. It unifies access, enforces policy, and delivers OpenTelemetry-compliant observability across every model and environment   without requiring any changes to the applications calling the gateway.

Middleware: Full-Stack Observability Built on OpenTelemetry

Middleware is a full-stack observability platform built on OpenTelemetry as its core instrumentation standard. It accepts traces, logs, infrastructure metrics, and real user monitoring data through OTEL Collector, storing them in a single correlated data layer that gives engineering teams a complete picture of their systems in one place.

What sets Middleware apart is what it does after a trace arrives. Rather than storing spans in isolation, Middleware correlates them with infrastructure signals from the host or cluster where the service runs. An engineer investigating a latency spike in a gateway span can navigate directly from the trace view to CPU and memory metrics for that pod   without switching dashboards. Middleware also builds a live service topology map from incoming span data, making every instrumented service visible as a node in the service map with latency and error rate computed automatically from its spans.

Better Together: A Seamless Integration for Complete Visibility

The integration of Middleware and the TrueFoundry AI Gateway simplifies and strengthens your AI observability. This combination makes it easy to bake production-grade visibility right into your AI workflow, ensuring your systems are observable from the moment of deployment.

With this integrated solution, every inference request that passes through the TrueFoundry AI Gateway automatically generates a structured set of OpenTelemetry spans. Those spans carry prompt content, completion content, token counts, model name, latency, and finish reason as queryable attributes   then flow asynchronously to Middleware over OTLP/HTTP. Middleware ingests them alongside the rest of your infrastructure telemetry, making gateway traffic immediately visible as a first-class service in the topology map and APM views alongside the application services that call it.

For full control over sensitive data, the TrueFoundry gateway's Exclude Request Data toggle strips prompt and completion content from span attributes before export. Token counts, latency, and model metadata are retained regardless, so you keep complete operational visibility without exposing user inputs to external systems. For organizations with strict network egress requirements, the gateway exporter can also be pointed at a self-managed OpenTelemetry Collector that forwards to Middleware   requiring no changes other than the endpoint URL.

How the Middleware and TrueFoundry Integration Works

Middleware and TrueFoundry AI Gateway Integration

Middleware and the TrueFoundry AI Gateway work together to deliver observability without adding complexity to your inference path.

How the Trace Flow Works

  1. Your application sends an inference request to the TrueFoundry AI Gateway. The gateway handles authentication, model resolution, and routing entirely in memory   no external calls happen in the critical path.

  2. The gateway forwards the request to the configured LLM provider   the only external call in the request path   and returns the response to your application immediately.

  3. After the response is delivered, the gateway asynchronously publishes the full trace event to an internal NATS bus. Export happens entirely outside the request path, so inference latency is never affected by OTEL endpoint availability or slowness.

  4. A dedicated OTEL exporter process reads from the NATS bus, serializes the spans as a protobuf-encoded OTLP/HTTP payload, and sends them to your Middleware tenant endpoint at https://<your-domain>.middleware.io:443/v1/traces with your Middleware API key in the Authorization header.

  5. Middleware receives the payload at its OTLP ingest layer and stores the spans in its correlated telemetry backend, where they are immediately queryable alongside logs, infrastructure metrics, and APM data for the rest of your stack.

Configuration is just as easy. Navigate to AI Engineering → Settings → OTEL Config in the TrueFoundry dashboard, enter your Middleware tenant endpoint and API key, set the protocol to HTTP with protobuf encoding, and you're ready to go.

Get Started with Full-Stack AI Observability

AI observability does not have to mean complex instrumentation work. With Middleware integrated into the TrueFoundry AI Gateway, your entire inference traffic becomes visible   correlated with infrastructure signals, filterable by model name or token count, and mapped into a live service topology   from the moment the configuration is saved. It's complete, production-grade observability that is easy to set up, more like flipping a switch than a custom engineering project.

To learn more, visit the Middleware documentation and the TrueFoundry integration reference to see how straightforward it is to get full-stack visibility into your AI applications.

Ready to get started? Connect your TrueFoundry gateway to Middleware today and turn every inference request into a structured, queryable observability event.

Le moyen le plus rapide de créer, de gérer et de faire évoluer votre IA

INSCRIVEZ-VOUS
Table des matières

Gouvernez, déployez et suivez l'IA dans votre propre infrastructure

Réservez un séjour de 30 minutes avec notre Expert en IA

Réservez une démo

Le moyen le plus rapide de créer, de gérer et de faire évoluer votre IA

Démo du livre

Découvrez-en plus

October 5, 2023
|
5 min de lecture

<Webinar>Vitrine GenAI pour les entreprises

May 23, 2024
|
5 min de lecture

Que sont les intégrations vectorielles ? — Un guide complet 2024

May 22, 2024
|
5 min de lecture

Qu'est-ce que l'indexation vectorielle ? - Un guide complet 2024

Best Fine Tuning Tools for Model Training
May 3, 2024
|
5 min de lecture

Les 6 meilleurs outils de réglage pour la formation des modèles en 2026

May 22, 2026
|
5 min de lecture

Nous nous sommes entraînés comme un mille-pattes pour pouvoir construire comme un seul homme

Aucun article n'a été trouvé.
May 22, 2026
|
5 min de lecture

Middleware integration with TrueFoundry AI Gateway

Outils LLM
Ingénierie et produits
Terminologie LLM
May 21, 2026
|
5 min de lecture

Introducing Skills Registry: Reusable Agent Skills for Production AI Systems

Aucun article n'a été trouvé.
May 21, 2026
|
5 min de lecture

Stdio vs Streamable HTTP for MCP: What changes when you move from local development to enterprise deployment

Aucun article n'a été trouvé.
Aucun article n'a été trouvé.

Blogs récents

Black left pointing arrow symbol on white background, directional indicator.
Black left pointing arrow symbol on white background, directional indicator.
Faites un rapide tour d'horizon des produits
Commencer la visite guidée du produit
Visite guidée du produit