Pillar Security Integration with TrueFoundry AI Gateway

Built for Speed: ~10ms Latency, Even Under Load

Blazingly fast way to build, track and deploy your models!

Handles 350+ RPS on just 1 vCPU — no tuning needed
Production-ready with full enterprise support

Get Started with Truefoundry Now Talk to the Expert

‍We are excited to announce our partnership with Pillar Security that brings adaptive runtime guardrails directly into the path of AI agent and LLM traffic.

Teams routing model and agent traffic through TrueFoundry's AI Gateway can now connect Pillar Security as a first class guardrail provider to gain real-time scanning and policy enforcement across prompts and responses and tool calls and MCP interactions in production. The integration runs at the four guardrail hooks exposed by the gateway and requires no changes to agent or application code.

This post covers the architecture of the integration. It explains how the TrueFoundry AI Gateway executes guardrails at runtime and how Pillar's scanner pipeline plugs into that execution model and how teams configure rules that target specific models and MCP servers and user populations.

Why enterprise agentic AI needs two layers

TrueFoundry provides the control layer for production AI systems. Through the AI Gateway teams centralize model routing and key management and access control and observability and governance across LLMs and tools and MCP-connected workflows. Every request flows through a single proxy layer where identity is verified and rate limits are enforced and traces are captured.

Pillar Security provides the runtime security layer. Its detection models scan prompts and responses for jailbreak attempts and prompt injection and PII and PCI data and secrets and toxic language and invisible character attacks. Pillar's adaptive guardrails are informed by red teaming exercises run against the same application and tune themselves to the agent's defined business purpose to reduce false positives.

Together the two solutions give teams a clean production architecture. TrueFoundry handles deployment and routing and operational control. Pillar handles runtime inspection and threat detection and policy enforcement. Pillar Security is supported as a first class guardrail provider inside the TrueFoundry gateway with hooks at llm_input_guardrails and llm_output_guardrails and mcp_tool_pre_invoke_guardrails and mcp_tool_post_invoke_guardrails.

The gap in production agent deployments

Most teams building AI agents focus on getting deployment and reliability right. The agent has to call the right tools and manage context across long conversations and handle retries and scale across users. That work is necessary but it does not answer the runtime security question.

Security in many agentic AI deployments stops at the perimeter. Platform access controls and MCP server allowlists and tool-level permissions and scoped credentials for downstream systems are all in place. Those controls matter but they leave the runtime path uninspected.

The questions that the perimeter cannot answer include what the agent is actually doing once it starts executing and which tools it is calling and in what sequence and with what data. If a prompt injection slips in through retrieved context or through an MCP server response or through an external API result the perimeter has no visibility into whether the agent is about to act on it.

Runtime guardrails on the gateway path

The architectural idea behind this integration is direct. If all model and tool and MCP traffic already flows through the gateway then the gateway is the right place to apply runtime security. With Pillar connected to the TrueFoundry AI Gateway teams can enforce guardrails on the same path where agent traffic is already being routed and governed. Evaluation happens on live traffic and not on traces reviewed after execution.

Pillar Argus runs as the adaptive runtime layer. Pillar applies its scanners to every interaction in production so platform and security teams can monitor and evaluate and enforce policy on agent behavior while it is happening. The scanner outputs include a session identifier and a flagged boolean and the per-category triggers and an evidence array with the offending text and its position in the input.

Pillar exposes the following detection categories at runtime. Jailbreak detection identifies attempts to bypass the model's safety training. Prompt injection detection covers both direct and indirect injection through retrieved context or tool output. PII and PCI detection covers over forty categories of personal and payment card data and supports masking before the data reaches the model. Secret detection identifies API keys and tokens and credentials in either prompts or model output. Content moderation and toxic language detection cover unsafe and policy-violating content. Invisible character detection catches concealed unicode payloads used to smuggle instructions past human review.

For agentic systems Pillar evaluates not just a single prompt and response pair but the tool invocations and MCP requests and multi-step execution context. Many agentic AI failures emerge across the full chain of actions and not in any single model call.

How the gateway executes guardrails

The TrueFoundry AI Gateway runs on the Hono framework and a single gateway pod handles 250 plus requests per second on 1 vCPU and 1 GB RAM with approximately 3 ms of added latency. Gateway pods are stateless and CPU bound and scale horizontally to tens of thousands of RPS through additional pods. The control plane and gateway plane are split. Configuration including guardrail rules and model definitions and rate limits lives in the control plane and syncs to gateway pods through NATS. The actual request path stays in memory with no external calls beyond the LLM provider.

Guardrails execute at four discrete hooks in the request lifecycle.

llm_input_guardrails intercepts a prompt before it reaches the model. The gateway sends the input payload to Pillar first. If Pillar returns flagged: true for any configured scanner the request is blocked and the LLM is never called. The input guardrail call runs concurrently with the model request to optimize time to first token and the model call is cancelled immediately on a violation to avoid incurring provider cost.

llm_output_guardrails fires after the LLM has responded but before the response is returned to the caller. Output guardrails are sequential. The gateway waits for the model output and submits it to Pillar for scanning before delivering to the client. This is the enforcement point for catching PII leakage and secret exposure and toxic generation and any unsafe content the model produced.

mcp_tool_pre_invoke_guardrails fires before a tool is executed by the agent. Pillar evaluates the tool name and the arguments and the calling context. If the arguments contain sensitive data or indicate off-scope resource access the tool invocation is blocked before any real-world action occurs.

mcp_tool_post_invoke_guardrails fires after the tool returns its result and before that result is passed back into the agent reasoning loop. This is the enforcement point for detecting indirect prompt injection in tool output and credential leakage from MCP servers and PII returned by upstream APIs. Stopping it here keeps the agent from acting on poisoned context.

Each hook supports three enforcing strategies. Enforce blocks on violation or on guardrail service error. Enforce But Ignore On Error blocks on violation but allows the request to proceed if the guardrail service itself is unreachable. Audit logs the verdict and never blocks. Each guardrail also supports two operation modes. Validate mode produces a block or pass decision. Mutate mode allows the guardrail service to modify content in flight which is how Pillar's masking capability is wired in. Mask mode is configured on the Pillar side and surfaces redacted values for matched PII and secrets before the prompt reaches the model.

The integration surface

Pillar is configured in the TrueFoundry control plane as a guardrail integration with two pieces of input. The first is the API key issued by the Pillar console. The second is the scanner configuration that selects which detection categories should run for this integration.

Field	Value
Provider	Pillar Security
Endpoint	https://api.pillar.security/api/v1/integrations/truefoundry
Authentication	Bearer token via PILLAR_API_KEY
Scanners	jailbreak, prompt_injection, pii, secret, toxic_language, content_moderation, invisible_character
Operation modes	Validate and Mutate
Response format	{ session_id, flagged, scanners, evidence }

Once the integration is registered the gateway exposes it as a selector that can be referenced from any guardrail rule. Rules are configured through a YAML rules block. Each rule uses a when block with two conditions. target matches on model or mcpServers or mcpTools or request metadata. subjects matches on user or team identity with in and not_in operators. The rule then declares which guardrail integrations to run on which of the four hooks.

A baseline rule that runs Pillar on input and output for an OpenAI model used by all teams looks like this.‍

name: guardrails-control
type: gateway-guardrails-config
rules:
  - id: pillar-baseline
    when:
      target:
        operator: or
        conditions:
          model:
            values:
              - openai-main/gpt-4o
            condition: in
      subjects:
        operator: and
        conditions:
          in:
            - team:everyone
    llm_input_guardrails:
      - pillar/pillar-default-profile
    llm_output_guardrails:
      - pillar/pillar-default-profile
    mcp_tool_pre_invoke_guardrails: []
    mcp_tool_post_invoke_guardrails: []

A second rule that adds Pillar scanning around an MCP server used by an agent team would target the MCP server and apply the integration on the pre and post tool invocation hooks. All matching rules are evaluated together and their guardrail sets are unioned per hook. Two rules that both target llm_input_guardrails will both run on the input.

Per request overrides are supported through the X-TFY-GUARDRAILS header. The header carries a JSON object specifying guardrail selectors for any combination of the four hooks. This lets application teams pin a stricter or more permissive policy for a specific call without modifying the global config.

Every guardrail decision is captured in the request trace. The span includes the hook that fired and the integration selector and the verdict and the latency of the guardrail call and the evidence returned by Pillar. Traces are emitted asynchronously through NATS and exported via OTEL to whichever observability backend the team has configured. Pillar's dashboard surfaces the same events from its side with full attack transcripts and category breakdowns for compliance review.

Architecture Summary

End to end the request flow looks like this. A client sends a chat completion or agent request to the gateway. The gateway authenticates the caller against cached IdP keys and resolves the model identifier through Virtual Model routing. Matching guardrail rules are evaluated in memory and the input payload is dispatched to Pillar concurrently with the model call. If Pillar flags the input the model call is cancelled and a structured error is returned. If the input is clean the model response is awaited and submitted to Pillar's output scanners before delivery. For agent traffic the same logic applies on each MCP tool invocation and on each tool response before it re-enters the agent context. Every step is captured in a trace span with the guardrail verdict attached.

Nothing else has to change in the application. There is no SDK to install on the client and no sidecar to deploy alongside the agent and no per-service security middleware to maintain. The gateway is already in the request path and Pillar attaches to that path through its API. Existing OpenAI compatible client code keeps working without modification.

The architectural principle that makes this clean is consolidation of policy enforcement at the gateway layer. When model traffic and tool traffic and MCP traffic all converge on a single proxy then guardrails configured at that proxy apply uniformly across every model and every team and every agent without per-application code. Pillar's scanners run inline at the same point and the gateway's hook model gives Pillar access to the four enforcement points where runtime decisions actually matter.

Get Started

Learn more about the TrueFoundry AI Gateway and the Pillar Security platform. Connect Pillar in the TrueFoundry guardrails configuration and reference the integration selector from any rule that targets your models or MCP servers.

TrueFoundry AI Gateway delivers ~3–4 ms latency, handles 350+ RPS on 1 vCPU, scales horizontally with ease, and is production-ready, while LiteLLM suffers from high latency, struggles beyond moderate RPS, lacks built-in scaling, and is best for light or prototype workloads.

Built for Speed: ~10ms Latency, Even Under Load

Schedule your Demo Now