TrueFoundry AI Gateway: FIPS Compliance on AWS & Azure Gov

Speed, Security, Sovereignty: The FIPS Compliant AI Gateway

In the public sector today, we are witnessing a collision. On one side, we have the Unstoppable Force of Generative AI. Agency leaders know that Large Language Models (LLMs) can reduce document processing times from days to seconds. They see the potential for massive efficiency gains.

On the other side sits the Immovable Object: Compliance. Specifically, the Federal Information Processing Standards (FIPS) requirements. These aren't just red tape; they are the non-negotiable laws of physics for government data.

The common belief is that you must choose: Speed or Security. You can either have a modern, agile AI stack that breaks the rules, or a compliant, "safe" stack that is years behind the curve.

We disagree. You don't have to choose. You just need the right architecture, which we deliver through the Truefoundry AI Gateway. We call it the "On-Prem Cloud" Strategy.

Why Compliance Isn't Optional

Before we talk about the solution, let’s be diplomatic but direct about the problem. Why do we need FIPS? Why can't we just use the standard API keys for OpenAI or Anthropic?

‍

Executive Brief: The FIPS Mandate and the "Secret" Problem

Before we discuss the solution, we must clearly define the constraint.

FIPS (Federal Information Processing Standards), specifically FIPS 140-3 (https://csrc.nist.gov/pubs/fips/140-3/final), is the official U.S. government standard for cryptographic modules. It does not simply ask, "Is your data encrypted?" It asks a far more rigorous question: "Is the specific mathematical module performing the encryption validated by a NIST-accredited laboratory?"

For government agencies, this is non-negotiable. If data—or the secrets protecting that data—is handled by a non-validated module (like standard OpenSSL found in most commercial software), it is effectively considered "plaintext" in the eyes of an auditor.

The Conflict with Modern AI: Custody of Secrets The intersection of FIPS and Generative AI creates a critical vulnerability regarding API Keys. Modern LLMs (like GPT-4 or Claude 3.5) function by exchanging long-term secrets—API keys—that grant access to your agency's data and budget.

The SaaS Risk: In a standard SaaS deployment, you upload these high-value API keys to a vendor's cloud. You lose custody. If that vendor stores them in a standard database that relies on non-validated encryption, you have effectively exposed your credentials to an uncleared environment.
The On-Prem Advantage: By deploying Truefoundry "On-Prem," you regain sovereignty. Your API keys are stored in your own AWS Secrets Manager or Azure Key Vault (which are FIPS-validated services). The AI Gateway retrieves them programmatically only for the millisecond they are needed to sign a request. The keys never leave your FIPS-validated boundary, and they are never visible to the software vendor.

The Shadow AI Consequence: When agencies fail to provide a compliant architecture for these keys, teams are forced to go rogue.

‍

The Samsung Incident: In 2023, well-meaning engineers at Samsung pasted proprietary code into the public version of ChatGPT to optimize it. They didn't "hack" anything; they just tried to be efficient. The result? That sensitive IP leaked into the public domain.
The Equifax Lesson: Major breaches often happen not because encryption was missing, but because it was implemented poorly (weak keys, expired certificates). FIPS prevents this by mandating validated cryptographic modules.

The takeaway: If you don't give your teams a secure, compliant way to use AI, they will find an insecure way to do it.

‍

The Solution: Truefoundry "On-Prem" in the Cloud

Truefoundry is an AI Gateway—a control plane that manages your LLM interactions. It brings Frontier-Class Capabilities like model routing, caching, and cost tracking.

Now, let's address the elephant in the room: Truefoundry's software itself is not FIPS 140-2 validated. It holds robust commercial certifications like SOC 2 Type II and HIPAA, which proves it is mature and secure for enterprise use. But it does not carry the specific FedRAMP High badge required for defense workloads.

So, how do we use it in a government environment?

We use the "Fortress Strategy."

We deploy Truefoundry’s Data Plane as a self-hosted ("On-Prem") workload inside your existing AWS GovCloud or Azure Government by Microsoft Azure or Google Public Sector from GCP (in the rest of the blog, we use AWS GovCloud to illustrate but the same principle applies to Azure and GCP) environment.

The Tank (Infrastructure): AWS GovCloud provides the FIPS-validated armor. It handles the physical security and the cryptographic heavy lifting.
The Engine (Truefoundry): The AI Gateway provides speed and intelligence.

By nesting the application inside the secure infrastructure, we achieve compliance through inheritance.

‍

Architecture Deep Dive: The Fortress

How do we isolate the non-FIPS software inside a FIPS-compliant shell? We treat the Truefoundry Gateway as a "Black Box" protected by AWS services.

Fig 1: Overall Conceptual Model

The Acronym Decoder (Why we built it this way)

FIPS-Enabled ALB (Application Load Balancer): This is our "Bouncer." We configure this ALB to use FIPS 140-3 or previously FIPS 140-2 validated cipher suites. It terminates the TLS connection here. This means the "crypto handshake" is handled by AWS's validated hardware, not by the Truefoundry container. The application effectively "inherits" this compliance for ingress.
Air-Gapped VPC: The Gateway lives in a private subnet with no direct route to the internet. It can only "speak" when spoken to by the ALB, or "whisper" out to specific LLM providers via a strict NAT Gateway firewall.
WORM Storage (Write Once, Read Many): We route audit logs to Amazon S3 with Object Lock enabled. This creates a legally defensible audit trail that satisfies compliance officers—once a log is written, it cannot be deleted.

‍

User Journey: "Safe Speed" with Alex

Architecture diagrams are great for engineers, but let’s look at how this changes the daily reality for Alex, a Senior Analyst. This workflow demonstrates how the "Fortress" handles a real-world task while protecting the agency from mistakes.

The Mission: Alex has a vendor proposal containing Controlled Unclassified Information (CUI) and potential PII. He needs a summary in 20 minutes.

Fig 2: User workflow with Merits by TrueFoundry

‍

Phase 1: Active Protection (The "Safety Net")

Alex pastes the text into the Truefoundry UI. He doesn't notice that page 4 contains a vendor's Tax ID.

The Interception: As Alex hits enter, the Truefoundry Guardrails scan the input instantly.
The Action: The system detects the Tax ID pattern. It doesn't just block the request; it surgically redacts the sensitive numbers.
The Result: The prompt that actually travels to the LLM is safe. Alex gets a notification: "Tax ID Found! Redacting..." He is protected from an accidental leak.

Phase 2: Model Agnosticism (The "Pivot")

The system routes the redacted prompt to Llama 3 on Bedrock. The summary comes back "mediocre."

The Switch: Alex doesn't need to call IT. He selects "Claude 3.5 (Azure)" from the dropdown menu and hits "Regenerate."
The Routing: Truefoundry automatically reroutes the request to a completely different cloud provider. The complexity of authenticating with Azure vs. AWS is hidden from Alex. He just gets a better answer.

Phase 3: Cost & Audit (The "Paper Trail")

Once Alex gets his "Perfect Summary," two background processes trigger:

Caching: The answer is saved. If a colleague asks the same question tomorrow, they get the answer instantly for a $0.00 cost.
Audit Log: The system logs the entire interaction—including the redaction event and the cost ($0.42)—and sends it to the Manager via S3 WORM storage for permanent record keeping.

‍

Conclusion: A Force Multiplier

Truefoundry’s "On-Prem" approach allows government agencies to have their cake and eat it too.

By nesting the Truefoundry Data Plane inside AWS GovCloud, you create a system that is:

Sovereign: Your data never leaves your control without permission.
Agile: You can switch models (OpenAI, Anthropic, Llama) instantly as technology evolves.
Compliant: You leverage the existing FIPS validations of AWS to protect the application.

This isn't just about checking a box on a compliance form. It's about empowering people like Alex to do their jobs safely, efficiently, and without fear of becoming the next headline.

‍

Built for Speed: ~10ms Latency, Even Under Load

Blazingly fast way to build, track and deploy your models!

Handles 350+ RPS on just 1 vCPU — no tuning needed
Production-ready with full enterprise support

Get Started with Truefoundry Now Talk to the Expert

TrueFoundry AI Gateway delivers ~3–4 ms latency, handles 350+ RPS on 1 vCPU, scales horizontally with ease, and is production-ready, while LiteLLM suffers from high latency, struggles beyond moderate RPS, lacks built-in scaling, and is best for light or prototype workloads.

Built for Speed: ~10ms Latency, Even Under Load

Schedule your Demo Now

The fastest way to build, govern and scale your AI

Book a Demo

Leveraging the TrueFoundry AI Gateway for FIPS Compliance