How FloQast governs enterprise AI at scale with TrueFoundry

FloQast is an accounting transformation platform used by finance and accounting teams to manage the financial close, reconciliations, and compliance. As the company embedded large language models across its product from transaction matching and variance analysis to journal-entry assistance and reconciliation copilots it needed a way to run AI in production that met the bar a financial-software company is held to: strict data governance, regional data residency, security, and reliability. In this case study, we explore how FloQast used the TrueFoundry AI Gateway to centralize, secure, and scale LLM and agent infrastructure across its organization.

The challenge: production AI under financial-grade governance

FloQast operates in a regulated, security-conscious domain. Building AI features wasn't just a matter of calling a model API—it required control over where data lived, who could access which capabilities, and how every request was monitored and secured.

Three requirements stood out. First, data residency and governance: prompts and traces had to be stored in FloQast's own infrastructure and kept within specific regions US, EU, and APAC then funneled through region-specific gateways down to the matching model-provider endpoints. Second, centralized control: as more teams shipped AI features against multiple model providers, FloQast needed a single point to govern inference, manage MCP servers and agents, and maintain observability and cost visibility. Third, security and least-privilege access: as a financial company, FloQast needed guardrails against threats like SQL injection and the ability to scope access tightly so that API keys map only to the specific LLM capabilities a team actually needs.

“Making sure the actual location of where our prompts and traces are stored complies with whether it's the EU, APAC, or US region the ability to have that easily funneled through specific gateways, and have the storage be in S3 locations tied to those three regions, was a huge benefit for TrueFoundry.”

— Colin Sidberry, Software Engineer, Floqast

The solution: one gateway for inference, MCP, guardrails, and routing

FloQast standardized on the TrueFoundry AI Gateway as the control plane for all of its AI traffic. What stood out to the team was how much capability came built in.

“You don't realize how much is baked into the product until you keep digging you find additional layers under every corner you turn, and you realize how powerful TrueFoundry is out of the box: managing MCP servers, agents, the prompts you're using, the playground in the UI so you can test without setting up your own infrastructure, plus the LLM inference. It's a lot packed into one.”

When asked to name the most important capabilities for the FloQast team, Colin pointed to a few.

1. The LLM proxy as the heart of the stack

The unified inference layer is what everything else feeds into. Adding a new model provider Anthropic, OpenAI, Google Gemini, or others is fast and simple. In the last 90 days alone, FloQast routed traffic across six providers (Anthropic, OpenAI, AWS Bedrock, and Google Vertex among them), with Anthropic's Claude Sonnet 4.6 carrying the majority of volume across its US and EU endpoints. The gateway processed roughly 15.1 billion input tokens and 285 million output tokens across 3.9M model requests.

2. Per-feature segmentation for cost visibility

FloQast layers virtual accounts on top of the proxy to segment inference by the specific product feature being rolled out. That makes it easy to track spend per feature and see where the LLM budget is going which features are most used and where the team is getting the most value.

“We can easily have another proxy on top so we can segment LLM inference by the actual feature we're rolling out. It's an easy way to track pricing per feature, so we have visibility into the most-used feature and where we get the most out of our LLM budget.”

— Colin Sidberry, Software Engineer, Floqast

This shows up directly in the data: distinct virtual accounts (transform, FDM, JEM, variance, and reconciliation workloads) each carry hundreds of thousands of requests, with total inference spend tracked at roughly $25K over the period fully attributable by feature.

3. Guardrails for security and compliance

FloQast runs a suite of security guardrails through the gateway PII detection, secrets detection, a SQL sanitizer for injection attacks, prompt-injection defense, content moderation, and code linting. Over 90 days the gateway evaluated 80K+ guardrail checks while adding only ~53ms of average latency, flagging risky requests without slowing the product down.

“The ease of standing up guardrails so we can run against SQL injection attacks and other safety guardrails we want for best practices when it comes to security and for software compliance, because we're a financial company, the ability to scope access down to least-privilege across your API keys, so you can tie certain keys to certain LLM functionalities instead of everybody having full access. You're only giving them what they need.”

— Colin Sidberry, Software Engineer, Floqast

MCP and agent management

Beyond raw inference, FloQast uses the gateway to manage its MCP (Model Context Protocol) servers the tools its AI agents call to read financial data, retrieve rules, pull table data, and work with journal entries. The MCP gateway handled 139K tool calls across servers like playbook, transform, FDM, and JEM, at an average latency of 389ms, giving FloQast a single governed surface for agent tool use alongside model calls.

Support that closes the loop across time zones

FloQast highlighted the support experience as a differentiator. The team works with TrueFoundry through a dedicated Slack channel and, more recently, an AI support assistant that answers within minutes valuable for a team that isn't always in the same time zone.

“It's been really great working with the TrueFoundry support team. We have a dedicated Slack channel, the team is really responsive, and the AI system you released a couple of months ago has been really helpful it responds within minutes, the accuracy of the information has been solid, and the support team follows up soon after to check if there are any additional questions.”

— Colin Sidberry, Software Engineer, Floqast

Results: reliable, governed AI at scale

By centralizing on TrueFoundry, FloQast turned a complex multi-provider, multi-region AI footprint into a single governed platform.

Over the most recent 90-day window, the gateway processed more than 4 million requests across chat completions, responses, embeddings, and MCP calls, maintaining a 98.9% success rate on model traffic. Intelligent routing directed 3.73M requests through virtual-model routing rules with a routing failure rate near 0.04%. Crucially, all of this ran with prompts and traces stored inside FloQast's own infrastructure, region by region preserving the data governance posture a financial-software company requires while giving the team one place for observability, cost data, and security best practices.

“Most definitely it's been a huge game changer. Reliability was our key concern when looking for a gateway. The fact that everything can be stored within our own infrastructure, so we keep our own data-governance rules for traces and prompts, plus a centralized point for observability and pricing data and coverage on security best practices it's been really helpful.”

— Colin Sidberry, Software Engineer, Floqast

Key takeaways for teams adopting an AI gateway

Govern data residency from day one. For regulated industries, where prompts and traces are stored and in which region is a first-class requirement, not an afterthought. Routing inference through region-specific gateways into provider endpoints let FloQast keep data in the right place automatically.

Centralize inference, then segment it. A single proxy for all model providers, with per-feature virtual accounts layered on top, gave FloQast both the simplicity of one integration and the granularity to attribute cost and usage to individual features.

Treat guardrails and least-privilege access as default infrastructure. Running PII, secrets, SQL-injection, and prompt-injection guardrails at the gateway and scoping API keys to specific capabilities builds security into every request rather than bolting it on per application.

Invest in onboarding early. With so much capability available, Colin's advice to other engineers is to sync deeply with the TrueFoundry team up front to set things up correctly and take full advantage of what's built in.

“There's a lot packed under the hood, so it's easy to stand things up the wrong way. What was really helpful was doing a deep dive with the team early walking through what we were trying to do and getting a knowledge share on how to better implement it. If I could go back and do that first, it would have made everything easier.”

— Colin Sidberry, Software Engineer, Floqast

Conclusion

FloQast's experience shows that running AI in a regulated, security-first domain doesn't have to mean building governance infrastructure from scratch. By standardizing on TrueFoundry's AI Gateway for inference, MCP and agent management, guardrails, and routing with data kept in their own cloud FloQast scaled to millions of governed requests across six model providers while preserving the reliability, data residency, and least-privilege security their customers expect.

Le moyen le plus rapide de créer, de gérer et de faire évoluer votre IA

INSCRIVEZ-VOUS

Réservez une démo

How FloQast governs enterprise AI at scale with TrueFoundry

4M+

100%

The challenge: production AI under financial-grade governance

The solution: one gateway for inference, MCP, guardrails, and routing

1. The LLM proxy as the heart of the stack

2. Per-feature segmentation for cost visibility

3. Guardrails for security and compliance

MCP and agent management

Support that closes the loop across time zones

Results: reliable, governed AI at scale

Key takeaways for teams adopting an AI gateway

Conclusion

Exploitez votre pipeline ML dès le premier jour

Blogue

How FloQast governs enterprise AI at scale with TrueFoundry

4M+

100%

The challenge: production AI under financial-grade governance

The solution: one gateway for inference, MCP, guardrails, and routing

1. The LLM proxy as the heart of the stack

2. Per-feature segmentation for cost visibility

3. Guardrails for security and compliance

MCP and agent management

Support that closes the loop across time zones

Results: reliable, governed AI at scale

Key takeaways for teams adopting an AI gateway

Conclusion

Exploitez votre pipeline ML dès le premier jour

Blogue

Abonnez-vous à notre newsletter