Aviva Credito is a Mexico-based lender focused on expanding access to credit. To reach customers that traditional banks and fully online fintechs struggle to serve, Aviva operates small physical kiosks supported by an automated, tablet-first onboarding experience - building trust while reducing fraud risk.
As Aviva’s AI initiatives grew from computer vision models to production-grade chatbots and document verification workflows, the team faced two recurring challenges: (1) deploying and operating LLM services without requiring deep Kubernetes expertise, and (2) managing multiple LLM providers with consistent observability, cost control, and agility.
By using TrueFoundry’s Deployment and AI Gateway, Aviva empowered every ML/AI engineer to ship production services independently, Observability across Azure and GCP model providers, and created a scalable foundation for safety and agentic workflows.
.webp)
Aviva’s mission is to increase access to credit for underserved communities in Mexico. Aviva’s model combines a physical presence, small kiosks with a single employee, while keeping the full process automated through tablets to deliver the best of both worlds: high trust and lower fraud, with the speed of automation.
Aviva’s first major inflection point came from a practical need: deploying an LLM model to recognize Mexico’s INE identity cards. The ML team could finetune / build the model, but shipping it reliably required an operational path they didn’t yet have. Early attempts ranged from manual VM-based deploys (slow and error-prone) to managed services that either lacked GPU support or failed to deliver quickly.
TrueFoundry’s deployment experience changed that: clear logs and observability sidecars surfaced the root cause behind a failing container, allowing the team to fix the image and successfully deploy in under an hour.
.png)

.webp)
.png)
By centralizing all LLM traffic through TrueFoundry’s AI Gateway, Aviva gained end-to-end visibility and control across a rapidly scaling, multi-cloud AI stack. Over a 90-day period, the team managed nearly half a million production requests and over 1.8B input tokens with predictable cost, measurable reliability, and significantly improved engineering velocity. The Gateway enabled rapid detection of cost and latency anomalies, model-level routing and failover without application changes, and a shared abstraction that allowed engineers to deploy, upgrade, and operate LLM-powered services independently.
.avif)