What is Braintrust pricing for the Pro plan in 2026?

Braintrust pricing for the Pro plan is $249 per month. It includes 5 GB of processed data, 50,000 scores, 30-day retention, three fixed roles, Google SSO, and priority email support. Overage pricing applies beyond included limits, with extra processed data and additional scores billed separately.

Does Braintrust offer a free tier, and what does it include?

Yes, Braintrust offers a free Starter tier with no credit card requirement. It includes 1 GB of processed data, 10,000 scores, 14-day retention, and one Owner role. It is useful for individual developers, early eval workflows, and small teams testing Braintrust before moving to paid usage.

What features require the Braintrust Enterprise plan and why?

Braintrust Enterprise is required for custom RBAC, SAML and OIDC SSO, signed BAA coverage, SOC 2 needs, self-hosting, guaranteed SLAs, custom retention, and premium support. These features matter because large organizations need stronger access control, deployment flexibility, audit readiness, and legal terms before production adoption.

Does Braintrust pricing include HIPAA compliance and a BAA at any tier?

Braintrust provides a signed Business Associate Agreement under the Enterprise plan only. Starter and Pro tiers do not include BAA coverage, so teams handling protected health information should not rely on them. For healthcare use cases, contract requirements matter more than usage volume.

What is the Braintrust cost for teams that exceed included usage limits?

The Braintrust cost rises when teams exceed included data or score limits. On Pro, additional processed data costs $3 per GB, and extra scores cost $1.50 per thousand. A team logging 10 GB and 100,000 scores in a month would land near $339, including the $249 base fee.

Braintrust Pricing 2026: Plans, Costs, and What Is Missing

Q: What Braintrust Pricing Does Not Cover for Enterprise Teams?

While Braintrust provides strong capabilities for AI evaluation, observability, and trace analysis, it is not designed to govern AI requests before they reach a model. Enterprise teams often require additional controls such as inference-layer access management, hard token budgets, VPC-native policy enforcement, and MCP tool governance. These capabilities help organizations manage security, costs, and compliance proactively, complementing Braintrust’s strengths in monitoring model performance and quality after execution.

Built for Speed: ~10ms Latency, Even Under Load

Blazingly fast way to build, track and deploy your models!

Handles 350+ RPS on just 1 vCPU — no tuning needed
Production-ready with full enterprise support

Get Started with Truefoundry Now Talk to the Expert

Most teams adopt Braintrust with a clear goal. They want to know whether an LLM output is useful, safe, and consistent. They also want to catch regressions before a prompt change reaches production.

That dual focus on evals and observability is where Braintrust performs well. It helps teams score responses, trace requests, compare experiments, and review metrics across model behavior. For individual developers, small teams, and AI-native product teams, that can create real workflow discipline.

Braintrust pricing becomes more complex when usage grows. The headline looks simple: Starter is free, Pro is $249 per month, and Enterprise is custom. The real decision depends on usage limits, overage charges, governance controls, retention needs, and enterprise requirements.

Large organizations should read the tiers carefully. Custom RBAC, SAML SSO, a signed BAA, SOC 2 support, and self-hosting sit at the Enterprise level. That means the real Braintrust cost is often shaped by compliance needs, not only volume.

The sections below explain each plan, the metered charges that apply beyond limits, and the governance gap evaluation tools usually leave open. They also explain where TrueFoundry can complement Braintrust when teams need inference control before the model call runs.

Braintrust Pricing Plans: What Each Tier Includes

Braintrust restructured its plans in March 2026 into three tiers. The differences between them turn out to be less about raw usage than about which controls you get. Higher tiers raise the included data and score limits, sure. Each step up the ladder also gates a different slice of governance.

Plan	Base Pricing	Included Usage	Governance and Security	Best Fit
Starter	$0	1 GB processed data, 10,000 scores	Owner role, basic access	Individual developers and early experiments
Pro	$249/month	5 GB processed data, 50,000 scores	Three fixed roles, Google SSO	Growing teams and production evaluations
Enterprise	Custom	Custom limits	Custom RBAC, SAML/OIDC, BAA, SOC 2, self-hosting	Large organizations with compliance needs

Higher tiers raise data and score limits, while also gating stronger controls. This is important because evaluation platforms often become operational systems. Teams rely on them for dashboards, alerts, datasets, playground workflows, and release confidence.

Braintrust Evaluates Your AI, TrueFoundry Governs Every Call It Makes

TrueFoundry adds RBAC, access controls, cost attribution, and VPC-native audit logging across every AI workload your teams run.

Book a Demo

Starter

The Starter tier costs nothing and does not require a credit card. It includes 1 GB of processed trace data per month, 10,000 scores, and 14-day log retention. That is useful for one developer testing an evaluation workflow or a smaller project in early experimentation.

The limitation is the permission model. Starter gives teams one Owner role, without deeper separation between editors and read-only users. Manual grading is also capped at one human-review scorer per project, which limits collaboration once review workflows expand.

When teams cross the included limits, metered billing applies. Starter overages cost $4 per GB of processed data and $2.50 per thousand scores. That makes Starter useful for learning the platform, although careful tracking matters once experiments become recurring workflows.

Pro

Pro costs $249 per month and is the tier most growing teams evaluate first. It includes 5 GB of processed data, 50,000 scores, and 30-day retention. It also adds three fixed roles: owner, engineer, and viewer.

Google SSO comes with Pro, along with team features such as custom charts, dataset snapshots, environments, annotations, and expanded review workflows. Pro also supports unlimited human-review scorers, which is a meaningful upgrade from Starter’s single-scorer limit.

Overage pricing is lower than Starter. Additional processed data costs $3 per GB, while extra scores cost $1.50 per thousand. Pro works well when teams need evaluation discipline, stronger transparency, and recurring model-quality workflows without Enterprise procurement.

Enterprise

Enterprise is a custom annual contract based on usage and requirements. This is where Braintrust places the controls many security and compliance teams consider mandatory. These include custom RBAC, SAML and OIDC SSO, custom retention, domain mappings, export automations, SLAs, and self-hosting.

Enterprise also supports HIPAA-related BAA requirements and SOC 2 needs. It is the right tier when evaluation data includes sensitive prompts, customer outputs, regulated records, or internal production traces. The drawback is budget uncertainty, as pricing depends on negotiation.

The Enterprise tier is also relevant for teams with strict deployment needs. If evaluation data cannot remain in Braintrust’s shared-hosted environment, self-hosting becomes part of the Enterprise discussion. That shifts pricing from a visible plan to a conversation with the vendor.

Braintrust pricing tiers comparison table for 2026

What Braintrust Pricing Actually Costs at Scale

The sticker prices are only the starting point for the real numbers. Two factors drive actual cost: usage that exceeds included limits and controls that only appear at higher tiers. Together, they make Braintrust pricing a planning exercise, not a simple monthly subscription decision.

Usage-Based Charges Apply Beyond Plan Limits

Cross the included data or score allotment, and metered billing stacks on top of the base fee. Picture a Pro team logging 10 GB of trace data and running 100,000 scores in a month. The base $249 includes 5 GB and 50,000 scores.

The extra 5 GB costs $15 at $3 per GB. The next 50,000 scores cost $75 at $1.50 per thousand. That brings the estimated monthly cost to $339 before considering any broader Enterprise requirements.

At this volume, the overage is manageable. At ten times the traffic, heavier scoring and longer retention can shift attention away from the base fee. For scale planning, teams should forecast traces, scores, storage behavior, datasets, and review frequency.

Cross the included data or score allotment and metered billing stacks on top of the base fee. Picture a Pro team logging 10 GB of trace data and running 100,000 scores in a month. The base $249 buys 5 GB and 50,000 scores. The extra 5 GB costs $15, and the next 50,000 scores cost $75. Call it $339 for the month.

RBAC and SSO Are Enterprise-Only

The exact wording matters because the headline can oversimplify. Pro is not without access control. It includes three fixed roles and Google sign-on, which can be enough for many teams.

What Pro does not offer is custom RBAC or SAML/OIDC SSO. These are the controls that map access to Okta, Azure AD, or other enterprise identity systems. For access reviews, fixed roles and Google sign-on may not satisfy enterprise security teams.

That distinction affects the true Braintrust cost. A team may sit well within Pro usage limits and still need Enterprise because access governance requires SAML or custom roles. In this case, security requirements determine the tier before usage does.

Self-Hosting Requires Enterprise Tier

Running Braintrust inside your own cloud is an Enterprise capability. Starter and Pro use Braintrust’s hosted environment, meaning traces and evaluation data are processed outside the customer’s infrastructure boundary.

Braintrust’s self-hosted option separates customer-controlled data infrastructure from platform management. It is designed for teams that need stronger data control without operating the full platform alone. Even then, self-hosting still requires Enterprise procurement.

For regulated teams, this matters more than sticker price. If prompts, outputs, or evaluation traces cannot leave the organization’s boundary, Starter and Pro may be unsuitable. There is no intermediate tier between hosted Pro and negotiated Enterprise for that requirement.

HIPAA BAA Is Enterprise-Only

A signed Business Associate Agreement is available only through Enterprise. A BAA is required when a vendor handles protected health information under HIPAA. Without that contract, teams should not evaluate clinical or PHI-related model outputs on Starter or Pro.

SOC 2 and advanced compliance terms follow a similar pattern. The deciding factor becomes contract coverage, not only monthly volume. A healthcare, insurance, or clinical AI team may need Enterprise even when usage remains modest.

When Braintrust Pricing Makes Sense and When It Does Not

Braintrust is strongest when the job is clearly about model-quality evaluation. It gives teams a structured place to run evals, inspect traces, compare experiments, and find regressions. Company size matters less than the type of workflow being managed.

It earns its keep when:

Teams are tuning model quality and need scoring, trace inspection, and regression checks.
Pro usage stays within the included data and score limits.
Three built-in roles are enough for current access needs.
The goal is to evaluate outputs, not control inference.
The evaluation team needs repeatable datasets, metrics, and playground workflows.
Governance before inference already has a separate owner.

It stops paying off cleanly when:

Compliance drives the purchase and requires Enterprise from day one.
A signed BAA, SOC 2 evidence, or SAML SSO is mandatory.
Teams need hard budgets before inference requests execute.
Sensitive evaluation data cannot leave the customer environment.
Agents need tool-level governance before actions run.
The team needs inference-layer access control, not post-response review.

The practical takeaway is simple. Braintrust works well when teams need evaluation and observability discipline. It becomes less complete when buyers expect it to control what models, agents, or tools are allowed to do before execution.

Braintrust pricing coverage versus enterprise governance gaps at each tier

What Braintrust Pricing Does Not Cover for Enterprise Teams

Set the tiers aside for a moment, as some enterprise needs fall outside Braintrust’s core purpose. Braintrust monitors and evaluates what happens after a model responds. It is not designed to govern every request before inference executes.

Four capabilities sit on the other side of that line:

Access controls at the inference layer: Teams need to decide which services, roles, or users can call each model. That decision must happen before inference, not after the response returns.
Per-team token budgets with hard limits: A dashboard can show overspending after the fact. A gateway budget can stop a runaway agent before the money is spent. Teams can review broader gateway cost planning before choosing how to control inference spend.
VPC-native inference governance: Some enterprises need policy enforcement on the request path inside their own cloud. This prevents prompts and responses from being exposed to a vendor environment for inspection.
MCP tool governance: Agent tools need controls on which tools can run and which identities they use. This area has drawn more scrutiny as MCP security research has expanded.

This does not make Braintrust weak. It means Braintrust has a defined role in the AI stack. It helps teams measure quality and catch regressions, while inference governance requires a separate request-path control layer.

The point also matters for Braintrust’s storage architecture. Braintrust promotes itself as a database designed for AI trace data at scale. That supports observability and querying, although request-path enforcement still belongs before model execution.

Braintrust Observes What Happened, TrueFoundry Controls What Happens Next

Create your TrueFoundry account and get VPC-native inference governance, per-team cost controls, and compliance-ready logging from day one.

Create Account

TrueFoundry as a Complement or Alternative to Braintrust

TrueFoundry and Braintrust occupy different layers of the AI stack. Braintrust sits after inference and helps teams evaluate outputs, compare scores, and catch regressions. TrueFoundry sits before inference and governs whether a request should run, which model it can reach, and how that action is logged.

Architecture diagram showing TrueFoundry governing inference before the request and Braintrust evaluating output afterward

Teams that need both layers can run them together. Braintrust can continue handling evals and observability after the response returns. TrueFoundry can manage the request path through a governed AI gateway, where access policies, budgets, and audit logs apply before inference begins.

This distinction matters when AI workloads move from testing to production. Evaluation helps teams understand output quality, while governance helps teams control exposure, cost, and access before the model call happens.

TrueFoundry is relevant when teams need:

Request-path control: Enforce identity, access, routing, and policy checks before inference executes.
Budget enforcement: Apply model, team, user, or workflow limits before costs accumulate.
Private deployment: Keep prompts, responses, logs, and metadata inside the customer’s cloud boundary.
Audit-ready records: Tie model calls to user identity, cost, latency, and policy outcomes.
Agent workflow control: Govern agent behavior, tool access, circuit breakers, and runtime limits where needed.

TrueFoundry can also serve as the observability layer for teams that want fewer systems. It records model calls, usage, costs, and agent actions with structured metadata. Those logs can remain inside the customer’s VPC and connect with existing monitoring tools.

The practical choice is straightforward. Braintrust remains useful when the primary need is output evaluation and regression tracking. TrueFoundry becomes the stronger layer when teams need inference governance, hard budgets, private deployment, and compliance-ready audit trails.

If you want to see VPC-native inference governance and per-team cost controls in action on your own workloads, you can book a demo with us.

TrueFoundry AI Gateway delivers ~3–4 ms latency, handles 350+ RPS on 1 vCPU, scales horizontally with ease, and is production-ready, while LiteLLM suffers from high latency, struggles beyond moderate RPS, lacks built-in scaling, and is best for light or prototype workloads.

Built for Speed: ~10ms Latency, Even Under Load

Schedule your Demo Now