Skip to main content
The Gateway Guardrail Metrics Query API provides a flexible way to query guardrail evaluations: which guardrail ran on which entity (input or output), what the outcome was (pass, fail, error), and how long it took. You can retrieve either distribution (aggregated) or timeseries results with powerful filtering and grouping.
This page covers datasource: "guardrailMetrics". For other datasources, see the sibling pages for Model, MCP, Cache, Routing, and Agent metrics.

Access control

  • Tenant admins: Can query metrics for the entire organization (tenant-wide).
  • Users: Can query their own data and their teams’ data.
  • Virtual accounts: Can query their own data and their teams’ data; with tenant-admin permissions, they can access tenant-wide data.
The server applies RBAC automatically; callers don’t pass any RBAC fields.

Contents

SectionDescription
OverviewAuthentication, quick start, and API reference
FilteringFilter operators, fields, and combinations
Distribution examplesAggregated (distribution) query examples
Timeseries examplesTime-bucketed (timeseries) query examples
Response formatResponse JSON structure and error responses

Authentication

You need to authenticate with your TrueFoundry API key. You can use either a Personal Access Token (PAT) or Virtual Account Token (VAT).
To generate an API key:
  1. Personal Access Token (PAT): Go to Access → Personal Access Tokens in your TrueFoundry dashboard
  2. Virtual Account Token (VAT): Go to Access → Virtual Account Tokens (requires admin permissions)
For detailed authentication setup, see our Authentication guide.

Quick Start

Distribution query

Counts and latency percentiles per guardrail and outcome, restricted to input-scope evaluations:
import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2026-04-21T00:00:00.000Z",
        "endTs": "2026-04-22T00:00:00.000Z",
        "datasource": "guardrailMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "guardrailName"},
            {"type": "p50", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["guardrailName", "guardrailResult"],
        "filters": [
            {"fieldName": "appliedOnEntityScope", "operator": "IN", "value": ["input"]}
        ]
    }
)

print(response.json())

Timeseries query

Hourly counts and p99 latency per guardrail:
import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2026-04-21T00:00:00.000Z",
        "endTs": "2026-04-22T00:00:00.000Z",
        "datasource": "guardrailMetrics",
        "type": "timeseries",
        "interval": "1 hour",
        "aggregations": [
            {"type": "count", "column": "guardrailName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["guardrailName"]
    }
)

print(response.json())

API reference

Endpoint

POST https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query
Post JSON to this endpoint with Authorization: Bearer <your_api_key> and Content-Type: application/json.

Request parameters

startTs
string
required
ISO 8601 timestamp marking the inclusive lower bound of the query window.
endTs
string
required
ISO 8601 timestamp marking the exclusive upper bound of the query window.
datasource
string
required
The data source to query. Use "guardrailMetrics" for Gateway guardrail metrics.
type
string
required
The type of query to execute:
  • "distribution": returns aggregated rows (one row per groupBy combination).
  • "timeseries": returns time-bucketed rows. Requires interval.
aggregations
array
Array of { type, column } objects describing the aggregations to compute. When omitted, only the implicit total = COUNT(*) is returned.Supported aggregation types
TypeDescription
sumSum of values
countNon-null count of the column
countDistinctDistinct count
minMinimum value
maxMaximum value
avgAverage
p5, p10, p25, p50, p75, p90, p95, p99, p999Percentiles (approximate)
rateSum, rateAvg, rateMin, rateMaxRates normalised by the interval in seconds (timeseries only)
ratePerMinuteValue divided by the interval in minutes (timeseries only)
Supported aggregation columns
ColumnSupported aggregation typesNotes
latencyMsall scalar and all percentilesGuardrail evaluation latency in ms
groupBy
array
Array of field names to group results by. Custom metadata keys are supported with a metadata. prefix.Available group-by fields
FieldNotes
guardrailNameThe configured guardrail’s name
appliedOnEntityScopeWhere the guardrail ran (e.g. input, output)
guardrailResultOutcome (e.g. pass, fail, error)
userEmailGroup by user (response key: createdBySubjectSlug)
virtualaccountGroup by virtual account (response key: createdBySubjectSlug)
teamUnnests the Teams array
createdBySubjectTypeDistinguishes user vs virtualaccount
metadata.<key>Group by a custom metadata key
When groupBy contains userEmail (without virtualaccount), the server auto-injects WHERE CreatedBySubjectType = 'user'. virtualaccount alone auto-injects 'virtualaccount'. When both appear, scope it yourself with createdBySubjectType if needed.
filters
array
Array of filter objects, AND-combined. See Filtering for the full operator reference and the per-field allow-list.
interval
string
Required for timeseries queries. Bucket size as <positive integer> <unit>, where <unit> is one of second, minute, hour, day, week, month, year (with or without a trailing s). Examples: "30 second", "5 minute", "1 hour", "1 day". Compound expressions like "1 hour 30 minute" are rejected.
intervalInSeconds
number
deprecated
Deprecated alias for interval. Accepts a positive integer number of seconds. Prefer interval in new code. If both are provided, interval wins.