Skip to main content
The Gateway Agent Metrics Query API provides a flexible way to query agent-framework invocations and their outcomes: which agent ran, on which framework, on which server type, and how long it took. You can retrieve either distribution (aggregated) or timeseries results with powerful filtering and grouping.
This page covers datasource: "agentMetrics". For other datasources, see the sibling pages for Model, MCP, Guardrail, Cache, and Routing metrics.

Access control

  • Tenant admins: Can query metrics for the entire organization (tenant-wide).
  • Users: Can query their own data and their teams’ data.
  • Virtual accounts: Can query their own data and their teams’ data; with tenant-admin permissions, they can access tenant-wide data.
The server applies RBAC automatically; callers don’t pass any RBAC fields.

Contents

SectionDescription
OverviewAuthentication, quick start, and API reference
FilteringFilter operators, fields, and combinations
Distribution examplesAggregated (distribution) query examples
Timeseries examplesTime-bucketed (timeseries) query examples
Response formatResponse JSON structure and error responses

Authentication

You need to authenticate with your TrueFoundry API key. You can use either a Personal Access Token (PAT) or Virtual Account Token (VAT).
To generate an API key:
  1. Personal Access Token (PAT): Go to Access → Personal Access Tokens in your TrueFoundry dashboard
  2. Virtual Account Token (VAT): Go to Access → Virtual Account Tokens (requires admin permissions)
For detailed authentication setup, see our Authentication guide.

Quick Start

By default, agent metrics include every Gateway request, not just agent invocations. Rows that didn’t go through an agent will have agentName, agentFramework, and agentServerType set to null and surface in groupBy output as null buckets. The IS_NULL operator is not supported on these three fields; to scope a query to a known set of agents, filter with {"fieldName": "agentName", "operator": "IN", "value": ["<your-agent>"]} or use STRING_* operators.

Distribution query

p50 and p99 latency per agent:
import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2026-04-21T00:00:00.000Z",
        "endTs": "2026-04-22T00:00:00.000Z",
        "datasource": "agentMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "p50", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["agentName"]
    }
)

print(response.json())

Timeseries query

Hourly p99 latency per agent, restricted to failed invocations:
import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2026-04-21T00:00:00.000Z",
        "endTs": "2026-04-22T00:00:00.000Z",
        "datasource": "agentMetrics",
        "type": "timeseries",
        "interval": "1 hour",
        "aggregations": [
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["agentName"],
        "filters": [
            {"fieldName": "isFailure", "operator": "EQUAL", "value": true}
        ]
    }
)

print(response.json())

API reference

Endpoint

POST https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query
Post JSON to this endpoint with Authorization: Bearer <your_api_key> and Content-Type: application/json.

Request parameters

startTs
string
required
ISO 8601 timestamp marking the inclusive lower bound of the query window.
endTs
string
required
ISO 8601 timestamp marking the exclusive upper bound of the query window.
datasource
string
required
The data source to query. Use "agentMetrics" for Gateway agent metrics.
type
string
required
The type of query to execute:
  • "distribution": returns aggregated rows (one row per groupBy combination).
  • "timeseries": returns time-bucketed rows. Requires interval.
aggregations
array
Array of { type, column } objects describing the aggregations to compute. When omitted, only the implicit total = COUNT(*) is returned.Supported aggregation types
TypeDescription
sumSum of values
countNon-null count of the column
countDistinctDistinct count
minMinimum value
maxMaximum value
avgAverage
p5, p10, p25, p50, p75, p90, p95, p99, p999Percentiles (approximate)
rateSum, rateAvg, rateMin, rateMaxRates normalised by the interval in seconds (timeseries only)
ratePerMinuteValue divided by the interval in minutes (timeseries only)
Supported aggregation columns
ColumnSupported aggregation typesNotes
latencyMsall scalar and all percentilesAgent invocation latency in ms
groupBy
array
Array of field names to group results by. Custom metadata keys are supported with a metadata. prefix.Available group-by fields
FieldNotes
agentNameAgent identifier
agentFrameworkFramework, e.g. langgraph, crewai
agentServerTypeTransport, e.g. sse, streamable-http
isFailureBoolean: true for failed invocations. GroupBy is permitted but rarely useful (only two values); filtering is the primary use.
httpStatusCodeRaw HTTP status code (NULL for rows without one). Frontends typically render NULL as “unknown” and numerics as HTTP: <code>.
userEmailGroup by user (response key: createdBySubjectSlug)
virtualaccountGroup by virtual account (response key: createdBySubjectSlug)
teamUnnests the Teams array
createdBySubjectTypeDistinguishes user vs virtualaccount
metadata.<key>Group by a custom metadata key
When groupBy contains userEmail (without virtualaccount), the server auto-injects WHERE CreatedBySubjectType = 'user'. virtualaccount alone auto-injects 'virtualaccount'. When both appear, scope it yourself with createdBySubjectType if needed.
filters
array
Array of filter objects, AND-combined. See Filtering for the full operator reference and the per-field allow-list.
interval
string
Required for timeseries queries. Bucket size as <positive integer> <unit>, where <unit> is one of second, minute, hour, day, week, month, year (with or without a trailing s). Examples: "30 second", "5 minute", "1 hour", "1 day". Compound expressions like "1 hour 30 minute" are rejected.
intervalInSeconds
number
deprecated
Deprecated alias for interval. Accepts a positive integer number of seconds. Prefer interval in new code. If both are provided, interval wins.