API Access to Agent Metrics

The Gateway Agent Metrics Query API provides a flexible way to query agent-framework invocations and their outcomes: which agent ran, on which framework, on which server type, and how long it took. You can retrieve either distribution (aggregated) or timeseries results with powerful filtering and grouping.

This page covers datasource: "agentMetrics". For other datasources, see the sibling pages for Model, MCP, Guardrail, Cache, and Routing metrics.

Access control

Tenant admins: Can query metrics for the entire organization (tenant-wide).
Users: Can query their own data and their teams’ data.
Virtual accounts: Can query their own data and their teams’ data; with tenant-admin permissions, they can access tenant-wide data.

The server applies RBAC automatically; callers don’t pass any RBAC fields.

Section	Description
Overview	Authentication, quick start, and API reference
Filtering	Filter operators, fields, and combinations
Distribution examples	Aggregated (distribution) query examples
Timeseries examples	Time-bucketed (timeseries) query examples
Response format	Response JSON structure and error responses

Authentication

You need to authenticate with your TrueFoundry API key. You can use either a Personal Access Token (PAT) or Virtual Account Token (VAT).

Get your API key

To generate an API key:

Personal Access Token (PAT): Go to Access → Personal Access Tokens in your TrueFoundry dashboard
Virtual Account Token (VAT): Go to Access → Virtual Account Tokens (requires admin permissions)

For detailed authentication setup, see our Authentication guide.

Quick Start

By default, agent metrics include every Gateway request, not just agent invocations. Rows that didn’t go through an agent will have agentName, agentFramework, and agentServerType set to null and surface in groupBy output as null buckets. The IS_NULL operator is not supported on these three fields; to scope a query to a known set of agents, filter with {"fieldName": "agentName", "operator": "IN", "value": ["<your-agent>"]} or use STRING_* operators.

Distribution query

p50 and p99 latency per agent:

import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2026-04-21T00:00:00.000Z",
        "endTs": "2026-04-22T00:00:00.000Z",
        "datasource": "agentMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "p50", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["agentName"]
    }
)

print(response.json())

Timeseries query

Hourly p99 latency per agent, restricted to failed invocations:

import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2026-04-21T00:00:00.000Z",
        "endTs": "2026-04-22T00:00:00.000Z",
        "datasource": "agentMetrics",
        "type": "timeseries",
        "interval": "1 hour",
        "aggregations": [
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["agentName"],
        "filters": [
            {"fieldName": "isFailure", "operator": "EQUAL", "value": true}
        ]
    }
)

print(response.json())

API reference

Endpoint

POST https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query

Post JSON to this endpoint with Authorization: Bearer <your_api_key> and Content-Type: application/json.

Request parameters

string

required

ISO 8601 timestamp marking the inclusive lower bound of the query window.

string

required

ISO 8601 timestamp marking the exclusive upper bound of the query window.

string

required

The data source to query. Use "agentMetrics" for Gateway agent metrics.

string

required

The type of query to execute:

"distribution": returns aggregated rows (one row per groupBy combination).
"timeseries": returns time-bucketed rows. Requires interval.

array

Array of { type, column } objects describing the aggregations to compute. When omitted, only the implicit total = COUNT(*) is returned.Supported aggregation types

Type	Description
`sum`	Sum of values
`count`	Non-null count of the column
`countDistinct`	Distinct count
`min`	Minimum value
`max`	Maximum value
`avg`	Average
`p5`, `p10`, `p25`, `p50`, `p75`, `p90`, `p95`, `p99`, `p999`	Percentiles (approximate)
`rateSum`, `rateAvg`, `rateMin`, `rateMax`	Rates normalised by the interval in seconds (timeseries only)
`ratePerMinute`	Value divided by the interval in minutes (timeseries only)

Supported aggregation columns

Column	Supported aggregation types	Notes
`latencyMs`	all scalar and all percentiles	Agent invocation latency in ms

array

Array of field names to group results by. Custom metadata keys are supported with a metadata. prefix.Available group-by fields

Field	Notes
`agentName`	Agent identifier
`agentFramework`	Framework, e.g. `langgraph`, `crewai`
`agentServerType`	Transport, e.g. `sse`, `streamable-http`
`isFailure`	Boolean: `true` for failed invocations. GroupBy is permitted but rarely useful (only two values); filtering is the primary use.
`httpStatusCode`	Raw HTTP status code (NULL for rows without one). Frontends typically render NULL as “unknown” and numerics as `HTTP: <code>`.
`userEmail`	Group by user (response key: `createdBySubjectSlug`)
`virtualaccount`	Group by virtual account (response key: `createdBySubjectSlug`)
`team`	Unnests the `Teams` array
`createdBySubjectType`	Distinguishes `user` vs `virtualaccount`
`metadata.<key>`	Group by a custom metadata key

When groupBy contains userEmail (without virtualaccount), the server auto-injects WHERE CreatedBySubjectType = 'user'. virtualaccount alone auto-injects 'virtualaccount'. When both appear, scope it yourself with createdBySubjectType if needed.

array

Array of filter objects, AND-combined. See Filtering for the full operator reference and the per-field allow-list.

string

Required for timeseries queries. Bucket size as <positive integer> <unit>, where <unit> is one of second, minute, hour, day, week, month, year (with or without a trailing s). Examples: "30 second", "5 minute", "1 hour", "1 day". Compound expressions like "1 hour 30 minute" are rejected.

number

deprecated

Deprecated alias for interval. Accepts a positive integer number of seconds. Prefer interval in new code. If both are provided, interval wins.

Get Started

LLM Gateway

MCP Registry and Gateway

Skills Registry

Prompt Registry

Guardrails and Security

Observability

Deployment

Admin Guide

Chat

Messages

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Fine-tuning

Moderations

Models

Access control

Contents

Authentication

Quick Start

Distribution query

Timeseries query

API reference

Endpoint

Request parameters

Query Examples

​Access control

​Contents

​Authentication

​Quick Start

​Distribution query

​Timeseries query

​API reference

​Endpoint

​Request parameters

Query Examples

Access control

Contents

Authentication

Quick Start

Distribution query

Timeseries query

API reference

Endpoint

Request parameters