Skip to main content

Distribution queries

Aggregated snapshots of agent invocations over a time window. Every example below posts JSON to:
POST https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query
with Authorization: Bearer <your_api_key> and Content-Type: application/json. To keep the snippets short, only the JSON body is shown; the wrapper is identical to the Overview Quick Start.
Agent metrics include every Gateway request by default. Rows that didn’t go through an agent will have agentName, agentFramework, and agentServerType set to null and show up as null buckets in groupBy output. IS_NULL is not supported on these three fields; to scope to specific known agents, use agentName IN [...] or one of the STRING_* operators.
Counts grouped by agent framework, useful for capacity planning across langgraph, crewai, etc.:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "agentMetrics",
    "type": "distribution",
    "groupBy": ["agentFramework"]
}
Counts grouped by agent and outcome. Surfaces flaky agents:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "agentMetrics",
    "type": "distribution",
    "groupBy": ["agentName", "isFailure"]
}
p50, p90, and p99 latency grouped by framework:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "agentMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "p50", "column": "latencyMs"},
        {"type": "p90", "column": "latencyMs"},
        {"type": "p99", "column": "latencyMs"}
    ],
    "groupBy": ["agentFramework"]
}
Group by raw httpStatusCode; rows without one show up as null (rendered as “unknown” in dashboards):
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "agentMetrics",
    "type": "distribution",
    "groupBy": ["httpStatusCode"]
}
Cross-tab of transport vs framework:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "agentMetrics",
    "type": "distribution",
    "groupBy": ["agentServerType", "agentFramework"]
}
Restrict to failed invocations:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "agentMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "p99", "column": "latencyMs"}
    ],
    "groupBy": ["agentName"],
    "filters": [
        {"fieldName": "isFailure", "operator": "EQUAL", "value": true}
    ]
}
Volume of invocations slower than 30 seconds, grouped by agent:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "agentMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "count", "column": "agentName"}
    ],
    "groupBy": ["agentName"],
    "filters": [
        {"fieldName": "latencyMs", "operator": "GREATER_THAN", "value": 30000}
    ]
}
Use agentName IN [...] to restrict the query to specific agents you care about:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "agentMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "p99", "column": "latencyMs"}
    ],
    "groupBy": ["agentName"],
    "filters": [
        {"fieldName": "agentName", "operator": "IN", "value": ["support-bot", "research-agent"]}
    ]
}