Skip to main content

Distribution queries

Aggregated snapshots of cache metrics over a time window. Every example below posts JSON to:
POST https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query
with Authorization: Bearer <your_api_key> and Content-Type: application/json. To keep the snippets short, only the JSON body is shown; the wrapper is identical to the Overview Quick Start.
The examples below pin the model side with {"fieldName": "virtualModelName", "operator": "IS_NULL", "value": true}. Flip to false (and swap groupBy: ["modelName"] for groupBy: ["virtualModel"]) for virtual-model-only cache stats.
Sum of cacheReadInputTokens per namespace. Surfaces which buckets do the most work:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "cacheMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "sum", "column": "cacheReadInputTokens"}
    ],
    "groupBy": ["cacheNamespace"],
    "filters": [
        {"fieldName": "virtualModelName", "operator": "IS_NULL", "value": true}
    ]
}
p50, p90, and p99 lookup latency grouped by cache type:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "cacheMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "p50", "column": "cacheLookupLatencyMs"},
        {"type": "p90", "column": "cacheLookupLatencyMs"},
        {"type": "p99", "column": "cacheLookupLatencyMs"}
    ],
    "groupBy": ["cacheType"],
    "filters": [
        {"fieldName": "virtualModelName", "operator": "IS_NULL", "value": true}
    ]
}
Sum of cost savings per underlying model:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "cacheMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "sum", "column": "potentialCostSavings"}
    ],
    "groupBy": ["modelName"],
    "filters": [
        {"fieldName": "virtualModelName", "operator": "IS_NULL", "value": true}
    ]
}
Restrict to a specific cache type and break savings down by model:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "cacheMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "sum", "column": "potentialCostSavings"},
        {"type": "sum", "column": "cacheReadInputTokens"}
    ],
    "groupBy": ["modelName"],
    "filters": [
        {"fieldName": "cacheType", "operator": "IN", "value": ["semantic"]},
        {"fieldName": "virtualModelName", "operator": "IS_NULL", "value": true}
    ]
}
Use STRING_STARTS_WITH on cacheNamespace, handy when prod and staging share a cache type:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "cacheMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "sum", "column": "potentialCostSavings"}
    ],
    "groupBy": ["cacheNamespace"],
    "filters": [
        {"fieldName": "cacheNamespace", "operator": "STRING_STARTS_WITH", "value": "prod-"},
        {"fieldName": "virtualModelName", "operator": "IS_NULL", "value": true}
    ]
}
Group by cacheLookupStatus to see hits vs misses per cache type:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "cacheMetrics",
    "type": "distribution",
    "aggregations": [],
    "groupBy": ["cacheType", "cacheLookupStatus"],
    "filters": [
        {"fieldName": "virtualModelName", "operator": "IS_NULL", "value": true}
    ]
}
Compare cache-creation tokens to cache-read tokens per namespace:
json={
    "startTs": "2026-04-21T00:00:00.000Z",
    "endTs": "2026-04-22T00:00:00.000Z",
    "datasource": "cacheMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "sum", "column": "cacheCreationInputTokens"},
        {"type": "sum", "column": "cacheReadInputTokens"}
    ],
    "groupBy": ["cacheNamespace"],
    "filters": [
        {"fieldName": "virtualModelName", "operator": "IS_NULL", "value": true}
    ]
}