Timeseries queries
Time-bucketed cache metrics over a window. Every timeseries query must include interval (or the deprecated intervalInSeconds). Each example below posts JSON to:
POST https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query
with Authorization: Bearer <your_api_key> and Content-Type: application/json. To keep the snippets short, only the JSON body is shown; the wrapper is identical to the Overview Quick Start .
The examples below pin the model side with {"fieldName": "virtualModelName", "operator": "IS_NULL", "value": true}. Flip to false for virtual-model-only series.
Hourly savings by namespace
Track cost savings per namespace over time: json={
"startTs" : "2026-04-21T00:00:00.000Z" ,
"endTs" : "2026-04-22T00:00:00.000Z" ,
"datasource" : "cacheMetrics" ,
"type" : "timeseries" ,
"interval" : "1 hour" ,
"aggregations" : [
{ "type" : "sum" , "column" : "potentialCostSavings" }
],
"groupBy" : [ "cacheNamespace" ],
"filters" : [
{ "fieldName" : "virtualModelName" , "operator" : "IS_NULL" , "value" : true}
]
}
Hourly p99 lookup latency by cache type
Watch for regressions in cache lookups: json={
"startTs" : "2026-04-21T00:00:00.000Z" ,
"endTs" : "2026-04-22T00:00:00.000Z" ,
"datasource" : "cacheMetrics" ,
"type" : "timeseries" ,
"interval" : "1 hour" ,
"aggregations" : [
{ "type" : "p99" , "column" : "cacheLookupLatencyMs" }
],
"groupBy" : [ "cacheType" ],
"filters" : [
{ "fieldName" : "virtualModelName" , "operator" : "IS_NULL" , "value" : true}
]
}
Hourly tokens read from cache
Track cache read volume per cache type: json={
"startTs" : "2026-04-21T00:00:00.000Z" ,
"endTs" : "2026-04-22T00:00:00.000Z" ,
"datasource" : "cacheMetrics" ,
"type" : "timeseries" ,
"interval" : "1 hour" ,
"aggregations" : [
{ "type" : "sum" , "column" : "cacheReadInputTokens" }
],
"groupBy" : [ "cacheType" ],
"filters" : [
{ "fieldName" : "virtualModelName" , "operator" : "IS_NULL" , "value" : true}
]
}
Volume by cacheLookupStatus over time: json={
"startTs" : "2026-04-21T00:00:00.000Z" ,
"endTs" : "2026-04-22T00:00:00.000Z" ,
"datasource" : "cacheMetrics" ,
"type" : "timeseries" ,
"interval" : "1 hour" ,
"aggregations" : [],
"groupBy" : [ "cacheLookupStatus" ],
"filters" : [
{ "fieldName" : "virtualModelName" , "operator" : "IS_NULL" , "value" : true}
]
}
Daily savings over a week
Daily cost savings across a 7-day window: json={
"startTs" : "2026-04-14T00:00:00.000Z" ,
"endTs" : "2026-04-21T00:00:00.000Z" ,
"datasource" : "cacheMetrics" ,
"type" : "timeseries" ,
"interval" : "1 day" ,
"aggregations" : [
{ "type" : "sum" , "column" : "potentialCostSavings" }
],
"filters" : [
{ "fieldName" : "virtualModelName" , "operator" : "IS_NULL" , "value" : true}
]
}
5-minute lookup latency in an incident window
Fine-grained breakdown to investigate a regression: json={
"startTs" : "2026-04-21T14:00:00.000Z" ,
"endTs" : "2026-04-21T16:00:00.000Z" ,
"datasource" : "cacheMetrics" ,
"type" : "timeseries" ,
"interval" : "5 minute" ,
"aggregations" : [
{ "type" : "p99" , "column" : "cacheLookupLatencyMs" }
],
"groupBy" : [ "cacheType" ],
"filters" : [
{ "fieldName" : "virtualModelName" , "operator" : "IS_NULL" , "value" : true}
]
}
Hourly savings for semantic cache only
Combine cache-type filter with namespace breakdown: json={
"startTs" : "2026-04-21T00:00:00.000Z" ,
"endTs" : "2026-04-22T00:00:00.000Z" ,
"datasource" : "cacheMetrics" ,
"type" : "timeseries" ,
"interval" : "1 hour" ,
"aggregations" : [
{ "type" : "sum" , "column" : "potentialCostSavings" }
],
"groupBy" : [ "cacheNamespace" ],
"filters" : [
{ "fieldName" : "cacheType" , "operator" : "IN" , "value" : [ "semantic" ]},
{ "fieldName" : "virtualModelName" , "operator" : "IS_NULL" , "value" : true}
]
}