Skip to main content
POST
/
api
/
svc
/
v1
/
llm-gateway
/
metrics
/
query
curl --request POST \
  --url https://{controlPlaneURL}/api/svc/v1/llm-gateway/metrics/query \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "startTs": "2025-01-21T00:00:00.000Z",
  "endTs": "2025-01-22T00:00:00.000Z",
  "datasource": "modelMetrics",
  "type": "distribution",
  "aggregations": [
    {
      "type": "count",
      "column": "modelName"
    },
    {
      "type": "sum",
      "column": "inputTokens"
    },
    {
      "type": "sum",
      "column": "outputTokens"
    },
    {
      "type": "p99",
      "column": "latencyMs"
    },
    {
      "type": "sum",
      "column": "costInUSD"
    }
  ],
  "groupBy": [
    "modelName"
  ],
  "filters": [
    {
      "fieldName": "virtualModelName",
      "operator": "IS_NULL",
      "value": true
    }
  ]
}
'
{
  "data": {
    "dataPoints": [
      {
        "modelName": "gpt-4o",
        "total": 1240,
        "countModelName": 1240,
        "sumInputTokens": 125000,
        "sumOutputTokens": 45000,
        "p99LatencyMs": 2450.5,
        "sumCostInUSD": 8.42
      },
      {
        "modelName": "gpt-3.5-turbo",
        "total": 860,
        "countModelName": 860,
        "sumInputTokens": 89000,
        "sumOutputTokens": 32000,
        "p99LatencyMs": 1820.3,
        "sumCostInUSD": 1.78
      }
    ]
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
startTs
string<date-time>
required

Inclusive lower bound of the query window as an ISO 8601 timestamp.

Example:

"2025-01-21T00:00:00.000Z"

endTs
string<date-time>
required

Exclusive upper bound of the query window as an ISO 8601 timestamp.

Example:

"2025-01-22T00:00:00.000Z"

datasource
enum<string>
required

Which Gateway data source to query.

Available options:
modelMetrics,
mcpMetrics,
guardrailMetrics,
cacheMetrics,
configMetrics,
agentMetrics
type
enum<string>
required

distribution returns one aggregated row per groupBy combination. timeseries returns one row per bucket per groupBy combination and requires interval.

Available options:
distribution,
timeseries
aggregations
MetricsAggregation · object[]

Aggregations to compute. When omitted, only the implicit total = COUNT(*) is returned.

groupBy
string[]

Field names to group results by. Custom metadata keys are supported with a metadata. prefix (for example, "metadata.environment"). Allowed field names depend on the datasource.

filters
(MetricsFieldFilter · object | MetricsMetadataFilter · object)[]

AND-combined filters that narrow the rows feeding each aggregation.

Filters are AND-combined; there is no OR-group support. Use fieldName for standard datasource fields and metadataKey for custom request-metadata keys.

interval
string

Required for type: "timeseries". Bucket size as <positive integer> <unit> where <unit> is one of second, minute, hour, day, week, month, year (with or without a trailing s). Compound expressions like "1 hour 30 minute" are rejected.

Example:

"1 hour"

intervalInSeconds
integer
deprecated

Deprecated alias for interval, in seconds (for example, 3600 for hourly). Prefer interval in new code. If both are provided, interval wins.

Response

Aggregated metrics for the requested datasource. Every row always contains total (implicit COUNT(*)), one key per requested aggregation (named <type><Column> in camelCase, for example sumInputTokens or p99LatencyMs), and one key per groupBy entry. Timeseries responses additionally include startTimestamp and endTimestamp as ISO 8601 timestamp strings (for example "2026-04-21T00:00:00.000Z"); endTimestamp equals the next bucket's startTimestamp. If groupBy is empty or omitted, the response collapses to a single row (or one row per timeseries bucket).

data
object
required