Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt

Use this file to discover all available pages before exploring further.

The Gateway MCP Metrics Query API provides a flexible way to query MCP server and tool usage, latency, errors, and traffic patterns across your tenant. You can retrieve either distribution (aggregated) or timeseries MCP metrics with powerful filtering and grouping capabilities.
This page covers datasource: "mcpMetrics". For querying model and virtual-model metrics, see API Access to Model Metrics.

Access control

  • Tenant admins: Can query metrics for the entire organization (tenant-wide).
  • Users: Can query their own data and their teams’ data.
  • Virtual accounts: Can query their own data and their teams’ data; with tenant-admin permissions, they can access tenant-wide data.

Authentication

You need to authenticate with your TrueFoundry API key. You can use either a Personal Access Token (PAT) or Virtual Account Token (VAT).
To generate an API key:
  1. Personal Access Token (PAT): Go to Access → Personal Access Tokens in your TrueFoundry dashboard
  2. Virtual Account Token (VAT): Go to Access → Virtual Account Tokens (requires admin permissions)
For detailed authentication setup, see our Authentication guide.

Quick Start

By default, the API returns metrics across all MCP methods including non-tool calls (initialize, tools/list, prompts/list, etc.). To analyze tool-call usage specifically, filter with {"fieldName": "method", "operator": "IN", "value": ["tools/call"]} and optionally group by toolName.

Distribution Query

Get aggregated MCP metrics distribution — top MCP servers by request count and p99 latency:
import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"},
            {"type": "avg", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["mcpServerName"]
    }
)

print(response.json())

Timeseries Query

Get MCP tool-call metrics over time with hourly intervals, including latency percentiles:
import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "toolName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["toolName"],
        "filters": [
            {"fieldName": "method", "operator": "IN", "value": ["tools/call"]}
        ],
        "intervalInSeconds": 3600
    }
)

print(response.json())

API Reference

Endpoint

POST https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query

Request Parameters

startTs
string
required
ISO 8601 timestamp for the start of the data range (e.g., "2025-01-21T00:00:00.000Z")
endTs
string
required
ISO 8601 timestamp for the end of the data range (e.g., "2025-01-22T00:00:00.000Z")
datasource
string
required
The data source to query. Use "mcpMetrics" for gateway MCP server and tool metrics.
type
string
required
The type of query to execute:
  • "distribution" - Returns aggregated metrics
  • "timeseries" - Returns metrics over time intervals
aggregations
array
Array of aggregation objects. Each aggregation specifies:
  • type - The aggregation type
  • column - The column to aggregate on
Supported aggregation types:
TypeDescription
countCount of records
countDistinctCount of unique values
sumSum of values
avgAverage of values
minMinimum value
maxMaximum value
p5050th percentile (median)
p7575th percentile
p9090th percentile
p9999th percentile
"aggregations": [
    {"type": "count", "column": "mcpServerName"},
    {"type": "avg", "column": "latencyMs"},
    {"type": "p99", "column": "latencyMs"}
]
Supported columns for aggregation:
ColumnDescriptionUseful aggregation types
latencyMsTotal MCP request latency in millisecondssum, avg, min, max, p50, p75, p90, p99
mcpServerNameMCP server namecount, countDistinct
methodJSON-RPC method (e.g. tools/call, tools/list, initialize)count, countDistinct
toolNameTool invoked (relevant only for tools/call)count, countDistinct
Token, cost, and time-to-first-token aggregations (inputTokens, outputTokens, costInUSD, timeToFirstTokenMs, interTokenLatencyMs, timePerOutputTokenLatencyMs) do not apply to MCP metrics — these are model-only signals. Use API Access to Model Metrics if you need them.
groupBy
array
Array of fields to group the metrics by. Available options:
  • mcpServerName - Group by MCP server name
  • method - Group by JSON-RPC method
  • toolName - Group by tool name (most useful with a method = "tools/call" filter)
  • userEmail - Group by user email
  • virtualaccount - Group by virtual account
  • team - Group by team (unnests the Teams array)
  • createdBySubjectType - Subject type (e.g. user, virtualaccount)
  • metadata.<key> - Group by a custom metadata key (e.g., metadata.environment)
"groupBy": ["mcpServerName", "method", "metadata.environment"]
filters
array
Array of filter objects to narrow down the results. See Filtering for details.
intervalInSeconds
number
Required for timeseries queries. The time interval in seconds for grouping data points.Common values:
  • 60 - 1 minute intervals
  • 300 - 5 minute intervals
  • 1800 - 30 minute intervals
  • 3600 - 1 hour intervals
  • 86400 - 1 day intervals

Filtering

Filters allow you to narrow down your query results. The API supports different filter operators depending on the field type.

Filter Structure

For standard fields, use fieldName:
{
    "fieldName": "mcpServerName",
    "operator": "IN",
    "value": ["github-mcp", "atlassian-mcp"]
}

Filterable Fields

FieldTypeDescription
mcpServerNamestringName of the MCP server
methodstringJSON-RPC method invoked (e.g., tools/call, tools/list, initialize)
toolNamestringName of the tool called (relevant only for tools/call)
userEmailstringEmail of the user making the request
virtualAccountstringThe virtual account name
teamarrayTeams associated with the request
latencyMsnumberRequest latency in milliseconds
conversationIDstringThe conversation identifier

Filter Operators

String Field Operators

OperatorDescriptionExample Value
EQUALExact match"alice@example.com"
NOT_EQUALNot equal to value"bot@example.com"
INMatch any value in the list["github-mcp", "atlassian-mcp"]
NOT_INExclude values in the list["internal-debug-tool"]
STRING_CONTAINSContains substring"mcp"
STRING_STARTS_WITHStarts with prefix"github-"
STRING_ENDS_WITHEnds with suffix"-mcp"
EQUAL and NOT_EQUAL are supported on userEmail, virtualAccount, and conversationID only. The MCP-specific string fields mcpServerName, method, and toolName support only IN, NOT_IN, STRING_CONTAINS, STRING_STARTS_WITH, STRING_ENDS_WITH.
# Filter for tool-call traffic only
{
    "fieldName": "method",
    "operator": "IN",
    "value": ["tools/call"]
}

# Filter for specific MCP servers
{
    "fieldName": "mcpServerName",
    "operator": "IN",
    "value": ["github-mcp", "atlassian-mcp"]
}

# Filter server names containing "mcp"
{
    "fieldName": "mcpServerName",
    "operator": "STRING_CONTAINS",
    "value": "mcp"
}

# Filter server names starting with "github-"
{
    "fieldName": "mcpServerName",
    "operator": "STRING_STARTS_WITH",
    "value": "github-"
}

Numeric Field Operators

OperatorDescriptionExample Value
GREATER_THANGreater than value1000
LESS_THANLess than value5000
GREATER_THAN_EQUALGreater than or equal to100
LESS_THAN_EQUALLess than or equal to1000
BETWEENBetween two values (inclusive)[500, 5000]
# Filter for high-latency MCP requests
{
    "fieldName": "latencyMs",
    "operator": "GREATER_THAN",
    "value": 1000
}

# Filter for latency within a range
{
    "fieldName": "latencyMs",
    "operator": "BETWEEN",
    "value": [500, 5000]
}

Array Field Operators (Teams)

OperatorDescriptionExample Value
ARRAY_HAS_ANYMatch if array contains any of the values["team-alpha", "team-beta"]
ARRAY_HAS_NONEMatch if array contains none of the values["excluded-team"]
# Filter for specific teams
{
    "fieldName": "team",
    "operator": "ARRAY_HAS_ANY",
    "value": ["team-alpha", "team-beta"]
}

# Exclude specific teams
{
    "fieldName": "team",
    "operator": "ARRAY_HAS_NONE",
    "value": ["excluded-team"]
}

Combining Multiple Filters

You can combine multiple filters in a single query. All filters are applied with AND logic:
{
    "startTs": "2025-01-21T00:00:00.000Z",
    "endTs": "2025-01-22T00:00:00.000Z",
    "datasource": "mcpMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "count", "column": "toolName"},
        {"type": "p99", "column": "latencyMs"}
    ],
    "filters": [
        {
            "fieldName": "method",
            "operator": "IN",
            "value": ["tools/call"]
        },
        {
            "fieldName": "mcpServerName",
            "operator": "IN",
            "value": ["github-mcp", "atlassian-mcp"]
        },
        {
            "fieldName": "latencyMs",
            "operator": "LESS_THAN",
            "value": 5000
        },
        {
            "fieldName": "team",
            "operator": "ARRAY_HAS_ANY",
            "value": ["team-alpha"]
        },
        {
            "metadataKey": "environment",
            "operator": "IN",
            "value": ["production"]
        }
    ],
    "groupBy": ["mcpServerName", "toolName"]
}

Query Examples

Distribution Queries

Get total request counts grouped by MCP server:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "groupBy": ["mcpServerName"]
    }
)
Count tools/call traffic grouped by tool name. The method filter is required because toolName is only populated for tool-call requests:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "toolName"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            }
        ],
        "groupBy": ["toolName"]
    }
)
Break down the JSON-RPC method mix per MCP server:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "method"}
        ],
        "groupBy": ["mcpServerName", "method"]
    }
)
Get p50, p90, and p99 latency percentiles per MCP server:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "p50", "column": "latencyMs"},
            {"type": "p90", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["mcpServerName"]
    }
)
Get p50, p90, and p99 latency percentiles per tool (filtered to tools/call):
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "p50", "column": "latencyMs"},
            {"type": "p90", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            }
        ],
        "groupBy": ["toolName"]
    }
)
Group by server, team, and user simultaneously to see who is hitting which MCP:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "groupBy": ["mcpServerName", "team", "userEmail"]
    }
)
Group by server and a custom metadata key (e.g., environment):
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "groupBy": ["mcpServerName", "metadata.environment"]
    }
)
Restrict the query to a known list of MCP servers:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "mcpServerName",
                "operator": "IN",
                "value": ["github-mcp", "atlassian-mcp", "slack-mcp"]
            }
        ],
        "groupBy": ["mcpServerName"]
    }
)
Find MCP requests that exceeded a latency threshold:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"},
            {"type": "avg", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "latencyMs",
                "operator": "GREATER_THAN",
                "value": 1000
            }
        ],
        "groupBy": ["mcpServerName"]
    }
)
Find MCP requests whose latency falls within a specific window:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "fieldName": "latencyMs",
                "operator": "BETWEEN",
                "value": [500, 5000]
            }
        ],
        "groupBy": ["mcpServerName"]
    }
)
Restrict the query to one or more teams using array operators:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "fieldName": "team",
                "operator": "ARRAY_HAS_ANY",
                "value": ["team-alpha", "team-beta"]
            }
        ],
        "groupBy": ["team", "mcpServerName"]
    }
)
Filter by a custom metadata key/value (e.g., environment):
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "metadataKey": "environment",
                "operator": "IN",
                "value": ["production"]
            }
        ],
        "groupBy": ["mcpServerName"]
    }
)
Count tool calls while excluding internal/debug tools using NOT_IN:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "toolName"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            },
            {
                "fieldName": "toolName",
                "operator": "NOT_IN",
                "value": ["internal-debug", "healthcheck", "echo"]
            }
        ],
        "groupBy": ["toolName"]
    }
)
Combine method, server, latency range, and team filters in a single query:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "toolName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            },
            {
                "fieldName": "mcpServerName",
                "operator": "IN",
                "value": ["github-mcp", "atlassian-mcp"]
            },
            {
                "fieldName": "latencyMs",
                "operator": "BETWEEN",
                "value": [100, 10000]
            },
            {
                "fieldName": "team",
                "operator": "ARRAY_HAS_ANY",
                "value": ["team-alpha"]
            }
        ],
        "groupBy": ["mcpServerName", "toolName"]
    }
)
Surface the overall traffic mix between initialize, tools/list, tools/call, etc. by grouping on method only:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "method"}
        ],
        "groupBy": ["method"]
    }
)

Timeseries Queries

Get hourly request counts across all MCP traffic:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "intervalInSeconds": 3600
    }
)
Get fine-grained MCP traffic with 5-minute buckets over a 6-hour window:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-21T06:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "intervalInSeconds": 300
    }
)
Get hourly request counts grouped by MCP server:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "groupBy": ["mcpServerName"],
        "intervalInSeconds": 3600
    }
)
Get hourly tool-call counts grouped by tool name (filtered to tools/call):
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "toolName"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            }
        ],
        "groupBy": ["toolName"],
        "intervalInSeconds": 3600
    }
)
Track p99 latency per MCP server hour-by-hour to spot regressions:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["mcpServerName"],
        "intervalInSeconds": 3600
    }
)
Hourly counts grouped by server, restricted to a known list of MCPs:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "fieldName": "mcpServerName",
                "operator": "IN",
                "value": ["github-mcp", "atlassian-mcp"]
            }
        ],
        "groupBy": ["mcpServerName"],
        "intervalInSeconds": 3600
    }
)
Hourly counts grouped by team using array filters:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "fieldName": "team",
                "operator": "ARRAY_HAS_ANY",
                "value": ["team-alpha", "team-beta"]
            }
        ],
        "groupBy": ["team"],
        "intervalInSeconds": 3600
    }
)
Daily MCP traffic over a 7-day window:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-14T00:00:00.000Z",
        "endTs": "2025-01-21T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "intervalInSeconds": 86400
    }
)
Hourly breakdown of JSON-RPC method traffic over time:
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "method"}
        ],
        "groupBy": ["method"],
        "intervalInSeconds": 3600
    }
)
Hourly p99 tool-call latency per tool, restricted to slow requests (latencyMs > 100):
response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "toolName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            },
            {
                "fieldName": "latencyMs",
                "operator": "GREATER_THAN",
                "value": 100
            }
        ],
        "groupBy": ["toolName"],
        "intervalInSeconds": 3600
    }
)

Response Format

The API returns metrics data in JSON format. Aggregation results are returned with keys in camelCase format: {aggregationType}{ColumnName} where the column name is capitalized (e.g., countMcpServerName, avgLatencyMs, p99LatencyMs, countToolName).

Distribution Response

{
    "data": [
        {
            "mcpServerName": "github-mcp",
            "countMcpServerName": 1240,
            "avgLatencyMs": 312.5,
            "p99LatencyMs": 1820.0
        },
        {
            "mcpServerName": "atlassian-mcp",
            "countMcpServerName": 860,
            "avgLatencyMs": 415.8,
            "p99LatencyMs": 2310.4
        }
    ]
}

Timeseries Response

{
    "data": [
        {
            "timestamp": "2025-01-21T00:00:00.000Z",
            "toolName": "search_issues",
            "countToolName": 25,
            "p99LatencyMs": 2100.5
        },
        {
            "timestamp": "2025-01-21T01:00:00.000Z",
            "toolName": "search_issues",
            "countToolName": 30,
            "p99LatencyMs": 2350.2
        }
    ]
}
If the groupBy array is empty, the API returns a summarized overview of all MCP requests within the specified time range.