API Access to MCP Metrics - TrueFoundry Docs

The Gateway MCP Metrics Query API provides a flexible way to query MCP server and tool usage, latency, errors, and traffic patterns across your tenant. You can retrieve either distribution (aggregated) or timeseries MCP metrics with powerful filtering and grouping capabilities.

This page covers datasource: "mcpMetrics". For querying model and virtual-model metrics, see API Access to Model Metrics.

Access control

Tenant admins: Can query metrics for the entire organization (tenant-wide).
Users: Can query their own data and their teams’ data.
Virtual accounts: Can query their own data and their teams’ data; with tenant-admin permissions, they can access tenant-wide data.

Authentication

You need to authenticate with your TrueFoundry API key. You can use either a Personal Access Token (PAT) or Virtual Account Token (VAT).

Get your API key

To generate an API key:

Personal Access Token (PAT): Go to Access → Personal Access Tokens in your TrueFoundry dashboard
Virtual Account Token (VAT): Go to Access → Virtual Account Tokens (requires admin permissions)

For detailed authentication setup, see our Authentication guide.

Quick Start

By default, the API returns metrics across all MCP methods including non-tool calls (initialize, tools/list, prompts/list, etc.). To analyze tool-call usage specifically, filter with {"fieldName": "method", "operator": "IN", "value": ["tools/call"]} and optionally group by toolName.

Distribution Query

Get aggregated MCP metrics distribution — top MCP servers by request count and p99 latency:

import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"},
            {"type": "avg", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["mcpServerName"]
    }
)

print(response.json())

Timeseries Query

Get MCP tool-call metrics over time with hourly intervals, including latency percentiles:

import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "toolName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["toolName"],
        "filters": [
            {"fieldName": "method", "operator": "IN", "value": ["tools/call"]}
        ],
        "intervalInSeconds": 3600
    }
)

print(response.json())

API Reference

Endpoint

POST https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query

Request Parameters

startTs

string

required

ISO 8601 timestamp for the start of the data range (e.g., "2025-01-21T00:00:00.000Z")

endTs

string

required

ISO 8601 timestamp for the end of the data range (e.g., "2025-01-22T00:00:00.000Z")

datasource

string

required

The data source to query. Use "mcpMetrics" for gateway MCP server and tool metrics.

type

string

required

The type of query to execute:

"distribution" - Returns aggregated metrics
"timeseries" - Returns metrics over time intervals

aggregations

array

Array of aggregation objects. Each aggregation specifies:

type - The aggregation type
column - The column to aggregate on

Supported aggregation types:

Type	Description
`count`	Count of records
`countDistinct`	Count of unique values
`sum`	Sum of values
`avg`	Average of values
`min`	Minimum value
`max`	Maximum value
`p50`	50th percentile (median)
`p75`	75th percentile
`p90`	90th percentile
`p99`	99th percentile

"aggregations": [
    {"type": "count", "column": "mcpServerName"},
    {"type": "avg", "column": "latencyMs"},
    {"type": "p99", "column": "latencyMs"}
]

Supported columns for aggregation:

Column	Description	Useful aggregation types
`latencyMs`	Total MCP request latency in milliseconds	`sum`, `avg`, `min`, `max`, `p50`, `p75`, `p90`, `p99`
`mcpServerName`	MCP server name	`count`, `countDistinct`
`method`	JSON-RPC method (e.g. `tools/call`, `tools/list`, `initialize`)	`count`, `countDistinct`
`toolName`	Tool invoked (relevant only for `tools/call`)	`count`, `countDistinct`

Token, cost, and time-to-first-token aggregations (inputTokens, outputTokens, costInUSD, timeToFirstTokenMs, interTokenLatencyMs, timePerOutputTokenLatencyMs) do not apply to MCP metrics — these are model-only signals. Use API Access to Model Metrics if you need them.

groupBy

array

Array of fields to group the metrics by. Available options:

mcpServerName - Group by MCP server name
method - Group by JSON-RPC method
toolName - Group by tool name (most useful with a method = "tools/call" filter)
userEmail - Group by user email
virtualaccount - Group by virtual account
team - Group by team (unnests the Teams array)
createdBySubjectType - Subject type (e.g. user, virtualaccount)
metadata.<key> - Group by a custom metadata key (e.g., metadata.environment)

"groupBy": ["mcpServerName", "method", "metadata.environment"]

filters

array

Array of filter objects to narrow down the results. See Filtering for details.

intervalInSeconds

number

Required for timeseries queries. The time interval in seconds for grouping data points.Common values:

60 - 1 minute intervals
300 - 5 minute intervals
1800 - 30 minute intervals
3600 - 1 hour intervals
86400 - 1 day intervals

Filtering

Filters allow you to narrow down your query results. The API supports different filter operators depending on the field type.

Filter Structure

Field Filters
Metadata Filters

For standard fields, use fieldName:

{
    "fieldName": "mcpServerName",
    "operator": "IN",
    "value": ["github-mcp", "atlassian-mcp"]
}

For metadata fields, use metadataKey:

{
    "metadataKey": "environment",
    "operator": "IN",
    "value": ["production"]
}

Filterable Fields

Field	Type	Description
`mcpServerName`	string	Name of the MCP server
`method`	string	JSON-RPC method invoked (e.g., `tools/call`, `tools/list`, `initialize`)
`toolName`	string	Name of the tool called (relevant only for `tools/call`)
`userEmail`	string	Email of the user making the request
`virtualAccount`	string	The virtual account name
`team`	array	Teams associated with the request
`latencyMs`	number	Request latency in milliseconds
`conversationID`	string	The conversation identifier

Filter Operators

String Field Operators

Operator	Description	Example Value
`EQUAL`	Exact match	`"alice@example.com"`
`NOT_EQUAL`	Not equal to value	`"bot@example.com"`
`IN`	Match any value in the list	`["github-mcp", "atlassian-mcp"]`
`NOT_IN`	Exclude values in the list	`["internal-debug-tool"]`
`STRING_CONTAINS`	Contains substring	`"mcp"`
`STRING_STARTS_WITH`	Starts with prefix	`"github-"`
`STRING_ENDS_WITH`	Ends with suffix	`"-mcp"`

EQUAL and NOT_EQUAL are supported on userEmail, virtualAccount, and conversationID only. The MCP-specific string fields mcpServerName, method, and toolName support only IN, NOT_IN, STRING_CONTAINS, STRING_STARTS_WITH, STRING_ENDS_WITH.

# Filter for tool-call traffic only
{
    "fieldName": "method",
    "operator": "IN",
    "value": ["tools/call"]
}

# Filter for specific MCP servers
{
    "fieldName": "mcpServerName",
    "operator": "IN",
    "value": ["github-mcp", "atlassian-mcp"]
}

# Filter server names containing "mcp"
{
    "fieldName": "mcpServerName",
    "operator": "STRING_CONTAINS",
    "value": "mcp"
}

# Filter server names starting with "github-"
{
    "fieldName": "mcpServerName",
    "operator": "STRING_STARTS_WITH",
    "value": "github-"
}

Numeric Field Operators

Operator	Description	Example Value
`GREATER_THAN`	Greater than value	`1000`
`LESS_THAN`	Less than value	`5000`
`GREATER_THAN_EQUAL`	Greater than or equal to	`100`
`LESS_THAN_EQUAL`	Less than or equal to	`1000`
`BETWEEN`	Between two values (inclusive)	`[500, 5000]`

# Filter for high-latency MCP requests
{
    "fieldName": "latencyMs",
    "operator": "GREATER_THAN",
    "value": 1000
}

# Filter for latency within a range
{
    "fieldName": "latencyMs",
    "operator": "BETWEEN",
    "value": [500, 5000]
}

Array Field Operators (Teams)

Operator	Description	Example Value
`ARRAY_HAS_ANY`	Match if array contains any of the values	`["team-alpha", "team-beta"]`
`ARRAY_HAS_NONE`	Match if array contains none of the values	`["excluded-team"]`

# Filter for specific teams
{
    "fieldName": "team",
    "operator": "ARRAY_HAS_ANY",
    "value": ["team-alpha", "team-beta"]
}

# Exclude specific teams
{
    "fieldName": "team",
    "operator": "ARRAY_HAS_NONE",
    "value": ["excluded-team"]
}

Combining Multiple Filters

You can combine multiple filters in a single query. All filters are applied with AND logic:

{
    "startTs": "2025-01-21T00:00:00.000Z",
    "endTs": "2025-01-22T00:00:00.000Z",
    "datasource": "mcpMetrics",
    "type": "distribution",
    "aggregations": [
        {"type": "count", "column": "toolName"},
        {"type": "p99", "column": "latencyMs"}
    ],
    "filters": [
        {
            "fieldName": "method",
            "operator": "IN",
            "value": ["tools/call"]
        },
        {
            "fieldName": "mcpServerName",
            "operator": "IN",
            "value": ["github-mcp", "atlassian-mcp"]
        },
        {
            "fieldName": "latencyMs",
            "operator": "LESS_THAN",
            "value": 5000
        },
        {
            "fieldName": "team",
            "operator": "ARRAY_HAS_ANY",
            "value": ["team-alpha"]
        },
        {
            "metadataKey": "environment",
            "operator": "IN",
            "value": ["production"]
        }
    ],
    "groupBy": ["mcpServerName", "toolName"]
}

Query Examples

Distribution Queries

Count MCP requests by server

Get total request counts grouped by MCP server:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "groupBy": ["mcpServerName"]
    }
)

Count tool calls by tool

Count tools/call traffic grouped by tool name. The method filter is required because toolName is only populated for tool-call requests:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "toolName"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            }
        ],
        "groupBy": ["toolName"]
    }
)

Method distribution per server

Break down the JSON-RPC method mix per MCP server:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "method"}
        ],
        "groupBy": ["mcpServerName", "method"]
    }
)

Latency percentiles by server

Get p50, p90, and p99 latency percentiles per MCP server:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "p50", "column": "latencyMs"},
            {"type": "p90", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["mcpServerName"]
    }
)

Latency percentiles by tool

Get p50, p90, and p99 latency percentiles per tool (filtered to tools/call):

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "p50", "column": "latencyMs"},
            {"type": "p90", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            }
        ],
        "groupBy": ["toolName"]
    }
)

Multi-dimensional grouping

Group by server, team, and user simultaneously to see who is hitting which MCP:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "groupBy": ["mcpServerName", "team", "userEmail"]
    }
)

Group by metadata

Group by server and a custom metadata key (e.g., environment):

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "groupBy": ["mcpServerName", "metadata.environment"]
    }
)

Filter to specific MCP servers

Restrict the query to a known list of MCP servers:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "mcpServerName",
                "operator": "IN",
                "value": ["github-mcp", "atlassian-mcp", "slack-mcp"]
            }
        ],
        "groupBy": ["mcpServerName"]
    }
)

Filter high-latency MCP requests

Find MCP requests that exceeded a latency threshold:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"},
            {"type": "avg", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "latencyMs",
                "operator": "GREATER_THAN",
                "value": 1000
            }
        ],
        "groupBy": ["mcpServerName"]
    }
)

Filter by latency range

Find MCP requests whose latency falls within a specific window:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "fieldName": "latencyMs",
                "operator": "BETWEEN",
                "value": [500, 5000]
            }
        ],
        "groupBy": ["mcpServerName"]
    }
)

Filter by team

Restrict the query to one or more teams using array operators:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "fieldName": "team",
                "operator": "ARRAY_HAS_ANY",
                "value": ["team-alpha", "team-beta"]
            }
        ],
        "groupBy": ["team", "mcpServerName"]
    }
)

Filter by metadata

Filter by a custom metadata key/value (e.g., environment):

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "metadataKey": "environment",
                "operator": "IN",
                "value": ["production"]
            }
        ],
        "groupBy": ["mcpServerName"]
    }
)

Tool calls excluding internal tools

Count tool calls while excluding internal/debug tools using NOT_IN:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "toolName"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            },
            {
                "fieldName": "toolName",
                "operator": "NOT_IN",
                "value": ["internal-debug", "healthcheck", "echo"]
            }
        ],
        "groupBy": ["toolName"]
    }
)

Complex filter combination

Combine method, server, latency range, and team filters in a single query:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "toolName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            },
            {
                "fieldName": "mcpServerName",
                "operator": "IN",
                "value": ["github-mcp", "atlassian-mcp"]
            },
            {
                "fieldName": "latencyMs",
                "operator": "BETWEEN",
                "value": [100, 10000]
            },
            {
                "fieldName": "team",
                "operator": "ARRAY_HAS_ANY",
                "value": ["team-alpha"]
            }
        ],
        "groupBy": ["mcpServerName", "toolName"]
    }
)

Calls per JSON-RPC method

Surface the overall traffic mix between initialize, tools/list, tools/call, etc. by grouping on method only:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "method"}
        ],
        "groupBy": ["method"]
    }
)

Timeseries Queries

Basic timeseries (hourly)

Get hourly request counts across all MCP traffic:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "intervalInSeconds": 3600
    }
)

5-minute intervals

Get fine-grained MCP traffic with 5-minute buckets over a 6-hour window:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-21T06:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "intervalInSeconds": 300
    }
)

Hourly by server

Get hourly request counts grouped by MCP server:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "groupBy": ["mcpServerName"],
        "intervalInSeconds": 3600
    }
)

Hourly by tool

Get hourly tool-call counts grouped by tool name (filtered to tools/call):

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "toolName"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            }
        ],
        "groupBy": ["toolName"],
        "intervalInSeconds": 3600
    }
)

Latency p99 over time by server

Track p99 latency per MCP server hour-by-hour to spot regressions:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["mcpServerName"],
        "intervalInSeconds": 3600
    }
)

Timeseries with filters

Hourly counts grouped by server, restricted to a known list of MCPs:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "fieldName": "mcpServerName",
                "operator": "IN",
                "value": ["github-mcp", "atlassian-mcp"]
            }
        ],
        "groupBy": ["mcpServerName"],
        "intervalInSeconds": 3600
    }
)

Timeseries with team filter

Hourly counts grouped by team using array filters:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "filters": [
            {
                "fieldName": "team",
                "operator": "ARRAY_HAS_ANY",
                "value": ["team-alpha", "team-beta"]
            }
        ],
        "groupBy": ["team"],
        "intervalInSeconds": 3600
    }
)

Daily timeseries for a week

Daily MCP traffic over a 7-day window:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-14T00:00:00.000Z",
        "endTs": "2025-01-21T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"}
        ],
        "intervalInSeconds": 86400
    }
)

Method-level traffic breakdown

Hourly breakdown of JSON-RPC method traffic over time:

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "method"}
        ],
        "groupBy": ["method"],
        "intervalInSeconds": 3600
    }
)

Tool latency trend (complex)

Hourly p99 tool-call latency per tool, restricted to slow requests (latencyMs > 100):

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2025-01-21T00:00:00.000Z",
        "endTs": "2025-01-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "aggregations": [
            {"type": "count", "column": "toolName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "filters": [
            {
                "fieldName": "method",
                "operator": "EQUAL",
                "value": "tools/call"
            },
            {
                "fieldName": "latencyMs",
                "operator": "GREATER_THAN",
                "value": 100
            }
        ],
        "groupBy": ["toolName"],
        "intervalInSeconds": 3600
    }
)

Response Format

The API returns metrics data in JSON format. Aggregation results are returned with keys in camelCase format: {aggregationType}{ColumnName} where the column name is capitalized (e.g., countMcpServerName, avgLatencyMs, p99LatencyMs, countToolName).

Distribution Response

{
    "data": [
        {
            "mcpServerName": "github-mcp",
            "countMcpServerName": 1240,
            "avgLatencyMs": 312.5,
            "p99LatencyMs": 1820.0
        },
        {
            "mcpServerName": "atlassian-mcp",
            "countMcpServerName": 860,
            "avgLatencyMs": 415.8,
            "p99LatencyMs": 2310.4
        }
    ]
}

Timeseries Response

{
    "data": [
        {
            "timestamp": "2025-01-21T00:00:00.000Z",
            "toolName": "search_issues",
            "countToolName": 25,
            "p99LatencyMs": 2100.5
        },
        {
            "timestamp": "2025-01-21T01:00:00.000Z",
            "toolName": "search_issues",
            "countToolName": 30,
            "p99LatencyMs": 2350.2
        }
    ]
}

If the groupBy array is empty, the API returns a summarized overview of all MCP requests within the specified time range.

Get Started

LLM Gateway

MCP Registry and Gateway

Agent Registry

Skills Registry

Guardrails and Security

Prompt Management

Observability

Deployment

Admin Guide

Chat

Agent

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Moderations

Models

Documentation Index

​Access control

​Authentication

​Quick Start

​Distribution Query

​Timeseries Query

​API Reference

​Endpoint

​Request Parameters

​Filtering

​Filter Structure

​Filterable Fields

​Filter Operators

​String Field Operators

​Numeric Field Operators

​Array Field Operators (Teams)

​Combining Multiple Filters

​Query Examples

​Distribution Queries

​Timeseries Queries

​Response Format

​Distribution Response

​Timeseries Response

Access control

Authentication

Quick Start

Distribution Query

Timeseries Query

API Reference

Endpoint

Request Parameters

Filtering

Filter Structure

Filterable Fields

Filter Operators

String Field Operators

Numeric Field Operators

Array Field Operators (Teams)

Combining Multiple Filters

Query Examples

Distribution Queries

Timeseries Queries

Response Format

Distribution Response

Timeseries Response