> ## Documentation Index
> Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt
> Use this file to discover all available pages before exploring further.

# API Access to MCP Metrics

> Query Gateway MCP server and tool metrics for traffic and latency analytics via API.

The **Gateway MCP Metrics Query API** provides a flexible way to query Gateway MCP server and tool metrics: methods invoked, tools called, latency, errors, and traffic mix. You can retrieve either **distribution** (aggregated) or **timeseries** results with powerful filtering and grouping.

<Info>
  This page covers `datasource: "mcpMetrics"`. For other datasources, see the sibling pages for [Model](/docs/ai-gateway/fetch-model-metrics), [Guardrail](/docs/ai-gateway/fetch-guardrail-metrics), [Cache](/docs/ai-gateway/fetch-cache-metrics), [Routing](/docs/ai-gateway/fetch-routing-metrics), and [Agent](/docs/ai-gateway/fetch-agent-metrics) metrics.
</Info>

### Access control

* **Tenant admins:** Can query metrics for the entire organization (tenant-wide).
* **Users:** Can query their own data and their teams' data.
* **Virtual accounts:** Can query their own data and their teams' data; with tenant-admin permissions, they can access tenant-wide data.

The server applies RBAC automatically; callers don't pass any RBAC fields.

## Contents

| Section                                                                           | Description                                    |
| --------------------------------------------------------------------------------- | ---------------------------------------------- |
| [Overview](/docs/ai-gateway/fetch-mcp-metrics)                                    | Authentication, quick start, and API reference |
| [Filtering](/docs/ai-gateway/fetch-mcp-metrics-filtering)                         | Filter operators, fields, and combinations     |
| [Distribution examples](/docs/ai-gateway/fetch-mcp-metrics-examples-distribution) | Aggregated (distribution) query examples       |
| [Timeseries examples](/docs/ai-gateway/fetch-mcp-metrics-examples-timeseries)     | Time-bucketed (timeseries) query examples      |
| [Response format](/docs/ai-gateway/fetch-mcp-metrics-response)                    | Response JSON structure and error responses    |

## Authentication

You need to authenticate with your TrueFoundry API key. You can use either a Personal Access Token **(PAT)** or Virtual Account Token **(VAT)**.

<Accordion title="Get your API key">
  To generate an API key:

  1. **Personal Access Token (PAT)**: Go to Access → Personal Access Tokens in your TrueFoundry dashboard
  2. **Virtual Account Token (VAT)**: Go to Access → Virtual Account Tokens (requires admin permissions)

  For detailed authentication setup, see our [Authentication guide](/docs/ai-gateway/authentication).
</Accordion>

## Quick Start

<Warning>
  By default, the API returns metrics across **all MCP methods** including non-tool calls (`initialize`, `tools/list`, `prompts/list`, etc.). To analyse tool-call usage specifically, filter with `{"fieldName": "method", "operator": "IN", "value": ["tools/call"]}` and optionally group by `toolName`.
</Warning>

<Note>
  Token, cost, and time-to-first-token aggregations (`inputTokens`, `outputTokens`, `costInUSD`, `timeToFirstTokenMs`, `interTokenLatencyMs`, `timePerOutputTokenLatencyMs`) **do not apply** to MCP metrics; they are model-only signals. Use the [Model Metrics](/docs/ai-gateway/fetch-model-metrics) datasource if you need them.
</Note>

### Distribution query

Top MCP servers by request count and p99 latency:

```python theme={"dark"}
import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2026-04-21T00:00:00.000Z",
        "endTs": "2026-04-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "distribution",
        "aggregations": [
            {"type": "count", "column": "mcpServerName"},
            {"type": "avg", "column": "latencyMs"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["mcpServerName"]
    }
)

print(response.json())
```

### Timeseries query

Hourly tool-call counts and p99 latency by tool:

```python theme={"dark"}
import requests

response = requests.post(
    "https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query",
    headers={
        "Authorization": "Bearer <your_api_key>",
        "Content-Type": "application/json"
    },
    json={
        "startTs": "2026-04-21T00:00:00.000Z",
        "endTs": "2026-04-22T00:00:00.000Z",
        "datasource": "mcpMetrics",
        "type": "timeseries",
        "interval": "1 hour",
        "aggregations": [
            {"type": "count", "column": "toolName"},
            {"type": "p99", "column": "latencyMs"}
        ],
        "groupBy": ["toolName"],
        "filters": [
            {"fieldName": "method", "operator": "IN", "value": ["tools/call"]}
        ]
    }
)

print(response.json())
```

## API reference

### Endpoint

```
POST https://{your_control_plane_url}/api/svc/v1/llm-gateway/metrics/query
```

Post JSON to this endpoint with `Authorization: Bearer <your_api_key>` and `Content-Type: application/json`.

### Request parameters

<ParamField path="startTs" type="string" required>
  ISO 8601 timestamp marking the **inclusive** lower bound of the query window.
</ParamField>

<ParamField path="endTs" type="string" required>
  ISO 8601 timestamp marking the **exclusive** upper bound of the query window.
</ParamField>

<ParamField path="datasource" type="string" required>
  The data source to query. Use `"mcpMetrics"` for Gateway MCP metrics.
</ParamField>

<ParamField path="type" type="string" required>
  The type of query to execute:

  * `"distribution"`: returns aggregated rows (one row per `groupBy` combination).
  * `"timeseries"`: returns time-bucketed rows. Requires `interval`.
</ParamField>

<ParamField path="aggregations" type="array">
  Array of `{ type, column }` objects describing the aggregations to compute. When omitted, only the implicit `total = COUNT(*)` is returned.

  **Supported aggregation types**

  | Type                                                          | Description                                                   |
  | ------------------------------------------------------------- | ------------------------------------------------------------- |
  | `sum`                                                         | Sum of values                                                 |
  | `count`                                                       | Non-null count of the column                                  |
  | `countDistinct`                                               | Distinct count                                                |
  | `min`                                                         | Minimum value                                                 |
  | `max`                                                         | Maximum value                                                 |
  | `avg`                                                         | Average                                                       |
  | `p5`, `p10`, `p25`, `p50`, `p75`, `p90`, `p95`, `p99`, `p999` | Percentiles (approximate)                                     |
  | `rateSum`, `rateAvg`, `rateMin`, `rateMax`                    | Rates normalised by the interval in seconds (timeseries only) |
  | `ratePerMinute`                                               | Value divided by the interval in minutes (timeseries only)    |

  **Supported aggregation columns**

  | Column          | Supported aggregation types    | Notes                                           |
  | --------------- | ------------------------------ | ----------------------------------------------- |
  | `latencyMs`     | all scalar and all percentiles | Total MCP request latency in ms                 |
  | `mcpServerName` | `count`, `countDistinct`       | Useful for "how many servers were called?"      |
  | `method`        | `count`, `countDistinct`       | Useful for traffic-mix breakdowns               |
  | `toolName`      | `count`, `countDistinct`       | Useful when filtered to `method = "tools/call"` |
</ParamField>

<ParamField path="groupBy" type="array">
  Array of field names to group results by. Custom metadata keys are supported with a `metadata.` prefix.

  **Available group-by fields**

  | Field                  | Notes                                                                   |
  | ---------------------- | ----------------------------------------------------------------------- |
  | `mcpServerName`        | The MCP server name                                                     |
  | `method`               | JSON-RPC method invoked (e.g. `tools/call`, `tools/list`, `initialize`) |
  | `toolName`             | Tool invoked (populated only for `tools/call`)                          |
  | `userEmail`            | Group by user (response key: `createdBySubjectSlug`)                    |
  | `virtualaccount`       | Group by virtual account (response key: `createdBySubjectSlug`)         |
  | `team`                 | Unnests the `Teams` array                                               |
  | `createdBySubjectType` | Distinguishes `user` vs `virtualaccount`                                |
  | `metadata.<key>`       | Group by a custom metadata key                                          |

  <Note>
    When `groupBy` contains `userEmail` (without `virtualaccount`), the server auto-injects `WHERE CreatedBySubjectType = 'user'`. `virtualaccount` alone auto-injects `'virtualaccount'`. When both appear, scope it yourself with `createdBySubjectType` if needed.
  </Note>
</ParamField>

<ParamField path="filters" type="array">
  Array of filter objects, AND-combined. See [Filtering](/docs/ai-gateway/fetch-mcp-metrics-filtering) for the full operator reference and the per-field allow-list.
</ParamField>

<ParamField path="interval" type="string">
  **Required for timeseries queries.** Bucket size as `<positive integer> <unit>`, where `<unit>` is one of `second`, `minute`, `hour`, `day`, `week`, `month`, `year` (with or without a trailing `s`). Examples: `"30 second"`, `"5 minute"`, `"1 hour"`, `"1 day"`. Compound expressions like `"1 hour 30 minute"` are rejected.
</ParamField>

<ParamField path="intervalInSeconds" type="number" deprecated>
  **Deprecated alias for `interval`.** Accepts a positive integer number of seconds. Prefer `interval` in new code. If both are provided, `interval` wins.
</ParamField>
