Skip to main content
GET
/
api
/
svc
/
v1
/
metrics
/
cluster
/
{id}
/
charts
/
{chart}
Get metric data for a cluster chart
curl --request GET \
  --url https://{controlPlaneURL}/api/svc/v1/metrics/cluster/{id}/charts/{chart} \
  --header 'Authorization: Bearer <token>'
{
  "graph": {
    "name": "<string>",
    "graphLines": [
      {
        "name": "<string>",
        "values": [
          {
            "timestamp": "2026-05-20T12:00:00.000Z",
            "value": 123
          }
        ],
        "queryType": "<string>",
        "totalCount": 123,
        "aggregateValue": {}
      }
    ],
    "displayName": "<string>",
    "description": "<string>",
    "aggregateValue": 123,
    "aggregateUnit": "% success rate",
    "unit": "cores",
    "queryTypes": [
      "<string>"
    ],
    "thresholds": [
      {
        "mode": "absolute",
        "name": "<string>",
        "value": 123
      }
    ]
  },
  "stepSize": 123
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

chart
enum<string>
required

Chart identifier returned by list_cluster_metric_charts.

Available options:
cpuUsage,
quantileOverTimeCpuUsage,
cpuThrottling,
memoryUsage,
networkBytes,
diskThroughput,
pvcUsage,
workbenchHomeDirectoryUsage,
requestVolume,
requestDuration,
successfulRuns,
failedRuns,
averageDuration,
maximumDuration,
dcgmGpuUtilization,
dcgmGpuMemoryUsed,
dcgmGpuTemperature,
dcgmGpuPowerUsage,
asyncServiceMessagesInProcess,
asyncServiceMessagesProcessed,
asyncServiceProcessingTime,
asyncServiceOutputMessagePublishTime,
asyncServiceProcessingFailures,
asyncServiceOutputMessagePublishFailures,
asyncServiceInputMessageFetchFailuresRate,
asyncServiceInputMessageFetchAckFailuresRate,
asyncServiceProcessingTimeHistogramMs,
asyncServiceInputLatencyNs,
asyncServiceKedaScalerMetricValue,
kedaScalerMetricValue,
rateOfInputTokensPerUser,
rateOfEmbeddingInputTokensPerUser,
rateOfOutputTokensPerUser,
modelInferenceRequestLatency,
rateOfModelInferenceRequest,
rateOfModelInferenceError,
rateOfModelInferenceErrorByPageType,
rateOfModelInferenceErrorPercentage,
embeddingModelInferenceRequestLatency,
costOfInference,
modelInferenceRequestPerResponseCode,
inputTokensConsumed,
outputTokensConsumed,
configAffectedRequests,
uniqueActiveUsers,
timeToFirstToken,
interTokenLatency,
perOutputTokenLatency,
modelUsageByUserOrVirtualAccount,
virtualModelUsageByUserOrVirtualAccount,
modelErrorDistribution,
podStatus,
clusterNodes,
clusterPods,
clusterCpuUsage,
clusterMemoryUsage,
clusterGpuNodes,
clusterGpuCount,
clusterGpuMemory,
clusterVolumeUsage,
clusterNetworkBytes,
sparkJobCpuUsageForExecutors,
sparkJobMemoryUsageForExecutors,
sparkJobCpuUsageForDriver,
sparkJobMemoryUsageForDriver,
sparkJobNoOfExecutors,
rateLimitingConfig,
loadBalancingConfig,
fallbackConfig,
guardrailsConfig,
budgetConfig,
rateLimitRPS,
rateLimitBlocked,
loadBalanceRPS,
loadBalanceAllModelFailed,
loadBalanceModelDistribution,
ratelimitRuleIdNominatedVsAppliedDistribution,
budgetRuleIdNominatedVsAppliedDistribution,
ratelimitBlockedDistributionByCreatedBySubject,
budgetBlockedDistributionByCreatedBySubject,
budgetRPS,
budgetBlocked,
containerRestarts,
probeFailures,
imagePullTime,
rateOfRequestsPerMcpServer,
latencyPerMcpServer,
mcpServerErrors,
mcpMethodCallsByMcpServer,
requestFailuresByMcpServer,
requestFailuresByMcpServerDistribution,
rateOfRequestsPerTool,
latencyPerTool,
toolErrors,
requestFailuresByTool,
latencySummaryPerTool,
requestsPerTool,
mcpServerUsageBreakdown,
mcpServerErrorBreakdown,
rateOfRequestsPerMcpServerPerUser,
latencyPerMcpServerPerUser,
mcpServerErrorsPerUser,
toolUsageBreakdown,
toolErrorBreakdown,
rateOfRequestsPerToolPerUser,
latencyPerToolPerUser,
toolErrorsPerUser,
requestFailuresPerTool,
requestFailuresSummaryPerTool,
mcpErrorBreakdown,
rateOfRequestsPerGuardrail,
requestsPerSecondByResult,
latencyPerGuardrail,
flaggedRateOfRequestsPerGuardrail,
mutatedRateOfRequestsPerGuardrail,
inputRequestsBreakdownPerGuardrail,
outputRequestsBreakdownPerGuardrail,
flagAndMutationResultsBreakdownPerUser,
guardrailErrors,
latencySummaryPerGuardrail,
requestBudgetLimited,
topNUsersByUsage,
topNVirtualAccountsByUsage,
topNModelsByUsage,
topNModelProvidersByUsage,
topNMcpServersByUsage,
topNToolsByUsage,
topNUsersByMcp,
topNVirtualAccountsByMcp,
incomingRequestsByType,
guardrailsOverview,
errorBreakdown,
errorBreakdownDetailed,
mcpErrorBreakdownDetailed,
meterLlmRequests,
meterMcpRequests,
meterTotalCost,
guardrailResultsBreakdownByPage,
guardrailBlockedByGuardrailsByPage,
guardrailMutatedByGuardrailsByPage,
guardrailUsageByPage,
routingRuleTargetModelDistribution,
routingDecisionsPerSecond,
routingFailureRate,
routingRuleUsageRate,
rateLimitChecksRate,
rateLimitExceededRate,
rateLimitResultBreakdown,
budgetLimitChecksRate,
budgetLimitExceededRate,
budgetLimitResultBreakdown,
routingRuleUsageRateByUser,
routingFailureRateByUser,
routingRuleUsageBreakdown,
rateLimitChecksRateByUser,
rateLimitExceededRateByUser,
rateLimitRuleUsageBreakdown,
rateLimitRuleExceededBreakdown,
rateLimitCheckResultBreakdown,
budgetLimitChecksRateByUser,
budgetLimitExceededRateByUser,
budgetLimitRuleUsageBreakdown,
budgetLimitRuleExceededBreakdown,
budgetLimitCheckResultBreakdown,
cacheEligibleRequests,
cacheHitPercentage,
cacheCostSavings,
cacheErrors,
cacheLookupLatency,
providerCacheUsagePercentage,
requestsPerSecondAgent,
requestLatencyAgent,
agentFailureRateByErrorType,
agentRequestFailureRate,
agentRequestFailuresBreakdown,
agentUsageBreakdown,
agentErrorBreakdown
id
string
required

Unique identifier of the cluster.

Query Parameters

startTs
number
required

Start timestamp in milliseconds from epoch

Example:

"1735201111814"

endTs
number
required

End timestamp in milliseconds from epoch

Example:

"1735204711814"

utcOffsetSeconds
number
required

Caller UTC offset in seconds.

params
string
required

JSON-encoded string describing the chart configuration. Call GET /metrics/cluster/{id}/charts (tool: list_cluster_metric_charts) to get available charts and pass graphs[*].params from that response unchanged.

Response

Returns the metric data for the cluster chart.

graph
MetricsGraph · object
required

Chart data payload.

stepSize
number
required

Query step size in seconds.