Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt

Use this file to discover all available pages before exploring further.

Understanding Span Attributes

Each span you query from LLM Gateway captures key request and model details. Recognizing these attributes helps you analyze and debug usage effectively.

Core Span Attributes

AttributeDescription
tfy.span_typeType of span, with possible values:
• "ChatCompletion" - Complete chat request lifecycle
• "Completion" - Text completion requests without chat context
• "MCP" - Model Context Protocol server interactions and tool calls
• "Rerank" - Document reranking operations for search relevance
• "Embedding" - Vector embedding generation operations
• "Model" - Actual LLM model inference processing
• "AgentResponse" - Multi-tool agent orchestration workflows
• "Guardrail" - Safety, compliance, and content validation checks
tfy.inputComplete input data sent to the model, mcp_server, guardrail, etc..
tfy.outputComplete output response from the model, mcp_server, guardrail, etc..
tfy.input_short_handAbbreviated version of the input for display purposes
tfy.error_messageError message if the request failed
tfy.prompt_version_fqnFQN of the prompt version used (if applicable)
tfy.prompt_variablesVariables used in prompt templating
tfy.triggered_guardrail_fqnsList of guardrails that were triggered during the request

Request Context Attributes

AttributeDescription
tfy.request.model_nameName of the model that was requested
tfy.request.created_by_subjectSubject (user/service account) that made the request
tfy.request.created_by_subject_teamsTeams associated with the requesting subject
tfy.request.metadataAdditional metadata associated with the request (e.g., {'foo': 'bar'})
tfy.request.conversation_idUnique identifier for the conversation (if part of a chat)

Model Attributes

AttributeDescription
tfy.model.idUnique identifier of the model
tfy.model.nameDisplay name of the model
tfy.model.fqnFully qualified name of the model
tfy.model.request_urlURL endpoint used for the model request
tfy.model.streamingWhether the request used streaming mode
tfy.model.request_typeType of request (e.g., "ChatCompletion", "Completion", "Embedding", "Rerank", "AgentResponse", "MCPGateway", "CreateModelResponse")
tfy.model.metric.cache_read_input_tokensNumber of input tokens served from the cache, billed at a lower cache read rate instead of the standard input rate
tfy.model.metric.cache_creation_input_tokensNumber of input tokens written to the cache, billed at a higher cache write rate to cover the cost of storage

Model Performance Metrics

AttributeDescription
tfy.model.metric.time_to_first_token_in_msTime taken to receive the first token (streaming)
tfy.model.metric.latency_in_msTotal request latency in milliseconds
tfy.model.metric.input_tokensNumber of tokens in the model input
tfy.model.metric.output_tokensNumber of tokens in the model output
tfy.model.metric.cost_in_usdCost of the request in USD
tfy.model.metric.inter_token_latency_in_msAverage latency between tokens (streaming)

Load Balancing Attributes

AttributeDescription
applied_loadbalance_rule_idsIDs of load balancing rules that were applied (e.g., ['gpt-4-dev-load'])

Budget Control Attributes

AttributeDescription
applied_budget_rule_idsIDs of budget rules that were applied to this request (e.g., ['virtualaccount1-monthly-budget'])

Rate Limiting Attributes

AttributeDescription
applied_ratelimit_rule_idsIDs of all rate limiting rules that were applied (e.g., ['virtualaccount1-daily-ratelimit'])

MCP (Model Context Protocol) Server Attributes

AttributeDescription
tfy.mcp_server.idUnique identifier of the MCP server
tfy.mcp_server.nameDisplay name of the MCP server
tfy.mcp_server.urlURL endpoint of the MCP server
tfy.mcp_server.fqnFully qualified name of the MCP server
tfy.mcp_server.server_nameInternal name of the MCP server
tfy.mcp_server.methodMCP method that was called
tfy.mcp_server.primitive_nameName of the MCP primitive used
tfy.mcp_server.error_codeError code if the MCP call failed
tfy.mcp_server.is_tool_call_execution_errorWhether the error was from tool call execution

MCP Server Metrics

AttributeDescription
tfy.mcp_server.metric.latency_in_msLatency of the MCP server call in milliseconds
tfy.mcp_server.metric.number_of_toolsNumber of tools available in the MCP server

Guardrail Attributes

AttributeDescription
tfy.guardrail.idUnique identifier of the guardrail
tfy.guardrail.nameDisplay name of the guardrail
tfy.guardrail.fqnFully qualified name of the guardrail
tfy.guardrail.resultResult of the guardrail check (e.g., 'pass', 'mutate', 'flag')

Guardrail Applied Entity Attributes

AttributeDescription
tfy.guardrail.applied_on_entity.typeType of entity the guardrail was applied to
tfy.guardrail.applied_on_entity.idID of the entity
tfy.guardrail.applied_on_entity.nameName of the entity
tfy.guardrail.applied_on_entity.fqnFQN of the entity
tfy.guardrail.applied_on_entity.scopeScope of the entity

Guardrail Metrics

AttributeDescription
tfy.guardrail.metric.latency_in_msTime taken for the guardrail check in milliseconds

HTTP Response Attributes

AttributeDescription
http.response.status_codeHTTP status code of the response

References