Documentation Index
Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt
Use this file to discover all available pages before exploring further.
Why Code Mode?
When an agent uses tools the standard way, every tool call is a separate round-trip with the model: the agent calls one MCP tool, the full JSON response enters the conversation, the model reasons over that JSON, then it calls the next tool — copying IDs and computing counts in prose along the way. For tasks that involve aggregating tool output, or chaining several calls together, this approach is slow, fills the context window with intermediate JSON the user doesn’t care about, and is prone to small errors like miscounts and typoed IDs. Take a common task: “How many open PRs does each contributor have on this repo?” The user only cares about a small summary table, but the underlying GitHub tool returns a full record per PR — title, labels, reviewers, timestamps, and more. Without Code Mode, every one of those records lands in context and the model has to count from prose. With Code Mode, the agent calls the same tool inside a script, runsCounter on the author logins, and prints only the table.
What is Code Mode?
Code Mode collapses tool round-trips into a single script. Using the Sandbox, the agent writes Python that calls MCP tools through an in-sandbox MCP client, processes the responses in code, and prints only what the user actually needs. The harness also exposes helpers for the agent to discover available tools and inspect their output schemas before writing the script. A Code Mode script reads naturally — the agent picks a tool, calls it, and prints only the fields it cares about:When does the Agent use Code Mode?
The harness picks Code Mode when the task fits one of two patterns where running the work in code is materially better than reasoning over raw JSON in chat:- Aggregate or format tool output — counts, group-bys, sums, filters, or formatted tables over the response of a single tool call.
- Chain tool calls — one tool’s output feeds another tool’s input, and the intermediate JSON is plumbing the user doesn’t need to see.
Aggregate or format tool output
When a task asks for a count, group-by, sum, filter, or formatted table over the response of a tool call, doing the math in code is far more reliable than asking the model to do it from prose — and it keeps raw records out of context entirely. Example: The user wants open PRs on a repo grouped by contributor, with a count per author. The user only cares about a small summary table, but the tool response contains a full record per PR with titles, labels, and metadata.| With Code Mode | Without Code Mode | |
|---|---|---|
| Approach | Call the tool inside a script, group with Counter on author login, print only the summary table. | Full tool response enters the conversation. The model reads every PR and produces counts per author in natural language. |
| Cost | Only the summary table reaches the model — minimal tokens. | Dozens of PR records (titles, labels, metadata) sit in context even though the user only asked for counts — context bloat. |
| Reliability | Counts are computed by code, so they are exact. | The model can miscount PRs, merge two usernames, or invent a contributor that wasn’t in the response — group-by and count tasks are especially hallucination-prone when done from prose. |
User
How many open PRs does each contributor have on
truefoundry/docs-mintlify?Assistant
Let me check the output schema for the GitHub list-pull-requests tool, then count PRs per author in one script.
The Agent calls
get_tool_output_schema so it knows the shape of the tool response before writing Code Mode scripts. With a defined outputSchema, the model can read and transform fields in code safely instead of guessing keys or types from raw JSON.get_tool_output_schema
get_tool_output_schema
Tool CallTool Response
Assistant
sandbox
sandbox
Tool CallTool Response
Assistant
On
truefoundry/docs-mintlify there are 7 open PRs across 4 contributors: alice-dev (3), bob-eng (2), carol-docs (1), and deb-docs (1).Chain tool calls
When one tool’s output feeds the input of another — for example, resolving an entity’s ID before fetching its details — Code Mode lets the agent do both calls in a single script. The intermediate response never enters the conversation, so there is no chance of typing wrong IDs and no extra model turn between tool calls. Example: The user wants to know which metrics are available for appmy-app in the tfy-test-cluster:test-ws workspace. To answer this, the agent has to first resolve the app to its internal id and workspaceId, then use those values to fetch the chart list.
| With Code Mode | Without Code Mode | |
|---|---|---|
| Approach | Run list_applications and list_app_metric_charts in one script. The script reads id and workspaceId from the first response and passes them straight into the second call, then prints only the chart table. | Call the first tool, wait for a model turn, copy IDs out of the JSON into the second tool call, then wait for another turn. |
| Cost | Intermediate response stays in the sandbox — only the final chart table reaches context. | Full applications response (pagination, every app field) sits in the conversation before the second call runs — context bloat. |
| Latency | One model turn covers both tool calls. | Each tool hop needs another model call in between — extra latency. |
| Reliability | IDs flow inside the script — no copy-paste. | The model can typo an applicationId or workspaceId, pick the wrong app from a long list, or drop a required field so the second call fails — hallucination risk. |
User
What metrics are available for app
my-app in the tfy-test-cluster:test-ws workspace?Assistant
I need schemas for
list_applications and list_app_metric_charts, then I will resolve the app and list its charts in one script.The Agent calls
get_tool_output_schema so it knows the shape of the tool response before writing Code Mode scripts. With a defined outputSchema, the model can read and transform fields in code safely instead of guessing keys or types from raw JSON.get_tool_output_schema
get_tool_output_schema
Tool CallTool Response
get_tool_output_schema
get_tool_output_schema
Tool CallTool Response
Assistant
sandbox
sandbox
Tool CallTool Response
Assistant
my-app in
tfy-test-cluster:test-ws has 12 metric charts (CPU, memory, network, request volume, and others). Query step is 1800s.