Handling Large Tool Response - TrueFoundry Docs

Why

An MCP server tool call may return a large amount of data, which can quickly fill the available context window.
The agent may not be able to control how much data an MCP server returns in a tool response by modifying the tool arguments alone. For example, fetching a single pull request can still return a very large description field.

How it works

Configuration

TrueFoundry Agent Harness applies offloading automatically — no setup required. Two complementary thresholds govern when tool responses are written to the sandbox instead of staying in context. Single tool response that’s too large If an individual tool response exceeds the per-call threshold, the full result is written to a file in the sandbox and replaced in context with a short preview plus the file path:

Agent context	Per-call threshold
Root agent	6,000 tokens
Subagent	12,000 tokens

Subagents have a higher threshold because they run in isolated context, so a single larger payload is less likely to crowd out their working memory. Parallel tool calls returning together When the agent fires several tool calls in parallel and their responses come back together, no single response may be over the per-call threshold but their combined size can still flood the context window. In this case the harness offloads responses one at a time — starting with the largest — until the combined tool-call content drops below 6,000 tokens. The smallest responses stay inline and the largest end up on disk. In every case the offloaded content remains accessible — the agent can read or grep the saved file from the sandbox whenever it needs the original data.

These thresholds are fixed defaults today. Making them configurable per agent, per tool, or per subagent is on the roadmap.

Example

User

Can you please list the mcp servers I have access to

Assistant

Let me get more information about the user.

get_me

Tool Call

{}

Tool Response

{
  "id": "cm1fa35mt009e0trd11v5831p",
  "subjectSlug": "john@example.com",
  ...
}

Assistant

Let me list the MCP servers you have access to.

Tool response blocked — list_mcp_servers returned too much data. The full result was not sent to the model; it was saved to disk and only a preview was returned.

list_mcp_servers

Tool Call

{
  createdBySubjectSlug: "john@example.com",
}

Tool Response (blocked) — Full result saved to /tmp/tool-id-xyz-output

Use sandbox to read parts of the file or extract data.
Use sandbox-schema-infer to understand the schema

Preview (first and last 100 chars):
{"data":[{"id":"v3sgnimki67gd1do5vkcv9wz","name":"test-realtime-global","fqn":"truefoundry:google-ve ... 51.458Z","updatedAt":"2026-05-19T13:23:46.269Z","createdBy":"john@truefoundry.com"}],"total":180}

Assistant

Let me get a preview of the data and print the schema.

sandbox

Tool Call

head -c 200 /tmp/tool-id-xyz-output && echo \"==\" && genson /tmp/tool-id-xyz-output

Tool Response

{"$schema":"http://json-schema.org/schema#","type":"object","properties":{"data":{"type":"array","items":{"type":"object","properties":{"id":{"type":"string"},"name":{"type":"string"},"createdBySubject":{"type":"object","properties":{"subjectSlug":{"type":"string"}},"required":["subjectSlug"]}},"required":["id","name","createdBySubject"]}},"total":{"type":"integer"}},"required":["data","total"]}

Assistant

Now I understand the schema of the MCP servers output file. Let me write code to parse the MCP server names.

sandbox

Tool Call

cat << 'EOF' > /tmp/parse_mcp_servers.py
import json
with open('/tmp/tool-id-xyz-output', 'r') as f:
    data = json.load(f)
    for item in data['data']:
        print(item['name'])
EOF
python3 /tmp/parse_mcp_servers.py

Tool Response

test-realtime-global
slack-mcp
google-calendar-mcp
...

Assistant

Hey John, here are the MCP servers you have access to:

test-realtime-global
slack-mcp
google-calendar-mcp
…

Documentation Index

​Why

​How it works

​Configuration

​Example

Why

How it works

Configuration

Example