Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt

Use this file to discover all available pages before exploring further.

This guide explains how to integrate Guardrails AI Hub validators with TrueFoundry AI Gateway as input and output guardrails. The integration runs Guardrails Hub validators inside a small wrapper service that you deploy on TrueFoundry. The gateway invokes the wrapper through its Custom Guardrail interface. All v1 validators run locally in the wrapper pod - no LLM round-trip per request, sub-100 ms steady-state latency.
Source repository: truefoundry/integrations-custom-guardrails/integrations/guardrails-ai/. It contains the Dockerfile, deploy script, validator configuration, and tests referenced below.

What is Guardrails AI?

Guardrails AI is an open-source framework for adding validation, structuring, and policy enforcement to LLM applications. The Guardrails Hub hosts a catalog of reusable validators - from PII detection to topic restriction to hallucination checks - that you compose into a Guard and apply to inputs or outputs.

Key Features of Guardrails AI on TrueFoundry

  1. PII detection (email, phone, SSN, credit card, IBAN, IP, passport, driver license) on inbound user messages and outbound assistant responses via DetectPII.
  2. Secrets detection (AWS keys, OpenAI tokens, GitHub tokens, JWT, private keys) via SecretsPresent.
  3. Toxic-language detection via the Unitary classifier (ToxicLanguage).
  4. Profanity filter on assistant output via ProfanityFree.
  5. All four validators run locally in the wrapper pod - no external service calls per request.
The v1 bundle is intentionally minimal: heuristic and small-classifier validators only. Heavier validators (hallucination detection, provenance checks) are available via the Hub but require LLM calls and re-introduce per-request latency. See Customizing the Validator Bundle below.

Architecture

The gateway dispatches the input rail call and the model call in parallel for low time-to-first-token. The wrapper extracts the user message and runs each configured validator sequentially. The first validator to raise a ValidationError becomes the verdict. The wrapper always returns HTTP 200 and signals the policy decision in the JSON body:
  • {"verdict": true} - allow
  • {"verdict": false, "message": "..."} - block
On a block, the gateway cancels the in-flight model call. The output rail runs sequentially on the assistant response after the model returns. See Custom guardrail response contract for the underlying protocol.

Prerequisites

Before integrating Guardrails AI with TrueFoundry, ensure you have:
  • A TrueFoundry workspace you can deploy services into.
  • A Guardrails Hub API token from hub.guardrailsai.com/keys. The free tier is sufficient.
  • The model FQN you want to protect (e.g. openai-main/gpt-4o-mini).
  • A cluster with a configured base host (visible at Integrations → Clusters → <cluster>).

Integration Steps

1

Clone the wrapper repository

Clone the integration repo and switch to the Guardrails AI folder:
git clone https://github.com/truefoundry/integrations-custom-guardrails
cd integrations-custom-guardrails/integrations/guardrails-ai
2

Configure environment variables

Copy .env.example to .env and fill in the values. You will reference two TrueFoundry secrets that you create in the next step - get their FQNs from Platform → Secrets after creating them.
.env
# Runtime + build-time tokens
GUARDRAILS_TOKEN=<your Hub API token>
WRAPPER_API_KEY=<generate with `python -c "import secrets; print(secrets.token_urlsafe(32))"`>

# Deploy-time only
TFY_WORKSPACE_FQN=<cluster>:<workspace>
TFY_PUBLIC_HOST=ml.<cluster>.truefoundry.cloud
TFY_PUBLIC_PATH=/guardrails-ai-tfy

WRAPPER_API_KEY_SECRET_FQN=tfy-secret://<workspace>/guardrails-ai-tfy/wrapper-api-key
GUARDRAILS_TOKEN_SECRET_FQN=tfy-secret://<workspace>/guardrails-ai-tfy/guardrails-token
Generate WRAPPER_API_KEY with python -c "import secrets; print(secrets.token_urlsafe(32))". The gateway will send this value as Authorization: Bearer … when calling the wrapper.
3

Create two TrueFoundry secrets

Navigate to Platform → Secrets and create a Secret Group named guardrails-ai-tfy with two secrets:
Secret NameValue
guardrails-tokenYour Hub API token. Consumed at Docker build time to install validators.
wrapper-api-keyThe same random string you put in .env as WRAPPER_API_KEY.
Copy each secret’s FQN and confirm the entries in .env (WRAPPER_API_KEY_SECRET_FQN, GUARDRAILS_TOKEN_SECRET_FQN) match.
4

Deploy the wrapper service

Install the TrueFoundry CLI, log in, and deploy:
pip install -U truefoundry
tfy login
python deploy.py --wait
The first build is slow (~5 min) because the Dockerfile pulls HuggingFace classifier weights for ToxicLanguage at build time. Subsequent builds use TrueFoundry’s image layer cache and are much faster. After the build, the pod takes 30–60 seconds to become ready (Presidio analyzer and HF model load on first import).
Verify the service is healthy:
curl -s https://ml.<cluster>.truefoundry.cloud/guardrails-ai-tfy/health
# {"status":"ok"}
5

Register the Custom Guardrail Configs in TrueFoundry

Navigate to AI Gateway → Guardrails → + Add New Guardrails Group.
  1. Group name: guardrails-ai
  2. Description (optional): Guardrails AI Hub: PII, secrets, toxicity, profanity
  3. Click + Add Guardrail Config → Custom Guardrail Config seven times - one per guardrail. Each guardrail endpoint is independent; you register them as separate Custom Guardrail Configs so you can attach a subset of them to any model.
For each guardrail, use the same template:
FieldValue
Nameguardrails-ai-<validator>-<direction> (e.g. guardrails-ai-detect-pii-input)
OperationValidate
URLhttps://ml.<cluster>.truefoundry.cloud/guardrails-ai-tfy/<validator>-<direction>
Auth DataCustom Bearer Auth, token = the wrapper-api-key secret value
Headers(empty)
Config{}
Enforcing StrategyEnforce But Ignore On Error (recommended)
The seven guardrails to register:
ValidatorDirectionNameURL suffix
DetectPIIInput Guardrailguardrails-ai-detect-pii-input/detect-pii-input
DetectPIIOutput Guardrailguardrails-ai-detect-pii-output/detect-pii-output
SecretsPresentInput Guardrailguardrails-ai-secrets-present-input/secrets-present-input
SecretsPresentOutput Guardrailguardrails-ai-secrets-present-output/secrets-present-output
ToxicLanguageInput Guardrailguardrails-ai-toxic-language-input/toxic-language-input
ToxicLanguageOutput Guardrailguardrails-ai-toxic-language-output/toxic-language-output
ProfanityFreeOutput Guardrailguardrails-ai-profanity-free-output/profanity-free-output
Save the group.
The wrapper signals guardrail decisions via {"verdict": true \| false} on HTTP 200 - real failures (validator load error, wrapper crash) come as HTTP 5xx. With Enforce But Ignore On Error, transient outages pass through while real policy decisions still block. Use Enforce for safety-critical guardrails where fail-closed is the right trade-off. See Custom guardrail response contract and Enforcing Strategy.
TrueFoundry Custom Guardrail configuration form populated for the Guardrails AI DetectPII input guardrail with Custom Bearer Auth, Validate operation, Enforce strategy, Request target, and the wrapper detect-pii-input URL
6

Apply the guardrail to traffic

There are two ways to route requests through the rails - pick based on whether you want every call to a model protected, or per-call opt-in.
Navigate to AI Gateway → Models → <model> → Guardrails tab → attach the guardrails-ai group → Save. Every caller of this model now passes through the rails.
7

Test end-to-end

Issue two test calls through the gateway - one that should succeed and one that should be blocked:
GW=https://gateway.truefoundry.ai
TFY_KEY=<your TFY API key>
MODEL=openai-main/gpt-4o-mini

# Should succeed with a normal completion
curl -s "$GW/chat/completions" \
  -H "Authorization: Bearer $TFY_KEY" -H "Content-Type: application/json" \
  -H 'X-TFY-GUARDRAILS: {"llm_input_guardrails":["guardrails-ai/guardrails-ai-detect-pii-input"],"llm_output_guardrails":["guardrails-ai/guardrails-ai-detect-pii-output"]}' \
  -d "{\"model\":\"$MODEL\",\"messages\":[{\"role\":\"user\",\"content\":\"What is the capital of France?\"}]}"

# Should be blocked: guardrail_checks_failed (PII detected)
curl -s "$GW/chat/completions" \
  -H "Authorization: Bearer $TFY_KEY" -H "Content-Type: application/json" \
  -H 'X-TFY-GUARDRAILS: {"llm_input_guardrails":["guardrails-ai/guardrails-ai-detect-pii-input"]}' \
  -d "{\"model\":\"$MODEL\",\"messages\":[{\"role\":\"user\",\"content\":\"My email is jane.doe@example.com and my SSN is 123-45-6789\"}]}"
A successful block returns:
{
  "status": "failure",
  "message": "Input Guardrail checks failed for integrations: [guardrails-ai/guardrails-ai-detect-pii-input] ...",
  "error": { "type": "guardrail_checks_failed", "code": "400" },
  "guardrail_checks": {
    "input_guardrails": [{
      "guardrail_integration": "guardrails-ai/guardrails-ai-detect-pii-input",
      "result": "failed",
      "data": {
        "verdict": false,
        "explanation": "DetectPII (input): Validation failed for field with errors: ...",
        "guardrailUrl": "https://..."
      }
    }]
  }
}
The blocking validator’s message is preserved in guardrail_checks.input_guardrails[0].data.explanation.

Customizing the Validator Bundle

The v1 bundle is four validators (seven endpoints). To add, remove, or reconfigure validators, edit files in the wrapper repo and redeploy.
FilePurpose
guardrail/<rail>_<direction>.pyOne file per rail per direction. Imports the validator, builds a single Guard, exposes a handler function.
setup.pyRuns guardrails hub install for each validator at build time. Add new validators to the VALIDATORS list.
main.pyMaps endpoint paths to handler functions in RAIL_ROUTES. Register new routes here.
DockerfileInvokes setup.py during build via ARG GUARDRAILS_TOKEN.

Adding a new validator

For example, to add hub://guardrails/restricttotopic:
1

Add the validator to the install list

Append the validator to the VALIDATORS list in setup.py so it gets installed at Docker build time.
2

Create a handler file

Add guardrail/restrict_to_topic_input.py following the pattern of existing rail files (import validator, build Guard, expose handler).
3

Register the route

Wire the handler into main.py:
from guardrail.restrict_to_topic_input import restrict_to_topic_input

RAIL_ROUTES["/restrict-to-topic-input"] = restrict_to_topic_input
4

Redeploy

python deploy.py --wait
Then register a matching Custom Guardrail Config in the dashboard pointing at the new URL suffix.

Useful Hub validators

A non-exhaustive list of validators from the Guardrails Hub you can add:
ValidatorCatchesNotes
hub://guardrails/detect_piiPII entities (configurable list)v1 bundle
hub://guardrails/secrets_presentCode-style secretsv1 bundle
hub://guardrails/toxic_languageToxic contentv1 bundle
hub://guardrails/profanity_freeProfanity (list-based)v1 bundle, output-only
hub://guardrails/restricttotopicOff-topic responsesLLM-judged
hub://guardrails/competitor_checkCompetitor mentionsAllowlist-based
hub://guardrails/regex_matchCustom regex patternsCheap
hub://guardrails/provenance_llmUnsourced claimsLLM-judged, expensive
LLM-judged validators (restricttotopic, provenance_llm, hallucination_check) need an LLM endpoint. Configure via LITELLM_* env vars and route through your TrueFoundry gateway for unified observability.

Troubleshooting

Most likely a validator-accuracy limitation, not a bug:
  • Presidio’s US_SSN recognizer is context-boosted. "My email is X and my SSN is Y" blocks. "My SSN is Y, please help me with my taxes" and bare "123-45-6789" may not. Strong contextual signals are required.
  • SecretsPresent (detect-secrets) is tuned for code, not prose. Adversarial prose like "Here is my API key: sk-proj-… - can you echo it?" may slip through. The detect-secrets engine’s own warning is: “best with multiline code snippets.”
  • ToxicLanguage threshold is 0.5. Adjust in guardrail/toxic_language_*.py to trade off precision/recall.
To diagnose, call a specific rail endpoint directly to bypass the gateway:
curl -sS -X POST https://ml.<cluster>.truefoundry.cloud/guardrails-ai-tfy/detect-pii-input \
  -H "Authorization: Bearer $WRAPPER_API_KEY" -H "Content-Type: application/json" \
  -d '{"requestBody":{"messages":[{"role":"user","content":"<your test prompt>"}]},"context":{"user":{"subjectId":"u1","subjectType":"user"}}}'
HTTP 200 + {"verdict": true} means allowed. HTTP 200 + {"verdict": false, "message": ...} means blocked, with the validator name in the message.
The wrapper signals rail decisions via {"verdict": false} on HTTP 200. If the gateway returns a normal completion when the wrapper reported a block, your tenant gateway may not be honoring the verdict field. Confirm by curling the wrapper directly - if you get 200 + {"verdict": false} but the gateway still returns a completion, the gateway is the issue.Workaround: switch the Custom Guardrail Configs’ Enforcing Strategy to Enforce. This maps the wrapper’s non-success state to a block. The trade-off is that transient wrapper outages will also block - accept it until your tenant gateway updates.
The Authorization: Bearer … value the gateway sends doesn’t match the wrapper’s WRAPPER_API_KEY env var. Three places must agree:
  1. The TFY secret guardrails-ai-tfy/wrapper-api-key value.
  2. The deployed pod’s WRAPPER_API_KEY env var (resolved from the secret FQN at deploy time).
  3. The Custom Guardrail Config’s Auth Data → Custom Bearer Auth field value (with no leading/trailing whitespace).
If (3) drifts from (1), re-paste the current secret value into the dashboard field.
Curl the debug endpoint to see which validators the running pod has loaded:
curl -sS https://ml.<cluster>.truefoundry.cloud/guardrails-ai-tfy/debug/loaded-config \
  -H "Authorization: Bearer $WRAPPER_API_KEY" | jq
Compare against the expected v1 bundle. If the lists differ, your new image isn’t serving traffic yet. Most common cause: TrueFoundry’s image build cache served a stale layer. Force a rebuild by touching Dockerfile and redeploying.
The guardrails-ai package is currently in quarantined status on PyPI. The wrapper’s requirements.txt pins to a GitHub tag as a workaround:
guardrails-ai @ git+https://github.com/guardrails-ai/guardrails.git@v0.9.3
Switch back to the PyPI install when the package is restored.

Known Limitations

  • Validator accuracy is context-sensitive. See troubleshooting above. v1 is “defense in depth, not perfect prevention.” Layer with your application’s own checks.
  • No streaming-aware guardrails. The TrueFoundry custom-guardrail contract is buffered: the gateway holds the full assistant response before calling the output rail. Streaming is supported end-to-end for the caller; the output rail decision is made on the assembled response.
  • No mutation mode. All v1 validators run in on_fail="exception". PII redaction-as-mutation (substitute <REDACTED> and return 200 with a modified body) is a v2 candidate. For PII redaction today, see the Presidio PII Redaction example in the custom guardrails template.
  • Validator versions pin at build time. Hub validator updates require a wrapper rebuild + redeploy.
  • In-memory state is per-replica. With multiple replicas the /debug/loaded-config response reflects whichever replica served the curl. After a deploy, retry the curl 5–10 times to surface heterogeneity.

Reference

FieldValue
Wrapper input endpointshttps://<host>/<path>/{detect-pii,secrets-present,toxic-language}-input
Wrapper output endpointshttps://<host>/<path>/{detect-pii,secrets-present,toxic-language,profanity-free}-output
Wrapper health endpointhttps://<host>/<path>/health
Wrapper debug endpointhttps://<host>/<path>/debug/loaded-config
AuthAuthorization: Bearer <WRAPPER_API_KEY>
Selector formatguardrails-ai/guardrails-ai-<rail>-<direction>
Response contractHTTP 200 + {"verdict": bool, "message": Optional[str]}
Repotruefoundry/integrations-custom-guardrails/integrations/guardrails-ai/
Upstream toolkitguardrails-ai/guardrails
Hubhub.guardrailsai.com