Overview
AI Gateway provides pattern-based security controls for AI API traffic (OpenAI, Anthropic, Azure OpenAI). This agent specializes in content-level pattern matching that detects known attack patterns in prompts — capabilities that complement Sentinel’s built-in inference features.
Built-in vs Agent Features
Sentinel v26.01 includes built-in inference support for token-based rate limiting, cost tracking, and model routing. This agent focuses on pattern-based guardrails that analyze prompt content:
| Feature | Built-in | Agent |
|---|---|---|
| Token-based rate limiting | Yes | — |
| Token counting (Tiktoken) | Yes | — |
| Cost attribution & budgets | Yes | — |
| Model-based routing | Yes | — |
| Fallback routing | Yes | — |
| Prompt injection detection | — | Yes |
| Jailbreak detection | — | Yes |
| Input PII detection & redaction | — | Yes |
| Output PII detection | — | Yes |
| Schema validation | — | Yes |
| Model allowlist | — | Yes |
Recommended setup: Use Sentinel’s built-in inference features for rate limiting and cost control, and add this agent for semantic security.
Protocol v2 Features
As of v0.2.0, the AI Gateway agent supports protocol v2 with:
- Capability negotiation: Reports supported features during handshake
- Health reporting: Exposes health status with detection metrics
- Metrics export: Counter metrics for detections (prompt injection, jailbreak, PII)
- gRPC transport: Optional high-performance gRPC transport via
--grpc-address - Lifecycle hooks: Graceful shutdown and drain handling
Features
Input Guardrails
Analyze and filter prompts before they reach the LLM:
- Prompt Injection Detection: Block attempts to override system prompts or manipulate AI behavior
- Jailbreak Detection: Detect DAN, developer mode, and other bypass attempts
- Input PII Detection: Detect emails, SSNs, phone numbers, credit cards, IP addresses
- Configurable actions: block, redact, or log
- Schema Validation: Validate requests against OpenAI and Anthropic JSON schemas
- Model Allowlist: Restrict which AI models can be used
Output Guardrails
Analyze and filter LLM responses before they reach the client:
- Response PII Detection: Detect PII leaked in LLM responses
- Configurable actions: block, redact, or log
- Prevents models from exposing training data or user information
- Response Schema Validation: Validate structured outputs (JSON mode)
- Ensure responses match expected schema
- Block malformed structured responses
Observability
- Provider Detection: Auto-detect OpenAI, Anthropic, Azure from request
- Audit Tags: Rich metadata for logging and monitoring
- Request Headers: Informational headers for downstream processing
Installation
Using Cargo
cargo install sentinel-agent-ai-gateway
Configuration
Command Line
sentinel-ai-gateway-agent \
--socket /tmp/sentinel-ai.sock \
--allowed-models "gpt-4,gpt-3.5-turbo,claude-3" \
--pii-action block \
--output-pii-action redact \
--schema-validation
Environment Variables
General Options
| Option | Env Var | Description | Default |
|---|---|---|---|
--socket | AGENT_SOCKET | Unix socket path | /tmp/sentinel-ai-gateway.sock |
--grpc-address | GRPC_ADDRESS | gRPC listen address (e.g., 0.0.0.0:50051) | - |
--allowed-models | ALLOWED_MODELS | Comma-separated model allowlist | (all) |
--block-mode | BLOCK_MODE | Block or detect-only | true |
--fail-open | FAIL_OPEN | Allow requests on processing errors | false |
--verbose, -v | VERBOSE | Enable debug logging | false |
Input Guardrails
| Option | Env Var | Description | Default |
|---|---|---|---|
--prompt-injection | PROMPT_INJECTION | Enable prompt injection detection | true |
--jailbreak-detection | JAILBREAK_DETECTION | Enable jailbreak detection | true |
--pii-detection | PII_DETECTION | Enable input PII detection | true |
--pii-action | PII_ACTION | Action on input PII: block/redact/log | log |
--schema-validation | SCHEMA_VALIDATION | Enable request schema validation | false |
Output Guardrails
| Option | Env Var | Description | Default |
|---|---|---|---|
--output-pii-detection | OUTPUT_PII_DETECTION | Enable response PII detection | false |
--output-pii-action | OUTPUT_PII_ACTION | Action on output PII: block/redact/log | log |
--output-schema | OUTPUT_SCHEMA | Path to JSON Schema for response validation | - |
Recommended Sentinel Configuration
Combine built-in inference features with the agent for comprehensive protection:
// Built-in: Token rate limiting and cost tracking
inference "openai" {
provider openai
token-rate-limit 100000 per minute
token-budget 1000000 per day
cost-tracking enabled
}
// Agent: Input and output guardrails
agent "ai-gateway" {
socket "/tmp/sentinel-ai-gateway.sock"
timeout 5s
// Include response events for output guardrails
events ["request_headers" "request_body_chunk" "response_headers" "response_body_chunk"]
}
route {
match { path-prefix "/v1/chat" }
inference "openai"
agents ["ai-gateway"]
upstream "openai-backend"
}
Note: Output guardrails require
response_headersandresponse_body_chunkevents to inspect LLM responses.
Response Headers
Input Guardrail Headers
| Header | Description |
|---|---|
X-AI-Gateway-Provider | Detected provider (openai, anthropic, azure) |
X-AI-Gateway-Model | Model from request |
X-AI-Gateway-Input-PII-Detected | Comma-separated PII types found in prompt |
X-AI-Gateway-Schema-Valid | Request schema validation result |
X-AI-Gateway-Blocked | true if request was blocked |
X-AI-Gateway-Blocked-Reason | Reason for blocking |
Output Guardrail Headers
| Header | Description |
|---|---|
X-AI-Gateway-Output-PII-Detected | Comma-separated PII types found in response |
X-AI-Gateway-Output-PII-Redacted | true if response PII was redacted |
X-AI-Gateway-Output-Schema-Valid | Response schema validation result |
Detection Patterns
Input Detection
Prompt Injection
Detects patterns like:
- “Ignore previous instructions”
- “You are now a…”
- “System prompt:”
- Role manipulation attempts
- System prompt extraction attempts
Jailbreak
Detects patterns like:
- DAN (Do Anything Now) and variants
- Developer/debug mode requests
- “Hypothetically” and “for educational purposes” framing
- Evil/uncensored mode requests
PII Detection (Input & Output)
Detects in both prompts and responses:
- Email addresses
- Social Security Numbers (SSN)
- Phone numbers (US format)
- Credit card numbers
- Public IP addresses
- API keys and secrets (common patterns)
Schema Validation (Input & Output)
Input Schema
Validates requests against JSON schemas for:
- OpenAI Chat: model, messages, temperature (0-2), etc.
- OpenAI Completions: model, prompt
- Anthropic Messages: model, max_tokens, messages
Output Schema
Validates structured responses:
- JSON mode responses
- Function call outputs
- Tool use responses
Supported Providers
| Provider | Detection | Paths |
|---|---|---|
| OpenAI | Bearer sk-* header | /v1/chat/completions, /v1/completions |
| Anthropic | anthropic-version header | /v1/messages, /v1/complete |
| Azure OpenAI | Path pattern | /openai/deployments/*/chat/completions |
Examples
Block Prompt Injection
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Ignore all previous instructions and..."}]
}'
Response:
HTTP/1.1 403 Forbidden
X-AI-Gateway-Blocked: true
X-AI-Gateway-Blocked-Reason: prompt-injection
Input PII Redaction
With --pii-action redact, PII is replaced before reaching the upstream:
// Original
{"messages": [{"role": "user", "content": "Email me at john@example.com"}]}
// After redaction
{"messages": [{"role": "user", "content": "Email me at [EMAIL_REDACTED]"}]}
Output PII Redaction
With --output-pii-action redact, PII in LLM responses is redacted before reaching the client:
// Original LLM response
{"choices": [{"message": {"content": "The customer's email is john.doe@company.com and SSN is 123-45-6789"}}]}
// After redaction
{"choices": [{"message": {"content": "The customer's email is [EMAIL_REDACTED] and SSN is [SSN_REDACTED]"}}]}
Response headers:
X-AI-Gateway-Output-PII-Detected: email,ssn
X-AI-Gateway-Output-PII-Redacted: true
Full Protection Example
Run with comprehensive input and output guardrails:
sentinel-ai-gateway-agent \
--socket /tmp/sentinel-ai.sock \
--allowed-models "gpt-4,gpt-4-turbo,claude-3-opus" \
--prompt-injection true \
--jailbreak-detection true \
--pii-action redact \
--output-pii-detection true \
--output-pii-action redact
Library Usage
use sentinel_agent_ai_gateway::{AiGatewayAgent, AiGatewayConfig, PiiAction};
use sentinel_agent_protocol::AgentServer;
let config = AiGatewayConfig {
// Input guardrails
prompt_injection_enabled: true,
jailbreak_detection_enabled: true,
pii_detection_enabled: true,
pii_action: PiiAction::Redact,
schema_validation_enabled: true,
// Output guardrails
output_pii_detection_enabled: true,
output_pii_action: PiiAction::Redact,
..Default::default()
};
let agent = AiGatewayAgent::new(config);
let server = AgentServer::new("ai-gateway", "/tmp/ai.sock", Box::new(agent));
server.run().await?;
Related Agents
| Agent | Integration |
|---|---|
| ModSecurity | Full OWASP CRS support for web attacks |
| Auth | Per-user API keys and quotas |
| WAF | Additional web attack detection |