AI Gateway

Official Stable

Pattern-based security for AI APIs: prompt injection detection, jailbreak prevention, PII detection, and schema validation for LLM traffic.

Version: 0.2.0 Author: Sentinel Core Team License: Apache-2.0 Protocol: vv2 View Source

Quick Install

Cargo
cargo install sentinel-agent-ai-gateway

Overview

AI Gateway provides pattern-based security controls for AI API traffic (OpenAI, Anthropic, Azure OpenAI). This agent specializes in content-level pattern matching that detects known attack patterns in prompts — capabilities that complement Sentinel’s built-in inference features.

Built-in vs Agent Features

Sentinel v26.01 includes built-in inference support for token-based rate limiting, cost tracking, and model routing. This agent focuses on pattern-based guardrails that analyze prompt content:

FeatureBuilt-inAgent
Token-based rate limitingYes
Token counting (Tiktoken)Yes
Cost attribution & budgetsYes
Model-based routingYes
Fallback routingYes
Prompt injection detectionYes
Jailbreak detectionYes
Input PII detection & redactionYes
Output PII detectionYes
Schema validationYes
Model allowlistYes

Recommended setup: Use Sentinel’s built-in inference features for rate limiting and cost control, and add this agent for semantic security.

Protocol v2 Features

As of v0.2.0, the AI Gateway agent supports protocol v2 with:

  • Capability negotiation: Reports supported features during handshake
  • Health reporting: Exposes health status with detection metrics
  • Metrics export: Counter metrics for detections (prompt injection, jailbreak, PII)
  • gRPC transport: Optional high-performance gRPC transport via --grpc-address
  • Lifecycle hooks: Graceful shutdown and drain handling

Features

Input Guardrails

Analyze and filter prompts before they reach the LLM:

  • Prompt Injection Detection: Block attempts to override system prompts or manipulate AI behavior
  • Jailbreak Detection: Detect DAN, developer mode, and other bypass attempts
  • Input PII Detection: Detect emails, SSNs, phone numbers, credit cards, IP addresses
    • Configurable actions: block, redact, or log
  • Schema Validation: Validate requests against OpenAI and Anthropic JSON schemas
  • Model Allowlist: Restrict which AI models can be used

Output Guardrails

Analyze and filter LLM responses before they reach the client:

  • Response PII Detection: Detect PII leaked in LLM responses
    • Configurable actions: block, redact, or log
    • Prevents models from exposing training data or user information
  • Response Schema Validation: Validate structured outputs (JSON mode)
    • Ensure responses match expected schema
    • Block malformed structured responses

Observability

  • Provider Detection: Auto-detect OpenAI, Anthropic, Azure from request
  • Audit Tags: Rich metadata for logging and monitoring
  • Request Headers: Informational headers for downstream processing

Installation

Using Cargo

cargo install sentinel-agent-ai-gateway

Configuration

Command Line

sentinel-ai-gateway-agent \
  --socket /tmp/sentinel-ai.sock \
  --allowed-models "gpt-4,gpt-3.5-turbo,claude-3" \
  --pii-action block \
  --output-pii-action redact \
  --schema-validation

Environment Variables

General Options

OptionEnv VarDescriptionDefault
--socketAGENT_SOCKETUnix socket path/tmp/sentinel-ai-gateway.sock
--grpc-addressGRPC_ADDRESSgRPC listen address (e.g., 0.0.0.0:50051)-
--allowed-modelsALLOWED_MODELSComma-separated model allowlist(all)
--block-modeBLOCK_MODEBlock or detect-onlytrue
--fail-openFAIL_OPENAllow requests on processing errorsfalse
--verbose, -vVERBOSEEnable debug loggingfalse

Input Guardrails

OptionEnv VarDescriptionDefault
--prompt-injectionPROMPT_INJECTIONEnable prompt injection detectiontrue
--jailbreak-detectionJAILBREAK_DETECTIONEnable jailbreak detectiontrue
--pii-detectionPII_DETECTIONEnable input PII detectiontrue
--pii-actionPII_ACTIONAction on input PII: block/redact/loglog
--schema-validationSCHEMA_VALIDATIONEnable request schema validationfalse

Output Guardrails

OptionEnv VarDescriptionDefault
--output-pii-detectionOUTPUT_PII_DETECTIONEnable response PII detectionfalse
--output-pii-actionOUTPUT_PII_ACTIONAction on output PII: block/redact/loglog
--output-schemaOUTPUT_SCHEMAPath to JSON Schema for response validation-

Combine built-in inference features with the agent for comprehensive protection:

// Built-in: Token rate limiting and cost tracking
inference "openai" {
    provider openai
    token-rate-limit 100000 per minute
    token-budget 1000000 per day
    cost-tracking enabled
}

// Agent: Input and output guardrails
agent "ai-gateway" {
    socket "/tmp/sentinel-ai-gateway.sock"
    timeout 5s
    // Include response events for output guardrails
    events ["request_headers" "request_body_chunk" "response_headers" "response_body_chunk"]
}

route {
    match { path-prefix "/v1/chat" }
    inference "openai"
    agents ["ai-gateway"]
    upstream "openai-backend"
}

Note: Output guardrails require response_headers and response_body_chunk events to inspect LLM responses.

Response Headers

Input Guardrail Headers

HeaderDescription
X-AI-Gateway-ProviderDetected provider (openai, anthropic, azure)
X-AI-Gateway-ModelModel from request
X-AI-Gateway-Input-PII-DetectedComma-separated PII types found in prompt
X-AI-Gateway-Schema-ValidRequest schema validation result
X-AI-Gateway-Blockedtrue if request was blocked
X-AI-Gateway-Blocked-ReasonReason for blocking

Output Guardrail Headers

HeaderDescription
X-AI-Gateway-Output-PII-DetectedComma-separated PII types found in response
X-AI-Gateway-Output-PII-Redactedtrue if response PII was redacted
X-AI-Gateway-Output-Schema-ValidResponse schema validation result

Detection Patterns

Input Detection

Prompt Injection

Detects patterns like:

  • “Ignore previous instructions”
  • “You are now a…”
  • “System prompt:”
  • Role manipulation attempts
  • System prompt extraction attempts

Jailbreak

Detects patterns like:

  • DAN (Do Anything Now) and variants
  • Developer/debug mode requests
  • “Hypothetically” and “for educational purposes” framing
  • Evil/uncensored mode requests

PII Detection (Input & Output)

Detects in both prompts and responses:

  • Email addresses
  • Social Security Numbers (SSN)
  • Phone numbers (US format)
  • Credit card numbers
  • Public IP addresses
  • API keys and secrets (common patterns)

Schema Validation (Input & Output)

Input Schema

Validates requests against JSON schemas for:

  • OpenAI Chat: model, messages, temperature (0-2), etc.
  • OpenAI Completions: model, prompt
  • Anthropic Messages: model, max_tokens, messages

Output Schema

Validates structured responses:

  • JSON mode responses
  • Function call outputs
  • Tool use responses

Supported Providers

ProviderDetectionPaths
OpenAIBearer sk-* header/v1/chat/completions, /v1/completions
Anthropicanthropic-version header/v1/messages, /v1/complete
Azure OpenAIPath pattern/openai/deployments/*/chat/completions

Examples

Block Prompt Injection

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Ignore all previous instructions and..."}]
  }'

Response:

HTTP/1.1 403 Forbidden
X-AI-Gateway-Blocked: true
X-AI-Gateway-Blocked-Reason: prompt-injection

Input PII Redaction

With --pii-action redact, PII is replaced before reaching the upstream:

// Original
{"messages": [{"role": "user", "content": "Email me at john@example.com"}]}

// After redaction
{"messages": [{"role": "user", "content": "Email me at [EMAIL_REDACTED]"}]}

Output PII Redaction

With --output-pii-action redact, PII in LLM responses is redacted before reaching the client:

// Original LLM response
{"choices": [{"message": {"content": "The customer's email is john.doe@company.com and SSN is 123-45-6789"}}]}

// After redaction
{"choices": [{"message": {"content": "The customer's email is [EMAIL_REDACTED] and SSN is [SSN_REDACTED]"}}]}

Response headers:

X-AI-Gateway-Output-PII-Detected: email,ssn
X-AI-Gateway-Output-PII-Redacted: true

Full Protection Example

Run with comprehensive input and output guardrails:

sentinel-ai-gateway-agent \
  --socket /tmp/sentinel-ai.sock \
  --allowed-models "gpt-4,gpt-4-turbo,claude-3-opus" \
  --prompt-injection true \
  --jailbreak-detection true \
  --pii-action redact \
  --output-pii-detection true \
  --output-pii-action redact

Library Usage

use sentinel_agent_ai_gateway::{AiGatewayAgent, AiGatewayConfig, PiiAction};
use sentinel_agent_protocol::AgentServer;

let config = AiGatewayConfig {
    // Input guardrails
    prompt_injection_enabled: true,
    jailbreak_detection_enabled: true,
    pii_detection_enabled: true,
    pii_action: PiiAction::Redact,
    schema_validation_enabled: true,

    // Output guardrails
    output_pii_detection_enabled: true,
    output_pii_action: PiiAction::Redact,

    ..Default::default()
};

let agent = AiGatewayAgent::new(config);
let server = AgentServer::new("ai-gateway", "/tmp/ai.sock", Box::new(agent));
server.run().await?;
AgentIntegration
ModSecurityFull OWASP CRS support for web attacks
AuthPer-user API keys and quotas
WAFAdditional web attack detection