Overview
A comprehensive data masking agent for Sentinel that protects sensitive data in API traffic through tokenization, format-preserving encryption, and pattern-based masking. Designed for organizations requiring GDPR, PCI DSS, and HIPAA compliance.
The agent intercepts request and response bodies, detects sensitive fields using configured paths or automatic pattern matching, and applies reversible or irreversible masking transformations.
Features
- Reversible Tokenization: Replace sensitive values with UUID tokens, detokenize on response
- Format-Preserving Encryption: Encrypt credit cards, SSNs while maintaining format (all-digits output)
- Pattern Detection: Automatic detection of credit cards (Luhn), SSNs, emails, phone numbers
- Content Type Support: JSON, XML, and form-urlencoded bodies
- Header Masking: Mask sensitive headers (Authorization, API keys)
- Character Masking: Partially mask values (e.g.,
4111****1111) - Per-Request Token Lifecycle: Tokens scoped to request correlation ID with automatic cleanup
- Configurable Actions: Tokenize, FPE, mask, redact, or hash
Installation
Using Cargo
cargo install sentinel-data-masking-agent
From Source
git clone https://github.com/raskell-io/sentinel
cd sentinel/agents/data-masking
cargo build --release
Configuration
Create a JSON configuration file or pass configuration through Sentinel’s agent config:
{
"store": {
"type": "memory",
"ttl_seconds": 300,
"max_entries": 100000
},
"fields": [
{
"path": "$.payment.card_number",
"action": {
"type": "fpe",
"alphabet": "credit_card"
},
"direction": "both"
},
{
"path": "$.user.ssn",
"action": {
"type": "tokenize",
"format": "uuid"
},
"direction": "both"
},
{
"path": "$.user.email",
"action": {
"type": "mask",
"char": "*",
"preserve_start": 2,
"preserve_end": 0
},
"direction": "response"
}
],
"headers": [
{
"name": "Authorization",
"action": {
"type": "redact",
"replacement": "[REDACTED]"
},
"direction": "request"
}
],
"patterns": {
"builtins": {
"credit_card": true,
"ssn": true,
"email": false,
"phone": false
},
"custom": [
{
"name": "api_key",
"regex": "sk_[a-zA-Z0-9]{24,}",
"action": {
"type": "redact",
"replacement": "sk_[REDACTED]"
}
}
]
},
"fpe": {
"key_env": "DATA_MASKING_FPE_KEY"
},
"buffering": {
"max_buffer_bytes": 10485760
}
}
Sentinel Configuration
Add to your Sentinel proxy configuration:
agents {
data-masking socket="/tmp/data-masking-agent.sock" {
events "request_headers" "request_body" "response_headers" "response_body" "request_complete"
timeout_ms 5000
max_request_body_bytes 10485760
max_response_body_bytes 10485760
failure_mode "fail_open"
config {
// Agent-specific config passed as JSON
}
}
}
Masking Actions
Tokenization (Reversible)
Replace values with UUID tokens. The original value is stored in memory and restored on the response path.
{
"path": "$.user.ssn",
"action": {
"type": "tokenize",
"format": "uuid"
}
}
Input: {"user": {"ssn": "123-45-6789"}} Masked: {"user": {"ssn": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"}} Response: Original value restored automatically
Token formats:
uuid- Standard UUID v4prefixed- Custom prefix (e.g.,tok_abc123...)
Format-Preserving Encryption (Reversible)
Encrypt values while preserving their format. Useful for credit cards and SSNs where downstream systems expect specific formats.
{
"path": "$.payment.card_number",
"action": {
"type": "fpe",
"alphabet": "credit_card"
}
}
Input: {"payment": {"card_number": "4111111111111111"}} Masked: {"payment": {"card_number": "8472619305847261"}}
Requires DATA_MASKING_FPE_KEY environment variable (64 hex chars / 32 bytes).
Alphabets:
digits- 0-9alphanumeric- 0-9, a-z, A-Zalphanumeric_lower- 0-9, a-zcredit_card- 16 digitsssn- 9 digits
Character Masking (Irreversible)
Partially mask values while preserving some characters for identification.
{
"path": "$.payment.card_number",
"action": {
"type": "mask",
"char": "*",
"preserve_start": 4,
"preserve_end": 4
}
}
Input: 4111111111111111 Output: 4111********1111
Redaction (Irreversible)
Replace the entire value with a fixed string.
{
"path": "$.secrets.api_key",
"action": {
"type": "redact",
"replacement": "[REDACTED]"
}
}
Hashing (Irreversible)
Replace with a SHA-256 hash (useful for correlation without exposing values).
{
"path": "$.user.email",
"action": {
"type": "hash",
"algorithm": "sha256",
"truncate": 16
}
}
Pattern Detection
The agent can automatically detect and mask sensitive data using built-in patterns:
Built-in Patterns
| Pattern | Detection | Default Action |
|---|---|---|
credit_card | Regex + Luhn validation | Mask first/last 4 |
ssn | XXX-XX-XXXX or 9 digits | Mask, preserve last 4 |
email | Standard email regex | Mask, preserve first 2 |
phone | US phone formats | Mask, preserve last 4 |
Custom Patterns
{
"patterns": {
"custom": [
{
"name": "aws_access_key",
"regex": "AKIA[0-9A-Z]{16}",
"action": {
"type": "redact",
"replacement": "[AWS_KEY]"
}
}
]
}
}
Direction Control
Control when masking is applied:
request- Only mask on request path (to upstream)response- Only mask on response path (to client)both- Mask on request, unmask on response (for tokenization/FPE)
{
"path": "$.user.ssn",
"action": { "type": "tokenize", "format": "uuid" },
"direction": "both"
}
Content Types
The agent supports multiple content types:
JSON
Use JSONPath-like syntax:
$.user.email- Exact pathssn- Match field name anywhere in document
XML
Use simple XPath:
/root/user/email- Exact path
Form Data
Use field names directly:
email- Form field namecredit_card- Form field name
CLI Options
sentinel-data-masking-agent [OPTIONS]
Options:
-s, --socket <PATH> Unix socket path [default: /tmp/data-masking-agent.sock]
-g, --grpc <ADDR> gRPC address (e.g., "0.0.0.0:50051")
-c, --config <PATH> Configuration file (JSON)
-l, --log-level <LEVEL> Log level [default: info]
-h, --help Print help
-V, --version Print version
Environment Variables
| Variable | Description |
|---|---|
DATA_MASKING_FPE_KEY | 64 hex character key for format-preserving encryption |
DATA_MASKING_SOCKET | Default socket path |
DATA_MASKING_LOG_LEVEL | Log level (trace, debug, info, warn, error) |
Usage Examples
PCI DSS Compliance
Protect credit card data while maintaining format for downstream validation:
{
"fields": [
{
"path": "$.payment.card_number",
"action": { "type": "fpe", "alphabet": "credit_card" },
"direction": "both"
},
{
"path": "$.payment.cvv",
"action": { "type": "redact", "replacement": "***" },
"direction": "request"
}
],
"patterns": {
"builtins": { "credit_card": true }
}
}
GDPR Data Minimization
Tokenize personal data so upstream services only see tokens:
{
"fields": [
{
"path": "$.user.email",
"action": { "type": "tokenize", "format": "uuid" },
"direction": "both"
},
{
"path": "$.user.phone",
"action": { "type": "tokenize", "format": "uuid" },
"direction": "both"
},
{
"path": "$.user.address",
"action": { "type": "tokenize", "format": "uuid" },
"direction": "both"
}
]
}
Logging-Safe Responses
Mask sensitive data in responses before they reach logging systems:
{
"fields": [
{
"path": "ssn",
"action": {
"type": "mask",
"char": "*",
"preserve_start": 0,
"preserve_end": 4
},
"direction": "response"
}
],
"headers": [
{
"name": "Set-Cookie",
"action": { "type": "redact", "replacement": "[COOKIE]" },
"direction": "response"
}
]
}
XML API Protection
{
"fields": [
{
"path": "/Envelope/Body/GetUserResponse/SSN",
"path_type": "xpath",
"action": { "type": "tokenize", "format": "uuid" },
"direction": "both"
}
]
}
Token Store
The in-memory token store provides:
- Per-request scoping: Tokens are associated with correlation IDs
- Automatic cleanup: Tokens removed when request completes
- TTL expiration: Background cleanup of stale tokens (default: 5 minutes)
- Capacity limits: LRU eviction when max entries reached
{
"store": {
"type": "memory",
"ttl_seconds": 300,
"max_entries": 100000
}
}
Best Practices
- Use tokenization for reversible masking - Prefer tokenization over FPE when format preservation isn’t required
- Set appropriate TTLs - Match token TTL to your request timeout settings
- Enable pattern detection judiciously - Only enable patterns you need to reduce false positives
- Use direction control - Apply masking only where needed (request vs response)
- Set FPE key securely - Use environment variables or secrets management, never hardcode
- Monitor buffer sizes - Adjust
max_buffer_bytesbased on your payload sizes - Test with production-like data - Validate patterns against real data before deployment
Limitations
- Body masking requires buffering the complete body (not streaming)
- FPE key must be 32 bytes (64 hex characters)
- XML support uses simple XPath (no complex expressions)
- Pattern detection may have false positives with certain data formats