Content Scanner Agent
Protocol v2 Features
As of v0.2.0, the Content Scanner agent supports protocol v2 with:
- Capability negotiation: Reports supported features during handshake
- Health reporting: Exposes health status with draining awareness
- Metrics export: Counter metrics for scans, blocks, errors, and bytes processed
- gRPC transport: High-performance gRPC transport via
--grpc-address - Lifecycle hooks: Graceful shutdown and drain handling
Overview
The Content Scanner agent scans uploaded files and request bodies for malware using ClamAV daemon. It provides protection against malicious file uploads by integrating with the industry-standard ClamAV antivirus engine.
Features
| Feature | Description |
|---|---|
| ClamAV Integration | Connects to clamd via Unix socket using INSTREAM protocol |
| Content-Type Filtering | Only scan specific content types using glob patterns |
| Path Exclusions | Skip scanning for health checks and static paths |
| Method Filtering | Configure which HTTP methods to scan (POST, PUT, PATCH) |
| Size Limits | Skip scanning for bodies exceeding configured size |
| Fail-Open/Closed | Configurable behavior when ClamAV is unavailable |
| Scan Metrics | Headers include scan time and detection status |
Installation
Using Bundle (Recommended)
The easiest way to install this agent is via the Sentinel bundle command:
# Install just this agent
sentinel bundle install content-scanner
# Or install all available agents
sentinel bundle install --all
The bundle command automatically downloads the correct binary for your platform and places it in ~/.sentinel/agents/.
From Source
git clone https://github.com/raskell-io/sentinel-agent-content-scanner
cd sentinel-agent-content-scanner
cargo build --release
ClamAV Setup
The agent requires ClamAV daemon (clamd) to be running:
# macOS (Homebrew)
brew install clamav
freshclam && clamd
# Ubuntu/Debian
sudo apt-get install clamav-daemon
sudo systemctl start clamav-daemon
# RHEL/CentOS
sudo yum install clamav-server clamav-update
sudo freshclam
sudo systemctl start clamd@scan
Configuration
Create a config.yaml file:
settings:
enabled: true
fail_action: allow # allow or block when ClamAV unavailable
log_detections: true
log_clean: false
body:
max_size: 52428800 # 50MB max body to scan
content_types: # Only scan these content types (empty = all)
- "application/octet-stream"
- "application/zip"
- "application/x-zip-compressed"
- "application/gzip"
- "application/pdf"
- "application/msword"
- "application/vnd.openxmlformats-officedocument.*"
- "multipart/form-data"
clamd:
enabled: true
socket_path: "/var/run/clamav/clamd.ctl"
timeout_ms: 30000 # 30 second scan timeout
chunk_size: 65536 # 64KB chunks to clamd
skip_paths:
- "/health"
- "/ready"
- "/metrics"
scan_methods:
- "POST"
- "PUT"
- "PATCH"
Configuration Reference
Settings
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Master enable/disable switch |
fail_action | string | "allow" | Action when ClamAV unavailable: allow or block |
log_detections | bool | true | Log malware detections |
log_clean | bool | false | Log clean scan results |
Body Configuration
| Field | Type | Default | Description |
|---|---|---|---|
max_size | int | 52428800 | Maximum body size to scan (bytes, 50MB default) |
content_types | list | [] | Content types to scan (empty = all) |
ClamAV Configuration
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable ClamAV scanning |
socket_path | string | /var/run/clamav/clamd.ctl | Path to clamd Unix socket |
timeout_ms | int | 30000 | Scan timeout in milliseconds |
chunk_size | int | 65536 | Chunk size for streaming to clamd |
Path and Method Filtering
| Field | Type | Default | Description |
|---|---|---|---|
skip_paths | list | [] | Paths to skip scanning (prefix match) |
scan_methods | list | ["POST", "PUT", "PATCH"] | HTTP methods to scan |
Response Headers
| Header | Value | Description |
|---|---|---|
x-content-scanned | "true" | Body was scanned successfully |
x-scan-time-ms | "123" | Scan duration in milliseconds |
x-malware-detected | "true" | Malware was detected (blocked) |
x-malware-name | "Eicar-Test-Signature" | Name of detected malware |
x-scan-skipped | "size-exceeded" | Reason scan was skipped |
Content-Type Matching
The agent supports flexible content-type patterns:
| Pattern | Matches |
|---|---|
application/json | Exact match only |
application/* | Any application type |
application/vnd.* | Vendor-specific types like application/vnd.ms-excel |
Usage
Start the Agent
./sentinel-agent-content-scanner -c config.yaml -s /tmp/content-scanner.sock
CLI Options
| Option | Description |
|---|---|
-c, --config | Path to configuration file (default: config.yaml) |
--grpc-address | gRPC listen address (e.g., 0.0.0.0:50051) |
--example-config | Print example configuration and exit |
--validate | Validate configuration and exit |
Sentinel Configuration
Add the agent to your Sentinel route:
route "/api/upload" {
agents "content-scanner" {
socket "/tmp/content-scanner.sock"
timeout_ms 35000
fail_mode "open"
phases "request_body"
}
upstream "upload-service"
}
Testing
Test with the EICAR standard antivirus test string:
# Clean file (should return 200)
echo "Hello World" | curl -X POST \
-H "Content-Type: application/octet-stream" \
-d @- http://localhost:8080/upload
# EICAR test file (should return 403)
echo 'X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*' \
| curl -X POST \
-H "Content-Type: application/octet-stream" \
-d @- http://localhost:8080/upload
Use Cases
File Upload Protection
Scan all uploaded files for malware:
body:
content_types:
- "application/octet-stream"
- "application/zip"
- "multipart/form-data"
scan_methods:
- "POST"
- "PUT"
Document Scanning
Scan office documents and PDFs:
body:
content_types:
- "application/pdf"
- "application/msword"
- "application/vnd.openxmlformats-officedocument.*"
- "application/vnd.ms-excel"
- "application/vnd.ms-powerpoint"
API Payload Scanning
Scan all API payloads with fail-closed mode:
settings:
fail_action: block
body:
max_size: 10485760 # 10MB for API payloads
content_types: [] # Scan all content types
Performance Considerations
- Body Size Limits: Set appropriate
max_sizeto avoid scanning very large files - Timeouts: Ensure
timeout_msis sufficient for your expected file sizes - Chunk Size: Default 64KB is optimal for most use cases
- Skip Paths: Exclude health check endpoints to reduce overhead