Observability

Sentinel provides comprehensive observability through metrics, logging, and distributed tracing. All observability features are configured in the observability block.

Basic Configuration

observability {
    logging {
        level "info"
        format "json"

        access-log {
            enabled #true
            file "/var/log/sentinel/access.log"
            format "json"
        }

        error-log {
            enabled #true
            file "/var/log/sentinel/error.log"
            level "warn"
        }

        audit-log {
            enabled #true
            file "/var/log/sentinel/audit.log"
            log-blocked #true
            log-agent-decisions #true
            log-waf-events #true
        }
    }

    metrics {
        enabled #true
        address "0.0.0.0:9090"
        path "/metrics"
    }

    tracing {
        backend "otlp" {
            endpoint "http://jaeger:4317"
        }
        sampling-rate 0.01
        service-name "sentinel"
    }
}

Logging

Application Logging

Configure the main application log output:

observability {
    logging {
        level "info"           // Log level
        format "json"          // Log format
        timestamps #true       // Include timestamps
        file "/var/log/sentinel/app.log"  // Optional file path
    }
}

Log Levels

LevelDescription
traceVery detailed debugging
debugDebugging information
infoInformational messages (default)
warnWarnings
errorErrors only

Log Formats

FormatDescription
jsonStructured JSON (default, recommended for production)
prettyHuman-readable format

Access Log

HTTP request/response logging:

observability {
    logging {
        access-log {
            enabled #true
            file "/var/log/sentinel/access.log"
            format "json"
            buffer-size 8192
            include-trace-id #true
        }
    }
}

Access Log Options

OptionDefaultDescription
enabledtrueEnable access logging
file/var/log/sentinel/access.logLog file path
formatjsonLog format (json, combined, custom)
buffer-size8192Write buffer size
include-trace-idtrueInclude trace ID in logs

Access Log Fields (JSON format)

{
  "timestamp": "2024-01-15T10:30:45.123Z",
  "trace_id": "2Kj8mNpQ3xR",
  "method": "GET",
  "path": "/api/v1/users",
  "status": 200,
  "duration_ms": 45,
  "bytes_sent": 1234,
  "client_ip": "192.168.1.100",
  "user_agent": "Mozilla/5.0...",
  "upstream": "api-backend",
  "upstream_addr": "10.0.1.5:8080",
  "upstream_duration_ms": 42,
  "cache_status": "HIT",
  "route_id": "api-users"
}

Error Log

Error and warning logging:

observability {
    logging {
        error-log {
            enabled #true
            file "/var/log/sentinel/error.log"
            level "warn"
            buffer-size 8192
        }
    }
}

Error Log Options

OptionDefaultDescription
enabledtrueEnable error logging
file/var/log/sentinel/error.logLog file path
levelwarnMinimum level to log
buffer-size8192Write buffer size

Audit Log

Security-focused logging for compliance and forensics:

observability {
    logging {
        audit-log {
            enabled #true
            file "/var/log/sentinel/audit.log"
            buffer-size 8192
            log-blocked #true
            log-agent-decisions #true
            log-waf-events #true
        }
    }
}

Audit Log Options

OptionDefaultDescription
enabledtrueEnable audit logging
file/var/log/sentinel/audit.logLog file path
buffer-size8192Write buffer size
log-blockedtrueLog blocked requests
log-agent-decisionstrueLog agent allow/deny decisions
log-waf-eventstrueLog WAF rule matches

Audit Log Events

{
  "timestamp": "2024-01-15T10:30:45.123Z",
  "event_type": "request_blocked",
  "trace_id": "2Kj8mNpQ3xR",
  "client_ip": "192.168.1.100",
  "method": "POST",
  "path": "/api/v1/admin",
  "reason": "rate_limit_exceeded",
  "rule_id": "rate-limit-api",
  "action": "block",
  "metadata": {
    "limit": 100,
    "current": 101
  }
}

Metrics

Prometheus-compatible metrics endpoint:

observability {
    metrics {
        enabled #true
        address "0.0.0.0:9090"
        path "/metrics"
        high-cardinality #false
    }
}

Metrics Options

OptionDefaultDescription
enabledtrueEnable metrics endpoint
address0.0.0.0:9090Metrics server address
path/metricsMetrics endpoint path
high-cardinalityfalseInclude high-cardinality labels

Available Metrics

Request Metrics

MetricTypeDescription
sentinel_requests_totalCounterTotal requests by route, method, status
sentinel_request_duration_secondsHistogramRequest latency distribution
sentinel_request_size_bytesHistogramRequest body size
sentinel_response_size_bytesHistogramResponse body size
sentinel_active_requestsGaugeCurrently active requests

Upstream Metrics

MetricTypeDescription
sentinel_upstream_requests_totalCounterRequests to upstreams
sentinel_upstream_duration_secondsHistogramUpstream latency
sentinel_upstream_healthy_backendsGaugeHealthy backends per upstream
sentinel_upstream_connectionsGaugeActive upstream connections
sentinel_upstream_retries_totalCounterRetry attempts

Cache Metrics

MetricTypeDescription
sentinel_cache_hits_totalCounterCache hits
sentinel_cache_misses_totalCounterCache misses
sentinel_cache_size_bytesGaugeCurrent cache size
sentinel_cache_entriesGaugeNumber of cached entries
sentinel_cache_evictions_totalCounterCache evictions

Rate Limiting Metrics

MetricTypeDescription
sentinel_rate_limit_hits_totalCounterRate limit triggers
sentinel_rate_limit_allowed_totalCounterAllowed requests
sentinel_rate_limit_delayed_totalCounterDelayed requests

Agent Metrics

MetricTypeDescription
sentinel_agent_requests_totalCounterAgent call count
sentinel_agent_duration_secondsHistogramAgent call latency
sentinel_agent_errors_totalCounterAgent errors
sentinel_agent_timeouts_totalCounterAgent timeouts
sentinel_agent_circuit_breaker_stateGaugeCircuit breaker state (0=closed, 1=open, 2=half-open)

Connection Metrics

MetricTypeDescription
sentinel_connections_totalCounterTotal connections
sentinel_active_connectionsGaugeCurrent connections
sentinel_connection_duration_secondsHistogramConnection lifetime
sentinel_tls_handshake_duration_secondsHistogramTLS handshake time

Prometheus Scrape Config

scrape_configs:
  - job_name: 'sentinel'
    static_configs:
      - targets: ['sentinel:9090']
    scrape_interval: 15s
    metrics_path: /metrics

Distributed Tracing

OpenTelemetry-compatible distributed tracing:

observability {
    tracing {
        backend "otlp" {
            endpoint "http://jaeger:4317"
        }
        sampling-rate 0.01     // 1% of requests
        service-name "sentinel"
    }
}

Tracing Backends

OTLP (OpenTelemetry Protocol)

tracing {
    backend "otlp" {
        endpoint "http://otel-collector:4317"
    }
}

Jaeger

tracing {
    backend "jaeger" {
        endpoint "http://jaeger:14268/api/traces"
    }
}

Zipkin

tracing {
    backend "zipkin" {
        endpoint "http://zipkin:9411/api/v2/spans"
    }
}

Tracing Options

OptionDefaultDescription
sampling-rate0.01Fraction of requests to trace (0.0-1.0)
service-namesentinelService name in traces

Trace Context

Sentinel propagates W3C Trace Context headers:

HeaderDescription
traceparentW3C trace context parent
tracestateW3C trace context state
X-Request-IDSentinel’s trace ID

Span Attributes

Each request span includes:

AttributeDescription
http.methodHTTP method
http.urlRequest URL
http.status_codeResponse status
http.routeRoute ID
sentinel.upstreamUpstream name
sentinel.cache_statusCache hit/miss

Trace ID Format

Configure the format for request trace IDs:

server {
    trace-id-format "tinyflake"   // or "uuid"
}
FormatExampleDescription
tinyflake2Kj8mNpQ3xR11-char Base58, operator-friendly (default)
uuid550e8400-e29b-41d4-a716-44665544000036-char UUID v4

Complete Example

server {
    worker-threads 0
    trace-id-format "tinyflake"
}

observability {
    logging {
        level "info"
        format "json"

        access-log {
            enabled #true
            file "/var/log/sentinel/access.log"
            format "json"
            include-trace-id #true
        }

        error-log {
            enabled #true
            file "/var/log/sentinel/error.log"
            level "warn"
        }

        audit-log {
            enabled #true
            file "/var/log/sentinel/audit.log"
            log-blocked #true
            log-agent-decisions #true
        }
    }

    metrics {
        enabled #true
        address "0.0.0.0:9090"
        path "/metrics"
    }

    tracing {
        backend "otlp" {
            endpoint "http://otel-collector:4317"
        }
        sampling-rate 0.05
        service-name "sentinel-prod"
    }
}

Log Rotation

Sentinel logs are designed for external rotation. Use logrotate or similar:

/var/log/sentinel/*.log {
    daily
    rotate 30
    compress
    delaycompress
    missingok
    notifempty
    copytruncate
}

Best Practices

  1. Use JSON logging in production: Enables log aggregation and analysis
  2. Set appropriate log levels: info for production, debug for troubleshooting
  3. Enable audit logging: Required for security compliance
  4. Configure sampling for tracing: 1-5% is typical for production
  5. Use separate log files: Easier rotation and analysis
  6. Monitor metrics endpoints: Set up alerting on error rates and latencies

Next Steps