Upstreams

The upstreams block defines backend server pools. Each upstream contains one or more targets with load balancing, health checks, and connection pooling.

Basic Configuration

upstreams {
    upstream "backend" {
        targets {
            target { address "10.0.1.1:8080" }
            target { address "10.0.1.2:8080" }
            target { address "10.0.1.3:8080" }
        }
        load-balancing "round_robin"
    }
}

Targets

Target Definition

targets {
    target {
        address "10.0.1.1:8080"
        weight 3
        max-requests 1000
    }
    target {
        address "10.0.1.2:8080"
        weight 2
    }
    target {
        address "10.0.1.3:8080"
        weight 1
    }
}
OptionDefaultDescription
addressRequiredTarget address (host:port)
weight1Weight for weighted load balancing
max-requestsNoneMaximum concurrent requests to this target

Target Metadata

target {
    address "10.0.1.1:8080"
    metadata {
        "zone" "us-east-1a"
        "version" "v2.1.0"
    }
}

Metadata is available for custom load balancing decisions and observability.

Load Balancing

upstream "backend" {
    load-balancing "round_robin"
}

Algorithms

AlgorithmDescription
round_robinSequential rotation through targets (default)
least_connectionsRoute to target with fewest active connections
weighted_least_connWeighted least connections (connection/weight ratio)
randomRandom target selection
ip_hashConsistent routing based on client IP
weightedWeighted random selection
consistent_hashConsistent hashing for cache-friendly routing
maglevGoogle’s Maglev consistent hashing (minimal disruption)
power_of_two_choicesPick best of two random targets
adaptiveDynamic selection based on response times
peak_ewmaLatency-based selection using exponential moving average
locality_awareZone-aware routing with fallback strategies
deterministic_subsetSubset of backends per proxy (large clusters)
least_tokens_queuedToken-based selection for LLM workloads

Round Robin

upstream "backend" {
    load-balancing "round_robin"
}

Simple sequential rotation. Good for homogeneous backends.

Weighted

system {
    worker-threads 0
}

listeners {
    listener "http" {
        address "0.0.0.0:8080"
        protocol "http"
    }
}

routes {
    route "default" {
        matches { path-prefix "/" }
        upstream "backend"
    }
}

upstreams {
    upstream "backend" {
        targets {
            target { address "10.0.1.1:8080" weight=3 }
            target { address "10.0.1.2:8080" weight=2 }
            target { address "10.0.1.3:8080" weight=1 }
        }
        load-balancing "weighted"
    }
}

Traffic distributed proportionally to weights. Use for:

  • Different server capacities
  • Gradual rollouts
  • A/B testing

Least Connections

upstream "backend" {
    load-balancing "least_connections"
}

Routes to the target with the fewest active connections. Best for:

  • Varying request durations
  • Long-running connections
  • Heterogeneous workloads

IP Hash

upstream "backend" {
    load-balancing "ip_hash"
}

Consistent routing based on client IP. Provides session affinity without cookies.

Note: Clients behind shared NAT will route to the same target.

Consistent Hash

upstream "backend" {
    load-balancing "consistent_hash"
}

Consistent hashing minimizes redistribution when targets are added/removed. Ideal for:

  • Caching layers
  • Stateful backends
  • Maintaining locality

Power of Two Choices

upstream "backend" {
    load-balancing "power_of_two_choices"
}

Randomly selects two targets, routes to the one with fewer connections. Provides:

  • Near-optimal load distribution
  • O(1) selection time
  • Better than pure random

Adaptive

upstream "backend" {
    load-balancing "adaptive"
}

Dynamically adjusts routing based on observed response times and error rates. The adaptive balancer continuously learns from request outcomes and automatically routes traffic away from slow or failing backends.

How It Works

  1. Weight Adjustment: Each target starts with its configured weight. The balancer adjusts effective weights based on performance:

    • Targets with high error rates have their weights reduced
    • Targets with high latency have their weights reduced
    • Healthy, fast targets recover their weights over time
  2. EWMA Smoothing: Error rates and latencies use Exponentially Weighted Moving Averages to smooth out transient spikes and focus on sustained trends.

  3. Circuit Breaker Integration: Targets with consecutive failures are temporarily removed from rotation, then gradually reintroduced.

  4. Latency Feedback: Every request reports its latency back to the balancer, enabling real-time performance awareness.

Selection Algorithm

For each request, the adaptive balancer:

  1. Calculates a score for each healthy target: score = weight / (1 + connections + error_penalty + latency_penalty)
  2. Uses weighted random selection based on scores
  3. Targets with better performance get proportionally more traffic

Default Thresholds

ParameterDefaultDescription
Error threshold5%Error rate that triggers weight penalty
Latency threshold500msp99 latency that triggers penalty
Min weight ratio10%Minimum weight (fraction of original)
Max weight ratio200%Maximum weight (fraction of original)
Adjustment interval10sHow often weights are recalculated
Min requests100Minimum requests before adjusting

When to Use Adaptive

Best for:

  • Heterogeneous backends with varying performance
  • Services with unpredictable load patterns
  • Environments where backend health fluctuates
  • Gradual degradation scenarios

Consider alternatives when:

  • All backends have identical performance (use round_robin)
  • Session affinity is required (use ip_hash or consistent_hash)
  • You need deterministic routing (use weighted)

Example: API with Variable Backend Performance

upstream "api" {
    targets {
        target { address "api-1.internal:8080" weight=100 }
        target { address "api-2.internal:8080" weight=100 }
        target { address "api-3.internal:8080" weight=100 }
    }
    load-balancing "adaptive"
    health-check {
        type "http" {
            path "/health"
            expected-status 200
        }
        interval-secs 5
        unhealthy-threshold 3
    }
}

If api-2 starts responding slowly, traffic automatically shifts to api-1 and api-3. When api-2 recovers, it gradually receives more traffic again.

Maglev

upstream "cache-cluster" {
    load-balancing "maglev"
}

Google’s Maglev consistent hashing algorithm provides O(1) lookup with minimal disruption when backends are added or removed. Uses a permutation-based lookup table for fast, consistent routing.

How It Works

  1. Lookup Table: Builds a 65,537-entry lookup table mapping hash values to backends
  2. Permutation Sequences: Each backend generates a unique permutation for table population
  3. Minimal Disruption: When backends change, only ~1/N keys are remapped (N = number of backends)
  4. Hash Key Sources: Can hash on client IP, header value, cookie, or request path

When to Use Maglev

Best for:

  • Large cache clusters requiring consistent routing
  • Services where session affinity matters
  • Minimizing cache invalidation during scaling
  • High-throughput systems needing O(1) selection

Comparison with consistent_hash:

  • Maglev: Better load distribution, O(1) lookup, more memory
  • Consistent Hash: Ring-based, O(log N) lookup, less memory

Peak EWMA

upstream "api" {
    load-balancing "peak_ewma"
}

Twitter Finagle’s Peak EWMA (Exponentially Weighted Moving Average) algorithm tracks latency and selects backends with the lowest predicted completion time.

How It Works

  1. EWMA Tracking: Maintains exponentially weighted moving average of each backend’s latency
  2. Peak Detection: Uses the maximum of EWMA and recent latency to quickly detect spikes
  3. Load Penalty: Penalizes backends with active connections
  4. Decay Time: Old latency observations decay over time (default: 10 seconds)

Selection Algorithm

For each request, Peak EWMA:

  1. Calculates load_score = peak_latency × (1 + active_connections × penalty)
  2. Selects the backend with the lowest load score
  3. Reports actual latency after request completes

When to Use Peak EWMA

Best for:

  • Heterogeneous backends with varying performance
  • Latency-sensitive applications
  • Backends with unpredictable response times
  • Services where slow backends should be avoided

Consider alternatives when:

  • All backends have identical performance (use round_robin)
  • Session affinity is required (use maglev or consistent_hash)

Locality-Aware

upstream "global-api" {
    load-balancing "locality_aware"
}

Prefers targets in the same zone or region as the proxy, falling back to other zones when local targets are unavailable.

Zone Configuration

Zones can be specified in target metadata or parsed from addresses:

targets {
    target {
        address "10.0.1.1:8080"
        metadata { "zone" "us-east-1a" }
    }
    target {
        address "10.0.1.2:8080"
        metadata { "zone" "us-east-1b" }
    }
    target {
        address "10.0.2.1:8080"
        metadata { "zone" "us-west-2a" }
    }
}

Fallback Strategies

When no local targets are healthy:

StrategyBehavior
round_robinRound-robin across all healthy targets (default)
randomRandom selection from all healthy targets
fail_localReturn error if no local targets available

When to Use Locality-Aware

Best for:

  • Multi-region deployments
  • Minimizing cross-zone latency
  • Reducing data transfer costs
  • Geographic data residency requirements

Deterministic Subsetting

upstream "large-cluster" {
    load-balancing "deterministic_subset"
}

For very large clusters (1000+ backends), limits each proxy instance to a deterministic subset of backends, reducing connection overhead while ensuring even distribution.

How It Works

  1. Subset Selection: Each proxy uses a consistent hash to select its subset
  2. Deterministic: Same proxy ID always selects the same subset
  3. Even Distribution: Across all proxies, each backend receives roughly equal traffic
  4. Subset Size: Default 10 backends per proxy (configurable)

When to Use Deterministic Subsetting

Best for:

  • Very large backend pools (1000+ targets)
  • Reducing connection overhead
  • Limiting memory usage per proxy
  • Services where full-mesh connectivity is impractical

Trade-offs:

  • Each proxy only sees a subset of backends
  • Subset changes when proxy restarts with different ID
  • Less effective with small backend pools

Weighted Least Connections

upstream "mixed-capacity" {
    load-balancing "weighted_least_conn"
}

Combines weight with connection counting. Selects the backend with the lowest ratio of active connections to weight.

Selection Algorithm

score = active_connections / weight

A backend with weight 200 and 10 connections (score: 0.05) is preferred over a backend with weight 100 and 6 connections (score: 0.06).

Example: Mixed Capacity Backends

system {
    worker-threads 0
}

listeners {
    listener "http" {
        address "0.0.0.0:8080"
        protocol "http"
    }
}

routes {
    route "default" {
        matches { path-prefix "/" }
        upstream "mixed-capacity"
    }
}

upstreams {
    // Large server can handle 2x traffic, medium is standard, small is half capacity
    upstream "mixed-capacity" {
        targets {
            target { address "large-server:8080" weight=200 }
            target { address "medium-server:8080" weight=100 }
            target { address "small-server:8080" weight=50 }
        }
        load-balancing "weighted_least_conn"
    }
}

When to Use Weighted Least Connections

Best for:

  • Heterogeneous backend capacities
  • Mixed old/new hardware
  • Gradual capacity scaling
  • Long-running requests with varying backend power

Comparison with least_connections:

  • least_connections: Ignores weight, pure connection count
  • weighted_least_conn: Accounts for backend capacity via weight

Least Tokens Queued

upstream "llm-backend" {
    load-balancing "least_tokens_queued"
}

Specialized algorithm for LLM/inference workloads. Selects the backend with the fewest estimated tokens currently being processed.

How It Works

  1. Token Estimation: Parses request body to estimate input tokens
  2. Queue Tracking: Tracks estimated tokens queued per backend
  3. Selection: Routes to backend with lowest token queue
  4. Completion Tracking: Updates queue when requests complete

When to Use Least Tokens Queued

Best for:

  • LLM inference backends (OpenAI-compatible APIs)
  • Services where request cost varies by input size
  • GPU-bound workloads with token-based processing
  • Balancing across heterogeneous inference hardware

Health Checks

HTTP Health Check

upstream "backend" {
    health-check {
        type "http" {
            path "/health"
            expected-status 200
            host "backend.internal"  // Optional Host header
        }
        interval-secs 10
        timeout-secs 5
        healthy-threshold 2
        unhealthy-threshold 3
    }
}
OptionDefaultDescription
typeRequiredCheck type (http, tcp, grpc)
interval-secs10Time between checks
timeout-secs5Check timeout
healthy-threshold2Successes to mark healthy
unhealthy-threshold3Failures to mark unhealthy

TCP Health Check

upstream "database" {
    health-check {
        type "tcp"
        interval-secs 5
        timeout-secs 2
    }
}

Simple TCP connection check. Use for non-HTTP services.

gRPC Health Check

upstream "grpc-service" {
    health-check {
        type "grpc" {
            service "my.service.Name"
        }
        interval-secs 10
        timeout-secs 5
    }
}

Uses the standard gRPC Health Checking Protocol (grpc.health.v1.Health).

Service Name

The service field specifies which service to check:

  • Empty string "": Checks overall server health
  • Service name: Checks health of a specific service (e.g., "my.package.MyService")
// Check overall server health
type "grpc" {
    service ""
}

// Check specific service
type "grpc" {
    service "myapp.UserService"
}

Response Handling

StatusResult
SERVINGHealthy
NOT_SERVINGUnhealthy
UNKNOWNUnhealthy
SERVICE_UNKNOWNUnhealthy
Connection failureUnhealthy

Example: gRPC Microservices

upstream "user-service" {
    targets {
        target { address "user-svc-1.internal:50051" }
        target { address "user-svc-2.internal:50051" }
    }
    load-balancing "least_connections"
    health-check {
        type "grpc" {
            service "user.UserService"
        }
        interval-secs 5
        timeout-secs 3
        healthy-threshold 2
        unhealthy-threshold 2
    }
}

Health Check Behavior

When a target fails health checks:

  1. Target marked unhealthy after unhealthy-threshold failures
  2. Traffic stops routing to unhealthy target
  3. Health checks continue at interval-secs
  4. Target marked healthy after healthy-threshold successes
  5. Traffic resumes to recovered target

Connection Pool

upstream "backend" {
    connection-pool {
        max-connections 100
        max-idle 20
        idle-timeout-secs 60
        max-lifetime-secs 3600
    }
}
OptionDefaultDescription
max-connections100Maximum connections per target
max-idle20Maximum idle connections to keep
idle-timeout-secs60Close idle connections after
max-lifetime-secsNoneMaximum connection lifetime

Connection Pool Sizing

Scenariomax-connectionsmax-idle
Low traffic20-505-10
Medium traffic10020
High traffic500+50+
Long-lived connections5010

Guidelines:

  • max-connections = expected peak RPS × average request duration
  • max-idle = 20-30% of max-connections
  • Set max-lifetime-secs if backends have connection limits

Timeouts

upstream "backend" {
    timeouts {
        connect-secs 10
        request-secs 60
        read-secs 30
        write-secs 30
    }
}
TimeoutDefaultDescription
connect-secs10TCP connection timeout
request-secs60Total request timeout
read-secs30Read timeout (response)
write-secs30Write timeout (request body)

Timeout Recommendations

Service Typeconnectrequestreadwrite
Fast API5301515
Standard API10603030
Slow/batch1030012060
File upload1060030300

Upstream TLS

Basic TLS to Upstream

upstream "secure-backend" {
    targets {
        target { address "backend.internal:443" }
    }
    tls {
        sni "backend.internal"
    }
}

mTLS to Upstream

upstream "mtls-backend" {
    targets {
        target { address "secure.internal:443" }
    }
    tls {
        sni "secure.internal"
        client-cert "/etc/sentinel/certs/client.crt"
        client-key "/etc/sentinel/certs/client.key"
        ca-cert "/etc/sentinel/certs/backend-ca.crt"
    }
}

TLS Options

OptionDescription
sniServer Name Indication hostname
client-certClient certificate for mTLS
client-keyClient private key for mTLS
ca-certCA certificate to verify upstream
insecure-skip-verifySkip certificate verification (testing only)

Warning: Never use insecure-skip-verify in production.

Complete Examples

Multi-tier Application

upstreams {
    // Web tier
    upstream "web" {
        targets {
            target { address "web-1.internal:8080" weight=2 }
            target { address "web-2.internal:8080" weight=2 }
            target { address "web-3.internal:8080" weight=1 }
        }
        load-balancing "weighted"
        health-check {
            type "http" {
                path "/health"
                expected-status 200
            }
            interval-secs 10
            timeout-secs 5
        }
        connection-pool {
            max-connections 200
            max-idle 50
        }
    }

    // API tier
    upstream "api" {
        targets {
            target { address "api-1.internal:8080" }
            target { address "api-2.internal:8080" }
        }
        load-balancing "least_connections"
        health-check {
            type "http" {
                path "/api/health"
                expected-status 200
            }
            interval-secs 5
            unhealthy-threshold 2
        }
        timeouts {
            connect-secs 5
            request-secs 30
        }
    }

    // Cache tier
    upstream "cache" {
        targets {
            target { address "cache-1.internal:6379" }
            target { address "cache-2.internal:6379" }
            target { address "cache-3.internal:6379" }
        }
        load-balancing "consistent_hash"
        health-check {
            type "tcp"
            interval-secs 5
            timeout-secs 2
        }
    }
}

Blue-Green Deployment

upstreams {
    // Blue environment (current)
    upstream "api-blue" {
        targets {
            target { address "api-blue-1.internal:8080" }
            target { address "api-blue-2.internal:8080" }
        }
    }

    // Green environment (new version)
    upstream "api-green" {
        targets {
            target { address "api-green-1.internal:8080" }
            target { address "api-green-2.internal:8080" }
        }
    }

    // Canary routing (90% blue, 10% green)
    upstream "api-canary" {
        targets {
            target { address "api-blue-1.internal:8080" weight=45 }
            target { address "api-blue-2.internal:8080" weight=45 }
            target { address "api-green-1.internal:8080" weight=5 }
            target { address "api-green-2.internal:8080" weight=5 }
        }
        load-balancing "weighted"
    }
}

Secure Internal Service

upstreams {
    upstream "payment-service" {
        targets {
            target { address "payment.internal:443" }
        }
        tls {
            sni "payment.internal"
            client-cert "/etc/sentinel/certs/sentinel-client.crt"
            client-key "/etc/sentinel/certs/sentinel-client.key"
            ca-cert "/etc/sentinel/certs/internal-ca.crt"
        }
        health-check {
            type "http" {
                path "/health"
                expected-status 200
                host "payment.internal"
            }
            interval-secs 10
        }
        timeouts {
            connect-secs 5
            request-secs 30
        }
        connection-pool {
            max-connections 50
            max-idle 10
            max-lifetime-secs 300
        }
    }
}

Default Values

SettingDefault
load-balancinground_robin
target.weight1
health-check.interval-secs10
health-check.timeout-secs5
health-check.healthy-threshold2
health-check.unhealthy-threshold3
connection-pool.max-connections100
connection-pool.max-idle20
connection-pool.idle-timeout-secs60
timeouts.connect-secs10
timeouts.request-secs60
timeouts.read-secs30
timeouts.write-secs30

Service Discovery

Instead of static targets, upstreams can discover backends dynamically from external sources.

DNS Discovery

upstream "api" {
    discovery "dns" {
        hostname "api.internal.example.com"
        port 8080
        refresh-interval 30
    }
}

Resolves A/AAAA records and uses all IPs as targets.

OptionDefaultDescription
hostnameRequiredDNS name to resolve
portRequiredPort for all discovered backends
refresh-interval30Seconds between DNS lookups

Consul Discovery

system {
    worker-threads 0
}

listeners {
    listener "http" {
        address "0.0.0.0:8080"
        protocol "http"
    }
}

routes {
    route "default" {
        matches { path-prefix "/" }
        upstream "backend"
    }
}

upstreams {
    upstream "backend" {
        discovery "consul" {
            address "http://consul.internal:8500"
            service "backend-api"
            datacenter "dc1"
            only-passing #true
            refresh-interval 10
            tag "production"
        }
    }
}

Discovers backends from Consul’s service catalog.

OptionDefaultDescription
addressRequiredConsul HTTP API address
serviceRequiredService name in Consul
datacenterNoneConsul datacenter
only-passingtrueOnly return healthy services
refresh-interval10Seconds between queries
tagNoneFilter by service tag

Kubernetes Discovery

Discover backends from Kubernetes Endpoints. Supports both in-cluster and kubeconfig authentication.

In-Cluster Configuration

When running inside Kubernetes, Sentinel automatically uses the pod’s service account:

upstream "k8s-backend" {
    discovery "kubernetes" {
        namespace "production"
        service "api-server"
        port-name "http"
        refresh-interval 10
    }
}

Kubeconfig File

For running outside the cluster or with custom credentials:

upstream "k8s-backend" {
    discovery "kubernetes" {
        namespace "default"
        service "my-service"
        port-name "http"
        refresh-interval 10
        kubeconfig "~/.kube/config"
    }
}
OptionDefaultDescription
namespaceRequiredKubernetes namespace
serviceRequiredService name
port-nameNoneNamed port to use (uses first port if omitted)
refresh-interval10Seconds between endpoint queries
kubeconfigNonePath to kubeconfig file (uses in-cluster if omitted)

Kubeconfig Authentication Methods

Sentinel supports multiple authentication methods from kubeconfig:

Token Authentication:

users:
- name: my-user
  user:
    token: eyJhbGciOiJSUzI1NiIs...

Client Certificate:

users:
- name: my-user
  user:
    client-certificate-data: LS0tLS1C...
    client-key-data: LS0tLS1C...

Exec-based (e.g., AWS EKS):

users:
- name: eks-user
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      command: aws
      args:
        - eks
        - get-token
        - --cluster-name
        - my-cluster

Feature Flag

Kubernetes discovery with kubeconfig requires the kubernetes feature:

cargo build --features kubernetes

File-based Discovery

Discover backends from a simple text file. The file is watched for changes and backends are reloaded automatically.

upstream "api" {
    discovery "file" {
        path "/etc/sentinel/backends/api-servers.txt"
        watch-interval 5
    }
}
OptionDefaultDescription
pathRequiredPath to the backends file
watch-interval5Seconds between file modification checks

File Format

One backend per line with optional weight parameter:

# Backend servers for API cluster
# Updated: 2026-01-11

10.0.1.1:8080
10.0.1.2:8080 weight=2
10.0.1.3:8080 weight=3

# Standby server (lower weight)
10.0.1.4:8080 weight=1

Format rules:

  • Lines starting with # are comments
  • Empty lines are ignored
  • Format: host:port or host:port weight=N
  • Hostnames are resolved via DNS
  • Default weight is 1 if not specified

Use Cases

External configuration management:

// Backends managed by Ansible/Puppet/Chef
upstream "backend" {
    discovery "file" {
        path "/etc/sentinel/backends/managed-by-ansible.txt"
        watch-interval 10
    }
}

Integration with custom scripts:

#!/bin/bash
# update-backends.sh - Run by cron or external system
consul catalog nodes -service=api | \
    awk '{print $2":8080"}' > /etc/sentinel/backends/api.txt
upstream "api" {
    discovery "file" {
        path "/etc/sentinel/backends/api.txt"
        watch-interval 5
    }
}

Manual failover control:

# Primary datacenter
10.0.1.1:8080 weight=10
10.0.1.2:8080 weight=10

# DR datacenter (uncomment during failover)
# 10.0.2.1:8080 weight=10
# 10.0.2.2:8080 weight=10

Hot Reload Behavior

File-based discovery automatically detects changes:

  1. Modification check: File modification time is checked every watch-interval seconds
  2. Reload trigger: When file is modified, backends are re-read
  3. Graceful update: New backends are added, removed backends drain connections
  4. Cache fallback: If file becomes temporarily unavailable, last known backends are used

File Permissions

Ensure Sentinel can read the backends file:

# Create directory
sudo mkdir -p /etc/sentinel/backends
sudo chown sentinel:sentinel /etc/sentinel/backends

# Create backends file
echo "10.0.1.1:8080" | sudo tee /etc/sentinel/backends/api.txt
sudo chmod 644 /etc/sentinel/backends/api.txt

Static Discovery

Explicitly define backends (default behavior when targets is used):

upstream "backend" {
    discovery "static" {
        backends "10.0.1.1:8080" "10.0.1.2:8080" "10.0.1.3:8080"
    }
}

Discovery with Health Checks

Discovery works with health checks. Unhealthy discovered backends are temporarily removed:

upstream "api" {
    discovery "dns" {
        hostname "api.example.com"
        port 8080
        refresh-interval 30
    }
    health-check {
        type "http" {
            path "/health"
            expected-status 200
        }
        interval-secs 10
        unhealthy-threshold 3
    }
}

Discovery Caching

All discovery methods cache results and fall back to cached backends on failure:

  • DNS resolution fails → use last known IPs
  • Consul unavailable → use last known services
  • Kubernetes API error → use last known endpoints
  • File unreadable → use last known backends

This ensures resilience during control plane outages.

Monitoring Upstream Health

Check upstream status via the admin endpoint:

curl http://localhost:9090/admin/upstreams

Response:

{
  "upstreams": {
    "backend": {
      "targets": [
        {"address": "10.0.1.1:8080", "healthy": true, "connections": 45},
        {"address": "10.0.1.2:8080", "healthy": true, "connections": 42},
        {"address": "10.0.1.3:8080", "healthy": false, "connections": 0}
      ]
    }
  }
}

Next Steps

  • Limits - Request limits and performance tuning
  • Routes - Routing to upstreams