Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/sockudo/sockudo/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Sockudo is designed for high-performance real-time communication. This guide covers configuration options and best practices for tuning Sockudo to handle high concurrent connection loads efficiently.

Connection Pool Optimization

Database Connection Pools

Connection pooling is critical for production deployments using external databases. Environment Variables:
# Global pooling settings (applies to all SQL backends)
DATABASE_POOLING_ENABLED=true
DATABASE_POOL_MIN=2
DATABASE_POOL_MAX=10

# MySQL-specific overrides (takes precedence when set)
DATABASE_MYSQL_POOL_MIN=4
DATABASE_MYSQL_POOL_MAX=32

# PostgreSQL-specific overrides
DATABASE_POSTGRES_POOL_MIN=2
DATABASE_POSTGRES_POOL_MAX=16
Configuration File:
{
  "database": {
    "pooling_enabled": true,
    "pool_min": 2,
    "pool_max": 10
  }
}
DynamoDB uses the AWS SDK client which manages its own connection behavior. Pool settings do not apply to DynamoDB.

Redis Connection Pools

# Redis connection pool size
REDIS_CONNECTION_POOL_SIZE=10
Recommended Pool Sizes by Deployment Size:
Deployment SizeDatabase Pool MaxRedis Pool Size
Small (1-1K connections)5-105
Medium (1K-10K connections)10-2010
Large (10K-50K connections)20-4020
Extra Large (50K+ connections)40-10050

WebSocket Buffer Configuration

Sockudo uses bounded buffers to protect against slow consumers that can’t keep up with message delivery.

Buffer Limit Modes

Mode 1: Message Count Only (Default - Fastest)
WEBSOCKET_MAX_MESSAGES=1000
WEBSOCKET_MAX_BYTES=none
WEBSOCKET_DISCONNECT_ON_BUFFER_FULL=true
{
  "websocket": {
    "max_messages": 1000,
    "max_bytes": null,
    "disconnect_on_buffer_full": true
  }
}
Mode 2: Byte Size Only (Precise Memory Control)
WEBSOCKET_MAX_MESSAGES=none
WEBSOCKET_MAX_BYTES=1048576  # 1MB
WEBSOCKET_DISCONNECT_ON_BUFFER_FULL=true
{
  "websocket": {
    "max_messages": null,
    "max_bytes": 1048576,
    "disconnect_on_buffer_full": true
  }
}
Mode 3: Both Limits (Most Precise)
WEBSOCKET_MAX_MESSAGES=1000
WEBSOCKET_MAX_BYTES=1048576
WEBSOCKET_DISCONNECT_ON_BUFFER_FULL=true

Buffer Behavior

  • When disconnect_on_buffer_full: true → Connection is closed with error code 4100
  • When disconnect_on_buffer_full: false → New messages are dropped silently (logged as warning)

Performance Characteristics

ModeOverheadMemory Control
Message-onlyZero (uses bounded channel)Approximate
Byte-only~1-2ns per messagePrecise
BothAtomic counter + channel checkMost precise

Memory Estimation

  • Message-only mode: ~1-2KB per message (typical)
  • Byte-only mode: Exact memory limit (e.g., 1MB = 1MB max)
  • 10,000 connections with 1MB byte limit: ~10GB worst case

Cleanup Queue Configuration

The async cleanup queue processes WebSocket disconnections in the background to prevent mass disconnections from blocking new connections.

Configuration Options

# Enable async cleanup (recommended)
CLEANUP_ASYNC_ENABLED=true

# Fallback to sync if queue fails
CLEANUP_FALLBACK_TO_SYNC=true

# Queue buffer size per worker
CLEANUP_QUEUE_BUFFER_SIZE=10000

# Tasks processed per batch per worker
CLEANUP_BATCH_SIZE=25

# Max wait time to fill batch (milliseconds)
CLEANUP_BATCH_TIMEOUT_MS=50

# Number of cleanup worker threads ("auto" or specific number)
CLEANUP_WORKER_THREADS=auto

# Retry attempts before giving up
CLEANUP_MAX_RETRY_ATTEMPTS=2

Deployment Scenarios

Use Case: Development, testing, small production instances
{
  "cleanup": {
    "async_enabled": true,
    "queue_buffer_size": 500,
    "batch_size": 10,
    "batch_timeout_ms": 100,
    "worker_threads": 1,
    "max_retry_attempts": 1
  }
}
  • Memory Usage: ~300KB queue buffer
  • CPU Impact: Minimal (1 worker)
  • Latency: 100ms max cleanup delay
Use Case: High concurrent connection loads (>10K connections)
{
  "cleanup": {
    "async_enabled": true,
    "queue_buffer_size": 10000,
    "batch_size": 100,
    "batch_timeout_ms": 25,
    "worker_threads": 2,
    "max_retry_attempts": 3
  }
}
  • Memory Usage: ~6MB per worker (total: ~12MB with 2 workers)
  • CPU Impact: Moderate (2 workers)
  • Latency: 25ms max cleanup delay
Use Case: Massive scale deployments (>50K connections)
{
  "cleanup": {
    "async_enabled": true,
    "queue_buffer_size": 50000,
    "batch_size": 500,
    "batch_timeout_ms": 10,
    "worker_threads": 4,
    "max_retry_attempts": 3
  }
}
  • Memory Usage: ~30MB per worker (total: ~120MB with 4 workers)
  • CPU Impact: High (4 workers)
  • Latency: 10ms max cleanup delay

Worker Threads Scaling

The worker_threads setting supports:
  • Fixed number: Specify exact worker count (e.g., 2)
  • Auto-detection: Use "auto" to scale based on CPU cores
When using "auto", the system uses 25% of available CPU cores (minimum 1, maximum 4):
  • 1-7 CPUs → 1 worker
  • 8-11 CPUs → 2 workers
  • 12-15 CPUs → 3 workers
  • 16+ CPUs → 4 workers
All configuration values (except worker_threads) are applied per worker, not as total system capacity.

Adapter Performance Tuning

Redis/Redis Cluster

# Queue processing concurrency
QUEUE_REDIS_CONCURRENCY=5

# Redis Cluster queue concurrency
REDIS_CLUSTER_QUEUE_CONCURRENCY=5

# Key prefix for namespace isolation
DATABASE_REDIS_KEY_PREFIX=sockudo:

NATS Configuration

# NATS servers (comma-separated)
NATS_SERVERS=nats://nats1:4222,nats://nats2:4222

# Connection timeouts (milliseconds)
NATS_CONNECTION_TIMEOUT_MS=5000
NATS_REQUEST_TIMEOUT_MS=5000

Socket Counting

Socket counting has performance overhead. Disable if you don’t need the get_sockets_count API.
# Disable socket counting for better performance
ADAPTER_ENABLE_SOCKET_COUNTING=false
When disabled, get_sockets_count returns 0 to avoid the overhead of tracking connection counts.

CPU Scaling Considerations

Worker Thread Auto-Scaling

Sockudo automatically scales cleanup workers based on available CPU:
# Use auto-detection (recommended)
CLEANUP_WORKER_THREADS=auto
This allocates 25% of CPU cores to cleanup, leaving 75% for main WebSocket processing.

Manual CPU Allocation

For fine-grained control:
# 4 CPU cores: Allocate 1 worker manually
CLEANUP_WORKER_THREADS=1

# 16 CPU cores: Allocate 4 workers manually
CLEANUP_WORKER_THREADS=4

Cache Configuration

# Cache driver selection
CACHE_DRIVER=redis  # Options: memory, redis, redis-cluster, none

# Cache TTL (seconds)
CACHE_TTL_SECONDS=300

# Memory cache settings
CACHE_CLEANUP_INTERVAL=60
CACHE_MAX_CAPACITY=10000
Configuration File:
{
  "cache": {
    "driver": "redis",
    "memory": {
      "ttl": 300,
      "cleanup_interval": 60,
      "max_capacity": 10000
    }
  }
}

Rate Limiting Configuration

# Enable rate limiting
RATE_LIMITER_ENABLED=true

# Rate limiter backend
RATE_LIMITER_DRIVER=redis

# API rate limiting
RATE_LIMITER_API_MAX_REQUESTS=100
RATE_LIMITER_API_WINDOW_SECONDS=60

# WebSocket rate limiting
RATE_LIMITER_WS_MAX_REQUESTS=20
RATE_LIMITER_WS_WINDOW_SECONDS=60

Performance Monitoring

Prometheus Metrics

Sockudo exposes metrics at /metrics (port 9601 by default):
METRICS_ENABLED=true
METRICS_HOST=0.0.0.0
METRICS_PORT=9601
METRICS_PROMETHEUS_PREFIX=sockudo_

Key Metrics to Monitor

  • sockudo_websocket_connections_total - Total active connections
  • sockudo_messages_received_total - Incoming message rate
  • sockudo_messages_sent_total - Outgoing message rate
  • sockudo_cleanup_queue_size - Cleanup queue depth
  • sockudo_adapter_operations_duration_seconds - Adapter operation latency

Quick Reference Table

Configuration by Server Size

Server Specqueue_buffer_sizebatch_sizebatch_timeout_msworker_threadspool_max
1vCPU/1GB5001010015
2vCPU/2GB500002550auto (1)10
4vCPU/4GB1000010025220
8vCPU/8GB5000050010440

Environment Variables Quick Reference

# Connection Pools
DATABASE_POOLING_ENABLED=true
DATABASE_POOL_MIN=2
DATABASE_POOL_MAX=10
REDIS_CONNECTION_POOL_SIZE=10

# WebSocket Buffers
WEBSOCKET_MAX_MESSAGES=1000
WEBSOCKET_MAX_BYTES=1048576
WEBSOCKET_DISCONNECT_ON_BUFFER_FULL=true

# Cleanup Queue
CLEANUP_ASYNC_ENABLED=true
CLEANUP_QUEUE_BUFFER_SIZE=10000
CLEANUP_BATCH_SIZE=25
CLEANUP_BATCH_TIMEOUT_MS=50
CLEANUP_WORKER_THREADS=auto

# Performance
ADAPTER_ENABLE_SOCKET_COUNTING=false
QUEUE_REDIS_CONCURRENCY=5
CACHE_TTL_SECONDS=300

Best Practices

  1. Start Conservative: Begin with standard deployment settings and tune based on metrics
  2. Monitor Actively: Watch queue health and connection latency during initial deployment
  3. Test Load: Run mass disconnection tests before production
  4. Use Auto-Scaling: Let CLEANUP_WORKER_THREADS=auto handle CPU allocation
  5. Profile Regularly: Use Prometheus metrics to identify bottlenecks
  6. Disable Unused Features: Turn off socket counting if not needed
  7. Use Redis for Scale: Switch to Redis/Redis Cluster for multi-node deployments

Next Steps