Skip to content
This repository was archived by the owner on Nov 4, 2025. It is now read-only.

Latest commit

 

History

History
1541 lines (1177 loc) · 36.1 KB

File metadata and controls

1541 lines (1177 loc) · 36.1 KB

Protocol Layer Design: Application Protocols over Low-Level Networking

This document outlines the design of a unified protocol abstraction layer that enables both server and client operations over our high-performance networking stack (io_uring, DPDK, RDMA, AF_XDP, etc.).

Table of Contents

  1. Architecture Overview
  2. Dedicated I/O Workers Pattern
  3. Protocol Trait Design
  4. Transport Abstraction
  5. Protocol Implementations
  6. Connection Management
  7. Integration with Worker Pool
  8. Request/Response Patterns
  9. Performance Considerations
  10. Security & Authentication

Architecture Overview

Layered Architecture

┌─────────────────────────────────────────────────────────────┐
│                   Application Workers                        │
│              (Business Logic, Data Processing)               │
└─────────────────────────────────────────────────────────────┘
                            ▲ │
                            │ │ Messages (Application Protocol)
                            │ ▼
┌─────────────────────────────────────────────────────────────┐
│                  Dedicated I/O Workers                       │
│         ┌──────────────┐         ┌──────────────┐           │
│         │ Server I/O   │         │ Client I/O   │           │
│         │   Workers    │         │   Workers    │           │
│         └──────────────┘         └──────────────┘           │
└─────────────────────────────────────────────────────────────┘
                            ▲ │
                            │ │ Protocol Layer (HTTP, gRPC, etc.)
                            │ ▼
┌─────────────────────────────────────────────────────────────┐
│              Transport Abstraction Layer                     │
│    (Unified interface over io_uring/DPDK/RDMA/AF_XDP)      │
└─────────────────────────────────────────────────────────────┘
                            ▲ │
                            │ │ Raw bytes
                            │ ▼
┌─────────────────────────────────────────────────────────────┐
│                Low-Level Networking Stack                    │
│   io_uring │ DPDK │ RDMA │ AF_XDP │ Raw Sockets │ eBPF     │
└─────────────────────────────────────────────────────────────┘

Key Design Principles

  1. Separation of Concerns: Application workers focus on business logic, I/O workers handle networking
  2. Protocol Abstraction: Common traits for all application protocols (HTTP, gRPC, Redis, PostgreSQL, etc.)
  3. Transport Independence: Protocols work over any transport (io_uring, DPDK, RDMA)
  4. Zero-Copy: Minimize data copying between layers
  5. Bidirectional: Same infrastructure for server (inbound) and client (outbound)
  6. Composable: Mix different protocols in same application

Dedicated I/O Workers Pattern

Concept

Instead of each application worker doing its own I/O, we create specialized I/O workers that:

  • Handle all network operations (both server and client)
  • Run on dedicated CPU cores
  • Use high-performance networking stack
  • Communicate with application workers via message passing

Benefits

Performance:

  • CPU affinity: I/O workers pinned to specific cores
  • Cache locality: Network buffers stay in I/O worker cache
  • Reduced context switching
  • Better utilization of hardware acceleration

Isolation:

  • Network failures don't crash application workers
  • Circuit breakers at I/O layer
  • Rate limiting centralized
  • Security boundary

Scalability:

  • Independent scaling of I/O and application workers
  • Multiple I/O workers per transport
  • Load balancing across I/O workers

Flexibility:

  • Swap transports without changing application
  • Protocol versioning isolated
  • A/B testing protocols
  • Multi-datacenter routing

Worker Types

1. Server I/O Workers (Inbound)

  • Accept incoming connections
  • Parse requests (HTTP, gRPC, custom protocols)
  • Route to application workers
  • Send responses back to clients

2. Client I/O Workers (Outbound)

  • Establish connections to external systems
  • Send requests (HTTP APIs, databases, message queues)
  • Receive responses
  • Route responses to application workers

3. Hybrid I/O Workers

  • Handle both inbound and outbound
  • Useful for proxy/gateway scenarios
  • Request forwarding

CPU Core Allocation

Typical Server Setup (64-core machine):

Cores 0-3:    Server I/O Workers (inbound traffic)
Cores 4-7:    Client I/O Workers (outbound traffic)
Cores 8-63:   Application Workers (business logic)
Cores 62-63:  System/Monitoring

Considerations:

  • NUMA-aware allocation
  • Avoid hyperthreads for I/O workers (use physical cores)
  • Reserve cores for kernel (IRQ handling)
  • Isolate using isolcpus or cgroups

Protocol Trait Design

Core Traits Hierarchy

1. Transport Trait (Low-Level)

Abstraction over all networking backends:

Trait: Transport

  • Unified interface over io_uring, DPDK, RDMA, AF_XDP
  • Raw byte send/receive
  • Connection lifecycle
  • Zero-copy buffer management

Operations:

  • send(bytes): Send raw bytes
  • recv(): Receive raw bytes
  • send_vectored(iovec): Scatter-gather send
  • recv_vectored(iovec): Scatter-gather receive
  • zero_copy_send(buffer): Zero-copy transmission
  • close(): Close connection
  • poll(): Non-blocking check for data

Implementations:

  • IoUringTransport: io_uring backend
  • DpdkTransport: DPDK backend
  • RdmaTransport: RDMA backend
  • AfXdpTransport: AF_XDP backend
  • RawSocketTransport: Raw sockets backend
  • MioTransport: mio/epoll backend (fallback)

Connection Types:

  • Stream: TCP-like (connection-oriented)
  • Datagram: UDP-like (connectionless)
  • SeqPacket: Ordered, reliable, message-oriented

2. Protocol Trait (Application-Level)

Abstraction for application protocols:

Trait: Protocol

  • Parse/serialize application messages
  • Handle protocol-specific logic
  • Transport-independent

Operations:

  • parse_request(bytes) -> Request: Deserialize request
  • serialize_request(request) -> bytes: Serialize request
  • parse_response(bytes) -> Response: Deserialize response
  • serialize_response(response) -> bytes: Serialize response
  • protocol_name(): Identifier (e.g., "HTTP/1.1", "gRPC")
  • default_port(): Standard port

Metadata:

  • is_binary(): Binary vs text protocol
  • supports_streaming(): Streaming capability
  • requires_tls(): Security requirements
  • multiplexing_support(): Connection multiplexing (HTTP/2, QUIC)

3. Server Protocol Trait

Server-specific protocol operations:

Trait: ServerProtocol: Protocol

  • Accept and handle incoming requests
  • Generate responses

Operations:

  • handle_connection(transport): Handle new connection
  • process_request(request) -> response: Application logic hook
  • error_response(error) -> response: Error handling
  • health_check_response(): Health endpoint
  • connection_timeout(): Idle connection timeout
  • max_request_size(): Request size limits

Lifecycle Hooks:

  • on_connect(connection_info): New connection
  • on_disconnect(connection_info): Connection closed
  • on_error(error): Error occurred
  • on_timeout(): Request timeout

State Management:

  • connection_state(): Per-connection state
  • session_management(): Session handling
  • rate_limiting(): Per-connection rate limits

4. Client Protocol Trait

Client-specific protocol operations:

Trait: ClientProtocol: Protocol

  • Establish outbound connections
  • Send requests and receive responses

Operations:

  • connect(address) -> connection: Establish connection
  • send_request(request): Send request
  • recv_response() -> response: Receive response
  • call(request) -> response: Synchronous request-response
  • async_call(request) -> future<response>: Async request-response

Connection Management:

  • connection_pool(): Connection pooling
  • max_connections(): Pool size
  • connection_timeout(): Connect timeout
  • request_timeout(): Request timeout
  • retry_policy(): Retry configuration

Reliability:

  • circuit_breaker(): Circuit breaker state
  • backoff_strategy(): Exponential backoff
  • health_check(): Connection health
  • failover_strategy(): Failover logic

5. Bidirectional Protocol Trait

For protocols that are both client and server:

Trait: BidirectionalProtocol: ServerProtocol + ClientProtocol

  • Full-duplex communication
  • Peer-to-peer scenarios

Use Cases:

  • WebSocket (both send and receive)
  • gRPC bidirectional streaming
  • Database replication protocols
  • P2P protocols

Protocol State Machine

All protocols implement a state machine:

States:

  1. Disconnected: No connection
  2. Connecting: Connection establishment in progress
  3. Connected: Connection established, ready
  4. Authenticating: Authentication/handshake in progress
  5. Ready: Authenticated, can send/receive
  6. Closing: Graceful shutdown in progress
  7. Closed: Connection closed
  8. Error: Error state

Transitions:

  • State transitions triggered by events
  • Thread-safe state management
  • Observers for state changes
  • Metrics on state transitions

Transport Abstraction

Unified Transport Interface

The transport layer provides a consistent API across all networking backends.

Transport Properties

Connection Properties:

  • local_address(): Local endpoint
  • remote_address(): Remote endpoint
  • connection_id(): Unique identifier
  • protocol(): Transport protocol (TCP, UDP, RDMA, etc.)

Performance Properties:

  • mtu(): Maximum transmission unit
  • bandwidth(): Available bandwidth
  • latency(): Round-trip time
  • buffer_size(): Send/receive buffer sizes

Capability Properties:

  • supports_zero_copy(): Zero-copy capability
  • supports_vectored_io(): Scatter-gather
  • supports_tls(): TLS offload
  • supports_checksumming(): Hardware checksum offload

Transport Selection

Automatic Selection:

Decision factors:
1. Hardware availability (RDMA NIC, DPDK-capable NIC)
2. Destination (local, same datacenter, remote)
3. Protocol requirements (latency-sensitive, throughput-oriented)
4. Security requirements (TLS, encryption)
5. Configuration preferences

Selection Algorithm:

  1. Ultra-low latency (< 10µs): RDMA
  2. High packet rate (> 10Mpps): DPDK or AF_XDP
  3. Efficient I/O (< 100µs): io_uring
  4. Standard (compatibility): mio/epoll
  5. Fallback: Standard sockets

Manual Override:

  • Configuration file
  • Environment variables
  • Runtime API
  • Per-connection selection

Protocol Implementations

HTTP Protocol

HTTP/1.1

Server Implementation:

  • Request parsing: Method, path, headers, body
  • Response generation: Status, headers, body
  • Keep-alive support
  • Chunked transfer encoding
  • Connection pipelining

Client Implementation:

  • Request building: Method, URL, headers, body
  • Response parsing: Status, headers, body
  • Connection pooling
  • Keep-alive
  • Automatic retries

Features:

  • Header compression (not built-in)
  • Body streaming
  • Multipart forms
  • WebSocket upgrade

Performance:

  • Zero-copy body transfers
  • Header caching
  • Connection reuse
  • Minimal allocations

HTTP/2

Server Implementation:

  • Binary framing
  • Stream multiplexing
  • Server push
  • Header compression (HPACK)
  • Flow control

Client Implementation:

  • Stream management
  • Priority handling
  • Header compression
  • Connection pooling (one connection per host)

Features:

  • Full multiplexing
  • Request prioritization
  • Server push support
  • Better compression

Performance:

  • Single connection per host
  • Reduced head-of-line blocking
  • Better bandwidth utilization
  • Lower latency

HTTP/3 (QUIC)

Server Implementation:

  • UDP-based transport
  • QUIC connection management
  • Stream multiplexing
  • 0-RTT connection establishment

Client Implementation:

  • QUIC client
  • Connection migration
  • 0-RTT resumption

Features:

  • No head-of-line blocking
  • Connection migration
  • Fast connection establishment
  • Built-in encryption

Challenges:

  • Requires UDP support in transport
  • Userspace congestion control
  • More complex implementation

gRPC Protocol

Overview:

  • HTTP/2 based
  • Protocol Buffers
  • Bidirectional streaming

Server Implementation:

  • Service definition
  • Method dispatch
  • Streaming handlers
  • Interceptors/middleware

Client Implementation:

  • Stub generation
  • Channel management
  • Load balancing
  • Retry/deadline handling

Streaming Types:

  1. Unary: Single request, single response
  2. Server streaming: Single request, stream responses
  3. Client streaming: Stream requests, single response
  4. Bidirectional: Both stream

Features:

  • Type-safe APIs
  • Efficient binary protocol
  • Good tooling
  • Language interop

Performance:

  • HTTP/2 multiplexing
  • Protobuf efficiency
  • Streaming large datasets
  • Connection reuse

Redis Protocol (RESP)

Overview:

  • Simple text-based protocol
  • Request-response pattern
  • Pipelining support

Server Implementation:

  • RESP parser
  • Command dispatch
  • Pipelining
  • Pub/sub

Client Implementation:

  • Command builder
  • Response parser
  • Connection pooling
  • Pipelining

Features:

  • Simple implementation
  • Human-readable (debugging)
  • Fast parsing
  • Multiplexing (pub/sub)

Performance:

  • Minimal overhead
  • Pipelining for throughput
  • Connection pooling

PostgreSQL Protocol

Overview:

  • Binary protocol
  • Extended query protocol
  • Prepared statements

Server Implementation:

  • Authentication (SASL, MD5)
  • Query parsing
  • Result set encoding
  • Transaction management

Client Implementation:

  • Connection establishment
  • Authentication
  • Query execution
  • Result parsing
  • Prepared statements
  • Connection pooling

Features:

  • Binary format (efficient)
  • Prepared statements (SQL injection protection)
  • Transactions
  • COPY protocol (bulk data)

Performance:

  • Binary encoding (faster than text)
  • Prepared statement caching
  • Connection pooling
  • Batch operations

Kafka Protocol

Overview:

  • Binary protocol
  • Producer/consumer pattern
  • Topic partitioning

Client Implementation (Producer):

  • Topic discovery
  • Partition assignment
  • Message batching
  • Compression
  • Acknowledgment handling

Client Implementation (Consumer):

  • Group coordination
  • Partition assignment
  • Offset management
  • Message fetching

Features:

  • High throughput
  • Durability
  • Ordering guarantees
  • Consumer groups

Performance:

  • Batching
  • Compression
  • Zero-copy (sendfile)
  • Partition parallelism

Custom Binary Protocols

Design Considerations:

  • Message framing (length-prefixed, delimiter-based)
  • Serialization format (Protobuf, MessagePack, Bincode, CBOR)
  • Versioning (protocol version negotiation)
  • Error handling
  • Flow control

Best Practices:

  • Length-prefix messages (avoid scanning)
  • Binary encoding (efficiency)
  • Type-safe serialization (Serde)
  • Version field in header
  • Checksum/CRC for corruption detection

Connection Management

Connection Pooling

Goals:

  • Reuse connections (amortize setup cost)
  • Limit concurrent connections
  • Health checking
  • Fair distribution

Pool Types:

1. Fixed-Size Pool

  • Pre-allocated connections
  • Block when exhausted
  • Predictable resource usage
  • Simple implementation

2. Dynamic Pool

  • Grow/shrink based on demand
  • Min/max limits
  • Connection timeout
  • More complex

3. Per-Worker Pool

  • Each I/O worker has own pool
  • No contention between workers
  • More total connections
  • Better locality

4. Shared Pool

  • All workers share pool
  • Fewer total connections
  • Requires synchronization
  • Better utilization

Pool Management:

  • Health checks (periodic ping)
  • Idle timeout (close unused)
  • Max lifetime (rotate connections)
  • Connection validation
  • Metrics (pool size, wait time)

Circuit Breaker Integration

States:

  1. Closed: Normal operation
  2. Open: Fail fast (don't try)
  3. Half-Open: Testing recovery

Per-Destination Circuit Breakers:

  • Track failures per endpoint
  • Independent breakers
  • Prevent cascading failures

Failure Detection:

  • Connection timeouts
  • Request timeouts
  • Error responses (5xx)
  • Network errors

Recovery:

  • Exponential backoff
  • Test requests
  • Gradual ramp-up

Service Discovery

Integration Points:

  • DNS (traditional)
  • Consul
  • etcd
  • Kubernetes services
  • Custom registry

Resolution:

  • Periodic refresh
  • Push-based updates
  • Health-based filtering
  • Load balancing

Integration with Worker Pool

Message Flow: Inbound Request

1. Client → Network → Server I/O Worker
   - I/O worker receives bytes
   - Parse protocol (HTTP request)
   - Create message envelope

2. Server I/O Worker → Application Worker
   - Route based on partitioning
   - Send via high-performance channel
   - Include connection context

3. Application Worker processes request
   - Business logic
   - Database queries (via Client I/O Workers)
   - Computation

4. Application Worker → Server I/O Worker
   - Send response message
   - Include connection ID

5. Server I/O Worker → Network → Client
   - Serialize response
   - Send over transport
   - Close or keep-alive connection

Message Flow: Outbound Request

1. Application Worker → Client I/O Worker
   - Send request message
   - Specify destination
   - Include callback/correlation ID

2. Client I/O Worker → Network → External Service
   - Get connection from pool
   - Serialize request
   - Send via transport
   - Wait for response (async)

3. External Service → Network → Client I/O Worker
   - Receive response bytes
   - Parse protocol
   - Create response message

4. Client I/O Worker → Application Worker
   - Route using correlation ID
   - Send via channel
   - Return connection to pool

Worker Communication Patterns

1. Request-Reply Pattern

Synchronous (blocking):

  • Application worker sends request
  • Blocks waiting for response
  • Simple but limits throughput

Asynchronous (non-blocking):

  • Application worker sends request with correlation ID
  • Continues processing
  • I/O worker replies with correlation ID
  • Application worker matches response

2. Fire-and-Forget Pattern

Use Case: Logging, metrics, events

  • Application worker sends message
  • No response expected
  • I/O worker handles delivery
  • Best effort or reliable

3. Streaming Pattern

Server Streaming:

  • Single request
  • Multiple responses
  • I/O worker buffers/controls flow

Client Streaming:

  • Multiple requests
  • Single response
  • Application worker sends stream
  • I/O worker batches

Bidirectional:

  • Both directions stream
  • WebSocket, gRPC bidirectional
  • Complex coordination

4. Aggregation Pattern

Fan-Out/Fan-In:

  • Application worker makes multiple requests
  • Different destinations
  • I/O workers handle concurrently
  • Aggregate responses

Correlation and Context

Correlation ID:

  • Unique per request-response pair
  • Generated by application worker
  • Included in messages
  • Used for routing responses

Request Context:

  • Trace ID (distributed tracing)
  • User/session ID
  • Priority
  • Deadline
  • Metadata (headers, auth)

Context Propagation:

  • Passed through all layers
  • Preserved in I/O workers
  • Included in external requests
  • Logged for debugging

Request/Response Patterns

Timeout Handling

Levels:

  1. Connection Timeout: How long to wait for connection
  2. Request Timeout: Total request duration
  3. Idle Timeout: Inactivity timeout
  4. Keep-Alive Timeout: Connection reuse timeout

Timeout Implementation:

  • Timer wheels (efficient)
  • Per-request deadlines
  • Timeout propagation
  • Partial results

Timeout Response:

  • Return error to application worker
  • Circuit breaker integration
  • Retry decision
  • Logging/metrics

Retry Logic

Retry Strategies:

  1. Fixed Delay: Wait fixed time between retries
  2. Exponential Backoff: Increasing delays
  3. Jittered Backoff: Add randomness (thundering herd)
  4. Adaptive: Based on observed latency

Retry Conditions:

  • Network errors (connection refused, timeout)
  • Transient errors (503, 429)
  • Idempotent operations only
  • Not for non-idempotent (POST)

Retry Limits:

  • Max retry count
  • Max total time
  • Budget-based (time/cost)

Retry Context:

  • Track retry attempts
  • Log for debugging
  • Metrics per retry
  • Circuit breaker interaction

Backpressure

Producer Side (Application Workers):

  • Slow down if I/O workers overwhelmed
  • Queue depth monitoring
  • Reject or queue requests

I/O Worker Side:

  • Signal backpressure to application workers
  • Flow control at protocol level (HTTP/2, gRPC)
  • Buffer management

Network Side:

  • TCP flow control
  • RDMA flow control
  • Application-level flow control

Performance Considerations

Zero-Copy Optimization

Strategies:

  1. Direct Buffer Sharing: Share buffers between layers
  2. Memory Mapping: mmap for large data
  3. sendfile/splice: Kernel zero-copy
  4. RDMA: Hardware zero-copy
  5. io_uring: Zero-copy send/recv

Buffer Management:

  • Pre-allocated buffer pools
  • Fixed-size buffers (avoid fragmentation)
  • Reference counting (shared buffers)
  • Arena allocation

Batching

Request Batching:

  • Combine multiple small requests
  • Reduce per-request overhead
  • Vectored I/O (sendmsg with iovec)

Response Batching:

  • Buffer multiple responses
  • Flush on timeout or size
  • Amortize send overhead

Benefits:

  • Fewer syscalls
  • Better cache utilization
  • Higher throughput
  • Lower CPU usage

Trade-offs:

  • Increased latency
  • Memory usage
  • Complexity

CPU Affinity and NUMA

I/O Worker Placement:

  • Pin to cores near NIC
  • Same NUMA node as NIC
  • Avoid core migration

Application Worker Placement:

  • Distribute across NUMA nodes
  • Keep data local
  • Minimize cross-node traffic

Memory Allocation:

  • Allocate on same NUMA node
  • Use numa_alloc_onnode()
  • Monitor cross-node access

Lock-Free Communication

Between Workers:

  • Use lock-free channels (crossbeam)
  • SPSC when possible (fastest)
  • MPSC for fan-in
  • MPMC when necessary

Within I/O Workers:

  • Lock-free data structures
  • Per-connection state isolated
  • Atomic operations for shared state

Security & Authentication

TLS Integration

Server-Side TLS:

  • Certificate management
  • SNI (Server Name Indication)
  • ALPN (Application-Layer Protocol Negotiation)
  • Session resumption
  • Client certificates (mTLS)

Client-Side TLS:

  • Certificate validation
  • CA certificate bundle
  • Hostname verification
  • Certificate pinning
  • Custom validators

TLS Offload:

  • Hardware acceleration (QAT)
  • TLS termination at load balancer
  • End-to-end encryption

Performance:

  • Session reuse (reduce handshakes)
  • TLS 1.3 (faster handshake)
  • Hardware acceleration
  • Connection pooling

Authentication Protocols

HTTP Authentication:

  • Basic auth (base64 encoded)
  • Bearer tokens (JWT)
  • OAuth 2.0
  • API keys

Database Authentication:

  • Username/password
  • SASL (PostgreSQL)
  • IAM (AWS RDS)
  • Certificate-based

gRPC Authentication:

  • Metadata (headers)
  • TLS certificates
  • JWT tokens
  • Interceptors

Custom Authentication:

  • Protocol-specific handshake
  • Challenge-response
  • Token exchange
  • Session management

Authorization

Levels:

  1. Connection-level: Who can connect?
  2. Operation-level: What operations allowed?
  3. Resource-level: What data accessible?

Integration:

  • Policy enforcement in I/O workers
  • Context propagation to application workers
  • Centralized policy service
  • Caching for performance

Implementation Strategy

Phase 1: Foundation (Month 1-2)

Deliverables:

  1. Transport trait definition
  2. Protocol trait hierarchy
  3. Basic transport implementations (io_uring, mio)
  4. Simple protocol (HTTP/1.1 client)
  5. Connection pooling

Validation:

  • Unit tests for traits
  • Integration tests (echo server/client)
  • Benchmark transport overhead
  • Verify zero-copy paths

Phase 2: Core Protocols (Month 3-4)

Deliverables:

  1. HTTP/1.1 server and client (full)
  2. HTTP/2 support
  3. gRPC protocol
  4. Redis protocol (RESP)
  5. Circuit breaker integration

Validation:

  • HTTP benchmark vs nginx/hyper
  • gRPC benchmark vs tonic
  • Redis benchmark vs official client
  • Interop tests with real servers

Phase 3: Advanced Transports (Month 5-6)

Deliverables:

  1. DPDK transport
  2. RDMA transport
  3. AF_XDP transport
  4. Transport auto-selection
  5. Performance tuning

Validation:

  • Benchmark each transport
  • Compare latency/throughput
  • Test failover scenarios
  • Production load testing

Phase 4: Enterprise Features (Month 7-9)

Deliverables:

  1. PostgreSQL protocol
  2. Kafka protocol
  3. TLS integration (rustls)
  4. Service discovery
  5. Advanced monitoring

Validation:

  • Real-world application integration
  • Security audit
  • Performance validation
  • Documentation

Phase 5: Optimization (Month 10-12)

Deliverables:

  1. Zero-copy optimizations
  2. NUMA-aware allocation
  3. Hardware acceleration (QAT for TLS)
  4. Protocol-specific optimizations
  5. Profiling and tuning

Validation:

  • Performance benchmarks
  • Comparison with best-in-class
  • Real production workloads
  • Scalability testing

Protocol Feature Matrix

Protocol Server Client Streaming Multiplexing Zero-Copy TLS Complexity
HTTP/1.1 Partial No Low
HTTP/2 Medium
HTTP/3 Required High
gRPC Medium
Redis (RESP) Pub/Sub Pipelining Optional Low
PostgreSQL Cursors No High
MySQL No No Medium
MongoDB Cursors No Medium
Kafka N/A Partitions High
RabbitMQ (AMQP) Channels High
WebSocket No Medium
Custom Binary Custom Variable

Transport Support Matrix

Transport Latency Throughput Complexity Maturity Use Case
io_uring ~10µs 10M ops/s Medium Mature General purpose
DPDK ~5µs 20M pps High Very Mature Packet processing
RDMA ~1µs 100 Gbps High Mature Ultra-low latency
AF_XDP ~5µs 10M pps High Maturing Packet processing
Raw Sockets ~20µs 1M ops/s Medium Mature Custom protocols
eBPF/XDP ~2µs 24M pps Very High Maturing Filtering/routing
mio/epoll ~50µs 100K ops/s Low Very Mature Fallback

Design Patterns

Pattern 1: Protocol Gateway

Use Case: Translate between protocols

Example: HTTP REST API → gRPC backend

Flow:
1. Server I/O Worker: Receive HTTP request
2. Application Worker: Translate REST → gRPC
3. Client I/O Worker: Send gRPC request
4. Client I/O Worker: Receive gRPC response
5. Application Worker: Translate gRPC → REST
6. Server I/O Worker: Send HTTP response

Benefits:

  • Protocol translation isolated
  • Both sides use optimal protocol
  • Can cache/optimize translations

Pattern 2: Request Aggregation

Use Case: Fan-out/fan-in for parallel requests

Example: Dashboard fetching from multiple services

Flow:
1. Server I/O Worker: Receive single HTTP request
2. Application Worker: Split into N sub-requests
3. Client I/O Worker: Send N requests (different destinations)
4. Client I/O Worker: Receive N responses
5. Application Worker: Aggregate results
6. Server I/O Worker: Send single HTTP response

Benefits:

  • Parallel execution
  • Lower latency than sequential
  • Circuit breaker per destination

Pattern 3: Protocol Multiplexing

Use Case: Multiple logical channels over one connection

Example: HTTP/2 streams, gRPC

Mechanism:
- Single physical connection
- Multiple logical streams/channels
- Stream IDs for demultiplexing
- Independent flow control per stream

Benefits:

  • Fewer connections (resource efficiency)
  • Reduced latency (no connection setup)
  • Better bandwidth utilization

Pattern 4: Connection Pooling Per Destination

Use Case: Reuse connections to frequently accessed services

Implementation:
- Client I/O Worker maintains pools
- One pool per destination
- Health checking
- Automatic scaling

Benefits:

  • Amortize connection setup
  • Predictable performance
  • Resource limits

Pattern 5: Transparent Retry/Circuit Breaking

Use Case: Reliability for external calls

Implementation:
- Client I/O Worker implements retry logic
- Circuit breaker per destination
- Transparent to application workers

Benefits:

  • Centralized reliability patterns
  • Application logic stays simple
  • Consistent behavior across protocols

Monitoring and Observability

Metrics to Collect

Per I/O Worker:

  • Active connections
  • Requests/sec
  • Bytes sent/received
  • Errors/sec
  • Latency (p50, p95, p99, p999)
  • Queue depth

Per Protocol:

  • Protocol-specific metrics (HTTP status codes, gRPC methods)
  • Parse errors
  • Protocol version distribution

Per Connection:

  • Connection lifetime
  • Requests per connection
  • Keep-alive success rate
  • Timeout rate

Per Transport:

  • Transport-specific metrics (RDMA operations, DPDK packet drops)
  • Hardware counters
  • Buffer utilization

Tracing Integration

Distributed Tracing:

  • Propagate trace context through all layers
  • Span per I/O operation
  • Parent-child relationships

Trace Context:

  • W3C Trace Context standard
  • Baggage for metadata
  • Sampling decisions

Instrumentation Points:

  • Request arrival at I/O worker
  • Message to application worker
  • Application worker processing
  • Outbound request to external system
  • Response received
  • Response sent to client

Logging

Structured Logging:

  • JSON format
  • Consistent fields (worker_id, connection_id, protocol, etc.)
  • Log levels appropriate

Key Events to Log:

  • Connection establishment/close
  • Protocol errors
  • Timeout events
  • Circuit breaker state changes
  • Slow requests (above threshold)
  • Authentication failures

Testing Strategy

Unit Tests

Transport Layer:

  • Send/receive correctness
  • Error handling
  • Buffer management
  • Zero-copy paths

Protocol Layer:

  • Parse/serialize correctness
  • Edge cases (malformed input)
  • State machine transitions
  • Error responses

Connection Management:

  • Pool behavior (acquire/release)
  • Health checking
  • Timeout handling
  • Circuit breaker logic

Integration Tests

End-to-End:

  • Real protocol implementations
  • Against real servers (HTTP, Redis, PostgreSQL)
  • Interoperability validation

Performance:

  • Latency measurements
  • Throughput benchmarks
  • Scalability tests
  • Resource usage

Reliability:

  • Failure injection
  • Timeout scenarios
  • Circuit breaker activation
  • Retry logic validation

Chaos Testing

Network Failures:

  • Packet loss
  • Latency injection
  • Connection drops
  • DNS failures

Service Failures:

  • Backend crashes
  • Slow responses
  • Error responses
  • Partial failures

Validation:

  • System remains stable
  • Circuit breakers activate
  • Retries work correctly
  • No cascading failures

Example Use Cases

Use Case 1: API Gateway

Scenario: High-performance HTTP API gateway

Architecture:

  • Server I/O Workers: Accept HTTP requests (HTTP/2)
  • Application Workers: Routing logic, authentication, rate limiting
  • Client I/O Workers: Forward to backend services (gRPC)

Transport:

  • Inbound: io_uring or DPDK (high connection count)
  • Outbound: io_uring (gRPC to backends)

Features:

  • TLS termination
  • JWT validation
  • Circuit breakers per backend
  • Request/response transformation
  • Load balancing

Use Case 2: Database Proxy

Scenario: Connection pooling proxy for PostgreSQL

Architecture:

  • Server I/O Workers: Accept PostgreSQL protocol connections
  • Application Workers: Query routing, caching, rewriting
  • Client I/O Workers: Connection pool to real databases

Transport:

  • Both: io_uring (efficient I/O)

Features:

  • Connection pooling (reduce DB connections)
  • Query caching
  • Read/write splitting
  • Circuit breaker for DB failures
  • Connection migration

Use Case 3: Message Queue Consumer

Scenario: High-throughput Kafka consumer

Architecture:

  • Client I/O Workers: Consume from Kafka topics
  • Application Workers: Message processing
  • Client I/O Workers: Write results (HTTP API, database)

Transport:

  • Kafka: io_uring (high throughput)
  • Outputs: io_uring

Features:

  • Parallel partition consumption
  • Offset management
  • Backpressure handling
  • Error handling/DLQ
  • At-least-once or exactly-once

Use Case 4: Proxy/Load Balancer

Scenario: Layer 7 load balancer

Architecture:

  • Server I/O Workers: Accept connections (HTTP, TLS)
  • Application Workers: Minimal (routing only)
  • Client I/O Workers: Forward to backends

Transport:

  • Inbound: DPDK or AF_XDP (ultra-high packet rate)
  • Outbound: io_uring (backend connections)

Features:

  • Connection pooling to backends
  • Health checking
  • Load balancing algorithms
  • TLS termination/passthrough
  • Zero-copy where possible

Use Case 5: Real-Time Analytics

Scenario: Stream processing for real-time analytics

Architecture:

  • Client I/O Workers: Ingest from sources (Kafka, HTTP streaming)
  • Application Workers: Processing, aggregation, windowing
  • Client I/O Workers: Write to outputs (database, Kafka)

Transport:

  • All: io_uring (balanced performance)

Features:

  • Streaming protocols
  • Backpressure management
  • State checkpointing
  • Exactly-once processing
  • Low latency

Conclusion

This protocol layer design provides:

Unified Abstraction: Common traits for all protocols
Transport Independence: Works over io_uring, DPDK, RDMA, etc.
Bidirectional: Server and client in same framework
High Performance: Zero-copy, lock-free, NUMA-aware
Reliable: Circuit breakers, retries, timeouts
Observable: Metrics, tracing, logging
Flexible: Multiple protocols, transports, patterns
Production-Ready: Security, authentication, monitoring

Next Steps

  1. Implement core traits (Transport, Protocol)
  2. Build reference implementation (HTTP/1.1 over io_uring)
  3. Validate with benchmarks (compare to hyper, nginx)
  4. Add more protocols (HTTP/2, gRPC, Redis)
  5. Integrate with worker pool (dedicated I/O workers)
  6. Add advanced transports (DPDK, RDMA)
  7. Production hardening (monitoring, testing, docs)

Document Version: 1.0
Last Updated: October 31, 2025
Status: Design Complete - Ready for Implementation