Devesh Chauhan devloperdevesh

Devesh Chauhan

Backend Engineer working on high-concurrency distributed systems and AI infrastructure

Built systems handling ~850+ req/sec with sub-500ms latency under real-world load

Open to remote opportunities and global collaboration

Engineering Activity

What I Build

High-concurrency backend systems (500–850+ req/sec under load)
Distributed AI systems (RAG, LLM orchestration, vector search)
Event-driven architectures (Kafka, async pipelines)
API-first platforms for integrations and automation
Low-latency systems optimized using caching, batching, and routing

Flagship System — High-Concurrency AI Platform

Production-grade distributed backend system designed to handle real-world AI workloads at scale.

Scale

~850 req/sec throughput (load-tested)
500+ concurrent requests with stable latency
100K+ documents processed

Architecture

Async FastAPI services (stateless, horizontally scalable)
Redis distributed caching
FAISS vector index for semantic retrieval
Kafka-based event pipelines
Multi-LLM routing layer with fallback handling

Engineering Decisions

Stateless architecture for horizontal scaling
Async pipelines for non-blocking execution
Cache-first design to reduce latency and cost
Backpressure handling for stability
API-first approach for extensibility

Impact

~40% latency reduction
~30% cost reduction
Stable under sustained and burst traffic

System Architecture

graph TD
    Client --> CDN
    CDN --> LB[Load Balancer]
    LB --> API[FastAPI Gateway]
    API --> Cache[Redis Cache]
    API --> Workers[Async Workers]
    Workers --> VectorDB[FAISS Index]
    VectorDB --> LLM[LLM Providers]
    LLM --> Response

Focus

Distributed systems and system design
High-throughput backend engineering
AI infrastructure and LLM systems
Performance optimization and reliability

Positioning

Backend Engineer (Distributed Systems)
AI Infrastructure
High-Concurrency Systems

Provide feedback

Saved searches

Use saved searches to filter your results more quickly