Backend System Design for AI
A comprehensive collection of handwritten notes on backend system design concepts for AI applications.
For downloading notes : https://drive.google.com/drive/folders/1HeiTfb70as7mTTbolUF_78RXx68L5zGT?usp=drive_link
Networking, Web & Security
System Design & Architecture
5.1. System Design — preview
5.2. Monolith vs Microservices — preview
5.3. Scaling, Load Balancing, Capacity Estimation, Consistent Hashing — preview
5.4. API Design, Patterns, GraphQL, gRPC, Streaming APIs for LLMs — preview
5.5. Message Queues, Kafka, RabbitMQ , Pub Sub model, Event driven architecture, DB as queues — preview
5.6. Eventual Consistency, AI System Reliability, Model Fallback Strategy — preview
1.1. Networking and Web Fundamentals
Networking and Web Fundamentals
Networking and Web Fundamentals
What Happens When You Enter google.com
What Happens When You Enter google.com
DNS
DNS Records
1.2. Web Sockets + Authentication & Security
WebSockets
Authentication and Security
Authentication and Security
Sessions
Hashing and Salting
1.3. Token Based Auth & Rate Limiting
Token Based Auth
Access Control List and Rule Engine
Access Control List and Rule Engine
Rate Limiting
Distributed Rate Limiting
Distributed Rate Limiting
1.4. Prompt Injection & PII Masking
API Keys
Prompt Injection
PII Masking Implementation
PII Masking Implementation
1.5. MCP Tool Authorisation
MCP Tool Authorisation
Checklist Before Executing a Tool
Checklist Before Executing a Tool
2.1. Database Fundamentals
Database and Storage Fundamentals
Database and Storage Fundamentals
SQL vs NoSQL
NoSQL Types
Transactions
2.2. Indexes, Query Optimisation, Normalisation & Denormalisation
Database Optimisation
Indexing Strategies
Query Optimisation
Normalisation vs Denormalisation
Normalisation vs Denormalisation
2.3. Bloom Filter & Location Based Databases
Bloom Filters
Location Based Databases
2.4. Sharding & Replication
Distributed Databases
Types of Sharding
Replication
2.5. DB Migration, Connection Pooling & NoSQL Optimisation
NoSQL Optimisation
DB Migration
Connection Pooling
3. Vector DB, Metadata Filtering & Hybrid Search
AI Specific Storage
Why Indexing
Embedding Storage
Metadata Filtering
Hybrid Search
RAG and Document Retrieval
RAG and Document Retrieval
4.1. Caching, Types of Caching, Redis and CDN
Caching
Caching Strategies
Redis
Content Delivery Network
4.2. Distributed Caching, Cache Replacement Policies, Thrashing, AI Based Caching
Distributed Caching
Cache Replacement Policies
Cache Replacement Policies
Cache Thrashing
AI Response Caching
System Design
Trade-offs and Limitations
Trade-offs and Limitations
5.2. Monolith vs Microservices
Monolith vs Microservices
Monolith vs Microservices
Microservices Architecture
Microservices Architecture
Monolith to Microservice Migration
Monolith to Microservice Migration
5.3. Scaling, Load Balancing, Capacity Estimation, Consistent Hashing
Scaling
Real System Example
Capacity Estimation
Load Balancing
Consistent Hashing
Virtual Nodes
Scaling LLM Workloads
Cost Based Scaling
5.4. API Design, Patterns, GraphQL, gRPC, Streaming APIs for LLMs
REST API Design
Best Practices in REST API Design
Best Practices in REST API Design
GraphQL
gRPC
Asynchronous APIs
API Gateway Pattern
Backend for Frontend
Streaming APIs for LLMs
5.5. Message Queues, Kafka, RabbitMQ, Pub/Sub, Event-Driven Architecture, DB as Queues
Message Queues
Kafka
RabbitMQ
Publish Subscribe Model
Event-Driven Architecture
Event-Driven Architecture
Database as Queue
AI Task Queues
5.6. Eventual Consistency, AI System Reliability, Model Fallback Strategy
Eventual Consistency
AI System Reliability
Model Fallback Strategies
Model Fallback Strategies