"What separates your weekend project from Netflix or Uber? It's not just more servers or code, it's the blueprint. It's system design. : )"
Welcome to the most comprehensive, hands on system design course that takes you from zero to hero!
- Episode 1: System Design Fundamentals ✓
- Episode 2: Monolith vs Microservices ✓
- Episode 3: Functional vs Non-Functional Requirements ✓
- Episode 4: Horizontal vs Vertical Scaling ✓
- Episode 5: Stateless vs Stateful Systems ✓
- Episode 6: Load Balancing ✓
- Episode 7: -
- So on...
Watch the Video | Read the Notes | View Presentation
- What is System Design and why it matters
- High-Level Design (HLD) vs Low-Level Design (LLD)
- Real example: Designing a URL Shortener
- Hands-on: Build your first system architecture
System Design = Software Architecture Blueprint ├── High-Level Design (HLD) - The Big Picture │ ├── Major Components & Services │ ├── Technology Stack Decisions │ ├── Data Flow Architecture │ └── Third-party Integrations └── Low-Level Design (LLD) - The Details ├── Classes, Methods & Data Structures ├── Database Schemas & Relationships ├── Algorithms & Implementation Logic └── Error Handling & Edge Cases Watch the Video | Read the Notes
- What monolithic architecture is and when to use it
- What microservices architecture is and its benefits
- Real-world examples: Netflix's evolution and Uber's architecture
- Practical decision framework for choosing between approaches
Architecture Patterns Comparison ├── Monolithic Architecture │ ├── Single Codebase & Deployable Unit │ ├── Shared Resources & Database │ ├── Advantages: Simple, Fast, Easy to Debug │ └── Challenges: Scalability, Technology Lock-in └── Microservices Architecture ├── Independent Services & Databases ├── Distributed System Architecture ├── Advantages: Scalability, Flexibility, Fault Isolation └── Challenges: Complexity, Operational Overhead Watch the Video | Read the Notes
- What requirements are and why they're critical to system design
- The difference between functional and non-functional requirements
- How to identify and document both requirement types
- Real-world example: Online bookstore requirements breakdown
- The requirements elicitation process
Requirements = Foundation of System Design ├── Functional Requirements (WHAT the system does) │ ├── User Actions & Features │ ├── System Operations & Business Logic │ ├── Data Processing & Integrations │ └── Example: User can create account, add to cart, checkout └── Non-Functional Requirements (HOW WELL it performs) ├── Performance: Response time, load time ├── Scalability: Concurrent users, data growth ├── Availability: Uptime (99.9%, 99.99%) ├── Security: Encryption, authentication, compliance ├── Usability: User experience, accessibility ├── Maintainability: Code quality, integration time └── Portability: Cross-platform, deployment flexibility Watch the Video | Read the Notes
- What scalability means across three dimensions (load, data, compute)
- Vertical scaling: Making one machine more powerful
- Horizontal scaling: Distributed systems engineering
- Real-world examples: Netflix's evolution and AWS instances
- Decision matrix and practical frameworks for choosing the right approach
- Monitoring, metrics, and autoscaling strategies
Scaling Strategies Comparison ├── Vertical Scaling (Scale Up) │ ├── Upgrade CPU, RAM, Storage on single machine │ ├── AWS Example: r6i.large → r6i.24xlarge (48x power) │ ├── Advantages: Simple, no code changes, ACID consistency │ └── Challenges: Physical limits, single point of failure, cost ├── Horizontal Scaling (Scale Out) │ ├── Add more servers, distribute load │ ├── Requires: Stateless architecture, load balancers │ ├── Advantages: Unlimited scale, fault tolerance, flexibility │ └── Challenges: CAP theorem, network latency, complexity └── Hybrid Approach (Best of Both) ├── Vertical for databases, horizontal for app servers ├── Netflix: 1000+ microservices, 300M+ users └── Autoscaling: Reactive, predictive, serverless Watch the Video | Read the Notes
- What "state" means in software systems (memory and session data)
- Stateless systems: Vending machine analogy and REST APIs
- Stateful systems: Bank teller analogy and session management
- Hybrid architecture: Stateless app tier + external state stores
- Real-world examples: Netflix, Amazon, and WhatsApp architectures
- Decision framework for choosing the right approach
State Management Strategies ├── Stateless Systems (Amnesia Design) │ ├── No server memory between requests │ ├── Every request includes full context (tokens, auth) │ ├── Advantages: Perfect clones, easy scaling, fault tolerance │ └── Challenges: Chattier requests, external state needed ├── Stateful Systems (Memory Design) │ ├── Server remembers session context │ ├── Requires: Sticky sessions, session storage │ ├── Advantages: Efficient, fast (in-memory), simple client │ └── Challenges: Sticky sessions, fragile, scaling hard └── Hybrid Architecture (Modern Approach) ├── Stateless application servers ├── Centralized state in Redis/DynamoDB/Cassandra ├── Netflix: Stateless microservices + Cassandra ├── Amazon: Stateless servers + DynamoDB carts └── WhatsApp: Stateful connections for real-time (2B users) Watch the Video | Read the Notes
- What load balancing is and its critical role in distributed systems
- Primary objectives: Scalability and High Availability
- 9 load balancing algorithms and when to use each
- Health monitoring: L4 (TCP) vs L7 (HTTP) checks
- Session persistence strategies (IP Hash vs Cookie-Based)
- Real-world example: Netflix's multi-layer architecture
- Load balancer types: Hardware, Software, and Cloud
- L4 vs L7 load balancing and their trade-offs
Load Balancing Strategies ├── Primary Objectives │ ├── Scalability: Horizontal scaling with commodity servers │ └── High Availability: 99.99% uptime, health checks, failover ├── Load Balancing Algorithms (9 total) │ ├── Round Robin: Simple, zero overhead, default │ ├── Weighted Round Robin: Heterogeneous hardware capacity │ ├── Least Connections: Dynamic, state-aware │ ├── Weighted Least Connections: Best of both worlds │ ├── Least Response Time: Latency + connections │ ├── Resource-Based: CPU/memory monitoring with agents │ ├── Geographic (GSLB): DNS-based, multi-region │ ├── IP Hash: Sticky sessions (L4) │ └── Cookie-Based: Sticky sessions (L7) ├── Session Persistence │ ├── IP Hash: Simple but NAT/proxy issues │ └── Cookie-Based: Robust L7 solution ├── Real-World Architecture │ ├── Netflix: GSLB → AWS ELB → Zuul → Microservices │ ├── 300M subscribers, 1000+ microservices │ └── Path-based routing (/play, /browse) ├── Load Balancer Types │ ├── Hardware: F5 BIG-IP, specialized silicon │ ├── Software: HAProxy, NGINX (flexible, cheap) │ └── Cloud: AWS ALB/NLB (managed, auto-scaling) └── L4 vs L7 Load Balancing ├── L4: IP/port level, fast (<1ms), simple └── L7: Content-based routing, SSL termination, microservices We love contributions! Here's how you can help make this course even better:
- Report bugs or issues
- Suggest new topics or improvements
- Improve documentation
- Create better diagrams
- Add more examples
- YouTube: Subscribe for new episodes
- LinkedIn: Connect with Harsh
This project is licensed under the MIT License - see the LICENSE file for details.