Distribute traffic across servers
Load Balancing distributes incoming requests across multiple servers so no single server gets overwhelmed. Think of it as a traffic cop directing cars to different lanes—users don't notice which server handles their request, they just get fast responses. Essential for high-traffic apps, redundancy, and zero-downtime deploys. Common strategies: Round Robin (each server in order), Least Connections (send to least busy), IP Hash (same user always to same server).
Use load balancing when one server can't handle all traffic, when you need redundancy (if one server dies, others keep working), or when you want zero-downtime deploys (deploy to servers one at a time). Start simple with services like Vercel/Netlify that handle it automatically. Add explicit load balancers (AWS ALB, nginx) when you have multiple backend servers.
System Design Patterns