Loading pattern...

What is Horizontal Scaling?

Horizontal Scaling (scaling out) means adding more servers to handle increased load. 1 server → 10 servers → 100 servers. Opposite of Vertical Scaling (scaling up): upgrading one server to be bigger/faster. Horizontal scaling is how cloud apps handle massive scale—Netflix, Facebook, etc. Requires load balancer to distribute traffic. More complex than vertical but enables infinite scale. Most startups start vertical, go horizontal when needed.

When Should You Use This?

Scale horizontally when you've maxed out vertical scaling (biggest server available), when you need redundancy (one server dies, others continue), or when traffic is unpredictable (scale up/down automatically). Cloud platforms (AWS, Vercel) make horizontal scaling easy with auto-scaling. Start with vertical scaling—it's simpler until you need high availability or massive scale.

Common Mistakes to Avoid

  • Premature horizontal scaling—vertical is simpler, start there
  • Stateful servers—sessions stored on servers breaks when load balanced, use external session store
  • No auto-scaling—manually adding servers is slow, automate it
  • Wrong metric—scaling on CPU might miss memory bottleneck
  • Over-provisioning—running 100 servers when 10 would work wastes money

Real-World Examples

  • AWS Auto Scaling—automatically adds servers when traffic spikes
  • Vercel—horizontally scales Next.js apps automatically
  • E-commerce—Black Friday: 10 servers → 100 servers → back to 10
  • SaaS apps—Start with 1-2 servers, scale to hundreds as user base grows

Category

System Design Patterns

Tags

scalinghorizontal-scalingcloudinfrastructureauto-scaling

Permalink