API Rate Limiting restricts how many requests a user/IP can make in a time window. Common rules: "100 requests per minute" or "1000 requests per hour." Prevents abuse (DDoS, scraping), protects infrastructure, and enforces pricing tiers (free tier: 100 req/day, paid: unlimited). Return 429 "Too Many Requests" when limit exceeded. Essential for public APIs. Common algorithms: Token Bucket, Leaky Bucket, Fixed Window.
Implement rate limiting for all public APIs, when you have free/paid tiers (enforce limits), to prevent abuse/scraping, or to protect infrastructure from overload. Even internal APIs benefit from rate limiting (buggy code can't take down the system). Use CDN/API gateway rate limiting (Cloudflare, AWS API Gateway) before rolling your own.
System Design Patterns
Limit requests per user