API Rate Limiting Algorithms Explained

Baikal Signal

Protecting APIs while keeping good clients moving quickly.

Why Rate Limit
Token Bucket
Leaky Bucket
Sliding Window
Algorithm Comparison

Rate limiting protects your API from abuse and ensures fair resource allocation. This guide compares the most common algorithms and their trade-offs.

Why Rate Limit

Rate limiting serves multiple purposes:

Prevent DoS attacks
Ensure fair usage among clients
Control infrastructure costs
Maintain service quality during traffic spikes

Token Bucket

The token bucket algorithm allows bursts while enforcing average rate.

How It Works

Bucket holds tokens (e.g., 100 tokens)
Tokens replenish at fixed rate (e.g., 10/second)
Each request consumes 1 token
If no tokens available, request is rejected

class TokenBucket {
                                                                          constructor(capacity, refillRate) {
                                                                            this.capacity = capacity;
                                                                            this.tokens = capacity;
                                                                            this.refillRate = refillRate;
                                                                            this.lastRefill = Date.now();
                                                                          }
                                                                        
                                                                          consume() {
                                                                            this.refill();
                                                                            if (this.tokens > 0) {
                                                                              this.tokens--;
                                                                              return true;
                                                                            }
                                                                            return false;
                                                                          }
                                                                        
                                                                          refill() {
                                                                            const now = Date.now();
                                                                            const elapsed = (now - this.lastRefill) / 1000;
                                                                            const tokensToAdd = elapsed * this.refillRate;
                                                                            this.tokens = Math.min(this.capacity, this.tokens + tokensToAdd);
                                                                            this.lastRefill = now;
                                                                          }
                                                                        }

Best For

APIs that need to handle occasional bursts while maintaining average rate.

Leaky Bucket

Requests enter a queue and are processed at constant rate.

Implementation

class LeakyBucket {
                                                                          constructor(capacity, leakRate) {
                                                                            this.capacity = capacity;
                                                                            this.queue = [];
                                                                            this.leakRate = leakRate;
                                                                            
                                                                            setInterval(() => this.leak(), 1000 / leakRate);
                                                                          }
                                                                        
                                                                          add(request) {
                                                                            if (this.queue.length < this.capacity) {
                                                                              this.queue.push(request);
                                                                              return true;
                                                                            }
                                                                            return false; // Queue full
                                                                          }
                                                                        
                                                                          leak() {
                                                                            if (this.queue.length > 0) {
                                                                              const request = this.queue.shift();
                                                                              processRequest(request);
                                                                            }
                                                                          }
                                                                        }

Best For

Smoothing out traffic spikes and maintaining predictable output rate.

Sliding Window

Track requests in a rolling time window for precise rate limiting.

class SlidingWindow {
                                                                          constructor(limit, windowMs) {
                                                                            this.limit = limit;
                                                                            this.windowMs = windowMs;
                                                                            this.requests = [];
                                                                          }
                                                                        
                                                                          allow() {
                                                                            const now = Date.now();
                                                                            const windowStart = now - this.windowMs;
                                                                            
                                                                            // Remove old requests
                                                                            this.requests = this.requests.filter(t => t > windowStart);
                                                                            
                                                                            if (this.requests.length < this.limit) {
                                                                              this.requests.push(now);
                                                                              return true;
                                                                            }
                                                                            return false;
                                                                          }
                                                                        }

Best For

Precise rate limiting with no edge-case issues at window boundaries.

Algorithm Comparison

Algorithm	Memory	Bursts	Complexity
Token Bucket	O(1)	Allowed	Low
Leaky Bucket	O(n)	Smoothed	Medium
Sliding Window	O(n)	Allowed	Medium

Recommendations

Token Bucket: Most APIs (good balance)
Leaky Bucket: When steady output rate is critical
Sliding Window: When precision matters most

Summary

Choose token bucket for most use cases as it balances simplicity with flexibility. Use leaky bucket when you need predictable processing rates. Implement sliding window when you need precise control without edge cases. All algorithms can be distributed using Redis for multi-server deployments.

API Rate Limiting Algorithms Explained

Table of Contents

Why Rate Limit

Token Bucket

How It Works

Best For

Leaky Bucket

Implementation

Best For

Sliding Window

Best For

Algorithm Comparison

Recommendations

Summary

Related Articles

Database Connection Pooling: Beyond the Basics

Redis Caching Patterns for High-Performance Applications

Essential HTTP Security Headers in 2026