How to Implement Rate Limiting in Node.js (2026)
Rate limiting prevents abuse, protects your API, and ensures fair usage. Here's how to implement it properly in Node.js — from simple middleware to production-grade Redis-based solutions.
Why Rate Limit?
- Prevent abuse: Stop bots and scrapers from hammering your API
- Protect infrastructure: Avoid overwhelming your database or servers
- Fair usage: Ensure all users get reasonable access
- Cost control: Prevent runaway API costs from a single user
Approach 1: Express Rate Limit (Simplest)
Install and configure in 3 lines:
npm install express-rate-limit
import rateLimit from 'express-rate-limit'
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // 100 requests per window
standardHeaders: true, // Return rate limit info in headers
legacyHeaders: false,
message: { error: 'Too many requests, please try again later.' },
})
app.use('/api/', limiter)
Pros: Zero config, works immediately Cons: In-memory only — resets on server restart, doesn't work with multiple instances
Best for: Side projects, prototypes, single-server deployments
Approach 2: Redis-Based (Production)
For multi-server deployments, use Redis as the rate limit store:
npm install express-rate-limit rate-limit-redis ioredis
import rateLimit from 'express-rate-limit'
import RedisStore from 'rate-limit-redis'
import Redis from 'ioredis'
const redis = new Redis(process.env.REDIS_URL)
const limiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 100,
standardHeaders: true,
store: new RedisStore({
sendCommand: (...args) => redis.call(...args),
}),
})
app.use('/api/', limiter)
Pros: Works across multiple servers, persists across restarts Cons: Requires Redis infrastructure
Best for: Production APIs with multiple instances
Approach 3: Upstash Rate Limit (Serverless)
For serverless (Vercel, Cloudflare Workers, AWS Lambda):
npm install @upstash/ratelimit @upstash/redis
import { Ratelimit } from '@upstash/ratelimit'
import { Redis } from '@upstash/redis'
const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(10, '10 s'), // 10 requests per 10 seconds
analytics: true,
})
// In your API handler
export async function POST(request) {
const ip = request.headers.get('x-forwarded-for') ?? '127.0.0.1'
const { success, limit, remaining, reset } = await ratelimit.limit(ip)
if (!success) {
return new Response('Rate limited', {
status: 429,
headers: {
'X-RateLimit-Limit': limit.toString(),
'X-RateLimit-Remaining': remaining.toString(),
'X-RateLimit-Reset': reset.toString(),
},
})
}
// Handle request normally
}
Pros: Serverless-native, no infrastructure management, built-in analytics Cons: External dependency, costs money at scale
Best for: Serverless APIs on Vercel, Cloudflare, AWS Lambda
Approach 4: Custom Token Bucket
For full control, implement the token bucket algorithm:
class TokenBucket {
constructor(capacity, refillRate) {
this.capacity = capacity // Max tokens
this.tokens = capacity // Current tokens
this.refillRate = refillRate // Tokens added per second
this.lastRefill = Date.now()
}
consume(tokens = 1) {
this.refill()
if (this.tokens >= tokens) {
this.tokens -= tokens
return true // Allowed
}
return false // Rate limited
}
refill() {
const now = Date.now()
const elapsed = (now - this.lastRefill) / 1000
this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate)
this.lastRefill = now
}
}
// Usage
const buckets = new Map()
function rateLimitMiddleware(req, res, next) {
const key = req.ip
if (!buckets.has(key)) {
buckets.set(key, new TokenBucket(10, 1)) // 10 tokens, 1/second refill
}
const bucket = buckets.get(key)
if (bucket.consume()) {
next()
} else {
res.status(429).json({ error: 'Rate limited' })
}
}
Pros: Full control, no dependencies Cons: In-memory only, need to handle cleanup of old buckets
Rate Limiting Strategies
Per-IP Rate Limiting
Most common. Limit by client IP address.
const limiter = rateLimit({
keyGenerator: (req) => req.ip,
max: 100,
windowMs: 15 * 60 * 1000,
})
Problem: Users behind NAT/VPN share IPs. Proxies can spoof IPs.
Per-User Rate Limiting
Better for authenticated APIs. Limit by user ID or API key.
const limiter = rateLimit({
keyGenerator: (req) => req.user?.id || req.ip,
max: 1000, // Authenticated users get more
windowMs: 60 * 60 * 1000, // Per hour
})
Per-Endpoint Rate Limiting
Different limits for different endpoints.
const authLimiter = rateLimit({ windowMs: 15 * 60 * 1000, max: 5 })
const apiLimiter = rateLimit({ windowMs: 60 * 1000, max: 60 })
app.use('/api/auth/login', authLimiter)
app.use('/api/', apiLimiter)
Login endpoints should have stricter limits (prevents brute force).
Tiered Rate Limiting
Different limits based on subscription tier.
function tieredLimiter(req, res, next) {
const tier = req.user?.tier || 'free'
const limits = { free: 100, pro: 1000, enterprise: 10000 }
const limiter = rateLimit({
max: limits[tier],
windowMs: 60 * 60 * 1000,
keyGenerator: () => req.user?.id || req.ip,
})
limiter(req, res, next)
}
Response Headers
Always include rate limit headers so clients can self-regulate:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1710265200
Retry-After: 60
express-rate-limit with standardHeaders: true handles this automatically.
Common Patterns
Separate Limits for Read vs Write
app.get('/api/*', rateLimit({ max: 1000, windowMs: 60000 }))
app.post('/api/*', rateLimit({ max: 100, windowMs: 60000 }))
Burst Allowance
Allow short bursts but enforce long-term limits:
const burstLimiter = rateLimit({ max: 20, windowMs: 1000 }) // 20/second burst
const sustainedLimiter = rateLimit({ max: 1000, windowMs: 60000 }) // 1000/minute sustained
app.use('/api/', burstLimiter, sustainedLimiter)
Webhook-Friendly Limits
Webhooks from services like Stripe send bursts. Whitelist known IPs or use higher limits:
app.use('/webhooks/', rateLimit({ max: 500, windowMs: 60000 }))
Testing Rate Limits
Use curl or ab to test:
# Quick test
for i in {1..20}; do curl -s -o /dev/null -w "%{http_code}\n" http://localhost:3000/api/test; done
# Load test with Apache Bench
ab -n 200 -c 10 http://localhost:3000/api/test
FAQ
What status code should I return?
429 Too Many Requests. Always. Include a Retry-After header.
Should I rate limit all endpoints?
Yes, but with different limits. Auth endpoints: strict (5-10/15min). Read APIs: generous (1000/hour). Write APIs: moderate (100/hour).
What about API gateways?
If you use AWS API Gateway, Cloudflare, or nginx, rate limit there instead of in Node. It's more efficient and blocks traffic before it reaches your app.
How do I handle rate limited users gracefully?
Return a clear error message with retry timing. Good clients will back off automatically.
Bottom Line
Start with express-rate-limit for simple apps. Move to Redis-based rate limiting when you scale to multiple servers. Use Upstash for serverless. Always rate limit authentication endpoints aggressively and return proper headers so clients can self-regulate.