How to Implement Rate Limiting in Node.js (2026)

Q: What status code should I return

429 Too Many Requests. Always. Include a `Retry-After` header.

Rate limiting prevents abuse, protects your API, and ensures fair usage. Here's how to implement it properly in Node.js — from simple middleware to production-grade Redis-based solutions.

Why Rate Limit?

Prevent abuse: Stop bots and scrapers from hammering your API
Protect infrastructure: Avoid overwhelming your database or servers
Fair usage: Ensure all users get reasonable access
Cost control: Prevent runaway API costs from a single user

Approach 1: Express Rate Limit (Simplest)

Install and configure in 3 lines:

npm install express-rate-limit

import rateLimit from 'express-rate-limit'

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100,                  // 100 requests per window
  standardHeaders: true,     // Return rate limit info in headers
  legacyHeaders: false,
  message: { error: 'Too many requests, please try again later.' },
})

app.use('/api/', limiter)

Pros: Zero config, works immediately Cons: In-memory only — resets on server restart, doesn't work with multiple instances

Best for: Side projects, prototypes, single-server deployments

Approach 2: Redis-Based (Production)

For multi-server deployments, use Redis as the rate limit store:

npm install express-rate-limit rate-limit-redis ioredis

import rateLimit from 'express-rate-limit'
import RedisStore from 'rate-limit-redis'
import Redis from 'ioredis'

const redis = new Redis(process.env.REDIS_URL)

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  standardHeaders: true,
  store: new RedisStore({
    sendCommand: (...args) => redis.call(...args),
  }),
})

app.use('/api/', limiter)

Pros: Works across multiple servers, persists across restarts Cons: Requires Redis infrastructure

Best for: Production APIs with multiple instances

Approach 3: Upstash Rate Limit (Serverless)

For serverless (Vercel, Cloudflare Workers, AWS Lambda):

npm install @upstash/ratelimit @upstash/redis

import { Ratelimit } from '@upstash/ratelimit'
import { Redis } from '@upstash/redis'

const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(10, '10 s'), // 10 requests per 10 seconds
  analytics: true,
})

// In your API handler
export async function POST(request) {
  const ip = request.headers.get('x-forwarded-for') ?? '127.0.0.1'
  const { success, limit, remaining, reset } = await ratelimit.limit(ip)
  
  if (!success) {
    return new Response('Rate limited', {
      status: 429,
      headers: {
        'X-RateLimit-Limit': limit.toString(),
        'X-RateLimit-Remaining': remaining.toString(),
        'X-RateLimit-Reset': reset.toString(),
      },
    })
  }
  
  // Handle request normally
}

Pros: Serverless-native, no infrastructure management, built-in analytics Cons: External dependency, costs money at scale

Best for: Serverless APIs on Vercel, Cloudflare, AWS Lambda

Approach 4: Custom Token Bucket

For full control, implement the token bucket algorithm:

class TokenBucket {
  constructor(capacity, refillRate) {
    this.capacity = capacity      // Max tokens
    this.tokens = capacity        // Current tokens
    this.refillRate = refillRate  // Tokens added per second
    this.lastRefill = Date.now()
  }

  consume(tokens = 1) {
    this.refill()
    
    if (this.tokens >= tokens) {
      this.tokens -= tokens
      return true // Allowed
    }
    
    return false // Rate limited
  }

  refill() {
    const now = Date.now()
    const elapsed = (now - this.lastRefill) / 1000
    this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate)
    this.lastRefill = now
  }
}

// Usage
const buckets = new Map()

function rateLimitMiddleware(req, res, next) {
  const key = req.ip
  
  if (!buckets.has(key)) {
    buckets.set(key, new TokenBucket(10, 1)) // 10 tokens, 1/second refill
  }
  
  const bucket = buckets.get(key)
  
  if (bucket.consume()) {
    next()
  } else {
    res.status(429).json({ error: 'Rate limited' })
  }
}

Pros: Full control, no dependencies Cons: In-memory only, need to handle cleanup of old buckets

Rate Limiting Strategies

Per-IP Rate Limiting

Most common. Limit by client IP address.

const limiter = rateLimit({
  keyGenerator: (req) => req.ip,
  max: 100,
  windowMs: 15 * 60 * 1000,
})

Problem: Users behind NAT/VPN share IPs. Proxies can spoof IPs.

Per-User Rate Limiting

Better for authenticated APIs. Limit by user ID or API key.

const limiter = rateLimit({
  keyGenerator: (req) => req.user?.id || req.ip,
  max: 1000,  // Authenticated users get more
  windowMs: 60 * 60 * 1000, // Per hour
})

Per-Endpoint Rate Limiting

Different limits for different endpoints.

const authLimiter = rateLimit({ windowMs: 15 * 60 * 1000, max: 5 })
const apiLimiter = rateLimit({ windowMs: 60 * 1000, max: 60 })

app.use('/api/auth/login', authLimiter)
app.use('/api/', apiLimiter)

Tiered Rate Limiting

Different limits based on subscription tier.

function tieredLimiter(req, res, next) {
  const tier = req.user?.tier || 'free'
  const limits = { free: 100, pro: 1000, enterprise: 10000 }
  
  const limiter = rateLimit({
    max: limits[tier],
    windowMs: 60 * 60 * 1000,
    keyGenerator: () => req.user?.id || req.ip,
  })
  
  limiter(req, res, next)
}

Response Headers

Always include rate limit headers so clients can self-regulate:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1710265200
Retry-After: 60

express-rate-limit with standardHeaders: true handles this automatically.

Common Patterns

Separate Limits for Read vs Write

app.get('/api/*', rateLimit({ max: 1000, windowMs: 60000 }))
app.post('/api/*', rateLimit({ max: 100, windowMs: 60000 }))

Burst Allowance

Allow short bursts but enforce long-term limits:

const burstLimiter = rateLimit({ max: 20, windowMs: 1000 })    // 20/second burst
const sustainedLimiter = rateLimit({ max: 1000, windowMs: 60000 }) // 1000/minute sustained

app.use('/api/', burstLimiter, sustainedLimiter)

Webhook-Friendly Limits

Webhooks from services like Stripe send bursts. Whitelist known IPs or use higher limits:

app.use('/webhooks/', rateLimit({ max: 500, windowMs: 60000 }))

Testing Rate Limits

Use curl or ab to test:

# Quick test
for i in {1..20}; do curl -s -o /dev/null -w "%{http_code}\n" http://localhost:3000/api/test; done

# Load test with Apache Bench
ab -n 200 -c 10 http://localhost:3000/api/test

FAQ

What status code should I return?

429 Too Many Requests. Always. Include a Retry-After header.

Should I rate limit all endpoints?

Yes, but with different limits. Auth endpoints: strict (5-10/15min). Read APIs: generous (1000/hour). Write APIs: moderate (100/hour).

What about API gateways?

If you use AWS API Gateway, Cloudflare, or nginx, rate limit there instead of in Node. It's more efficient and blocks traffic before it reaches your app.

How do I handle rate limited users gracefully?

Return a clear error message with retry timing. Good clients will back off automatically.

Bottom Line

Start with express-rate-limit for simple apps. Move to Redis-based rate limiting when you scale to multiple servers. Use Upstash for serverless. Always rate limit authentication endpoints aggressively and return proper headers so clients can self-regulate.