Skip to content

API Performance Optimization

API performance optimization is critical for ensuring your Express application remains responsive, scalable, and resilient under load. This guide covers essential techniques for improving API performance and handling large-scale operations efficiently.

Rate limiting restricts how many requests a client can make to your API within a specific time window. This prevents abuse, protects against DoS attacks, and ensures fair resource distribution among clients.

The most common approach uses the express-rate-limit middleware:

// src/middleware/rateLimiter.js
const rateLimit = require('express-rate-limit');
// Basic rate limiter - applies to all requests
const globalLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
standardHeaders: true, // Return rate limit info in the `RateLimit-*` headers
legacyHeaders: false, // Disable the `X-RateLimit-*` headers
message: {
status: 429,
message: 'Too many requests, please try again later.'
}
});
// More restrictive limiter for authentication endpoints
const authLimiter = rateLimit({
windowMs: 60 * 60 * 1000, // 1 hour
max: 5, // limit each IP to 5 login attempts per hour
message: {
status: 429,
message: 'Too many login attempts, please try again after an hour'
}
});
module.exports = {
globalLimiter,
authLimiter
};

Use Case: Different endpoints need different rate limits based on their computational cost or business importance. When to use: When you have expensive operations (like file uploads) that need stricter limits, or public endpoints that can handle more traffic.

// Different limits for different routes
app.get('/api/public',
rateLimit({
windowMs: 15 * 60 * 1000,
max: 200 // More generous for public routes
}),
publicController.getData
);
app.get('/api/premium',
rateLimit({
windowMs: 15 * 60 * 1000,
max: 600 // More generous for premium users
}),
premiumController.getData
);

Pagination is essential when dealing with large datasets to avoid performance issues and excessive resource consumption.

// src/services/productService.js
const { PrismaClient } = require('@prisma/client');
const prisma = new PrismaClient();
exports.getProducts = async (page = 1, limit = 10) => {
try {
const skip = (page - 1) * limit;
// Execute queries in parallel
const [products, totalCount] = await Promise.all([
prisma.product.findMany({
skip,
take: Number(limit),
orderBy: { createdAt: 'desc' }
}),
prisma.product.count()
]);
// Calculate pagination metadata
const totalPages = Math.ceil(totalCount / limit);
return {
data: products,
pagination: {
page: Number(page),
limit: Number(limit),
totalItems: totalCount,
totalPages,
}
};
} catch (error) {
throw error;
}
};

While the example above demonstrates offset-based pagination, cursor-based pagination is often more efficient for large datasets. Instead of using skip and take, cursor-based pagination uses a reference point (usually an ID or unique timestamp) and retrieves records after that point. Both SQL and NoSQL databases support this pattern with slightly different implementations.

Caching dramatically improves API performance by serving previously computed results without repeating expensive operations.

Redis is an in-memory data store perfect for caching in distributed systems:

// middleware/cacheMiddleware.js
const redisClient = require("../config/redis");
const checkCache = (keyFn) => {
return async (req, res, next) => {
const key = keyFn(req);
const cachedData = await redisClient.get(key);
if (cachedData) {
const parsedData = JSON.parse(cachedData);
return res.status(StatusCodes.OK).json({
message: parsedData.message,
data: parsedData.data,
statusCode: parsedData.statusCode,
});
}
next();
};
};
module.exports = checkCache;
// config/redis.js
const { createClient } = require('redis');
const redisClient = createClient({
url: process.env.REDIS_URL || 'redis://localhost:6379'
});
redisClient.on('error', (err) => console.log('Redis Client Error', err));
// Connect to Redis
(async () => {
try {
await redisClient.connect();
console.log('Connected to Redis successfully');
} catch (error) {
console.error('Redis connection error:', error);
// Application can continue without Redis
}
})();
module.exports = redisClient;
// routes/productRoutes.js
const express = require('express');
const router = express.Router();
const productController = require('../controllers/productController');
const checkCache = require('../middleware/cacheMiddleware');
// Cache product list by category
router.get('/products/category/:categoryId',
checkCache((req) => `products:category:${req.params.categoryId}`),
productController.getProductsByCategory
);
// Cache individual product lookups
router.get('/products/:id',
checkCache((req) => `products:${req.params.id}`),
productController.getProductById
);
module.exports = router;
  1. Cache read-heavy resources that don’t change frequently
  2. Don’t cache user-specific data that changes frequently
  3. Use short TTLs for data that changes moderately often
  4. Monitor cache hit ratios to ensure your caching strategy is effective

Not all operations need to be performed synchronously within the request-response cycle. Moving time-consuming tasks to background processes improves API responsiveness.

Why we need this: Imagine your user registration takes 3 seconds because you’re sending a welcome email. That’s 3 seconds of waiting for something that could happen in the background. Bull queues let you respond instantly while handling the email separately.

// src/queues/index.js
const Queue = require('bull');
// Create queues with configuration
const createQueue = (name) => {
return new Queue(name, {
redis: {
host: process.env.REDIS_HOST || 'localhost',
port: process.env.REDIS_PORT || 6379,
password: process.env.REDIS_PASSWORD || undefined
},
defaultJobOptions: {
attempts: 3, // Sometimes emails fail - network issues, server down, etc.
backoff: {
type: 'exponential',
delay: 2000 // Wait 2s, then 4s, then 8s between retries
},
removeOnComplete: true // Redis can get cluttered with old jobs
}
});
};
// Separate queues for different job types - this way image processing
// won't block email sending if it gets backed up
const emailQueue = createQueue('email-processing');
const pdfQueue = createQueue('pdf-generation');
const imageQueue = createQueue('image-processing');
// Monitor what's happening - crucial for debugging production issues
const setupQueueEvents = (queue) => {
queue.on('completed', job => {
console.log(`✅ ${queue.name} job ${job.id} completed`);
});
queue.on('failed', (job, err) => {
console.error(`❌ ${queue.name} job ${job.id} failed:`, err.message);
// In production, you'd send this to your error tracking service
});
queue.on('error', (error) => {
console.error(`🚨 ${queue.name} queue error:`, error);
});
};
[emailQueue, pdfQueue, imageQueue].forEach(setupQueueEvents);
module.exports = {
emailQueue,
pdfQueue,
imageQueue
};
  1. Use compression for response payloads

    const compression = require('compression');
    app.use(compression());
  2. Implement connection pooling for databases

  3. Use streams for large files

    app.get('/api/reports/large-csv', (req, res) => {
    const fileStream = fs.createReadStream('path/to/large-report.csv');
    fileStream.pipe(res);
    });
  4. Implement proper error handling to prevent resource leaks

N+1 Query Problem

  • Problem: Executing a database query for each item in a collection
  • Solution: Use eager loading or batch queries

Memory Leaks

  • Problem: Objects that aren’t garbage collected
  • Solution: Use proper cleanup in event listeners and timers

Blocking the Event Loop

  • Problem: Long-running synchronous operations
  • Solution: Use asynchronous APIs and background processing

Excessive Logging

  • Problem: Logging everything, including sensitive info
  • Solution: Use appropriate log levels and sampling