Load Balancing
How load balancers distribute traffic across servers to ensure high availability and optimal performance.
What is Load Balancing?
A load balancer distributes incoming network traffic across multiple servers to ensure no single server bears too much load. It acts as a traffic cop sitting in front of your servers.
ββββ Server 1
Client β LB ββββββββββ Server 2
ββββ Server 3
Why Load Balancing?
Without a load balancer, a single server handles all traffic β creating a single point of failure and a performance bottleneck.
Load Balancing Algorithms
1. Round Robin
Requests are distributed sequentially across servers.
| Pros | Cons |
|---|---|
| Simple to implement | Doesn't account for server capacity |
| Even distribution | Slow servers get same load as fast ones |
2. Weighted Round Robin
Each server gets a weight proportional to its capacity. A server with weight 3 gets three times the requests of a server with weight 1.
3. Least Connections
Routes to the server with the fewest active connections. Best for long-lived connections like WebSockets.
4. IP Hash
Uses a hash of the client's IP to determine which server gets the request. Ensures the same client always reaches the same server β useful for session affinity.
Choosing an Algorithm
- Stateless services β Round Robin or Least Connections
- Stateful sessions β IP Hash (but prefer stateless + external session store)
- Mixed server capacity β Weighted Round Robin
Types of Load Balancers
Layer 4 (Transport Layer)
Operates at TCP/UDP level. Routes based on IP address and port. Very fast because it doesn't inspect packet contents.
Layer 7 (Application Layer)
Operates at HTTP level. Can route based on URL path, headers, cookies, or request content. More flexible but slightly slower.
Health Checks
Load balancers continuously check if servers are healthy:
- Active: Periodically sends requests to a health endpoint
- Passive: Monitors response codes from real traffic
// Typical health check endpoint response
{
"status": "healthy",
"uptime": "72h",
"db_connected": true,
"cache_connected": true
}
Common Pitfall
Don't make health checks too aggressive. If your health check interval is 1 second and timeout is 500ms, a brief network hiccup can cause unnecessary server removals.
Popular Load Balancers
- NGINX β Great for L7, widely used as reverse proxy
- HAProxy β High performance L4/L7
- AWS ALB/NLB β Managed cloud solutions
- Envoy β Modern, microservices-focused
In System Design Interviews
When discussing load balancing in interviews:
- Place load balancers between every critical tier (clientβwebserver, webserverβapp, appβdb)
- Mention redundant load balancers (active-passive) to avoid SPOF
- Discuss which algorithm and why
- Consider geographic load balancing (DNS-based) for global systems