Load Balancing

How load balancers distribute traffic across servers to ensure high availability and optimal performance.

3 min read2026-03-22easyload-balancingnetworkingavailability

What is Load Balancing?

A load balancer distributes incoming network traffic across multiple servers to ensure no single server bears too much load. It acts as a traffic cop sitting in front of your servers.

                    β”Œβ”€β”€β”€ Server 1
Client β†’ LB β”€β”€β”€β”€β”€β”€β”œβ”€β”€β”€ Server 2
                    └─── Server 3

Why Load Balancing?

Without a load balancer, a single server handles all traffic β€” creating a single point of failure and a performance bottleneck.

Load Balancing Algorithms

1. Round Robin

Requests are distributed sequentially across servers.

ProsCons
Simple to implementDoesn't account for server capacity
Even distributionSlow servers get same load as fast ones

2. Weighted Round Robin

Each server gets a weight proportional to its capacity. A server with weight 3 gets three times the requests of a server with weight 1.

3. Least Connections

Routes to the server with the fewest active connections. Best for long-lived connections like WebSockets.

4. IP Hash

Uses a hash of the client's IP to determine which server gets the request. Ensures the same client always reaches the same server β€” useful for session affinity.

Choosing an Algorithm

  • Stateless services β†’ Round Robin or Least Connections
  • Stateful sessions β†’ IP Hash (but prefer stateless + external session store)
  • Mixed server capacity β†’ Weighted Round Robin

Types of Load Balancers

Layer 4 (Transport Layer)

Operates at TCP/UDP level. Routes based on IP address and port. Very fast because it doesn't inspect packet contents.

Layer 7 (Application Layer)

Operates at HTTP level. Can route based on URL path, headers, cookies, or request content. More flexible but slightly slower.

Health Checks

Load balancers continuously check if servers are healthy:

  • Active: Periodically sends requests to a health endpoint
  • Passive: Monitors response codes from real traffic
// Typical health check endpoint response
{
  "status": "healthy",
  "uptime": "72h",
  "db_connected": true,
  "cache_connected": true
}

Common Pitfall

Don't make health checks too aggressive. If your health check interval is 1 second and timeout is 500ms, a brief network hiccup can cause unnecessary server removals.

  • NGINX β€” Great for L7, widely used as reverse proxy
  • HAProxy β€” High performance L4/L7
  • AWS ALB/NLB β€” Managed cloud solutions
  • Envoy β€” Modern, microservices-focused

In System Design Interviews

When discussing load balancing in interviews:

  1. Place load balancers between every critical tier (client→webserver, webserver→app, app→db)
  2. Mention redundant load balancers (active-passive) to avoid SPOF
  3. Discuss which algorithm and why
  4. Consider geographic load balancing (DNS-based) for global systems

Comments