Summary
A load balancer distributes incoming requests across a pool of servers to prevent any single node from becoming a bottleneck. L4 balancers route by TCP/UDP without inspecting content; L7 balancers read HTTP headers, paths, and cookies to make smarter routing decisions. Algorithm choice and health checks determine how evenly load spreads.
Jump to the interview angleA load balancer sits in front of a server pool and forwards each incoming connection or request to one of the available backends. It tracks server health, removes unhealthy nodes, and re-adds them when they recover. Without it, a single server is both a performance ceiling and a single point of failure.
L4 vs L7 load balancing
| Aspect | L4 (transport layer) | L7 (application layer) | |
|---|---|---|---|
| Routing basis | IP address + TCP/UDP port only | HTTP headers, URL path, cookies, host | |
| Content inspection | None — opaque byte stream | Full HTTP body and headers visible | |
| TLS termination | Pass-through (TLS reaches backend) | Terminates TLS; backends see plain HTTP | |
| Sticky sessions | IP-hash only | Cookie-based or header-based affinity | |
| Performance cost | Very low — no parsing | Higher — parses each request | |
| Typical tools | AWS NLB, HAProxy TCP mode | AWS ALB, nginx, Caddy, Envoy |
Routing algorithms
- **Round-robin** — each request goes to the next server in a fixed cycle; simple, works when servers are equally capable.
- **Weighted round-robin** — servers get a share of traffic proportional to their weight; use when instance sizes differ.
- **Least-connections** — routes to the server with fewest open connections; beats round-robin for long-lived requests like uploads.
- **IP hash** — hashes client IP to pin the same backend; breaks when the server pool changes.
- **Power of two choices (P2C)** — picks two servers at random, routes to the less loaded; near-optimal with low overhead.
WebSockets and stateful apps need sticky sessions
A WebSocket connection is long-lived on one server node. If a reconnect lands on a different node, the server has no record of the client — the connection starts from scratch or fails. Use cookie-based affinity on an L7 balancer. The same applies to server-sent SSE streams and in-memory session stores.
Interview angle
Interviewers check whether you can pick L4 vs L7 by the routing signal you need, and whether you know why WebSockets break without sticky sessions. Name a concrete algorithm trade-off.
Soundbite: "L7 for path-based routing and cookie affinity; L4 when you need raw TCP throughput with no parsing cost."
Key terms
- L4 load balancer
- Routes by IP/port without inspecting HTTP content; low overhead, no TLS termination.
- L7 load balancer
- Inspects HTTP headers, paths, and cookies to route requests; terminates TLS.
- Sticky session
- Affinity rule that pins a client to the same backend node across requests.
- Health check
- Periodic probe (HTTP GET or TCP connect) that removes unhealthy backends from the pool.
- Least-connections
- Algorithm routing each new request to the backend with the fewest open connections.