Skip to content
fearchitect
Network & Infrastructure

Load Balancing (for Frontends)

Spread traffic across servers; keep WebSocket apps sticky.

By Abas TurabliReviewed

Summary

A load balancer distributes incoming requests across a pool of servers to prevent any single node from becoming a bottleneck. L4 balancers route by TCP/UDP without inspecting content; L7 balancers read HTTP headers, paths, and cookies to make smarter routing decisions. Algorithm choice and health checks determine how evenly load spreads.

Jump to the interview angle

A load balancer sits in front of a server pool and forwards each incoming connection or request to one of the available backends. It tracks server health, removes unhealthy nodes, and re-adds them when they recover. Without it, a single server is both a performance ceiling and a single point of failure.

L4 vs L7 load balancing

AspectL4 (transport layer)L7 (application layer)
Routing basisIP address + TCP/UDP port onlyHTTP headers, URL path, cookies, host
Content inspectionNone — opaque byte streamFull HTTP body and headers visible
TLS terminationPass-through (TLS reaches backend)Terminates TLS; backends see plain HTTP
Sticky sessionsIP-hash onlyCookie-based or header-based affinity
Performance costVery low — no parsingHigher — parses each request
Typical toolsAWS NLB, HAProxy TCP modeAWS ALB, nginx, Caddy, Envoy

Routing algorithms

  • **Round-robin** — each request goes to the next server in a fixed cycle; simple, works when servers are equally capable.
  • **Weighted round-robin** — servers get a share of traffic proportional to their weight; use when instance sizes differ.
  • **Least-connections** — routes to the server with fewest open connections; beats round-robin for long-lived requests like uploads.
  • **IP hash** — hashes client IP to pin the same backend; breaks when the server pool changes.
  • **Power of two choices (P2C)** — picks two servers at random, routes to the less loaded; near-optimal with low overhead.
Health checks run on a timer; failed nodes leave the pool. A sticky cookie pins the client to Server 1 regardless of algorithm.

WebSockets and stateful apps need sticky sessions

A WebSocket connection is long-lived on one server node. If a reconnect lands on a different node, the server has no record of the client — the connection starts from scratch or fails. Use cookie-based affinity on an L7 balancer. The same applies to server-sent SSE streams and in-memory session stores.

Interview angle

Interviewers check whether you can pick L4 vs L7 by the routing signal you need, and whether you know why WebSockets break without sticky sessions. Name a concrete algorithm trade-off.

Soundbite: "L7 for path-based routing and cookie affinity; L4 when you need raw TCP throughput with no parsing cost."

Key terms

L4 load balancer
Routes by IP/port without inspecting HTTP content; low overhead, no TLS termination.
L7 load balancer
Inspects HTTP headers, paths, and cookies to route requests; terminates TLS.
Sticky session
Affinity rule that pins a client to the same backend node across requests.
Health check
Periodic probe (HTTP GET or TCP connect) that removes unhealthy backends from the pool.
Least-connections
Algorithm routing each new request to the backend with the fewest open connections.

Further reading

Search fearchitect

Jump to a topic, mode, or action.