Context

Reliability, scalability, fault tolerance. You know these concepts are important and you want your system to follow these principles. As a result, you’ve deployed your backend services to run with multiple replicas. You have servers that:

Accept traffic from browser-based clients on the internet
Accept traffic from other services
Persist data in a SQL database

Problem

Uh oh! All your start-of-journey traffic from the browser is routed to one server instance. Your server is unable to handle the load and becomes unresponsive. Data is not propagated throughout the rest of your system.

Solution

We will need to find some way to distribute incoming network traffic across multiple servers. We can use a Load Balancer (LB) to achieve this.

Load Balancers are found and used in any distributed system and are used to spread traffic across a cluster of servers to improve the availability and reliability of systems. Load balancers will also track the state of the resource (server, website, database) being distributed. If the resource is not in a healthy state a Load Balancer will not send traffic to it.

Typically a Load Balancer is placed between the boundary of incoming traffic from the public internet and a system's private network. This would solve the initial problem encountered above. However, we can and should take this a step further.

To fully maximize availability and reliability, we balance the load at each layer of the system:

Between the browser client and the web server
Between each server and the servers, they communicate to in your private network
Between each server and the data-persistence layer

Resulting Context

Instead of a single device performing a lot of work, load balancing has several devices performing a little bit of work.
Users experience faster, uninterrupted service.
Systems experience less downtime and higher throughput.
Smart load balancers provide benefits like predictive analytics that determine traffic bottlenecks before they happen.

The following covers a deep dive into load balancers

What can we load-balance on?

Requests can be load-balanced by a variety of methods. The most common being:

Host-based: Distributes traffic based on the hostname of the request.
Path-based: Using the entire URL to distribute traffic.
Content-based: Will use the full content of the request to distribute traffic.

Where do load balancers operate?

Network

Load balancers at the Network layer operate at layer 4 in the OSI model (transport). Here routing is performed based on networking information such as IP address and various packet-related info.

Application

Load balancers at the Application layer operate at layer 7 in the OSI modle. At this layer, Load Balancers can inspect and analyze requests in their entirety and route based on content.

Load Balancing Algorithms

Load Balancing Algorithms: Directs traffic to the server with the fewest active connections.

Least Connection Method: Directs traffic to the server with the fewest active connections and the lowest average response time.

Least Bandwidth Method: Directs traffic to the server that is currently serving the least amount of traffic. Typically measured in megabits per second (Mbps).

Round Robin Method: This method cycles through a list of servers and sends each new request to the next server. When it reaches the end of the list, it starts over at the beginning. It is most useful when the servers are of equal specification and there are not many persistent connections.

Weighted Round Robin Method: A more intelligent Round Robin. Each server is assigned a weight indicating processing capacity. Servers with higher weights receive new connections before those with fewer weights and servers with higher weights get more connections than those with fewer weights.

IP Hash: A hash of the IP address of the client is calculated to redirect the request to a server.

Redundancy

Load balancers can be a single point of failure. To combat this we can deploy load balancers in a cluster. Load balancers in a cluster should be aware of the health of other load balancers and be able to take over traffic where needed.

Load balancers in the real world

Examples of commonly used load balancers include:

Load Balancing