Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Server load balancing architectures, Part 1: Transport-level load balancing

High scalability and availability for server farms

  • Print
  • Feedback

Page 2 of 6

Availability and scalability

Server load balancing distributes service requests across a group of real servers and makes those servers look like a single big server to the clients. Often dozens of real servers are behind a URL that implements a single virtual service.

How does this work? In a widely used server load balancing architecture, the incoming request is directed to a dedicated server load balancer that is transparent to the client. Based on parameters such as availability or current server load, the load balancer decides which server should handle the request and forwards it to the selected server. To provide the load balancing algorithm with the required input data, the load balancer also retrieves information about the servers' health and load to verify that they can respond to traffic. Figure 1 illustrates this classic load balancer architecture.

Classic load balancer architecture (load dispatcher) t

Figure 1. Classic load balancer architecture (load dispatcher)

The load-dispatcher architecture illustrated in Figure 1 is just one of several approaches. To decide which load balancing solution is the best for your infrastructure, you need to consider availability and scalability.

Availability is defined by uptime -- the time between failures. (Downtime is the time to detect the failure, repair it, perform required recovery, and restart tasks.) During uptime the system must respond to each request within a predetermined, well-defined time. If this time is exceeded, the client sees this as a server malfunction. High availability, basically, is redundancy in the system: if one server fails, the others take over the failed server's load transparently. The failure of an individual server is invisible to the client.

Scalability means that the system can serve a single client, as well as thousands of simultaneous clients, by meeting quality-of-service requirements such as response time. Under an increased load, a high scalable system can increase the throughput almost linearly in proportion to the power of added hardware resources.

In the scenario in Figure 1, high scalability is reached by distributing the incoming request over the servers. If the load increases, additional servers can be added, as long as the load balancer does not become the bottleneck. To reach high availability, the load balancer must monitor the servers to avoid forwarding requests to overloaded or dead servers. Furthermore, the load balancer itself must be redundant too. I'll discuss this point later in this article.

Server load balancing techniques

In general, server load balancing solutions are of two main types:

  • Transport-level load balancing -- such as the DNS-based approach or TCP/IP-level load balancing -- acts independently of the application payload.
  • Application-level load balancing uses the application payload to make load balancing decisions.

Load balancing solutions can be further classified into software-based load balancers and hardware-based load balancers. Hardware-based load balancers are specialized hardware boxes that include application-specific integrated circuits (ASICs) customized for a particular use. ASICs enable high-speed forwarding of network traffic without the overhead of a general-purpose operating system. Hardware-based load balancers are often used for transport-level load balancing. In general, hardware-based load balancers are faster than software-based solutions. Their drawback is their cost.

  • Print
  • Feedback

Resources

More