Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Server load balancing architectures, Part 1: Transport-level load balancing

High scalability and availability for server farms

  • Print
  • Feedback

Page 3 of 6

In contrast to hardware load balancers, software-based load balancers run on standard operating systems and standard hardware components such as PCs. Software-based solutions runs either within a dedicated load balancer hardware node as in Figure 1, or directly in the application.

DNS-based load balancing

DNS-based load balancing represents one of the early server load balancing approaches. The Internet's domain name system (DNS) associates IP addresses with a host name. If you type a host name (as part of the URL) into your browser, the browser requests that the DNS server resolve the host name to an IP address.

The DNS-based approach is based on the fact that DNS allows multiple IP addresses (real servers) to be assigned to one host name, as shown in the DNS lookup example in Listing 1.

Listing 1. Example DNS lookup

>nslookup amazon.com
Server:   ns.box
Address:  192.168.1.1

Name:       amazon.com
Addresses:  72.21.203.1, 72.21.210.11, 72.21.206.5

If the DNS server implements a round-robin approach, the order of the IP addresses for a given host changes after each DNS response. Usually clients such as browsers try to connect to the first address returned from a DNS query. The result is that responses to multiple clients are distributed among the servers. In contrast to the server load balancing architecture in Figure 1, no intermediate load balancer hardware node is required.

DNS is an efficient solution for global server load balancing, where load must be distributed between data centers at different locations. Often the DNS-based global server load balancing is combined with other server load balancing solutions to distribute the load within a dedicated data center.

Although easy to implement, the DNS approach has serious drawbacks. To reduce DNS queries, client tend to cache the DNS queries. If a server becomes unavailable, the client cache as well as the DNS server will continue to contain a dead server address. For this reason, the DNS approach does little to implement high availability.

TCP/IP server load balancing

TCP/IP server load balancers operate on low-level layer switching. A popular software-based low-level server load balancer is the Linux Virtual Server (LVS). The real servers appear to the outside world as a single "virtual" server. The incoming requests on a TCP connection are forwarded to the real servers by the load balancer, which runs a Linux kernel patched to include IP Virtual Server (IPVS) code.

To ensure high availability, in most cases a pair of load balancer nodes are set up, with one load balancer node in passive mode. If a load balancer fails, the heartbeat program that runs on both load balancers activates the passive load balancer node and initiates the takeover of the Virtual IP address (VIP). While the heartbeat is responsible for managing the failover between the load balancers, simple send/expect scripts are used to monitor the health of the real servers.

Transparency to the client is achieved by using a VIP that is assigned to the load balancer. If the client issues a request, first the requested host name is translated into the VIP. When it receives the request packet, the load balancer decides which real server should handle the request packet. The target IP address of the request packet is rewritten into the Real IP (RIP) of the real server. LVS supports several scheduling algorithms for distributing requests to the real servers. It is often is set up to use round-robin scheduling, similar to DNS-based load balancing. With LVS, the load balancing decision is made on the TCP level (Layer 4 of the OSI Reference Model).

After receiving the request packet, the real server handles it and returns the response packet. To force the response packet to be returned through the load balancer, the real server uses the VIP as its default response route. If the load balancer receives the response packet, the source IP of the response packet is rewritten with the VIP (OSI Model Layer 3). This LVS routing mode is called Network Address Translation (NAT) routing. Figure 2 shows an LVS implementation that uses NAT routing.

LVS implemented with NAT routing

Figure 2. LVS implemented with NAT routing

LVS also supports other routing modes such as Direct Server Return. In this case the response packet is sent directly to the client by the real server. To do this, the VIP must be assigned to all real servers, too. It is important to make the server's VIP unresolvable to the network; otherwise, the load balancer becomes unreachable. If the load balancer receives a request packet, the MAC address (OSI Model Layer 2) of the request is rewritten instead of the IP address. The real server receives the request packet and processes it. Based on the source IP address, the response packet is sent to the client directly, bypassing the load balancer. For Web traffic this approach can reduce the balancer workload dramatically. Typically, many more response packets are transferred than request packets. For instance, if you request a Web page, often only one IP packet is sent. If a larger Web page is requested, several response IP packets are required to transfer the requested page.

Caching

Low-level server load balancer solutions such as LVS reach their limit if application-level caching or application-session support is required. Caching is an important scalability principle for avoiding expensive operations that fetch the same data repeatedly. A cache is a temporary store that holds redundant data resulting from a previous data-fetch operation. The value of a cache depends on the cost to retrieve the data versus the hit rate and required cache size.

Based on the load balancer scheduling algorithm, the requests of a user session are handled by different servers. If a cache is used on the server side, straying requests will become a problem. One approach to handle this is to place the cache in a global space. memcached is a popular distributed cache solution that provides a large cache across multiple machines. It is a partitioned, distributed cache that uses consistent hashing to determine the cache server (daemon) for a given cache entry. Based on the cache key's hash code, the client library always maps the same hash code to the same cache server address. This address is then used to store the cache entry. Figure 3 illustrates this caching approach.

Figure 3. Load balancer architecture enhanced by a partitioned, distributed cache

Listing 2 uses spymemcached, a memcached client written in Java, to cache HttpResponse messages across multiple machines. The spymemcached library implements the required client logic I just described.

  • Print
  • Feedback

Resources

More