Newsletter sign-up
View all newsletters

Sign up for our Enterprise Java Newsletter

Enterprise Java

Server load balancing architectures, Part 1: Transport-level load balancing

High scalability and availability for server farms

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone

Page 3 of 6

In contrast to hardware load balancers, software-based load balancers run on standard operating systems and standard hardware components such as PCs. Software-based solutions runs either within a dedicated load balancer hardware node as in Figure 1, or directly in the application.

DNS-based load balancing

DNS-based load balancing represents one of the early server load balancing approaches. The Internet's domain name system (DNS) associates IP addresses with a host name. If you type a host name (as part of the URL) into your browser, the browser requests that the DNS server resolve the host name to an IP address.

The DNS-based approach is based on the fact that DNS allows multiple IP addresses (real servers) to be assigned to one host name, as shown in the DNS lookup example in Listing 1.

Listing 1. Example DNS lookup

>nslookup amazon.com
Server:   ns.box
Address:  192.168.1.1

Name:       amazon.com
Addresses:  72.21.203.1, 72.21.210.11, 72.21.206.5

If the DNS server implements a round-robin approach, the order of the IP addresses for a given host changes after each DNS response. Usually clients such as browsers try to connect to the first address returned from a DNS query. The result is that responses to multiple clients are distributed among the servers. In contrast to the server load balancing architecture in Figure 1, no intermediate load balancer hardware node is required.

DNS is an efficient solution for global server load balancing, where load must be distributed between data centers at different locations. Often the DNS-based global server load balancing is combined with other server load balancing solutions to distribute the load within a dedicated data center.

Although easy to implement, the DNS approach has serious drawbacks. To reduce DNS queries, client tend to cache the DNS queries. If a server becomes unavailable, the client cache as well as the DNS server will continue to contain a dead server address. For this reason, the DNS approach does little to implement high availability.

TCP/IP server load balancing

TCP/IP server load balancers operate on low-level layer switching. A popular software-based low-level server load balancer is the Linux Virtual Server (LVS). The real servers appear to the outside world as a single "virtual" server. The incoming requests on a TCP connection are forwarded to the real servers by the load balancer, which runs a Linux kernel patched to include IP Virtual Server (IPVS) code.

To ensure high availability, in most cases a pair of load balancer nodes are set up, with one load balancer node in passive mode. If a load balancer fails, the heartbeat program that runs on both load balancers activates the passive load balancer node and initiates the takeover of the Virtual IP address (VIP). While the heartbeat is responsible for managing the failover between the load balancers, simple send/expect scripts are used to monitor the health of the real servers.

Transparency to the client is achieved by using a VIP that is assigned to the load balancer. If the client issues a request, first the requested host name is translated into the VIP. When it receives the request packet, the load balancer decides which real server should handle the request packet. The target IP address of the request packet is rewritten into the Real IP (RIP) of the real server. LVS supports several scheduling algorithms for distributing requests to the real servers. It is often is set up to use round-robin scheduling, similar to DNS-based load balancing. With LVS, the load balancing decision is made on the TCP level (Layer 4 of the OSI Reference Model).

After receiving the request packet, the real server handles it and returns the response packet. To force the response packet to be returned through the load balancer, the real server uses the VIP as its default response route. If the load balancer receives the response packet, the source IP of the response packet is rewritten with the VIP (OSI Model Layer 3). This LVS routing mode is called Network Address Translation (NAT) routing. Figure 2 shows an LVS implementation that uses NAT routing.

LVS implemented with NAT routing

Figure 2. LVS implemented with NAT routing

LVS also supports other routing modes such as Direct Server Return. In this case the response packet is sent directly to the client by the real server. To do this, the VIP must be assigned to all real servers, too. It is important to make the server's VIP unresolvable to the network; otherwise, the load balancer becomes unreachable. If the load balancer receives a request packet, the MAC address (OSI Model Layer 2) of the request is rewritten instead of the IP address. The real server receives the request packet and processes it. Based on the source IP address, the response packet is sent to the client directly, bypassing the load balancer. For Web traffic this approach can reduce the balancer workload dramatically. Typically, many more response packets are transferred than request packets. For instance, if you request a Web page, often only one IP packet is sent. If a larger Web page is requested, several response IP packets are required to transfer the requested page.

Caching

Low-level server load balancer solutions such as LVS reach their limit if application-level caching or application-session support is required. Caching is an important scalability principle for avoiding expensive operations that fetch the same data repeatedly. A cache is a temporary store that holds redundant data resulting from a previous data-fetch operation. The value of a cache depends on the cost to retrieve the data versus the hit rate and required cache size.

Based on the load balancer scheduling algorithm, the requests of a user session are handled by different servers. If a cache is used on the server side, straying requests will become a problem. One approach to handle this is to place the cache in a global space. memcached is a popular distributed cache solution that provides a large cache across multiple machines. It is a partitioned, distributed cache that uses consistent hashing to determine the cache server (daemon) for a given cache entry. Based on the cache key's hash code, the client library always maps the same hash code to the same cache server address. This address is then used to store the cache entry. Figure 3 illustrates this caching approach.

Figure 3. Load balancer architecture enhanced by a partitioned, distributed cache

Listing 2 uses spymemcached, a memcached client written in Java, to cache HttpResponse messages across multiple machines. The spymemcached library implements the required client logic I just described.

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone
Comments (3)
Login
Forgot your account info?

Meaning of straying requests?By Anonymous on May 22, 2009, 5:23 amCan someone please explain what the author meant by the statement "If a cache is used on the server side, straying requests will become a problem"? What do you mean...

Reply | Read entire comment

can print why not create PDFs to downloadBy Anonymous on February 17, 2009, 9:40 amcan't print, can't email to myself. why not create PDFs to download

Reply | Read entire comment

Great articleBy Anonymous on November 1, 2008, 1:43 pmI ran across one that talks about different approaches to monitoring apps: Transaction Monitoring

Reply | Read entire comment

View all comments

Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a JavaWorld account? Log in here. Register now for a free account.
Resources

More