Server load balancing architectures, Part 2: Application-level load balancing

Operating with application knowledge

The transport-level server load balancing architectures described in the first half of this article are more than adequate for many Web sites, but more complex and dynamic sites can't depend on them. Applications that rely on cache or session data must be able to handle a sequence of requests from the same client accurately and efficiently, without failing. In this follow up to his introduction to server load balancing, Gregor Roth discusses various application-level load balancing architectures, helping you decide which one will best meet the business requirements of your Web site.

The first half of this article describes transport-level server load balancing solutions, such as TCP/IP-based load balancers, and analyzes their benefits and disadvantages. Load balancing on the TCP/IP level spreads incoming TCP connections over the real servers in a server farm. It is sufficient in most cases, especially for static Web sites. However, support for dynamic Web sites often requires higher-level load balancing techniques. For instance, if the server-side application must deal with caching or application session data, effective support for client affinity becomes an important consideration. Here in Part 2, I'll discuss techniques for implementing server load balancing at the application level to address the needs of many dynamic Web sites.

Intermediate server load balancers

In contrast to low-level load balancing solutions, application-level server load balancing operates with application knowledge. One popular load-balancing architecture, shown in Figure 1, includes both an application-level load balancer and a transport-level load balancer.

Load balancing on transport and application levels
Figure 1. Load balancing on transport and application levels (click to enlarge)

The application-level load balancer appears to the transport-level load balancer as a normal server. Incoming TCP connections are forwarded to the application-level load balancer. When it retrieves an application-level request, it determines the target server on the basis of the application-level data and forwards the request to that server.

Listing 1 shows an application-level load balancer that uses a HTTP request parameter to decide which back-end server to use. In contrast to the transport-level load balancer, it makes the routing decision based on an application-level HTTP request, and the unit of forwarding is a HTTP request. Similarly to the memcached approach I discussed in Part 1, this solution uses a "hash key"-based partitioning algorithm to determine the server to use. Often, attributes such as user ID or session ID are used as the partitioning key. As a result, the same server instance always handles the same user. The user's client is affine or "sticky" to the server. For this reason the server can make use of a local HttpRequest cache I discussed in Part 1.

Listing 1. Intermediate application-level load balancer

class LoadBalancerHandler implements IHttpRequestHandler, ILifeCycle {
   private final List<InetSocketAddress> servers = new ArrayList<InetSocketAddress>();
   private HttpClient httpClient;

   /*
    * this class does not implement server monitoring or healthiness checks
    */

   public LoadBalancerHandler(InetSocketAddress... srvs) {
      servers.addAll(Arrays.asList(srvs));
   }

  public void onInit() {
      httpClient = new HttpClient();
      httpClient.setAutoHandleCookies(false);
}


   public void onDestroy() throws IOException {
      httpClient.close();
   }

   public void onRequest(final IHttpExchange exchange) throws IOException {
      IHttpRequest request = exchange.getRequest();

      // determine the business server based on the id's hashcode
      Integer customerId = request.getRequiredIntParameter("id");
      int idx = customerId.hashCode() % servers.size();
      if (idx < 0) {
         idx *= -1;
      }

      // retrieve the business server address and update the Request-URL of the request
      InetSocketAddress server = servers.get(idx);
      URL url = request.getRequestUrl();
      URL newUrl = new URL(url.getProtocol(), server.getHostName(), server.getPort(), url.getFile());
      request.setRequestUrl(newUrl);

      // proxy header handling (remove hop-by-hop headers, ...)
      // ...


      // create a response handler to forward the response to the caller
      IHttpResponseHandler respHdl = new IHttpResponseHandler() {

         @Execution(Execution.NONTHREADED)
         public void onResponse(IHttpResponse response) throws IOException {
            exchange.send(response);
         }

         @Execution(Execution.NONTHREADED)
         public void onException(IOException ioe) throws IOException {
            exchange.sendError(ioe);
         }
      };

      // forward the request in a asynchronous way by passing over the response handler
      httpClient.send(request, respHdl);
   }
}



class LoadBalancer {

   public static void main(String[] args) throws Exception {
      InetSocketAddress[] srvs = new InetSocketAddress[] { new InetSocketAddress("srv1", 8030), new InetSocketAddress("srv2", 8030)};
      HttpServer loadBalancer = new HttpServer(8080, new LoadBalancerHandler(srvs));
      loadBalancer.run();
   }
}

In Listing 1, the LoadBalancerHandler reads the HTTP id request parameter and computes the hash code. Going beyond this simple example, in some cases load balancers must read (a part of) the HTTP body to retrieve the required balancing algorithm information. The request is forwarded based on the result of the modulo operation. This is done by the HttpClient object. This HttpClient also pools and reuses (persistent) connections to the servers for performance reasons. The response is handled in an asynchronous way through the use of an HttpResponseHandler. This non-blocking, asynchronous approach minimizes the load balancer's system requirements. For instance, no outstanding thread is required during a call. For a more detailed explanation of asynchronous, non-blocking HTTP programming, read my article "Asynchronous HTTP and Comet architectures."

Another intermediate application-level server load balancing technique is cookie injection. In this case the load balancer checks if the request contains a specific load balancing cookie. If the cookie is not found, a server is selected using a distribution algorithm such as round-robin. A load balancing session cookie is added to the response before the response is sent. When the browser receives the session cookie, the cookie is stored in temporary memory and is not retained after the browser is closed. The browser adds the cookie to all subsequent requests in that session, which are sent to the load balancer. By storing the server slot as cookie value, the load balancer can determine the server that is responsible for this request (in this browser session). Listing 2 implements a load balancer based on cookie injection.

Listing 2. Cookie-injection based application-level load balancer

class CookieBasedLoadBalancerHandler implements IHttpRequestHandler, ILifeCycle {
   private final List<InetSocketAddress> servers = new ArrayList<InetSocketAddress>();
   private int serverIdx = 0;
   private HttpClient httpClient;

   /*
    * this class does not implement server monitoring or healthiness checks
    */

   public CookieBasedLoadBalancerHandler(InetSocketAddress... realServers) {
      servers.addAll(Arrays.asList(realServers));
   }

   public void onInit() {
      httpClient = new HttpClient();
      httpClient.setAutoHandleCookies(false);
}

   public void onDestroy() throws IOException {
      httpClient.close();
   }

   public void onRequest(final IHttpExchange exchange) throws IOException {
      IHttpRequest request = exchange.getRequest();


      IHttpResponseHandler respHdl = null;
      InetSocketAddress serverAddr = null;

      // check if the request contains the LB_SLOT cookie
      cl : for (String cookieHeader : request.getHeaderList("Cookie")) {
         for (String cookie : cookieHeader.split(";")) {
            String[] kvp = cookie.split("=");
            if (kvp[0].startsWith("LB_SLOT")) {
               int slot = Integer.parseInt(kvp[1]);
               serverAddr = servers.get(slot);
               break cl;
            }
         }
      }

      // request does not contains the LB_SLOT -> select a server
      if (serverAddr == null) {
         final int slot = nextServerSlot();
         serverAddr = servers.get(slot);

         respHdl = new IHttpResponseHandler() {

            @Execution(Execution.NONTHREADED)
            public void onResponse(IHttpResponse response) throws IOException {
               // set the LB_SLOT cookie
               response.setHeader("Set-Cookie", "LB_SLOT=" + slot + ";Path=/");
               exchange.send(response);
            }

            @Execution(Execution.NONTHREADED)
            public void onException(IOException ioe) throws IOException {
               exchange.sendError(ioe);
            }
         };

      } else {
         respHdl = new IHttpResponseHandler() {

            @Execution(Execution.NONTHREADED)
            public void onResponse(IHttpResponse response) throws IOException {
               exchange.send(response);
            }

            @Execution(Execution.NONTHREADED)
            public void onException(IOException ioe) throws IOException {
               exchange.sendError(ioe);
            }
         };
      }

      // update the Request-URL of the request
      URL url = request.getRequestUrl();
      URL newUrl = new URL(url.getProtocol(), serverAddr.getHostName(), serverAddr.getPort(), url.getFile());
      request.setRequestUrl(newUrl);

      // proxy header handling (remove hop-by-hop headers, ...)
      // ...

      // forward the request
      httpClient.send(request, respHdl);
   }

   // get the next slot by using the using round-robin approach
   private synchronized int nextServerSlot() {
      serverIdx++;
      if (serverIdx >= servers.size()) {
         serverIdx = 0;
      }
      return serverIdx;
   }
}


class LoadBalancer {

   public static void main(String[] args) throws Exception {
      InetSocketAddress[] srvs = new InetSocketAddress[] { new InetSocketAddress("srv1", 8030), new InetSocketAddress("srv2", 8030)};
      CookieBasedLoadBalancerHandler hdl = new CookieBasedLoadBalancerHandler(srvs);
      HttpServer loadBalancer = new HttpServer(8080, hdl);
      loadBalancer.run();
   }
}

Unfortunately, the cookie-injection approach only works if the browser accepts cookies. If the user deactivates cookies, the client loses stickiness.

In general, the drawback of intermediate application-level load balancer solutions is that they require an additional node or process. Solutions that integrate a transport-level and an application-level server load balancer solve this problem but are often very expensive, and the flexibility gained by accessing application-level data is limited.

HTTP redirect-based server load balancer

One way to avoid additional network hops is to make use of the HTTP redirect directive. With the help of the redirect directive, the server reroutes a client to another location. Instead of returning the requested object, the server returns a redirect response such as 303 See Other. The client recognizes the new location and reissues the request. Figure 2 shows this architecture.

Http redirect-based application-level load balancing
Figure 2. HTTP redirect-based application-level load balancing

Listing 3 implements an HTTP redirect-based application-level load balancer. The load balancer in Listing 3 doesn't forward the request. Instead, it sends a redirect status code, which contains an alternate location. According to the HTTP specification, the client repeats the request by using the alternate location. If the client uses the alternate location for further requests, the traffic goes to that server directly. No extra network hops are required.

1 2 3 Page 1
Page 1 of 3