Jan 27, 2009 12:00 AM PT

Open source Java projects: Terracotta

A unique approach to clustering conquers scalability and fail-over

By distributing application load among multiple redundant servers, clustering maintains performance and keeps users blissfully unaware of single-server failures. In this Open source Java projects installment, Steven Haines introduces Terracotta, an enterprise Java clustering solution. Find out why Terracotta, unlike traditional clustering solutions, doesn't make you sacrifice an iota of reliability in the name of performance. Level: Intermediate

Terracotta is an open source solution for enterprise Java clustering that boasts near linear scalability and 100 percent reliability. Terracotta supports standard HTTP session clustering in Apache Tomcat and Oracle WebLogic, as well as open source projects such as Struts, Spring, and Hibernate. I'll start by explaining what's unique about Terracotta clustering. Then, after taking you through installation, I'll show you how to configure Terracotta to cluster a sample Web application that uses HTTP sessions, and how to deploy Terracotta clustering in a production environment.

Clustering solves two fundamental problems for mission-critical applications: scalability and fail-over. Scalability measures how well an application can maintain its performance under increasing load. Clustering addresses scalability by letting you distribute the load among several physical servers or server instances. Theoretically perfect scalability is linear: as new servers are added to a cluster, each server adds support for a constant number of users. For example, if one server can support 500 users, then two servers can support 1,000 users and three servers can support 1,500.

Performance vs. scalability

The concepts of performance and scalability are often interwoven, but they're distinct. Performance measures whether an application can respond to a request within its defined service-level agreement (SLA). Scalability measures how well an application can maintain its performance under increasing load. Horizontal clustering distributes load across multiple machines; you can think of it as "scaling out." You can think of vertical clustering as "scaling up," meaning that load is distributed to multiple application server instances running on the same physical server. Vertical clustering can sometimes better utilize all of a server's resources than a single JVM instance.

The other side of clustering is fail-over in the event of server failure. Successful fail-over makes outages transparent to users while maintaining their state within the application. It requires a strategy for replicating a user's state to one or more secondary servers and then, if the first server goes down, redirecting all subsequent requests to the secondary server(s). Deciding how, when, and where to send the data are fundamental challenges to implementing this strategy effectively.

Serialization -- the process of converting a Java object to a binary object -- is the traditional approach to how the data is sent. Application servers typically identify whether a change has been made to stateful objects, serialize those objects, and send them to the replicated servers. This strategy is inefficient because the serialization process is "all or nothing." In many applications, such as those powered by portals, stateful information can be measured in megabytes. Even if a user changes only a single byte, such as changing a preference from "true" to "false," the application server must construct a serialized version of a potentially multi-megabyte object and send all of that data across the network to its replicated servers. This approach's inefficiency hinders linear scalability.

To overcome the limitations of Java serialization, Terracotta uses bytecode instrumentation (BCI) in the Terracotta client to identify the exact properties within stateful objects that change and then replicate only those properties across the cluster.

Bytecode instrumentation

Bytecode instrumentation is a process through which an application's behavior can be modified at runtime. Bytecode is the format Java is compiled to and that the JVM knows how to interpret. The JVM provides "hooks" through which a process can examine and modify bytecode of objects before they are returned to the application that is using them. In the performance-monitoring space, this feature is exploited to mark the time that a method starts and the time that it ends in order to measure its response time. Terracotta uses BCI to intercept changes made to objects so that it can identify those changes and send them to the Terracotta server.

The other fail-over challenge is determining when and where to send clustered information. In traditional solutions, typically data is sent to each replicated server as soon as the user's request is completed. The level of redundancy has a direct impact on clustering performance. In an ideal world, any application server can fail over to any other server. But because this would require network communication to all other servers in the cluster, the performance cost is prohibitive when you have more than a few servers. Most application administrators opt instead for a lesser reliability that balances performance. For example, they might define at most two secondary servers to which a server can fail over. The idea is that the chance of two or three servers going down simultaneously is low, and the performance overhead of replicating stateful information to those servers is manageable.

Terracotta, in contrast, replicates data to multiple servers without compromising reliability. It does so by introducing a new server that hosts all stateful information. When an application makes changes to a stateful object, those changes are sent to the Terracotta server. Then if another server needs access to that data, it is injected into that server on demand. Thus, all servers in the cluster all have access to the same data, but the data is pushed to an individual server only when it is accessed. So the overhead required to replicate stateful information to 100 servers is the same overhead required to replicate to one server. Furthermore, the Terracotta server -- not a server in the cluster -- is the one receiving the request, so the overhead is much less than even replicating to one additional clustered server. And the Terracotta server can be clustered itself to support the redundancy of your stateful objects in case it ever goes down.

Downloading and installing Terracotta

You can download Terracotta from the project site. The download is free (as is the source code), but you need to register first. At the time of this writing, the latest version is 2.7.2. Pick the version for your operating system (either Windows or "All Platforms"); for this example I picked the latter. Decompress the file somewhere locally, such as C:\terracotta-2.7.2. On my Linux box I installed it in my home directory: /home/shaines/lib/terracotta-2.7.2.

Open source licenses

Each of the open source Java projects covered in this series is subject to a license, which you should understand before integrating the project with your own projects. Terracotta is subject to the Terracotta Public License.

The installation creates the following directories:

  • bin contains all of Terracotta's executables, including the files you'll use to start and stop the Terracotta server.
  • config-examples contains sample configurations for clustering Plain Old Java Objects (POJOs), Tomcat, Spring, and WebLogic.
  • docs contains Terracotta's documentation (which comprises HTML documents that link to the Terracotta Web site) and the XML configuration file reference.
  • lib contains the Terracotta compiled JAR files and all of its dependencies.
  • modules contains prebuilt JAR files for integrating with several technologies and projects, such as the Commons collections, Spring, Struts, and GlassFish.
  • samples contains examples that demonstrate how to cluster POJOs, RIFE applications, HTTP sessions, and Spring applications.
  • schema contains the XML schema for the Terracotta configuration file and accompanying documentation.
  • tools contains tools to help you build your Terracotta configuration file. This include the Sessions Configurator, which helps you configure Terracotta for HTTP session clustering.
  • vendor contains external applications. In this version the only application it includes is a version of Tomcat that is configured to work with Terracotta. The Sessions Configurator uses it to help you build HTTP session clustering.

The welcome.sh or welcome.bat file, found in the root of the Terracotta installation, executes a Java Swing application that is a launching pad for exploring POJO and Spring examples. It also launches the Sessions Configurator to help you build a configuration file for HTTP session clustering. It is a good starting point for familiarizing yourself with Terracotta.

Configuring and using Terracotta

Applications can store stateful information in different ways, such as by maintaining data in a cache, persisting data to a database and accessing that database through an object-relational mapping tool such as Hibernate, or by using HTTP Sessions. Terracotta makes it easy to implement any of these solutions, but for the purposes of this example I'll assume that you want to configure and use Terracotta to cluster a Web application's sessions.

The sample application

The sample application is purposely simple. It contains:

  • A servlet that manages a single variable called name in an HttpSession. If name's value is null, then the user is forwarded to a JSP file that prompts for a name. If name has a non-null value, then the user is forwarded to a JSP file that greets the user by name.
  • A JSP files one that prompts the user for his or her name
  • A JSP file that displays the user's name

The point of this example is to illustrate how to configure Terracotta to cluster an HttpSession object. If you can cluster the single String that the example is storing in the HttpSession, you can just as easily cluster a complex object graph stored in an HttpSession.

Listing 1 shows the source code for the SessionServlet.

Listing 1. SessionServlet.java

package com.geekcap.terracottaexample;

import java.io.PrintStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpSession;

public class SessionServlet extends HttpServlet
{
  public void service( HttpServletRequest req, HttpServletResponse res )
  {
    try
    {
      // See if there is a name in the request
      String name = req.getParameter( "name" );

      // Load session
      HttpSession session = req.getSession();

   // Put the name into the session
   if( name != null )
   {
     session.setAttribute( "name", name );
   }

   else
   {
     // See if we already have it in the session
     name = ( String )session.getAttribute( "name" );
     if( name == null )
     {
          // Forward the user to the name entry form
       req.getRequestDispatcher( "/enterYourName.jsp" ).forward(
             req, res );
        }
      }

     // We have a name in the session so forward the user to the hello page
     req.getRequestDispatcher( "/hello.jsp" ).forward( req, res );
    }
    catch( Exception e )
    {
      e.printStackTrace();
    }
  }

}

The servlet in Listing 1 first checks to see if the request has a name in it. If it does, then it either sets or overwrites the name in the session. If it does not, then it checks to see if there is a name in the session. If there is, then it forwards the user to the hello.jsp file; otherwise it forwards the user to the enterYourName.jsp file. It is a contrived and superficial application, but it demonstrates how to store a String in a session.

Listing 2 shows the source code for the enterYourName.jsp file.

Listing 2. enterYourName.jsp

<%@ page language="java" contentType="text/html; charset=UTF-8"
    pageEncoding="UTF-8"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
         <title>Hello, Terracotta!</title>
   </head>
<body>
      <form action="SessionServlet" method="post">
Enter your name: <input type="text" name="name"/><input type="submit">
      </form>
   </body>
</html>

Listing 2's enterYourName.jsp constructs a form that submits a user's name to the SessionServlet (which we'll later map to the SessionServlet in the web.xml file).

Listing 3 shows the source code for the hello.jsp file.

Listing 3. hello.jsp

<%@ page language="java" contentType="text/html; charset=UTF-8"
    pageEncoding="UTF-8"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Hello, Terracotta</title>
</head>
<body>
Hello, <%= session.getAttribute( "name" ) %>
<p>
<form action="SessionServlet" method="post">
Update your name: <input type="text" name="name"/><input type="submit">
</form>
</body>
</html>

Listing 3's hello.jsp file loads the name attribute from the HttpSession and displays it in a "Hello, NAME" greeting. And it provides the same form from the enterYourName.jsp page that allows the user to update his or her name.

Finally, Listing 4 shows the Web deployment descriptor for this application.

Listing 4. web.xml

<?xml version="1.0" encoding="UTF-8"?>
<web-app id="WebApp_ID" version="2.4" xmlns="http://java.sun.com/xml/ns/j2ee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd">
   <display-name>
   TeracottaSessionClusteringExample</display-name>
   <servlet>
      <description>
      </description>
      <display-name>SessionServlet</display-name>
      <servlet-name>SessionServlet</servlet-name>
      <servlet-class>com.geekcap.terracottaexample.SessionServlet</servlet-class>
   </servlet>
   <servlet-mapping>
      <servlet-name>SessionServlet</servlet-name>
      <url-pattern>/SessionServlet</url-pattern>
   </servlet-mapping>
   <welcome-file-list>
      <welcome-file>index.html</welcome-file>
   </welcome-file-list>
</web-app>

The web.xml file defines the SessionServlet and maps requests for /SessionServlet to the SessionServlet.

Package these files into a WAR file named TerracottaExample.war with the following structure:

enterYourName.jsp
hello.jsp
WEB-INF/web.xml
WEB-INF/classes/com/geekcap/terracottaexample/SessionServlet.class

Using the Terracotta Sessions Configurator

To test the sample application and build a Terracotta configuration file, you'll use the Sessions Configurator. Launch the welcome application (by executing either welcome.bat or welcome.sh), select the Sessions tab (shown in Figure 1), and choose Terracotta Sessions Configurator.

Figure 1. Launching the Terracotta Sessions Configurator

When the Sessions Configurator starts, click the Import button. Or, if you close the initial dialog box, you can choose File ->Import webapp. Navigate to your WAR file and click OK, as shown in Figure 2.

Figure 2. Opening a WAR application

The Control tab, shown in Figure 3, shows your application configured to run in two Tomcat instances.

Figure 3. Two instances of the Terracotta example application

The Terracotta Sessions enabled checkbox configures Tomcat to cluster Tomcat sessions. A simple way to see the effects of the clustering is to launch the application with this checkbox disabled and hit both servers, then launch the application with the checkbox enabled and hit both servers. Give it a try:

  1. Disable the checkbox.
  2. Click Start all.
  3. Hit both servers, which are at these URLs:
    • http://localhost:9081/TerracottaExample/SessionServlet
    • http://localhost:9082/TerracottaExample/SessionServlet
  4. Click Stop all.
  5. Enable the checkbox.
  6. Repeat steps 2 through 4.

What you should notice is that when you hit the individual servers with Terracotta Sessions disabled, the servers know nothing about each other. As a matter of fact they clobber each other's session cookie, and the results are rather random. When you enable Terracotta Sessions and then enter your name into one server, the other server is immediately aware of your name. And when you change your name on one server, the change is immediately reflected on the other server.

Figure 4 shows screen shots from both of my Tomcat servers showing the same data.

Figure 4. Both Tomcat instances showing the same data

To see the data being clustered, click on the Monitor tab of the Sessions Configurator and log in using the default values: localhost on port 9520.

Figure 5 shows the Terracotta Server Monitor that I have navigated to the name attribute's value in the HttpSession.

Figure 5. Terracotta server monitor

Depending on your needs, you can modify the rules that Terracotta uses to determine what to cluster. In Figure 6, I've replaced the rule of clustering *..* with javax.servlet.http.HttpSession, so that Terracotta will cluster only the HttpSession object.

Figure 6. Configuring the clustering rules

Once you are satisfied that your clustering is working, choose File -> Export Configuration. This generates a configuration file that you'll need to configure Tomcat with on startup to tell Terracotta which classes should be clustered.

Production deployment

To integrate Terracotta with Tomcat:

  1. Start the Terracotta Server by launching start-tc.server.bat or start-tc-server.sh from the Terracotta bin folder.
  2. Update the catalina.bat or catalina.sh script in the Tomcat bin folder to load Terracotta and point it to your configuration file:

    Unix/Linux:

    TC_INSTALL_DIR=<path_to_local_Terracotta_home>
    TC_CONFIG_PATH=<path_to_local_tc-config.xml>
    . $TC_INSTALL_DIR/bin/dso-env.sh -q
    export JAVA_OPTS="$TC_JAVA_OPTS $JAVA_OPTS"
    

    Windows:

    set TC_INSTALL_DIR=<path_to_local_terracotta_home>
    set TC_CONFIG_PATH=<path_to_local_tc-config.xml>
    %TC_INSTALL_DIR%\bin\dso-env.bat -q
    set JAVA_OPTS=%TC_JAVA_OPTS%;%JAVA_OPTS%
    
  3. Start the Tomcat servers using the startup.bat or startup.sh script.
  4. Hit your application on both servers and observe that the session is clustered.
  5. (Optionally) open the monitoring tab in the Sessions Configurator and find your clustered object value.

In conclusion

Terracotta's strategy is to bypass Java serialization, identifying only the components that change in stateful objects and then persisting those changes to the Terracotta server. Then when stateful objects are requested, the Terracotta client obtains the objects from the Terracotta server and injects them into the application. Bypassing Java serialization and hosting all data in a central server rather than sending data to all replicated servers aids in obtaining linear scalability. Configuring all application servers to connect to the Terracotta server and then injecting data as it is requested ensures reliability. Linear scalability and 100% reliability, all freely available as open source code -- who could ask for more?

Session clustering is but one of Terracotta's capabilities. You can use the technology to accomplish many things, including caching. For example, if you're building a new application and want to share data among multiple servers -- such as by using a distributed cache -- then Terracotta is an excellent solution: you can configure Terracotta to cluster a hashmap across all of your servers and store shared data in that hashmap.

Steven Haines is the founder and CEO of GeekCap, Inc., which provides technical e-learning courses for software developers. Previously he was the Java EE Domain Expert at Quest Software, defining software used to monitor the performance of various Java EE application servers. He is the author of Pro Java EE 5 Performance Management and Optimization, Java 2 Primer Plus, and Java 2 From Scratch. He is the Java host on InformIT.com and a Java Community Editor on InfoQ.com. Steven has taught Java at the University of California, Irvine and Learning Tree University.

Learn more about this topic

More from JavaWorld