Write thread-safe servlets

Learn how to handle thread safety

If you write Web applications in Java, the servlet is your best friend. Whether you write Java ServerPages (JSP) or plain servlets, you are both at the servlet's mercy and the lucky recipient of what the servlet has to offer. So let's look closer at the servlet.

As we know, when building a Website, we are primarily interested in the servlet's two methods: doPost() and doGet(). The underlining method for both methods is service(). Later, I look closer at these methods and how they relate to thread safety, but first let's review the concept of a thread.

A thread is a single execution process; in other words, an individual, sequential flow of control within a program. When we say that a program is multithreaded, we are not implying that the program runs two separate instances simultaneously (as if you concurrently executed the program twice from the command line). Rather, we are saying that the same instance (executed only once) spawns multiple threads that process this single instance of code. This means that more than one sequential flow of control runs through the same memory block.

So what do we mean by thread-safe, you ask? When multiple threads execute a single instance of a program and therefore share memory, multiple threads could possibly be attempting to read and write to the same place in memory. Let's look at an example. If we have a multithreaded program, we will have multiple threads processing the same instance (see Figure 1).

Figure 1. A multithreaded application

What happens when Thread-A examines variable instanceVar? Notice how Thread-B has just incremented instanceVar. The problem here is Thread-A has written to the instanceVar and is not expecting that value to change unless Thread-A explicitly does so. Unfortunately Thread-B is thinking the same thing regarding itself; the only problem is they share the same variable. This issue is not unique to servlets. It is a common programming problem only present when multithreading an application. You are probably thinking; "Well I didn't ask for multithreading. I just want a servlet!" And a servlet is what you have. Let me introduce you to our friend the servlet container.

Your servlet container is no dummy

A lot of magic happens between the Web browser's HTTP request and the code we write within the doGet() and doPost() methods. The servlet container makes this "magic" possible. Like any Java program, the servlet must run within a JVM, but for a Web application, we also have the complexity of handling HTTP requests—that's where the servlet container comes in. The servlet container is responsible for your servlets' creation, destruction, and execution; the sequence of these events is referred to as the servlet's lifecycle.

The servlet's lifecycle is an important topic, and thus, you will find it on Sun's Java certification exam. The reason for its importance is primarily because so much of the servlet's lifecycle is outside the programmer's control. We do not worry a lot (for the most part) about how many of our servlet's instances exist at runtime. Nor are we generally concerned about memory utilization regarding the creation and destruction of our servlets. The reason for our lack of concern is because the servlet container handles this for us (yes, more magic).

The servlet container not only handles the servlet's lifecycle, it also does a fine job at it. The servlet container is concerned about efficiency. It ensures that when servlets are created, they are utilized efficiently, and, yes, you guessed it, this includes multithreading. As Figure 2 illustrates, multiple threads simultaneously process your servlet.

Figure 2. A multithreaded servlet container. Click on thumbnail to view full-size image.

Just imagine the performance problems we would experience with a Website as popular as Google or Amazon if the sites were not built using efficient, multithreaded processing. Though our Web application is probably not quite as popular as Google, it still would not be practical to build a site that required a servlet to be instantiated for each request. For that reason, I'm thankful that the servlet container handles the multithreading for me. Otherwise, most of us would have to change the way we designed our servlets.

Now that we are familiar with the perils of multithreaded applications and we know that all of our servlets are multithreaded, let's look at exactly when multithreading will be a problem for us.

Are you thread-safe?

Below is a simple servlet that is not thread-safe. Look closely, because at first glance, nothing appears wrong with it:

package threadSafety;
import java.io.IOException;
import javax.servlet.*;
import javax.servlet.http.*;
import java.math.*;
public class SimpleServlet extends HttpServlet
{
  //A variable that is NOT thread-safe!
  private int counter = 0;
  public void doGet(HttpServletRequest req, HttpServletResponse resp)
    throws ServletException, IOException
  {
    doPost(req, resp);
  }
  public void doPost(HttpServletRequest req, HttpServletResponse resp)
    throws ServletException, IOException
  {
    resp.getWriter().println("<HTML><BODY>");
    resp.getWriter().println(this + ": <br>");
    for (int c = 0; c < 10; c++)
    {
      resp.getWriter().println("Counter = " + counter + "<BR>");
      try
      {
        Thread.currentThread().sleep((long) Math.random() * 1000);
        counter++;
      }
      catch (InterruptedException exc) { }
    }
    resp.getWriter().println("</BODY></HTML>");
  }
}

The variable counter is an instance variable, called such because it is tied to the class instance. Because it is defined within the class definition, it belongs within that class instance. It's convenient to place our variables within this scope because it lives outside each of the class's methods and can be accessed at any time. The value is also retained between method calls. The problem here is that our servlet container is multithreaded and shares single instances of servlets for multiple requests. Does defining your variables as instance variables sound like a good idea now? Remember, only one place in memory is allocated for this variable, and it is shared between all threads that intend on executing this same class instance.

Let's find out what happens when we execute this servlet simultaneously. We add a delay in processing by using the sleep() method. This method helps simulate more accurate behavior, as most requests differ in the amount of time required for processing. Of course, as is our luck as programmers, this also causes our problem to occur more often. This simple servlet will increment counter such that each servlet should be able to display sequential values. We create simultaneous requests by using HTML frames; each frame's source is the same servlet:

<HTML>
  <BODY>
    <TABLE>
      <TR>
        <TD>
          <IFRAME src="/theWebapp/SimpleServlet" 
                  name="servlet1"
                  height="200%"> 
          </IFRAME>
        </TD>
      </TR>
      <TR>
        <TD>
          <IFRAME src="/theWebapp/SimpleServlet" 
                  name="servlet2" 
                  height="200%">
          </IFRAME>
        </TD>
      </TR>
      <TR>
        <TD>
          <IFRAME src="/theWebapp/SimpleServlet" 
                  name="servlet3" 
                  height="200%">
          </IFRAME>
        </TD>
      </TR>
    </TABLE>
  </BODY>
</HTML>

Our code, which is a non-thread-safe servlet, generates the following output:

ThreadSafety.SimpleServlet@1694eca:
Counter=0
Counter=2
Counter=4
Counter=6
Counter=9
Counter=11
Counter=13
Counter=15
Counter=17
Counter=19
ThreadSafety.SimpleServlet@1694eca:
Counter=0
Counter=1
Counter=3
Counter=5
Counter=7
Counter=8
Counter=10
Counter=12
Counter=14
Counter=16
ThreadSafety.SimpleServlet@1694eca:
Counter=18
Counter=20
Counter=22
Counter=23
Counter=24
Counter=25
Counter=26
Counter=27
Counter=28
Counter=29

As we can see in our output, we fail to get the results we desire. Notice the value printed from the this reference is duplicated. This is the servlet's memory address. It tells us that only one servlet is instantiated to service all requests. The servlet tried its best to output sequential data, but because all threads share the memory allocated for counter, we managed to step on our own toes. We can see that the values are not always sequential, which is bad! What if that variable is being used to point at a user's private information? What if a user logs into their online banking system and on a particular page, that user sees someone else's banking information? This problem can manifest itself in many ways, most of which are difficult to identify, but the good news is that this problem is easily remedied. So let's take a look at our options.

Your first defense: Avoidance

I have always said that the best way to fix problems is to avoid them all together; in our case, this approach is best. When discussing thread safety, we are interested only in the variables that we both read and write to and that pertain to a particular Web conversation. If the variable is for read-only use or it is application-wide, then no harm results in sharing this memory space across all instances. For all other variable uses, we want to make sure that we either have synchronized access to the variable (more on this in a moment) or that we have a unique variable for each thread.

To ensure we have our own unique variable instance for each thread, we simply move the declaration of the variable from within the class to within the method using it. We have now changed our variable from an instance variable to a local variable. The difference is that, for each call to the method, a new variable is created; therefore, each thread has its own variable. Before, when the variable was an instance variable, the variable was shared for all threads processing that class instance. The following thread-safe code has a subtle, yet important, difference. Notice where the counter variable is declared!

import java.io.IOException;
import javax.servlet.*;
import javax.servlet.http.*;
import java.math.*;
public class SimpleServlet extends HttpServlet
{
  public void doGet(HttpServletRequest req, HttpServletResponse resp)
    throws ServletException, IOException
  {
    doPost(req, resp);
  }
  public void doPost(HttpServletRequest req, HttpServletResponse resp)
    throws ServletException, IOException
  {
    //A variable that IS thread-safe!
    private int counter = 0;
    resp.getWriter().println("<HTML><BODY>");
    resp.getWriter().println(this + ": <br>");
    for (int c = 0; c < 10; c++)
    {
      resp.getWriter().println("Counter = " + counter + "<BR>");
      try
      {
        Thread.currentThread().sleep((long) Math.random() * 1000);
        counter++;
      }
      catch (InterruptedException exc) { }
    }
    resp.getWriter().println("</BODY></HTML>");
  }
}

Move the variable declaration to within the doGet() method and test again. Notice a change in behavior? I know, you're thinking; "It can't be that easy," but usually it is. As you scramble to revisit your latest servlet code to check where you declared your variables, you may run into a small snag. As you move your variables from within the class definition to within the method, you may find that you were leveraging the scope of the variable and accessing it from within other methods. If you find yourself in this situation, you have a couple of choices. First, change the method interfaces and pass this variable (and any other shared variables) to each method requiring it. I highly recommend this approach. Explicitly passing your data elements from method to method is always best; it clarifies your intentions, documents each method's requirements, makes your code well structured, and offers many other benefits.

If you discover that you must share a variable between servlets and this variable is going to be read from and written to by multiple threads (and you are not storing it in a database), then you will require thread synchronization. Sorry, there is no way around it now.

Your second defense: Partial synchronization

Thread synchronization is an important technique to know, but not one you want to throw at a solution unless required. Anytime you synchronize blocks of code, you introduce bottlenecks into your system. When you synchronize a code block, you tell the JVM that only one thread may be within this synchronized block of code at a given moment. If we run a multithreaded application and a thread runs into a synchronized code block being executed by another thread, the second thread must wait until the first thread exits that block.

It is important to accurately identify which code block truly needs to be synchronized and to synchronize as little as possible. In our example, we assume that making our instance variable a local variable is not an option. Look at how we would synchronize the crucial block of code:

1 2 Page
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more