Design for thread safety

Design tips on when and how to use synchronization, immutable objects, and thread-safe wrappers

Six months ago I began a series of articles about designing classes and objects. In this month's Design Techniques column, I'll continue that series by looking at design principles that concern thread safety. This article tells you what thread safety is, why you need it, when you need it, and how to go about getting it.

What is thread safety?

Thread safety simply means that the fields of an object or class always maintain a valid state, as observed by other objects and classes, even when used concurrently by multiple threads.

One of the first guidelines I proposed in this column (see "Designing object initialization") is that you should design classes such that objects maintain a valid state, from the beginning of their lifetimes to the end. If you follow this advice and create objects whose instance variables all are private and whose methods only make proper state transitions on those instance variables, you're in good shape in a single-threaded environment. But you may get into trouble when more threads come along.

Multiple threads can spell trouble for your object because often, while a method is in the process of executing, the state of your object can be temporarily invalid. When just one thread is invoking the object's methods, only one method at a time will ever be executing, and each method will be allowed to finish before another method is invoked. Thus, in a single-threaded environment, each method will be given a chance to make sure that any temporarily invalid state is changed into a valid state before the method returns.

Once you introduce multiple threads, however, the JVM may interrupt the thread executing one method while the object's instance variables are still in a temporarily invalid state. The JVM could then give a different thread a chance to execute, and that thread could call a method on the same object. All your hard work to make your instance variables private and your methods perform only valid state transformations will not be enough to prevent this second thread from observing the object in an invalid state.

Such an object would not be thread-safe, because in a multithreaded environment, the object could become corrupted or be observed to have an invalid state. A thread-safe object is one that always maintains a valid state, as observed by other classes and objects, even in a multithreaded environment.

Why worry about thread safety?

There are two big reasons you need to think about thread safety when you design classes and objects in Java:

  1. Support for multiple threads is built into the Java language and API

  2. All threads inside a Java virtual machine (JVM) share the same heap and method area

Because multithreading is built into Java, it is possible that any class you design eventually may be used concurrently by multiple threads. You needn't (and shouldn't) make every class you design thread-safe, because thread safety doesn't come for free. But you should at least think about thread safety every time you design a Java class. You'll find a discussion of the costs of thread safety and guidelines concerning when to make classes thread-safe later in this article.

Given the architecture of the JVM, you need only be concerned with instance and class variables when you worry about thread safety. Because all threads share the same heap, and the heap is where all instance variables are stored, multiple threads can attempt to use the same object's instance variables concurrently. Likewise, because all threads share the same method area, and the method area is where all class variables are stored, multiple threads can attempt to use the same class variables concurrently. When you do choose to make a class thread-safe, your goal is to guarantee the integrity -- in a multithreaded environment -- of instance and class variables declared in that class.

You needn't worry about multithreaded access to local variables, method parameters, and return values, because these variables reside on the Java stack. In the JVM, each thread is awarded its own Java stack. No thread can see or use any local variables, return values, or parameters belonging to another thread.

Given the structure of the JVM, local variables, method parameters, and return values are inherently "thread-safe." But instance variables and class variables will only be thread-safe if you design your class appropriately.

RGBColor #1: Ready for a single thread

As an example of a class that is not thread-safe, consider the RGBColor class, shown below. Instances of this class represent a color stored in three private instance variables: r, g, and b. Given the class shown below, an RGBColor object would begin its life in a valid state and would experience only valid-state transitions, from the beginning of its life to the end -- but only in a single-threaded environment.

// In file threads/ex1/RGBColor.java
// Instances of this class are NOT thread-safe.
public class RGBColor {
    private int r;
    private int g;
    private int b;
    public RGBColor(int r, int g, int b) {
        checkRGBVals(r, g, b);
        this.r = r;
        this.g = g;
        this.b = b;
    }
    public void setColor(int r, int g, int b) {
        checkRGBVals(r, g, b);
        this.r = r;
        this.g = g;
        this.b = b;
    }
    /**
    * returns color in an array of three ints: R, G, and B
    */
    public int[] getColor() {
        int[] retVal = new int[3];
        retVal[0] = r;
        retVal[1] = g;
        retVal[2] = b;
        return retVal;
    }
    public void invert() {
        r = 255 - r;
        g = 255 - g;
        b = 255 - b;
    }
    private static void checkRGBVals(int r, int g, int b) {
        if (r < 0 || r > 255 || g < 0 || g > 255 ||
            b < 0 || b > 255) {
            throw new IllegalArgumentException();
        }
    }
}

Because the three instance variables, ints r, g, and b, are private, the only way other classes and objects can access or influence the values of these variables is via RGBColor's constructor and methods. The design of the constructor and methods guarantees that:

  1. RGBColor's constructor will always give the variables proper initial values

  2. Methods setColor() and invert() will always perform valid state transformations on these variables

  3. Method getColor() will always return a valid view of these variables

Note that if bad data is passed to the constructor or the setColor() method, they will complete abruptly with an InvalidArgumentException. The checkRGBVals() method, which throws this exception, in effect defines what it means for an RGBColor object to be valid: the values of all three variables, r, g, and b, must be between 0 and 255, inclusive. In addition, in order to be valid, the color represented by these variables must be the most recent color either passed to the constructor or setColor() method, or produced by the invert() method.

If, in a single-threaded environment, you invoke setColor() and pass in blue, the RGBColor object will be blue when setColor() returns. If you then invoke getColor() on the same object, you'll get blue. In a single-threaded society, instances of this RGBColor class are well-behaved.

Throwing a concurrent wrench into the works

Unfortunately, this happy picture of a well-behaved RGBColor object can turn scary when other threads enter the picture. In a multithreaded environment, instances of the RGBColor class defined above are susceptible to two kinds of bad behavior: write/write conflicts and read/write conflicts.

Write/write conflicts

Imagine you have two threads, one thread named "red" and another named "blue." Both threads are trying to set the color of the same RGBColor object: The red thread is trying to set the color to red; the blue thread is trying to set the color to blue.

Both of these threads are trying to write to the same object's instance variables concurrently. If the thread scheduler interleaves these two threads in just the right way, the two threads will inadvertently interfere with each other, yielding a write/write conflict. In the process, the two threads will corrupt the object's state.

The Unsynchronized RGBColor applet

The following applet, named Unsynchronized RGBColor, demonstrates one sequence of events that could result in a corrupt RGBColor object. The red thread is innocently trying to set the color to red while the blue thread is innocently trying to set the color to blue. In the end, the RGBColor object represents neither red nor blue but the unsettling color, magenta.

For some reason, your browser won't let you see this way cool Java applet.

To step through the sequence of events that lead to a corrupted RGBColor object, press the applet's Step button. Press Back to back up a step, and Reset to back up to the beginning. As you go, a line of text at the bottom of the applet will explain what's happening during each step.

For those of you who can't run the applet, here's a table that shows the sequence of events demonstrated by the applet:

ThreadStatementrgbColor
noneobject represents green02550 
blueblue thread invokes setColor(0, 0, 255)02550 
bluecheckRGBVals(0, 0, 255);02550 
bluethis.r = 0;02550 
bluethis.g = 0;02550 
blueblue gets preempted000 
redred thread invokes setColor(255, 0, 0)000 
redcheckRGBVals(255, 0, 0);000 
redthis.r = 255;000 
redthis.g = 0;25500 
redthis.b = 0;25500 
redred thread returns25500 
bluelater, blue thread continues25500 
bluethis.b = 25525500 
blueblue thread returns2550255 
noneobject represents magenta2550255 

As you can see from this applet and table, the RGBColor is corrupted because the thread scheduler interrupts the blue thread while the object is still in a temporarily invalid state. When the red thread comes in and paints the object red, the blue thread is only partially finished painting the object blue. When the blue thread returns to finish the job, it inadvertently corrupts the object.

Read/write conflicts

Another kind of misbehavior that may be exhibited in a multithreaded environment by instances of this RGBColor class is read/write conflicts. This kind of conflict arises when an object's state is read and used while in a temporarily invalid state due to the unfinished work of another thread.

For example, note that during the blue thread's execution of the setColor() method above, the object at one point finds itself in the temporarily invalid state of black. Here, black is a temporarily invalid state because:

  1. It is temporary: Eventually, the blue thread intends to set the color to blue.

  2. It is invalid: No one asked for a black RGBColor object. The blue thread is supposed to turn a green object into blue.

If the blue thread is preempted at the moment the object represents black by a thread that invokes getColor() on the same object, that second thread would observe the RGBColor object's value to be black.

Here's a table that shows a sequence of events that could lead to just such a read/write conflict:

ThreadStatementrgbColor
noneobject represents green02550 
blueblue thread invokes setColor(0, 0, 255)02550 
bluecheckRGBVals(0, 0, 255);02550 
bluethis.r = 0;02550 
bluethis.g = 0;02550 
blueblue gets preempted000 
redred thread invokes getColor()000 
redint[] retVal = new int[3];000 
redretVal[0] = 0;000 
redretVal[1] = 0;000 
redretVal[2] = 0;000 
redreturn retVal;000 
redred thread returns black000 
bluelater, blue thread continues000 
bluethis.b = 255000 
blueblue thread returns00255 
noneobject represents blue00255 

As you can see from this table, the trouble begins when the blue thread is interrupted when it has only partially finished painting the object blue. At this point the object is in a temporarily invalid state of black, which is exactly what the red thread sees when it invokes getColor() on the object.

Three ways to make an object thread-safe

There are basically three approaches you can take to make an object such as RGBThread thread-safe:

  1. Synchronize critical sections
  2. Make it immutable
  3. Use a thread-safe wrapper

Approach 1: Synchronizing the critical sections

The most straightforward way to correct the unruly behavior exhibited by objects such as RGBColor when placed in a multithreaded context is to synchronize the object's critical sections. An object's critical sections are those methods or blocks of code within methods that must be executed by only one thread at a time. Put another way, a critical section is a method or block of code that must be executed atomically, as a single, indivisible operation. By using Java's synchronized keyword, you can guarantee that only one thread at a time will ever execute the object's critical sections.

To take this approach to making your object thread-safe, you must follow two steps: you must make all relevant fields private, and you must identify and synchronize all the critical sections.

Step 1: Make fields private

Synchronization means that only one thread at a time will be able to execute a bit of code (a critical section). So even though it's fields you want to coordinate access to among multiple threads, Java's mechanism to do so actually coordinates access to code. This means that only if you make the data private will you be able to control access to that data by controlling access to the code that manipulates the data.

1 2 3 Page
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more