In last month's column, I demonstrated a simple deadlock scenario using two nested synchronized
blocks that acquired the same two locks, but in a different order. (Please review last month's example if this isn't fresh in your mind.) This month, I'll take a look at a solution to this commonplace deadlock problem, presenting a roll-your-own exclusion semaphore class and a lock manager that supports the safe acquisition of multiple semaphores. Using these objects rather than the built-in synchronized
can save you hours of searching for unexpected deadlocks. (They don't solve every possible deadlock problem, of course, but are nonetheless pretty useful.)
When 'synchronized' isn't good enough
The nested-synchronized
-statements example from last month was -- admittedly -- contrived, but the multiple-lock situation comes up frequently in the real world. One common problem is the too-coarse, object-level granularity of the synchronized
keyword: there's only one monitor per object, and sometimes that's not enough.
Consider the following class, the methods of which can be broken up into three partitions. The methods in the first partition use only a subset of the fields in the class. The methods in the second partition use a non-overlapping subset; they share fields that are not used by methods in the first partition. The methods in the third partition use everything.
1| class Complicated // NOT thread safe 2| { 3| private long a, b; 4| private long x, y; 5| 6| // partition 1, functions use a and/or b 7| 8| public void use_a() { do_something_with(a); ); } 9| public void use_b() { do_something_with(b); ); } 10 public void use_a_and_b() { do_something_with(a+b); ); } 11| 12| // partition 2, functions use x and/or y 13| 14| public void use_x() { do_something_with(x); ); } 15| public void use_y() { do_something_with(y); ); } 16| public void use_x_and_y() { do_something_with(x+y); ); } 17| 18| // partition 3, functions use a, b, x, and y 19| 20| public void use_everything() { do_something_with( a + x ); } 21| public void use_everything_else(){ do_something_with( b + y ); } 22| }
As it stands, this code is a multithreading disaster. Nothing is synchronized and we have guaranteed race conditions. (A race condition occurs when two threads try to access the same object at the same time, and chance determines which one wins the "race." Programs shouldn't work by chance.) Synchronizing all the methods would fix the problem, but then you couldn't call a method in partition 1 (emphasized in the code above) simply because some thread was using a method from partition 2 above. Since these two partitions don't interact with each other, this solution imposes needless access restrictions on the methods of the class. If you're accessing any method in partition 3, though, you do want to lock out everything in the other two partitions. We really need two locks in this situation. One to lock partition-1 variables and another for partition-2 variables. The methods in partition 3 can then grab both locks.
So, roll your own
You can implement the two-lock strategy by synchronizing on something other than the containing object. Rather than synchronizing the methods, you can define two local variables to use as locks and synchronize on them:
1| class Complicated_and_thread_safe 2| { 3| private long a, b; 4| private long x, y; 5| 6| private Object ab_lock = new int[]; 7| private Object xy_lock = new int[]; 8| 9| // partition 1, functions use a and/or b 10| 11| public void use_a() { synchronized(ab_lock){ /*...*/ } } 12| public void use_b() { synchronized(ab_lock){ /*...*/ } } 13| public void use_a_and_b(){ synchronized(ab_lock){ /*...*/ } } 14| 15| // partition 2, functions use x and/or y 16| 17| public void use_x() { synchronized(xy_lock){ /*...*/ } } 18| public void use_y() { synchronized(xy_lock){ /*...*/ } } 19| public void use_x_and_y(){ synchronized(xy_lock){ /*...*/ } } 20| 21| // partition 3, functions use a, b, x, and y 22| 23| public void use_everything() 24| { synchronized(ab_lock) // grab both locks 25| { synchronized(xy_lock) 26| { /*...*/ 27| } 28| } 29| } 30| 31| public void use_everything_else() 32| { synchronized(ab_lock) 33| { synchronized(xy_lock) 34| { /*...*/ 35| } 36| } 37| } 38| }
I haven't synchronized the methods themselves in this example. (Remember, synchronizing a method is effectively the same thing as wrapping all the code in the method in a synchronized(this){...}
block.) Instead, I'm providing a unique lock for each partition (ab_lock
and xy_lock
) and then explicitly synchronizing on these individual locks.
Java associates locks with objects (instance of some class that extends Object
, so I can't use primitive-type variables as locks here. I don't want to spend unnecessary time calling constructors and doing other initialization operations on complicated objects, however. Consequently, the locks themselves are declared as the simplest possible Object
-- an array.
Arrays in Java are first-class objects: they implicitly extend the Object
class. (If you don't believe me, compile the following code:
1| public class foo 2| { static Object ref = new int[]{ 10 }; 3| 4| public static void main( String[] args ) 5| { System.out.println( ref.toString() ); 6| } 7| }
The class compiles just fine. Not only does the implicit cast from the array to Object
work (because Object
is a base class of all arrays), but the println()
correctly invokes the compiler-generated toString()
override (which prints absolutely nothing useful -- but you can't have everything). I've used a one-element array for my lock, rather than something like an Integer
, because arrays come into existence very efficiently. For example, there's no constructor call.
In the foregoing example, it's critical that methods that acquire both locks always acquire them in the same order, otherwise we end up in the Wilma-and-Betty deadlock scenario discussed last month. Acquiring multiple locks is a commonplace enough problem that some operating systems have system calls for this purpose. It would be nice to have an easy way to acquire multiple locks, in Java, without having to worry about the order-of-acquisition problem. The remainder of this month's column describes one way to do that.
Semaphores
Listing 1 shows the core interface I use for all my roll-your-own semaphore classes: the Semaphore
. It's in the com.holub.asynch package, as are all the thread-related classes and interfaces I'll be discussing. (I've also put all the code that appears in the listings into the "Goodies" section on my Web site; see Resources for the link.)
1| package com.holub.asynch; 2| 3| interface Semaphore 4| { 5| int id (); 6| void acquire(long timeout) throws InterruptedException; 7| void release(); 8| }
Listing 1: Semaphore.java
If you haven't worked much with semaphores, a semaphore has to have two properties in order to be useful:
It must be able to identify itself using a unique ID when passed an
id()
request. The current implementation uses a unique integer for the purpose.- You must be able to acquire and release the semaphore, though the semantic meaning of "acquire" and "release" can vary, depending on what sort of semaphore you're dealing with.
Managing the locks
Any semaphore that implements Semaphore
can be locked in groups using the Lock_manager
class shown in Listing 2. There are several things going on here. First of all, I've used the Arrays.sort()
method, one of the new JDK 1.2 data-structure facilities, to make life easier. The Arrays
class is a "Booch utility" -- a class that contains nothing but static methods. The java.lang.Math utility is another example. In fact, the Lock_manager
itself is a utility, since it's made up solely of static methods.
01| package com.holub.asynch; 02| 03| import com.holub.asynch.Semaphore; 04| import java.util.Arrays; 05| import java.util.Comparator; 06| 07| class Lock_manager 08| { 09| private static int[] id_lock = new int[1]; 10| private static int id_pool = 0; 11| 12| public static int new_id( ) 13| { 14| int id; 15| synchronized( id_lock ) { 16| id = id_pool++; } 17| return id; 18| } 19| 20| /** 21| * Every mutex is given a unique int ID when it's created. 22| * This function returns once all of the locks in the incoming 23| * array have been successfully acquired. Locks are always 24| * acquired in ascending order of ID to attempt to avoid 25| * deadlock situations. 26| * 27| * @param locks All of these locks must be acquired before 28| * acquire_multiple returns. 29| * @param timeout Maximum time to wait to acquire each 30| * lock (milliseconds). The total time for the multiple 31| * acquire operation could be (timeout * locks.length). 32| **/ 33| 34| public static void acquire_multiple( Semaphore[] locks, long timeout ) 35| throws InterruptedException 36| { try 37| { 38| Arrays.sort(locks, new Lock_comparator() ); 39| for( int i = 0; i < locks.length; ++i ) 40| locks[i].acquire( timeout ) ; 41| } 42| catch( Comparable.Not_comparable e ) 43| { 44| // shouldn't happen. Convert it to an uncaught 45| // exception to terminate the program. 46| 47| throw new Error( e.toString() ); 48| } 49| } 50| 51| private static class Lock_comparator implements Comparator 52| { public int compare(Object a, Object b) 53| throws Comparable.Not_comparable 54| { return ((Semaphore)a).id() - ((Semaphore)b).id(); 55| } 56| public boolean equals(Object obj) 57| { return obj instanceof Lock_comparator; 58| } 59| } 60| }
Listing 2: Lock_manager.java
"Booch utilities"
Digressing for a moment, in most object-oriented languages, utilities are kludges needed to get around design problems in the language itself or in existing libraries. The problem in Java is that neither arrays nor primitive types really are true objects. If they were, they would contain all the methods needed to manipulate them. For example, you would be able to find a cosine using:
int x = PI; x.cosine();
You'd also be able to extend int
, and so forth. In the case of arrays, you ought to be able to sort them by asking them to sort themselves, like this:
int array[] = new int[]; //... array.sort();
Unfortunately, Java isn't a "pure" object-oriented language in which everything is an object, so we have to make do with a utility class.
You can use the Arrays
utility as follows to sort an array of int
:
int array[] = new int[]; //... Arrays.sort( array );
The problem is complicated a bit by arrays of objects of some arbitrary class. How can a generic sort utility figure out the correct ordering? The Object
class contains an equals()
function, but we'd need a greater_than()
as well to do sorting.
The Strategy pattern
To the rescue comes the "Strategy" pattern. The notion is to pass into a method or object another object that encapsulates the strategy needed to do something. There are lots of examples of Strategy in Java. A java.awt.LayoutManager
, for example, encapsulates the strategy that a Frame
uses for laying itself out. You can change the layout simply by changing this strategy. You don't have to define a new kind of Frame
class using derivation or some other mechanism. This, in turn, makes Frame
much more "reusable" since the behavior of a Frame
can be changed without changing any of the code that comprises Frame
.
The Arrays
utility uses the Strategy pattern to define a comparison strategy. In Listing 2, the Lock_comparator
(at the bottom of the listing) implements the comparison strategy by implementing the new (to JDK 1.2) java.util.Comparator
interface and its compare()
method. (This works like C's strcmp()
: it returns a negative number, zero, or a positive number depending on whether a is less than, equal to, or greater than b.) An instance of the Comparator
is passed to the Arrays.sort()
method like this:
Arrays.sort(locks, new Lock_comparator() );
The Lock_comparator
object's compare()
method is called by Arrays.sort()
when it needs to compare objects. C programmers will recognize this approach as very similar to that of qsort()
, which is passed a pointer to a compare method -- another example of Strategy.
How Lock_manager
works
Now, let's look at what the Lock_manager
actually does. It starts out with the static new_id()
method that returns a unique int
every time it's called. This method will be called from the various objects that implement the Semaphore
interface to get the value they'll use for their ID.
The acquire
multiple method is used to safely acquire groups of semaphores. You pass it an array of objects that implement Semaphore
, it sorts the array by ID, and then acquires the semaphores one at a time in ID order. Consistent acquisition of multiple semaphores in ID order effectively eliminates the deadlock scenario we've been discussing.