JW Archives: How the Java virtual machine performs thread synchronization

Understanding threads, locks, monitors and more in Java bytecode

All Java programs are compiled into class files that contain bytecodes, the machine language of the Java virtual machine. In this JavaWorld classic, Bill Venners goes under the hood of the JVM to explain the mechanics of thread synchronization, including shared data, locks, monitors, and synchronized. Like many JW Archives, this article is as relevant for students of JVM internals today as it was when it was first published in July 1997.

This month's Under the Hood looks at thread synchronization in both the Java language and the Java virtual machine (JVM). This article is the last in the long series of bytecode articles I began last summer. It describes the only two opcodes directly related to thread synchronization, the opcodes used for entering and exiting monitors.

Threads and shared data

One of the strengths of the Java programming language is its support for multithreading at the language level. Much of this support centers on coordinating access to data shared among multiple threads.

About JW Archives

JavaWorld is one of the most influential resources in the world for Java developers, hosting a library of feature articles dating back to 1996.  JW Archives selects how-to articles that are as relevant to the interests of developers today as when they were first published, refreshing them for a contemporary readership.

The JVM organizes the data of a running Java application into several runtime data areas: one or more Java stacks, a heap, and a method area. For a backgrounder on these memory areas, see the first Under the Hood article: "The lean, mean virtual machine."

Inside the Java virtual machine, each thread is awarded a Java stack, which contains data no other thread can access, including the local variables, parameters, and return values of each method the thread has invoked. The data on the stack is limited to primitive types and object references. In the JVM, it is not possible to place the image of an actual object on the stack. All objects reside on the heap.

There is only one heap inside the JVM, and all threads share it. The heap contains nothing but objects. There is no way to place a solitary primitive type or object reference on the heap -- these things must be part of an object. Arrays reside on the heap, including arrays of primitive types, but in Java, arrays are objects too.

Besides the Java stack and the heap, the other place data may reside in the JVM is the method area, which contains all the class (or static) variables used by the program. The method area is similar to the stack in that it contains only primitive types and object references. Unlike the stack, however, the class variables in the method area are shared by all threads.

Object and class locks

As described above, two memory areas in the Java virtual machine contain data shared by all threads. These are:

  • The heap, which contains all objects
  • The method area, which contains all class variables

If multiple threads need to use the same objects or class variables concurrently, their access to the data must be properly managed. Otherwise, the program will have unpredictable behavior.

To coordinate shared data access among multiple threads, the Java virtual machine associates a lock with each object and class. A lock is like a privilege that only one thread can "possess" at any one time. If a thread wants to lock a particular object or class, it asks the JVM. At some point after the thread asks the JVM for a lock -- maybe very soon, maybe later, possibly never -- the JVM gives the lock to the thread. When the thread no longer needs the lock, it returns it to the JVM. If another thread has requested the same lock, the JVM passes the lock to that thread.

Class locks are actually implemented as object locks. When the JVM loads a class file, it creates an instance of class java.lang.Class. When you lock a class, you are actually locking that class's Class object.

Threads need not obtain a lock to access instance or class variables. If a thread does obtain a lock, however, no other thread can access the locked data until the thread that owns the lock releases it.

Monitors

The JVM uses locks in conjunction with monitors. A monitor is basically a guardian in that it watches over a sequence of code, making sure only one thread at a time executes the code.

Each monitor is associated with an object reference. When a thread arrives at the first instruction in a block of code that is under the watchful eye of a monitor, the thread must obtain a lock on the referenced object. The thread is not allowed to execute the code until it obtains the lock. Once it has obtained the lock, the thread enters the block of protected code.

When the thread leaves the block, no matter how it leaves the block, it releases the lock on the associated object.

Multiple locks

A single thread is allowed to lock the same object multiple times. For each object, the JVM maintains a count of the number of times the object has been locked. An unlocked object has a count of zero. When a thread acquires the lock for the first time, the count is incremented to one. Each time the thread acquires a lock on the same object, a count is incremented. Each time the thread releases the lock, the count is decremented. When the count reaches zero, the lock is released and made available to other threads.

Synchronized blocks

In Java language terminology, the coordination of multiple threads that must access shared data is called synchronization. The language provides two built-in ways to synchronize access to data: with synchronized statements or synchronized methods.

Synchronized statements

To create a synchronized statement, you use the synchronized keyword with an expression that evaluates to an object reference, as in the reverseOrder() method below:

class KitchenSync {
    private int[] intArray = new int[10];
    void reverseOrder() {
        synchronized (this) {
            int halfWay = intArray.length / 2;
            for (int i = 0; i < halfWay; ++i) {
                int upperIndex = intArray.length - 1 - i;
                int save = intArray[upperIndex];
                intArray[upperIndex] = intArray[i];
                intArray[i] = save;
            }
        }
    }
}

In the case above, the statements contained within the synchronized block will not be executed until a lock is acquired on the current object (this). If instead of a this reference, the expression yielded a reference to another object, the lock associated with that object would be acquired before the thread continued.

Two opcodes, monitorenter and monitorexit, are used for synchronization blocks within methods, as shown in the table below.

Table 1. Monitors

Opcode
Operand(s)
Description
monitorenter
none
pop objectref, acquire the lock associated with objectref
monitorexit
none
pop objectref, release the lock associated with objectref

When monitorenter is encountered by the Java virtual machine, it acquires the lock for the object referred to by objectref on the stack. If the thread already owns the lock for that object, a count is incremented. Each time monitorexit is executed for the thread on the object, the count is decremented. When the count reaches zero, the monitor is released.

Take a look at the bytecode sequence generated by the reverseOrder() method of the KitchenSync class.

Note that a catch clause ensures the locked object will be unlocked even if an exception is thrown from within the synchronized block. No matter how the synchronized block is exited, the object lock acquired when the thread entered the block definitely will be released.

Synchronized methods

To synchronize an entire method, you just include the synchronized keyword as one of the method qualifiers, as in:

class HeatSync {
    private int[] intArray = new int[10];
    synchronized void reverseOrder() {
        int halfWay = intArray.length / 2;
        for (int i = 0; i < halfWay; ++i) {
            int upperIndex = intArray.length - 1 - i;
            int save = intArray[upperIndex];
            intArray[upperIndex] = intArray[i];
            intArray[i] = save;
        }
    }
}

The JVM does not use any special opcodes to invoke or return from synchronized methods. When the JVM resolves the symbolic reference to a method, it determines whether the method is synchronized. If it is, the JVM acquires a lock before invoking the method. For an instance method, the JVM acquires the lock associated with the object upon which the method is being invoked. For a class method, it acquires the lock associated with the class to which the method belongs. After a synchronized method completes, whether it completes by returning or by throwing an exception, the lock is released.

Bill Venners is president of Artima, Inc., publisher of Artima Developer (www.artima.com). He is author of the book, Inside the Java Virtual Machine, a programmer-oriented survey of the Java platform's architecture and internals. His popular columns in JavaWorld magazine covered Java internals, object-oriented design, and Jini. Active in the Jini Community since its inception, Bill led the Jini Community's ServiceUI project, whose ServiceUI API became the de facto standard way to associate user interfaces to Jini services. Bill is also the lead developer and designer of ScalaTest, an open source testing tool for Scala and Java developers, and coauthor with Martin Odersky and Lex Spoon of the book, Programming in Scala.

Learn more about this topic

Read more of Bill's Under the Hood columns on JavaWorld

More about threads and concurrency

Inside the Java Virtual Machine (Bill Venners, McGraw-Hill Companies, January 2000) is reprinted in select chapters by Artima.com.

Also check out the JavaWorld site map and search engine.