Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Programming Java threads in the real world, Part 1

A Java programmer's guide to threading architectures

  • Print
  • Feedback
All Java programs other than simple console-based applications are multithreaded, whether you like it or not. The problem is that the Abstract Windowing Toolkit (AWT) processes operating system (OS) events on its own thread, so your listener methods actually run on the AWT thread. These same listener methods typically access objects that are also accessed from the main thread. It may be tempting, at this point, to bury your head in the sand and pretend you don't have to worry about threading issues, but you can't usually get away with it. And, unfortunately, virtually none of the books on Java addresses threading issues in sufficient depth. (For a list of helpful books on the topic, see Resources.)

This article is the first in a series that will present real-world solutions to the problems of programming Java in a multithreaded environment. It's geared to Java programmers who understand the language-level stuff (the synchronized keyword and the various facilities of the Thread class), but want to learn how to use these language features effectively.

Platform dependence

Unfortunately, Java's promise of platform independence falls flat on its face in the threads arena. Though it's possible to write a platform-independent multithreaded Java program, you have to do it with your eyes open. This isn't really Java's fault; it's almost impossible to write a truly platform-independent threading system. (Doug Schmidt's ACE [Adaptive Communication Environment] framework is a good, though complex, attempt. See Resources for a link to his program.) So, before I can talk about hard-core Java-programming issues in subsequent installments, I have to discuss the difficulties introduced by the platforms on which the Java virtual machine (JVM) might run.

Atomic energy

The first OS-level concept that's important to understand is atomicity. An atomic operation cannot be interrupted by another thread. Java does define at least a few atomic operations. In particular, assignment to variables of any type except long or double is atomic. You don't have to worry about a thread preempting a method in the middle of the assignment. In practice, this means that you never have to synchronize a method that does nothing but return the value of (or assign a value to) a boolean or int instance variable. Similarly, a method that did a lot of computation using only local variables and arguments, and which assigned the results of that computation to an instance variable as the last thing it did, would not have to be synchronized. For example:

class some_class
{   
    int some_field;
    void f( some_class arg ) // deliberately not synchronized
    {
        // Do lots of stuff here that uses local variables
        // and method arguments, but does not access
        // any fields of the class (or call any methods
        // that access any fields of the class).
        // ...
        some_field = new_value;     // do this last.
    }
}


On the other hand, when executing x=++y or x+=y, you could be preempted after the increment but before the assignment. To get atomicity in this situation, you'll need to use the keyword synchronized.

All this is important because the overhead of synchronization can be nontrivial, and can vary from OS to OS. The following program demonstrates the problem. Each loop repetitively calls a method that performs the same operations, but one of the methods (locking()) is synchronized and the other (not_locking()) isn't. Using the JDK "performance-pack" VM running under Windows NT 4, the program reports a 1.2-second difference in runtime between the two loops, or about 1.2 microseconds per call. This difference may not seem like much, but it represent a 7.25-percent increase in calling time. Of course, the percentage increase falls off as the method does more work, but a significant number of methods -- in my programs, at least -- are only a few lines of code.

import java.util.*;
class synch
{
   synchronized int locking     (int a, int b){return a + b;}
    int              not_locking (int a, int b){return a + b;}

    private static final int ITERATIONS = 1000000;
    static public void main(String[] args)
    {
        synch tester = new synch();
        double start = new Date().getTime();
      for(long i = ITERATIONS; --i >= 0 ;)
            tester.locking(0,0);
        double end = new Date().getTime();
        double locking_time = end - start;
        start = new Date().getTime();
      for(long i = ITERATIONS; --i >= 0 ;)
            tester.not_locking(0,0);
        end = new Date().getTime();
        double not_locking_time = end - start;
        double time_in_synchronization = locking_time - not_locking_time;
        System.out.println( "Time lost to synchronization (millis.): "
                        + time_in_synchronization );
        System.out.println( "Locking overhead per call: "
                        + (time_in_synchronization / ITERATIONS) );
        System.out.println(
            not_locking_time/locking_time * 100.0 + "% increase" );
    }
}


Though the HotSpot VM is supposed to address the synchronization-overhead problem, HotSpot isn't a freebee -- you have to buy it. Unless you license and ship HotSpot with your app, there's no telling what VM will be on the target platform, and of course you want as little as possible of the execution speed of your program to be dependent on the VM that's executing it. Even if deadlock problems (which I'll discuss in the next installment of this series) didn't exist, the notion that you should "synchronize everything" is just plain wrong-headed.

Concurrency versus parallelism

The next OS-related issue (and the main problem when it comes to writing platform-independent Java) has to do with the notions of concurrency and parallelism. Concurrent multithreading systems give the appearance of several tasks executing at once, but these tasks are actually split up into chunks that share the processor with chunks from other tasks. The following figure illustrates the issues. In parallel systems, two tasks are actually performed simultaneously. Parallelism requires a multiple-CPU system.

Unless you're spending a lot of time blocked, waiting for I/O operations to complete, a program that uses multiple concurrent threads will often run slower than an equivalent single-threaded program, although it will often be better organized than the equivalent single-thread version. A program that uses multiple threads running in parallel on multiple processors will run much faster.

  • Print
  • Feedback

Resources
  • For a great in-depth look at multithreading in general and the implementation of multithreading both in and with Java in particular, this one's a must. It's required reading if you're using threads heavilyDoug Lea, Concurrent Programming in JavaDesign Principles and Patterns (Addison Wesley, 1997). http://java.sun.com/docs/books/cp/
  • For an intro-level book on Java threading that is less technical but more readable than Lea's effort, seeScott Oaks and Henry Wong, Java Threads (O'Reilly, 1997)
    http://www.oreilly.com/catalog/jthreads/
  • This book is good for those looking into the general subject of multithreading, but doesn't have a Java slantBill Lewis and Daniel J. Berg, Threads PrimerA Guide to Multithreaded Programming (Prentice Hall/SunSoft Press, ISBN 0-13-443698-9)
    http://www.sun.com/books/books/Lewis/Lewis.html
  • Doug Schmidt's ACE framework is a good, though complex, attempt at a truly platform-independent threading system
    http://www.cs.wustl.edu/~schmidt/