Programming Java threads in the real world, Part 8

Threads in an object-oriented world, thread pools, implementing socket 'accept' loops

One of the biggest problems associated with thread use in an object-oriented environment is a conceptual one: though procedural programmers naturally think about the flow of control from function to function as the system works, object-oriented designers focus on the message flow within an individual scenario or use case. The traditional view of threading, however, concerns itself entirely with flow of control. As a consequence, object-oriented designers typically don't think about threads -- at least not until they get down to the very low-level implementation part of the design; rather, they think about two categories of messages: synchronous messages that don't return until they're done doing whatever they do, and asynchronous messages, which initiate some background operation and return immediately. This month's column (along with next month's) will address such issues by showing you how to reconcile these two points of view and implement object-oriented-style threading using Java's essentially procedural implementation of threads.

Java's implementation of threading, in fact, complicates matters by using a misleading metaphor for threads. You have to create a thread by deriving from Thread, which leads many a novice Java programmer to the erroneous belief that all the methods of the Thread derivative will run on that thread. In fact, a method of a Thread derivative is just like any other method: it runs on a thread only if called directly or indirectly from that thread's run() method. Objects do not run on threads; methods do.

It's sometimes difficult to predict any control flow in an object-oriented system. In a procedural system, one procedure just calls another, beginning at a single starting point. If there are many threads, there are many paths through the code, but each path starts at a known place, and the control flow through a given thread is predictable (though sometimes not easily so). Object-oriented systems are another matter. Object-oriented systems tend to be networks of cooperating objects, communicating with one another via some message-passing system. The system's "main" method may well do nothing but create a bunch of objects, hook them up to each other, and then terminate. Flow of control is very difficult to predict in this system.

Synchronous vs. asynchronous messages

As I mentioned earlier, an object-oriented designer looks at the world in terms of objects and messages. Objects pass messages to each other, and the receipt of some message causes an appropriate message-handler -- a Java method -- to be executed. Most of these messages are synchronous: their handlers don't return until they're finished doing what they do. Other messages are asynchronous: the handler returns immediately, before the requested operation completes. Meanwhile, work is going on in the background to satisfy the original request. A good example of an asynchronous message in Java is Toolkit.getImage(), which initiates the process of fetching an image and then returns immediately, long before the actual image arrives.

The broad categories of messages (synchronous and asynchronous) can themselves be subdivided in various ways. For example, a balking message is one that can't even be initiated. Imagine that you could only open a limited number of database connections at a given moment, and that all the connections were in use. A message that required access to the database could "balk" if it couldn't get the connection. It isn't that it tried to do the operation and failed; rather, it couldn't even initiate the operation to give it a chance to fail.

Another variant on synchronous messages is a timeout. Rather than balking immediately, the method decides to wait for a predetermined amount of time for the resource to become available. If that time expires, the request will fail. (The operation probably never started, but if the operation indeed started, it certainly didn't complete successfully.) In Java, a read from a socket can timeout in this way.

The problem isn't design; it's implementation. Designing asynchronous systems in an object-oriented way isn't particularly difficult. Object-oriented-design notations such as UML (the "Universal Modeling Language") can easily capture notions such as synchronous and asynchronous messages. Implementing these notions in the essentially procedural system mandated by the Java threading model is another matter, however.

The thread-per-method solution

Given an object-oriented design perspective -- a network of objects communicating via messages -- what's the best way to implement an asynchronous message? The most naive way, which is workable in simple situations, is for each asynchronous-message handler to spawn its own thread.

First, let's consider the following synchronous method, which flushes an internal buffer out to a file. (The Reader_Writer lock was discussed last month.)

 import com.holub.asynch.Reader_writer; import
java.io.*;
class Synchronous_flush
{
    private final OutputStream  out;
    private Reader_writer       lock = new Reader_writer();
    private byte[]              buffer;
    private int                 length;
    public Synchronous_flush( OutputStream out )
    {   this.out = out;
    }
    //...
    synchronized void flush( ) throws IOException
    {   try
        {   lock.request_write();
            out.write( buffer, 0, length );
            length = 0;
        }
        finally
        {   lock.write_accomplished();
        }
    }
}

This blocking version of flush() presents several problems. For one thing, flush() can block indefinitely while waiting to acquire the reader/writer lock. Moreover, if the OutputStream was a socket connection rather than a file, the write operation itself could take a long time to do. Finally, since flush() is synchronized, the entire object is locked while the flush is in progress, so any thread that tries to call any other synchronized method of Synchronous_flush will block until the flush() completes. This wait could turn into a nested-monitor-lockout situation should the lock not be released.

These problems can be solved by making flush() asynchronous; the flush() method should simply initiate the flush operation and then return immediately. Here's an initial (yet, as you'll soon see, not very successful) attempt:

import com.holub.asynch.Reader_writer;
import java.io.*;
class Asynchronous_flush
{
    private OutputStream    out;
    private Reader_writer   lock = new Reader_writer();
    private byte[]          buffer;
    private int             length;
    //...      synchronized void flush( )
    {   new Thread()
        {   public void run() 
            {   try
                {   lock.request_write();
                    out.write( buffer, 0, length );
                    length = 0;
                }                catch( IOException e )
                {   // ignore it.
                }
                finally
                {   lock.write_accomplished();
                }
            }       }.start();
    }
}

I've wrapped the former contents of the flush() method inside the run() method of an anonymous inner class that extends Thread. Now flush() does nothing but fire off the thread and return. This simple strategy can work for simple situations, but unfortunately it doesn't work here. Let's analyze the problems one at a time.

The main problem is that the write operation is no longer thread-safe. Simply synchronizing the flush() method locks the object only while we're in the flush() method, which isn't for very long. The actual write() operation is performed on its own thread long after flush() has returned, and the buffer may have been modified several times in the interim (or even worse, may be modified while the write is in progress). A possible solution to the synchronization problem is to make a copy of the buffer while we're synchronized, and then work from the copy when inside the (unsynchronized) auxiliary thread. The only time synchronization is necessary is while we're actually making the copy.

Because it's so easy, it would be nice if we could implement this strategy like this:

    synchronized void flush( )
    {   
        byte[] copy = buffer.clone();
        length = 0;
        new Thread()
        {   public void run()
            {   try
                {   lock.request_write();                   out.write( copy, 0, length );
                }
                catch( IOException e )
                {   // ignore it.
                }
                finally
                {   lock.write_accomplished();
                }
            }
        }.start();
    }

But this code doesn't even compile. Remember that the inner-class object -- the anonymous Thread derivative -- exists long after the method returns. Consequently, the local variables of the method can't be used by the thread (unless they're final, which, in this case, they aren't) simply because they won't exist any more; they're destroyed when flush() returns. We can copy local variables into the thread object, however.

Listing 1 solves most of these problems by using the copy strategy I just discussed. The strange-looking thing on line 24 is an "instance initializer" for the inner class. Think of it syntactically as a static initializer that isn't static -- a sort-of metaconstructor. The code in the instance initializer is effectively copied into all constructors, including the compiler-generated "default" constructor, above any code specified in the constructor itself. That is, if you have both an instance initializer and a constructor, the code in the instance initializer executes first. (The one exception to this rule is that the instance initializer is not copied into any constructor that calls another constructor using the this(optional_args) syntax. This way the code in the instance initializer is executed only once.) The syntax is pretty ugly, but there it is.

An alternative to the solution in Listing 1 would be to encapsulate the code from the instance initializer in a nonstatic method, and then call it when initializing the field:

new Thread()
{   int     length; 
    byte[]  copy = init();
    private void init()
    {   length        = Flush_example.this.length;
        byte[] copy   = new byte[length];
        System.arraycopy(Flush_example.this.buffer, 0, copy, 0, length);
        Flush_example.this.length = 0;
        return copy;
    }
    //...
}

This isn't much of an improvement over the instance initializer in clarity, and initializing length as a side effect of the init() call is particularly hideous. I've used arraycopy() rather than clone because I didn't want to mess with the CloneNotSupportedException. Exceptions are not allowed to propagate out of instance initializers.

Whatever method we use for initialization, the inner-class's construction happens in the new() on line 20 of Listing 1, while the outer-class object is locked, so the copy operation is thread-safe. The newly-created thread then acquires the writer lock and writes to the file in its own good time, using the copy for this purpose.

Listing 1: Flush_example.java
01  
02  
03  
04  
05  
06  
07  
08  
09  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
import com.holub.asynch.Reader_writer;
import java.io.*;
class Flush_example
{
    public interface Flush_error_handler
    {   void error( IOException e );
    }
    private final OutputStream  out;
    private Reader_writer       lock = new Reader_writer();
    private byte[]              buffer;
    private int                 length;
    public Flush_example( OutputStream out )
    {   this.out = out;
    }
    //...
    synchronized void flush( final Flush_error_handler handler )
    {   new Thread()
        {   int     length; 
            byte[]  copy;
            {   length = Flush_example.this.length;
                copy   = new byte[length];
                System.arraycopy(Flush_example.this.buffer, 0, copy, 0, length);
                Flush_example.this.length = 0;
            }
            public void run()
            {   try
                {   lock.request_write();
                    out.write( copy, 0, length );
                }
                catch( IOException e )
                {   handler.error(e);
                }
                finally
                {   lock.write_accomplished();
                }
            }
        }.start();
    }
}

An exceptional problem

The next perplexing issue is what to do with the IOException. Back in the original version of the code, the exception propagated out of the flush() method to whomever called flush(). We can't do that here, because there's nobody to propagate it to -- if you start backtracking down the call stack, you'll end up back in run(), but you didn't call run(); the system did when it fired up the thread. Simply ignoring the write error, as I've been doing, isn't a good strategy for obvious reasons.

1 2 3 4 5 Page
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more