Wizard API updated!
Tim Boudreau has released a new version of the Swing Wizard library (version 0.997) that fixes the WizardException bug reported in JavaWorld's recent Open Source Java Project profile. The article's examples have been reworked to test out the new, improved WizardException. Thanks, Tim, for this helpful fix!
Open Source Java Projects: The Wizard API

Newsletter sign-up

Sign up for our technology specific newsletters.

Enterprise Java
View all newsletters

Email Address:

Master Merlin's new I/O classes

Squeeze maximum performance out of nonblocking I/O and memory-mapped buffers

With the recent public beta release of J2SE (Java 2 Platform, Standard Edition) 1.4 (code-named Merlin), Sun has once again unleashed scores of new classes, features, and interfaces on unsuspecting Java developers. Because J2SE 1.3 focused only on performance improvements, J2SE 1.4 incorporates two years' worth of feature enhancements. Also, as the first J2SE release defined by the Java Community Process (JCP), Merlin reflects a wider array of interests than previous JDK releases.

A few obvious additions in Merlin have received most of the press so far, including the XML parser, secure sockets extension, and 2D graphics enhancements. This article introduces an exciting new API many have overlooked. The new I/O (input/output) packages finally address Java's long-standing shortcomings in its high-performance, scalable I/O. The new I/O packages -- java.nio.* -- allow Java applications to handle thousands of open connections while delivering scalability and excellent performance. These packages introduce four key abstractions that work together to solve the problems of traditional Java I/O:

  1. A Buffer contains data in a linear sequence for reading or writing. A special buffer provides for memory-mapped file I/O.
  2. A charset maps Unicode character strings to and from byte sequences. (Yes, this is Java's third shot at character conversion.)
  3. Channels -- which can be sockets, files, or pipes -- represent a bidirectional communication pipe.
  4. Selectors multiplex asynchronous I/O operations into one or more threads.

A quick review



Before diving into the new API's gory details, let's review Java I/O's old style. Imagine a basic network daemon. It needs to listen to a ServerSocket, accept incoming connections, and service each connection. Assume for this example that servicing a connection involves reading a request and sending a response. That resembles the way a Web server works. Figure 1 depicts the server's lifecycle. At each heavy black line, the I/O operation blocks -- that is, the operation call won't return until the operation completes.

Figure 1. Blocking points in a typical Java server

Let's take a closer look at each step.

Creating a ServerSocket is easy:

ServerSocket server = new ServerSocket(8001);


Accepting new connections is just as easy, but with a hidden catch:

Socket newConnection = server.accept();


The call to server.accept() blocks until the ServerSocket accepts an incoming network connection. That leaves the calling thread sitting for an indeterminate length of time. If this application has only one thread, it does a great impression of a system hang.

Once the incoming connection has been accepted, the server can read a request from that socket, as shown in the code below. Don't worry about the Request object. It is a fiction invented to keep this example simple.

 InputStream in = newConnection.getInputStream();
InputStreamReader reader = new InputStreamReader(in);
LineNumberReader lnr = new LineNumberReader(reader);
Request request = new Request();
while(!request.isComplete()) {
  String line = lnr.readLine();
  request.addLine(line);
}
This harmless-looking chunk of code features problems. Let's start with blocking. The call to lnr.readLine() eventually filters down to call SocketInputStream.read(). There, if data waits in the network buffer, the call immediately returns some data to the caller. If there isn't enough data buffered, then the call to read blocks until enough data is received or the other computer closes the socket. Because LineNumberReader asks for data in chunks (it extends BufferedReader), it might just sit around waiting to fill a buffer, even though the request is actually complete. The tail end of the request can sit in a buffer that LineNumberReader has not returned.

This code fragment also creates too much garbage, another big problem. LineNumberReader creates a buffer to hold the data it reads from the socket, but it also creates Strings to hold the same data. In fact, internally, it creates a StringBuffer. LineNumberReader reuses its own buffer, which helps a little. Nevertheless, all the Strings quickly become garbage.

Now it's time to send the response. It might look something like this (imagine that the Response object creates its stream by locating and opening a file):

 Response response = request.generateResponse();
OutputStream out = newConnection.getOutputStream();
InputStream in = response.getInputStream();
int ch;
while(-1 != (ch = in.read())) {
  out.write(ch);
}
newConnection.close();
This code suffers from only two problems. Again, the read and write calls block. Writing one character at a time to a socket slows the process, so the stream should be buffered. Of course, if the stream were buffered, then the buffers would create more garbage.

You can see that even this simple example features two problems that won't go away: blocking and garbage.

The old way to break through blocks

The usual approach to dealing with blocking I/O in Java involves threads -- lots and lots of threads. You can simply create a pool of threads waiting to process requests, as shown in Figure 2.

Figure 2. Worker threads to handle requests

Threads allow a server to handle multiple connections, but they still cause trouble. First, threads are not cheap. Each has its own stack and receives some CPU allocation. As a practical matter, a JVM might create dozens or even a few hundred threads, but it should never create thousands of them.

In a deeper sense, you don't need all those threads. They do not efficiently use the CPU. In a request-response server, each thread spends most of its time blocked on some I/O operation. These lazy threads offer an expensive approach to keeping track of each request's state in a state machine. The best solution would multiplex connections and threads so a thread could order some I/O work and go on to something productive, instead of just waiting for the I/O work to complete.

New I/O, new abstractions

Now that we've reviewed the classic approach to Java I/O, let's look at how the new I/O abstractions work together to solve the problems we've seen with the traditional approach.

1 | 2 | 3 | 4 |  Next >
Resources