Make room for JavaSpaces, Part 2

Build a compute server with JavaSpaces

I've just returned from the second Jini Community Summit, where Ken Arnold and I led a birds-of-a-feather session on JavaSpaces. This encouraging meeting highlighted a number of interesting JavaSpaces applications, including Bill Olivier's JavaSpaces-based email system at the University of Wales. Also featured was the Department of Defense's explorations of battlefield management software, in which battlefield resources, such as tanks, are represented in a space.

TEXTBOX: TEXTBOX_HEAD: Make room for JavaSpaces: Read the whole series!


In addition, there was significant interest in continuing to advance the JavaSpaces technology, and a number of areas were suggested for its improvement. One problem that the conference attendees wanted to confront was the painful process of setting up and running JavaSpaces (as well as Jini) for the first time -- probably the greatest barrier to using the technology. This problem is now being addressed in a new working group started by Ken Arnold, called Out of the Box (for more information, see the Resources section at the end of the article). Another area of community effort involves the development of helper or utility interfaces and classes that provide a valuable set of tools for new JavaSpaces developers. This article is based around one of those tools in particular: the compute server.

Basically, a compute server is an implementation of a powerful, all-purpose computing engine using a JavaSpace. Tasks are dropped into the space, picked up by processes, and then computed; the result is written back into the space and, at some point, retrieved. Beginning JavaSpaces programmers often ask how to implement such a system; in fact, the JavaSpaces Technology Kit ships with two sample compute servers, which perform computations for ray tracing and cryptography.

Compute servers are quite useful even for advanced programming. At the Jini Summit, it was decided that a common set of interfaces should be created to standardize JavaSpaces-based compute servers so that programmers could avoid reinventing the wheel every time they wanted to implement one. As a result, the community has created a working group whose goal is to build a specification and reference implementation of a compute-server architecture. One aim of this article is to kick off that work.

The compute server explained

A compute server provides a service that accepts tasks, computes them, and returns results. The server itself is responsible for computing the results and managing the resources that complete the job. Behind the scenes, the service might use multiple CPUs or special-purpose hardware to compute the tasks more quickly than a single-CPU machine could.

Historically, compute servers have been used to take advantage of the resources of large farms of processors in order to tackle computationally intensive problems -- ray tracing, weather modeling, cryptography, and so-called "grand challenge" computational problems. More recently, compute servers have begun to move into the mainstream, and have been put to work in a variety of environments, from the creation of financial models for Wall Street to the construction of large-scale order-entry and fulfillment centers that service the Web. All of these applications use the compute server to break a large computational problem into smaller problems, and then allow a distributed set of processors to solve these smaller problems in parallel.

Compute servers come in two forms: special-purpose and general-purpose. The SETI@home project is a good example of a special-purpose compute server. SETI@home makes use of the idle cycles of common desktop PCs to search for artificial signals of extraterrestrial origin within radio telescope data. The SETI compute server (which is formed by a group of PC users who have agreed to run the SETI@home screensaver on their PCs) is considered a special-purpose server because it only performs one type of computation: it searches for messages from outer space.

General-purpose compute servers allow you to compute any task -- you can add new tasks at will, and the server automatically incorporates their corresponding code. This is a powerful idea. For example, a corporation might create a compute server out of all the idle machines on its network. Computationally intensive problems can then be run across this compute server, making use of unused resources on the network as they become available. This lets the corporation more effectively use its existing installed base of hardware, thus reducing its overall hardware requirements. While you could do this with a special-purpose compute server, the administrative costs would be a nightmare; every time a new application needed to be run, it would have to be installed across the entire organization. With a general-purpose server, this happens automatically.

Many space-based compute servers have been built over the years. While I was part of the Linda group at Yale University, we built several such systems. The most sophisticated of these was a system called Piranha, built by classmate David Kaminsky. Piranha adaptively configured itself to the available resources on the network; as the name suggests, CPUs began feeding on the tasks that needed to be worked on as they became available. All our attempts at building compute servers were a bit unsatisfying, however, because the tasks had to be statically compiled into the server, thus limiting it to special purposes. JavaSpaces vastly improves this situation by moving code between processes -- you can move code into the compute server any time you need a new task computed. This is exceedingly easy with JavaSpaces; in fact, the properties of the space (inherited from Jini and RMI) allow it to transparently import new code into the compute server for you.

Compute servers can be built quickly with just a few lines of code. However, creating a reliable and robust compute server is a challenge that requires knowledge of most JavaSpaces and Jini APIs. In this article, I will explore many of the subtle aspects of JavaSpaces programming. My goal is to develop a basic compute server that can compute arbitrary tasks. Along the way, you will get a better feel for some of the JavaSpaces APIs that Susanne and I described in our last article.

Compute server design

Let's start with a big picture of the compute server and its relation to JavaSpaces. The typical space-based compute server looks like the figure below. A master process generates a number of tasks and writes them into a space. When we say task in this context, we mean an entry that both describes the specifics of a task and contains methods that perform the necessary computations. One or more worker processes monitor the space; these workers take tasks as they become available, compute them, and then write their results back into the space. Results are entries that contain data from the computation's output. For instance, in a ray-tracing compute server, each task would contain the ray-tracing code along with a few parameters that tell the compute process which region of the ray-traced image to work on. The result entry would contain the bytes that make up the ray-traced region.

A space-based compute server

(Illustration by James P. Dustin, Dustin Design)

There are some nice properties the contribute to compute servers' ubiquity in space-based systems. First of all, they scale naturally; in general, the more worker processes there are, the faster tasks will be computed. You can add or remove workers at runtime, and the compute server will continue as long as there is at least one worker to compute tasks. Second, compute servers balance loads well -- if one worker is running on a slower CPU, it will compute its tasks more slowly and thus complete fewer tasks overall, while faster machines will have higher task throughput. This model avoids situations in which slow machines becoming overwhelmed while fast machines starve for work; instead, the load is balanced because the workers perform tasks when they can.

In addition, the masters and workers in this model are uncoupled; the workers don't have to know anything about the master or the specifics of the tasks -- they just grab tasks and compute them, returning the results to the space. Likewise, a given master doesn't need to worry about who computes its tasks or how, but just throws tasks into a space and waits. The space itself makes this very easy, as it provides a shared and persistent object store where objects can be looked up associatively (for more details on these aspects of spaces, please refer to November's column). In other words, the space allows a worker to simply request any task, and receive a task entry when one exists in the space. Similarly, the master can ask for the computational results of its (and only its) tasks -- without needing to know the specifics of where these results came from.

Finally, the space provides a feature that makes this all work seamlessly: the ability to ship around executable content in objects. This is feasible because Java (and Jini's underlying RMI transport) makes underlying class loading possible. How does this work? Basically, when a master (or any other process, for that matter) writes an object into a space, it is serialized and transferred to that space. The serialized form of the object includes a codebase, which identifies the place where the class that created the object resides. When a process (in this case, a worker) takes a serialized object from the space, it then downloads the object from the appropriate codebase, assuming that its JVM has not downloaded the class already. This is a very powerful feature, because it lets your applications ship around new behaviors and have those behaviors automatically installed in remote processes. This capability is a great improvement over previous space-based compute servers, which required you to precompile the task code into the workers.

Implementing the compute server

For masters and workers to exchange task objects, they must agree on a common interface for those objects. There aren't many requirements for the task object (at least in this simple implementation); all that is really needed is a method that instructs the task to start computing.

The well known Command pattern is used for this purpose. The interface for the Command pattern looks like this:

public interface Command {
    public void execute();

This pattern, which consists of a single method (typically called execute), lets you decouple the object that invokes an operation from the object that can perform it. In this case, the object that will invoke the operation is a worker that has cycles to spare and time to compute a task. The object that knows how to perform the operation is the task itself. Different master processes are free to implement various execution methods, so that workers can perform a variety of computations for different masters. See Resources for pointers to information about Command and other patterns.

To make use of the Command interface in this compute server, you must augment it:

package javaworld.simplecompute;

import net.jini.core.entry.Entry;

public interface Command extends Entry { public Entry execute(); }

Note that I've added extends Entry to the interface definition and imported the Entry class from the net.jini.core.entry package. You may recall from Part 1 of our series that, for an object to be written into a space, it must be instantiated from a class that implements the Entry interface. I've also changed the return type of the execute method to return an object of type Entry. This provides a way for computations to place results back into the space. We'll discover how this is done as we take our next step: implementing the worker.

The worker

The worker is a straightforward concept: it continually looks for a task, takes it from the space, computes it, returns a result (if the task specifies that this is to be done), and then starts over. Here is the code:

package javaworld.simplecompute;

import jsbook.util.*;

import net.jini.core.entry.Entry; import; import;

public class Worker { JavaSpace space; public static void main(String[] args) { Worker worker = new Worker(); worker.startWork(); } public Worker() { space = SpaceAccessor.getSpace(); } public void startWork() { TaskEntry template = new TaskEntry(); for (;;) { try { TaskEntry task = (TaskEntry) space.take(template, null, Long.MAX_VALUE); Entry result = task.execute(); if (result != null) { space.write(result, null, 1000*60*10); } } catch (Exception e) { System.out.println("Task cancelled"); } } } }

1 2 3 Page 1
Page 1 of 3