Seymour Cray's Cray and Danny Hillis's Connection Machine -- these two types of supercomputers are what dreams are made of, at least to programmers who, like myself, hunger for infinite processing power. These Testarossas of the number-crunching world can simulate weather, car crashes and impacting anti-tank shells, miniature ecosystems and the global economy, the effects of imaginary pollution disasters, the internals of stars, and the detonation of an H-bomb.
Supercomputers are the ultimate kit for more than day-dreaming power-programmers. Superpower governments and giant multinationals rely heavily on them. Their astronomical costs have meant that few programmers have had the pleasure of even writing a single machine instruction for these ivory-tower instruments of knowledge.
But what does all this have to do with Java?
"Knock, knock. Can I borrow your CPU for a minute?"
Consider for a second your own PC: You're most likely staring at a Pentium-class machine, with 16 or more megabytes of RAM. Now look at your humble telephone socket: For many of us, this is your PC's connection to the Internet, linking your machine to
more of these same Pentium-class machines. From this perspective, the Internet metamorphoses itself as the
supercomputer in the world. And all that is necessary to harvest its awesome potential is to convince as many as possible of these Internet-connected machines to do some work for you, while you sit back, still staring at your diminutive PC.
But how do you get millions of Internet nodes to execute some code?
Enter the ubiquitous Web browser (which by now is likely to be Java-compatible). Most applets embedded in a Web page these days range in usefulness from the entertaining but excrutiatingly slow to the annoying crash-on-arrival type. Why not distribute some useful applets for a change, like ones tackling a massive computational problem whose solution could benefit the world, instead of the military or some profit-obsessed oil giant?
Raytracing as an example DAMPP application
Since I am hardly a nuclear physicist or a meteorologist or a world-renowned economist, I have chosen to tackle a subject most computer users and programmers will be more comfortable with: raytraced image generation. Raytracing is a technique that to date has produced the most life-like, photorealistic synthetic images to come out of machines. Some of the best, mind-blowing movie special effects in these past few years were done using raytracing, although most are generated using much cheaper (that is, faster) techniques that end up producing less realistic images or animation.
Although raytracing is not what this article is about, I shall give you a very brief summary of the technique: Computer generated images all have at their heart a mathematical description of the scene to be generated. Life-like images are produced from three-dimensional mathematical descriptions, where the scene or model is composed out of simple building blocks like flat polygons, spheres, cubes, cones, and lines. To obtain realistic images, the algorithm must model color and light and all its physical side-effects (shadows, reflections, transparancy, brightness, and darkness) as close to the real phenomenon as possible. This is where the cheaper image-rendering algorithms use shortcuts that trade image quality for speed of generation. Raytracing does the opposite. Its goal is maximum realism at the expense of quick rendering.
Raytracing models reality very closely indeed. It attempts to model rays of light that, after emanating from some light source(s), interact with the model and eventually end up hitting your eye's retina. Concretely, the raytracing algorithm is extremely simple, in principle: For each image pixel to be rendered, it traces an imaginary beam of light from your eye (the viewpoint), going through the pixel and intersecting at some point the computer model of the scene to be drawn. At such an intersection the algorithm determines the pixel's color (and intensity) by tracing more imaginary light beams, this time emanating from the light source(s) in the scene's model. Hence ray-tracing, a very elegant but incredibly compute-intensive solution. (It is essentially a brute-force approach.)
Raytracing has one characteristic that's quite relevant to parallel computation: Each pixel can be computed totally independently from all others. In parallel jargon, there are no data dependencies. This means raytracing can be easily adapted to a parallel distributed environment, which is exactly what I have done with a raytracer written in Java, by Frederico Inacio de Moraes (who kindly gave me permission to use his code for this project).
The DAMPP Design: JobMasters and WorkerApplets
Let us now forget about raytracing and design the system this article is all about: a distributed applet-based (massively) parallel processing (DAMPP) system.
We need applets to calculate some parts of a parallel computation and we need to somehow assemble all the produced results in one central location. That central location will be the WWW server that carries the Web page containing our applets. These millions of applets, working like ants toward a common goal, are in fact just clones. So from now on I will simply talk about the WorkerApplet. (Keep in mind, though, that hundreds or thousands of these cloned applets can be instantiated at any one time, working in parallel on a given problem.)
On the WWW server runs another server program (along with the HTTP server), which I call the JobMaster. It is the command and control center of a DAMPP system. See the DAMPP high-level protocol diagram for the big picture.
DAMPP high-level protocol
Here's the scenario of the two components (WorkerApplet and JobMaster) interacting with each other:
A user surfs the Web and hits the page containing the worker applet. The worker applet loads and starts running immediately. The user is not even aware of this fact since the applet is totally invisible; it does not boast flashy animations or a GUI. It is simply a harmless visitor process in the machine.
The first thing the applet does is call home to tell the JobMaster it arrived safe and sound, and is ready to do some work (Step 1 in the DAMPP high-level protocol diagram). The JobMaster responds by issuing the applet with a job "specification" (Step 2). This is a parcel of information that allows the applet to start working on a sub-part of the big problem being attacked (in our raytracing example application, each applet calculates individual image lines).
Since Web users don't surf the Web for days on end, the applet knows it's living on borrowed time. At any moment the user can decide that surfing real waves is more exciting than surfing cyberwaves, and so kill his browser along with our poor applet. Worker applets therefore are designed to be dispensible, thus conforming nicely to our current throw-away consumer society <sigh>. DAMPP is designed to cope with applets being killed prematurely; all it does is eventually re-issue the same job to another applet on another surfer's computer.
Whenever an applet does manage to complete its calculations, it contacts the JobMaster again to hand over its results (Step 3) - and then, like the male bee for which there is no more use, commits suicide.
If we switch our perspective to that of the JobMaster, we see two types of incoming communications: job requests and returned results. To simplify implementation and enhance performance, these two very different streams will use separate channels and both rely on UDP (User Datagram Protocol) for the underlying communication protocol. UDP is one of the TCP/IP protocols, sitting between IP and TCP as far as complexity and capabilities are concerned. I chose the UDP protocol over the more-frequently used TCP protocol for three reasons:
- UDP's connectionless communication mode is perfectly suited to our design
- UDP consumes less Internet bandwidth than TCP for the same amount of information transferred
- UDP is faster
UDP has one disadvantage: It does not guarantee delivery of information. (That is why TCP, which does guarantee delivery, is used more often.) But since the number of applet hosts on the Internet is, in all practicality, limitless, any lost results or any applet-server rendezvous gone wrong can be safely ignored. This is analoguous to your body not blinking an eyelid at the death of one of its own cells: Statistically and practically, it is a meaningless event. The design of the DAMPP computation engine can therefore rely, in a similar vein as our bodies, on massive built-in redundancy to achieve its goals with a very high degree of probability.
The overall picture is now clear: A swarm of applets passively and invisibly are being executed by client machines all over the world, contacting their home server to request jobs, executing those jobs and returning results to the server.
DAMPP implementation details
We are now ready to talk Java code!
The whole DAMPP system consists of just five classes: three classes for the applet and three for the server (one class is shared by both). These classes are, for the applet:
- RayTrace (which extends WorkerApplet, instead of java.applet.Applet)
for the server:
and shared by both:
At this proof-of-concept stage, the applet side already has been decomposed into a generic platform for parallel processing (WorkerApplet) and the application-specific code (RayTrace). Since this article is all about the distributed parallel processing, and not about raytracing, I shall not go into detail of how RayTrace works. So let's look at WorkerApplet's implementation instead.
WorkerApplet is a semi-abstract (or semi-concrete -- both views are valid) foundation class for any concrete applets wishing to be the parallel building block for a DAMPP parallel computation engine. Actually, WorkerApplet consists mostly of concrete parts; the only abstract part left to be implemented in subclasses is a
run() method to satisfy the standard Java
Runnable interface. This method ensures that your application applets execute as a thread in the client's browser, thus allowing other applets to execute concurrrently with the DAMPP applet.
WorkerApplet is an abstraction layer meant to hide all the DAMPP communication details between applet and server. To this end WorkerApplet provides two methods:
- byte requestJob()
- void returnResults(byte array)
These two methods correspond directly with steps 1 and 3 of the applet-server protocol shown in the DAMPP high-level protocol diagram.
requestJob() initially is used by an applet to find out what part of the larger problem it should tackle. The return type of the method is
byte which means that an unstructured array of bytes is returned. This is because WorkerApplet does not impose any restrictions on the type or structure of the job specification message issued by the server. In the case of the raytracer demonstration application, the job specification essentially consists of the scanline number (the y-coordinate) of the image scanline that the applet should generate. Other applications would use completely different job specification messages; this is left entirely to the specific application.
Once the calculations are done (and these should be designed to require no more than a couple of minutes at most, lest you risk loosing too many applets prematurely due to them being killed off by users exiting their browsers), the applet sends its results back to the server via method
returnResults(). This method also uses an unstructured byte array as an argument to allow any type of information to be returned. In the case of our raytracer, a complete scanline of 24-bit color pixels is returned each time.
Since the system relies on an exchange of UDP packets (which are analogous to telegrams), and not on TCP-style persistent connections (which are more analoguous to telephone connections), the server needs a reply address to be able to reply to the applet.
Job Request UDP packet structure