Need a good set of abstract data structures? ObjectSpace's JGL packs a punch!

With the Java Generic Library, you get a valuable package of data structures, algorithms, and helper classes

During the past year, Java programmers have been frustrated -- and quite legitimately -- with a lack of punch when it comes to abstract data structures in the java.lang and java.util packages. Well, fret no more: ObjectSpace Inc., developer of object-oriented software, is distributing a free, large, and well-designed package of generic data structures, algorithms, and helper classes. Called the Java Generic Library (JGL), this Java package (import COM.objectspace.jgl.*) is Java's timely and hard-hitting answer to C++'s Standard Template Library (STL).

Unlike other commercial offerings, like Thought Inc.'s Nutmeg library (reviewed by Doug Garrett in the October 1996 issue of Dr.Dobb's Journal), the JGL is free to anyone and is meant to be used for any purpose -- in the same spirit as Sun's Java programming language and Java development kit (JDK). Java developers accustomed to the JDK way of developing (that is, in a non-visual way) will be right at home with the Java Generic Library's distribution approach: the archive contains an excellent and comprehensive HTML-based user guide and API reference manual, pre-compiled .class files (the library itself), 170+ examples on how to use the classes, and, as rich icing on the cake, the complete source code to the entire library!

JGL: An overview

The problem domain JGL focuses on can be expressed in Niklaus Wirth's immortal formula: algorithms + data structures. Or rather, in this object-oriented era: data structures + algorithms.

On the data structures side, JGL makes available a set of Abstract Data Types (ADTs) that are derived from an abstract class called Container -- actually defined as a Java interface. Containers are further sub-categorized into the following (as shown also in Figure 1):

  • Sequences -- comprising arrays, single/double linked lists, and Dequeues (double-ended queues)
  • Sets -- hashing and ordered
  • Maps -- hashing and ordered
  • Queue
  • Stack

All sequences (containers in which element ordering plays a role) and sets (containers in which element ordering is unimportant) "descend" from their respective interfaces -- that is, Sequence and Set -- which inherit from interface Container. (Do not confuse the AWT Container class with JGL's Container: The two classes are unrelated.)

On the algorithms side, object-oriented inheritance is much less useful, so JGL's algorithms are simply grouped loosely into unrelated classes with names generally ending in -ing. Here are some of the main algorithm classes:

  • Sorting
  • Finding
  • Filtering
  • Counting
  • Replacing
  • Reversing
  • Applying
  • Transforming
  • Copying

Both groups of classes, data structures, and algorithms, can be coupled together with the help of a host of helper classes: iterator and function classes.

Iterator classes will be familiar to most experienced object-oriented programmers: They are abstract pointers to elements of abstract data structures. The JGL provides iterator classes not only for each of its Container classes but also for every native Java array variety and for java.util's Vector class. JGL lets you use native Java arrays and class Vector as first-class JGL citizens, via mediating classes called array adapters.

Function classes let you encapsulate "functions" (method calls, condition functions) as objects that you can then apply "in bulk" to containers in myriad different ways. For example, if you had a "convert to uppercase" function, you could convert every string in any container to uppercase, irrespective of whether the container is a linked list, a native Java array, a hashtable, or a set, and so on.

Now that I've given you a bird's-eye view of the JGL, let's swoop down and analyze these class groups in more detail.

Generic containers

Figure 1 shows the

Container

hierarchy and how it partially meshes with the java.util hierarchy. Note all JGL containers inherit from a single parent interface; thus all containers share a common subset of methods.

Java Generic Library
Figure 1. The JGL Container Hierarchy

This common subset of methods that all containers share are enumerated in the definition for interface Container, shown in Listing 1.

// Copyright(c) 1996,1997 ObjectSpace, Inc.

// Portions Copyright(c) 1995, 1996 Hewlett-Packard Company.

package COM.objectspace.jgl;

import java.util.Enumeration; import java.io.Serializable;

/* * Container is the interface that is implemented by all of the * Java Generic Library containers. * @version 2.0 * @author ObjectSpace, Inc. */

public interface Container extends Cloneable, Serializable { public Object add( Object object ); // Add an object to myself. public void clear(); // Remove all of my objects. public boolean isEmpty(); // Return true if I contain no objects. public Enumeration elements(); // Return an Enumeration of the components in this container public ForwardIterator start(); // Return an iterator positioned at my first item. public ForwardIterator finish(); // Return an iterator positioned immediately after my last item. public boolean equals( Object object ); // Return true if I'm equal to a specified object. public int size(); // Return the number of objects that I contain. public int maxSize(); // Return the maximum number of objects that I can contain. public Object clone(); // Return a shallow copy of myself. public String toString(); // Return a string that describes me. }

Listing 1. Container interface definition

At this abstraction level, the main methods of a Container are its add() method and the start() and finish() methods that return iterators. Further down the hierarchy, every concrete Container provides more specialized methods to actually retrieve objects from itself, in addition to more tailored object adding methods. As with the java.util containers, JGL containers can only contain objects, so if you need to store primitives, you may have to wrap them in their equivalent object forms (but check if an array adapter can't provide a simpler solution, indirectly wrapping the primitives at the higher level of an entire array instead).

Sets are very basic containers; their main methods are simply put() and get(). Sequences, on the other hand, give you pushFront()/popFront(), pushBack()/popBack(), front()/back(), at(), and put(), clearly reflecting their more flexible random access container nature.

Initially, JGL's Array class is probably the most approachable Container: It is the JGL equivalent of java.util.Vector. The exception, of course, that Array is a JGL Sequence (and hence also a Container) and can therefore easily be substituted by other Sequences, or Containers, if random access is not important.

ObjectSpace's JGL philosophy is not to compete with Sun's java.util package but to complement it, and JGL allows you to leave most of your Vector-based code as is. Nevertheless, you can use JGL to enhance it by providing an adapter class, VectorArray. This adapter wraps a Vector up in some JGL clothes, turning it into a Sequence, the far-reaching implications being that JGL algorithms can be applied to java.util.Vectors, as can JGL iterators. This simple fact alone should be reason enough to adopt JGL.

JGL has an even more powerful compatibility trick up its sleeve: Any native Java arrays can be adapted and turned into a JGL Sequence! To fit into the JGL scheme of things, you do not need to rewrite your code, which uses plain arrays: You just wrap some adapter classes around the arrays and let JGL loose on your existing code. Example 1 shows you how to use a JGL enumerator on the standard command-line arguments array that is passed to every Java application.

import jgl.ObjectArray; import java.util.Enumeration;

class Native {

public static void main (String[] args) {

ObjectArray cmdArgs; // an array adapter to adapt native obj arrays

cmdArgs = new ObjectArray(args); // wrap args array

// from now on, cmdArgs is a JGL-compatible array, so we can use // any of the Sequence/Container methods on it.

if ( ! cmdArgs.isEmpty() ) { Enumeration enum = cmdArgs.elements(); int i = 0; while ( enum.hasMoreElements() ) { String cmdLineArg = (String) enum.nextElement(); System.out.println( "Command line argument " + i + " = " + cmdLineArg); i++; } } } } // End of Class Native

Example 1: Turning the (native Strings) command-line arguments array into a JGL Container

Very much like Java's own java.lang wrapper classes (Boolean, Character, Float, and so on), JGL's adapter class nomenclature is perfectly regular (all classnames ending in -Array), so you do not need to look up the class you should use. An array of booleans simply requires a BooleanArray, an array of chars requires a CharArray, and so on.

If you are familiar with Java's java.util.Hashtable class, you will easily adopt JGL's Map classes. These take key-value pairs as their unit of storage, a key associating with a value as in a dictionary. You can use the generic add() method to add a key-value pair (that is, two objects) as a single object, courtesy of class Pair which simply glues two objects into a bigger object. Alternatively, you use put(Object key, Object value) to add entries as explicit pairs.

Of note is the ability of the map classes to handle one-to-one mappings, like java.util.Hashtable (HashMap and OrderedMap), and one-to-many mappings (HashMultiMap and OrderedMultiMap). For the *MultiMap classes that allow key duplicates, the list of matching values can not be returned via the usual single value-producing get(Object key). A more appropriate equalRange(key) method returns two iterators (that is, pointers) that together define a range of objects with identical keys.

While giving away as few implementation details as possible, the general online description for all these classes provides the time and space attributes for each ADT. A DList, for example (doubly linked list), comes with the following explanation: "Removing a single element is a constant time operation." These performance specs of course are vital to determine which of the JGL structures you should deploy. If you need fast adding of elements at an extremity, you should go for one of the Queue-type classes. If random access is additionally required, then Array is your best choice. If fast finding is critical, then you probably need a HashMap or HashSet, and so on.

Algorithm classEncapsulated algorithms
ApplyingforEach, inject
Comparingequal, lexicographicalCompare, median, mismatch
Copyingcopy, copyBackward
Countingcount, countIf
Fillingfill, fillN
Filteringreject, select, unique, uniqueCopy
FindingadjacentFind, detect, every, find, findIf, some
HashingorderedHash, unorderedHash
MinMaxmaxElement, minElement
SetOperations and OrderedSetOperationsincludes, setDifference, setIntersection, setSymmetricDifference, setUnion
PermutingnextPermutation, prevPermutation
Removingremove, removeIf, removeCopy, removeCopyIf
Replacingreplace, replaceIf, replaceCopy, replaceCopyIf
Reversingreverse, reverseCopy
Rotatingrotate, rotateCopy
ShufflingrandomShuffle
Sortingsort
SwappingiterSwap, swapRanges
Transformingcollect, transform

Table 1: Algorithm classes and their encapsulated algorithms provided by JGL

Algorithms

Complementing data structures are the algorithms that can optionally operate on them. JGL packs forty-plus different algorithms, as shown in Table 1 above, that can be applied to any of JGL's Containers. Herein lies JGL's phenomenal power and flexibility, as it means you can:

  • Find items in a deck
  • Swap items in singly linked list
  • Extract a subset of an array
  • Get the difference between a doubly linked list and a hashtable
  • Sort a stack frame
  • Rotate a queue, and so on, virtually ad infinitum

The combinations are almost endless. You may think they are far from infinite because the roughly 20 types of Containers, multiplied by roughly 40 algorithms, produces some 800 different combinations. In fact, this number ends up looking very small when you factor in the extra dimension of function objects that can customize most of the algorithms.

This distinctly non-object-oriented, orthogonal separation of algorithms from data structures is not done entirely consistently within JGL: For programmers' convenience, many Containers do encapsulate common algorithms applicable to them. The Set classes, for example, contain methods to calculate the union, difference, and intersection of two sets. But these set algorithms also are present in the SetOperations algorithms class -- meaning these set operations can be equally applied to any other two Containers! This hybrid approach proves that in the design of JGL, pragmatism was favored over philosophical purity.

Actual invocation of an algorithm is done via a static (class) method call. This means that you never instantiate one of the algorithm classes. See the examples later on in this article.

The Glue classes: functions and iterators

While you can use Containers purely on their own, or apply JGL algorithms to Containers without any other helper class lubricants, the inherent power of the two main groups of classes can only be fully realized when more classes are called upon -- the function and iterator classes.

Function classes

JGL's function class mechanism allows you to unleash the full potential of the JGL. Unfortunately, this is also JGL's most complex aspect.

Unary and binary functions

Theoretically, a function class is very simple: It either encapsulates a unary function or a binary function. Listing 2 shows the interfaces at the heart of every function class.

public interface UnaryFunction { Object execute( Object object ); // Return the result of } // executing with a single // Object.

public interface BinaryFunction { Object execute( Object first, Object second ); // Return the } // result of executing // with two Object arguments.

Listing 2: UnaryFunction and BinaryFunction interface definitions

All functions simply define a method called execute() that takes one argument (for unary functions) or two arguments (for binary functions) and that produces a single result. The types of the argument(s) and the return value is the most generic possible: Object. This is somewhat like void * in C or C++, although in Java, using Object is completely type-safe since you need to use the instanceof operator to determine an object's runtime type before you can cast to the real type. A typical example of a unary function would be a class (in other words, an execute() method) that takes Strings and returns their length. This function could then become the basis on which you sort a Container of Strings, based on their length instead of on their alphanumeric properties. Actually, JGL already contains such a unary function class; it is called LengthString.

JGL comes pre-bundled with the following UnaryFunction classes:

  • BinderFirst
  • BinderSecond
  • Hash
  • LengthString
  • NegateInteger
  • Print
  • SelectFirst
  • SelectSecond
  • ToString
  • UnaryCompose
  • UnaryPredicateFunction

JGL classes that implement the BinaryFunction interface are:

  • BinaryCompose
  • BinaryPredicateFunction
  • DividesInteger
  • MinusInteger
  • ModulusInteger
  • PlusInteger
  • PlusString
  • TimesInteger

Predicate functions

Very closely related to, but unfortunately not compatible with, these unary and binary functions are predicate functions, which also come in unary and binary flavors. Predicate functions produce a boolean result from operating on one or two argument objects (see Listing 3 for their interfaces). They are used to customize decision-making inside JGL algorithms and can be powerfully combined into composite predicates using logical operator (predicate) classes.

As an example use of predicate functions, consider the Replacing class of algorithms. The more flexible entry points let you specify your own condition for replacement, by passing the algorithm an instance of a UnaryPredicate. So instead of replacing objects when they equal some other object (the default behavior), you could force replacement when an object represents a negative number, or when a string is longer than a certain maximum length, or when an employee is on holiday.

public interface UnaryPredicate { boolean execute( Object object ); // Return the result of } // executing with a single // Object.

public interface BinaryPredicate { boolean execute( Object first, Object second ); // Return the result of } // executing with two // Object arguments.

Listing 3: UnaryPredicate and BinaryPredicate interface definitions

JGL classes that implement the UnaryPredicate interface are:

  • BinderFirstPredicate
  • BinderSecondPredicate
  • LogicalNot
  • NegativeInteger
  • PositiveInteger
  • UnaryComposePredicate
  • UnaryNot

Classes that implement the BinaryPredicate interface are:

  • BinaryComposePredicate
  • BinaryNot
  • EqualTo
  • GreaterEqualInteger
  • GreaterEqualString
  • GreaterInteger
  • GreaterString
  • HashComparitor
  • IdenticalTo
  • LessEqualInteger
  • LessEqualString
  • LessInteger
  • LessString
  • LogicalAnd
  • LogicalOr
  • NotEqualTo
  • NotIdenticalTo
  • SwappedBinaryPredicate

In my opinion, it is a shame the designers of JGL decided against the use of Boolean.TRUE and Boolean.FALSE objects as return values for the predicate functions (they return a primitive boolean instead). Such an approach would have meant predicate functions would just be special cases of the more generic Object-returning functions. The JGL team told me that my idea was originally used but found to be 40% slower when such predicate functions were deployed in algorithms where numerous comparisons are needed (sorting for example). It seems testing a native boolean is significantly faster than comparing two object references as in if (result == Boolean.TRUE).

Iterator classes

Every Container can produce iterator objects pointing to its elements via its

start()

and

finish()

methods. These then can be used by an application to iterate through all or part of the elements of a container, fetching them or modifying them. Again, the design of the JGL does not attempt to compete with established Java idioms like the use of the

Enumeration

interface and the

elements()

method to iterate through a JDK container. To iterate through all elements of a Vector, for example, you normally use:

Enumeration enum = myVector.elements();
while (enum.hasMoreElements() ) {
    aVectorElement = enum.nextElement(); // use aVectorElement
}

This Java idiom has been retained in JGL by forcing all Container classes to also include an elements() method which returns a plain Enumeration object. The problem with Java's standard enumerator is that it is a read-only iterator: You cannot use an Enumeration to change container elements. Another weakness is its rigid behavior: An Enumeration enumerates forwards only and sequentially produces every element of a container. You cannot skip elements. All these weaknesses are addressed by the more flexible JGL iterators, which provide get(), put(), and advance() methods. (See method dumpContents() in Listing 3.)

Don't let source code readability suffer!

The combined use of algorithm classes, function classes and, more particularly, predicate classes, has some severe source readability implications. Using algorithm classes generally makes all your iterations vanish from the spot where an algorithm's iteration (if any) "logically" occurs, to be replaced by a single method call. To make matters worse, the use of predicate functions results in hiding conditional statements from where they used to be. If you're not careful, you can turn a previously readable algorithm (expressed in time-honored and pretty-printed

while

s,

for

s,

if

s, and

else

s) into a highly cryptic sequential list of statements relying solely on generic programming-style assignments and method invocations.

While this novel generic programming approach, as embodied by JGL (and copied from C++'s STL), is very powerful and further helps us realize the lofty goal of code reuse, you absolutely must further encapsulate "heavy" generic programming sequences as methods with clear names, to be invoked by higher application logic. These methods will have to be documented much more thoroughly than vanilla logic, where conventional flow control constructs are self-documenting.

On this topic, it would be interesting to see how industry-standard code complexity metrics qualify code where conditionals and iteration are abstracted and encapsulated in classes and in their member methods. My guess is that the use of generic programming will need a reassessment of the algorithms used to calculate code complexity metrics like McCabe's popular cyclomatic complexity metric.

JGL examples

Determining the limits of what is and what isn't possible using JGL will take months (or more) of constant usage; the examples shown next only scratch the surface.

Example 2, FamilyTies.java, shows a code example demonstrating the interchangeable nature of JGL containers. Instances of an Array, HashSet, Queue, SList, DList, and a Deque all get filled with the same mix of objects, in exactly the same way.

Example 3, InCommon.java, shows how generic algorithms can be applied equally to different container types: The code uses the SetOperations class to calculate the intersection between an Array (which encapsulates a native character array) and a Deck. The intersection itself is stored in a doubly linked list. Note in particular how the array adapter additionally creates the illusion of managing objects, whereas the underlying native array contains just primitives.

Example 4, PowerSorts.java, shows how you can powerfully customize any of the JGL algorithms by passing additional function objects. Here I show how the Sorting class can be made to sort a list of e-mail addresses according to the address domains (in ascending or descending order), or sort a list of first names according to whether the name is used for men or women.

Some minor JGL criticism

My main (but still minor) criticism of the JGL is that its overall organization could be more hierarchical. Currently, the JGL is a single Java package containing a rather overwhelming number (100+) of classes and interfaces. Since nearly all classes fall into the very clearly delineated groups discussed in this article, it would be nice if these groups were structured into easy-to-remember Java subpackages -- for example, all Container classes would be grouped together in a jgl.containers package, and all algorithm classes in a jgl.algorithms package, for starters. This would mean extra explicit import statements at the beginning of your source files, but in my opinion, this is beneficial rather than a drag, as it helps to document the outside entities your programs rely on.

Another problem with the library is that if it remains a third-party Java library, any Web applets using the library will incur a rather painful time penalty when loading off the Web due to the large number of small classes JGL uses. This is because currently Java fetches each referenced class using a separate TCP session. It is to be hoped that Sun Microsystems will act swiftly and welcome the JGL into Java's inner sanctum and grant it "core" API status. (If Sun hasn't done this by the time you read this, maybe a bit of e-mail pressure could help things along.) If this happens, the JGL will reside on every Java-compatible client machine and no JGL classes will consume precious Internet bandwidth.

Conclusion

The benchmarks that accompany the JGL 2.0 distribution prove that the library's ADTs performance is very similar to that of the JDK structures: sometimes a bit faster, sometimes a bit slower. JGLs performance therefore, is not primarily to be found at run time, but rather during design and implementation, when its limitless flexibility will allow you to perform better. You will be able to tackle more complex software development faster, and without sacrificing any system flexibility.

ObjectSpace has implemented one of Java's most important third-party class libraries in existence to date, and is giving JGL away free to the explosively growing Java community. As with Java itself, the Java Generic Library borrows heavily from the C++ camp: It takes the best from C++'s STL, while leaving the C++ warts behind. Most C++ programmers today will know of their STL, but few are managing to exploit its potential. This is no surprise since the C++-STL combination is overly complex. On the other side of the fence Java now beckons with its JGL. Just a year and a half after Java's official launch, the language and its APIs remain delightfully simple, elegant, and powerful; ObjectSpace's JGL only manages to strengthen these rare and desirable characteristics.

Laurence Vanhelsuwé is an independent software engineer. Self-taught, he dropped out of college and immediately started a professional career writing arcade games. He has worked on such diverse technologies as X.25 WAN routers, virtual reality flight simulation, Postscript, and real-time digitized video-based traffic analysis. Laurence currently invests most of his time in learning the latest Java APIs and writing books about them. He thinks Java will revolutionize computer science by leveling the computing landscape into one pan-Java playing field.

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies