Need a good set of abstract data structures? ObjectSpace's JGL packs a punch!

With the Java Generic Library, you get a valuable package of data structures, algorithms, and helper classes

1 2 Page 2
Page 2 of 2

Theoretically, a function class is very simple: It either encapsulates a unary function or a binary function. Listing 2 shows the interfaces at the heart of every function class.

public interface UnaryFunction { Object execute( Object object ); // Return the result of } // executing with a single // Object.

public interface BinaryFunction { Object execute( Object first, Object second ); // Return the } // result of executing // with two Object arguments.

Listing 2: UnaryFunction and BinaryFunction interface definitions

All functions simply define a method called execute() that takes one argument (for unary functions) or two arguments (for binary functions) and that produces a single result. The types of the argument(s) and the return value is the most generic possible: Object. This is somewhat like void * in C or C++, although in Java, using Object is completely type-safe since you need to use the instanceof operator to determine an object's runtime type before you can cast to the real type. A typical example of a unary function would be a class (in other words, an execute() method) that takes Strings and returns their length. This function could then become the basis on which you sort a Container of Strings, based on their length instead of on their alphanumeric properties. Actually, JGL already contains such a unary function class; it is called LengthString.

JGL comes pre-bundled with the following UnaryFunction classes:

  • BinderFirst
  • BinderSecond
  • Hash
  • LengthString
  • NegateInteger
  • Print
  • SelectFirst
  • SelectSecond
  • ToString
  • UnaryCompose
  • UnaryPredicateFunction

JGL classes that implement the BinaryFunction interface are:

  • BinaryCompose
  • BinaryPredicateFunction
  • DividesInteger
  • MinusInteger
  • ModulusInteger
  • PlusInteger
  • PlusString
  • TimesInteger

Predicate functions

Very closely related to, but unfortunately not compatible with, these unary and binary functions are predicate functions, which also come in unary and binary flavors. Predicate functions produce a boolean result from operating on one or two argument objects (see Listing 3 for their interfaces). They are used to customize decision-making inside JGL algorithms and can be powerfully combined into composite predicates using logical operator (predicate) classes.

As an example use of predicate functions, consider the Replacing class of algorithms. The more flexible entry points let you specify your own condition for replacement, by passing the algorithm an instance of a UnaryPredicate. So instead of replacing objects when they equal some other object (the default behavior), you could force replacement when an object represents a negative number, or when a string is longer than a certain maximum length, or when an employee is on holiday.

public interface UnaryPredicate { boolean execute( Object object ); // Return the result of } // executing with a single // Object.

public interface BinaryPredicate { boolean execute( Object first, Object second ); // Return the result of } // executing with two // Object arguments.

Listing 3: UnaryPredicate and BinaryPredicate interface definitions

JGL classes that implement the UnaryPredicate interface are:

  • BinderFirstPredicate
  • BinderSecondPredicate
  • LogicalNot
  • NegativeInteger
  • PositiveInteger
  • UnaryComposePredicate
  • UnaryNot

Classes that implement the BinaryPredicate interface are:

  • BinaryComposePredicate
  • BinaryNot
  • EqualTo
  • GreaterEqualInteger
  • GreaterEqualString
  • GreaterInteger
  • GreaterString
  • HashComparitor
  • IdenticalTo
  • LessEqualInteger
  • LessEqualString
  • LessInteger
  • LessString
  • LogicalAnd
  • LogicalOr
  • NotEqualTo
  • NotIdenticalTo
  • SwappedBinaryPredicate

In my opinion, it is a shame the designers of JGL decided against the use of Boolean.TRUE and Boolean.FALSE objects as return values for the predicate functions (they return a primitive boolean instead). Such an approach would have meant predicate functions would just be special cases of the more generic Object-returning functions. The JGL team told me that my idea was originally used but found to be 40% slower when such predicate functions were deployed in algorithms where numerous comparisons are needed (sorting for example). It seems testing a native boolean is significantly faster than comparing two object references as in if (result == Boolean.TRUE).

Iterator classes

Every Container can produce iterator objects pointing to its elements via its

start()

and

finish()

methods. These then can be used by an application to iterate through all or part of the elements of a container, fetching them or modifying them. Again, the design of the JGL does not attempt to compete with established Java idioms like the use of the

Enumeration

interface and the

elements()

method to iterate through a JDK container. To iterate through all elements of a Vector, for example, you normally use:

Enumeration enum = myVector.elements();
while (enum.hasMoreElements() ) {
    aVectorElement = enum.nextElement(); // use aVectorElement
}

This Java idiom has been retained in JGL by forcing all Container classes to also include an elements() method which returns a plain Enumeration object. The problem with Java's standard enumerator is that it is a read-only iterator: You cannot use an Enumeration to change container elements. Another weakness is its rigid behavior: An Enumeration enumerates forwards only and sequentially produces every element of a container. You cannot skip elements. All these weaknesses are addressed by the more flexible JGL iterators, which provide get(), put(), and advance() methods. (See method dumpContents() in Listing 3.)

Don't let source code readability suffer!

The combined use of algorithm classes, function classes and, more particularly, predicate classes, has some severe source readability implications. Using algorithm classes generally makes all your iterations vanish from the spot where an algorithm's iteration (if any) "logically" occurs, to be replaced by a single method call. To make matters worse, the use of predicate functions results in hiding conditional statements from where they used to be. If you're not careful, you can turn a previously readable algorithm (expressed in time-honored and pretty-printed

while

s,

for

s,

if

s, and

else

s) into a highly cryptic sequential list of statements relying solely on generic programming-style assignments and method invocations.

While this novel generic programming approach, as embodied by JGL (and copied from C++'s STL), is very powerful and further helps us realize the lofty goal of code reuse, you absolutely must further encapsulate "heavy" generic programming sequences as methods with clear names, to be invoked by higher application logic. These methods will have to be documented much more thoroughly than vanilla logic, where conventional flow control constructs are self-documenting.

On this topic, it would be interesting to see how industry-standard code complexity metrics qualify code where conditionals and iteration are abstracted and encapsulated in classes and in their member methods. My guess is that the use of generic programming will need a reassessment of the algorithms used to calculate code complexity metrics like McCabe's popular cyclomatic complexity metric.

JGL examples

Determining the limits of what is and what isn't possible using JGL will take months (or more) of constant usage; the examples shown next only scratch the surface.

Example 2, FamilyTies.java, shows a code example demonstrating the interchangeable nature of JGL containers. Instances of an Array, HashSet, Queue, SList, DList, and a Deque all get filled with the same mix of objects, in exactly the same way.

Example 3, InCommon.java, shows how generic algorithms can be applied equally to different container types: The code uses the SetOperations class to calculate the intersection between an Array (which encapsulates a native character array) and a Deck. The intersection itself is stored in a doubly linked list. Note in particular how the array adapter additionally creates the illusion of managing objects, whereas the underlying native array contains just primitives.

Example 4, PowerSorts.java, shows how you can powerfully customize any of the JGL algorithms by passing additional function objects. Here I show how the Sorting class can be made to sort a list of e-mail addresses according to the address domains (in ascending or descending order), or sort a list of first names according to whether the name is used for men or women.

Some minor JGL criticism

My main (but still minor) criticism of the JGL is that its overall organization could be more hierarchical. Currently, the JGL is a single Java package containing a rather overwhelming number (100+) of classes and interfaces. Since nearly all classes fall into the very clearly delineated groups discussed in this article, it would be nice if these groups were structured into easy-to-remember Java subpackages -- for example, all Container classes would be grouped together in a jgl.containers package, and all algorithm classes in a jgl.algorithms package, for starters. This would mean extra explicit import statements at the beginning of your source files, but in my opinion, this is beneficial rather than a drag, as it helps to document the outside entities your programs rely on.

Another problem with the library is that if it remains a third-party Java library, any Web applets using the library will incur a rather painful time penalty when loading off the Web due to the large number of small classes JGL uses. This is because currently Java fetches each referenced class using a separate TCP session. It is to be hoped that Sun Microsystems will act swiftly and welcome the JGL into Java's inner sanctum and grant it "core" API status. (If Sun hasn't done this by the time you read this, maybe a bit of e-mail pressure could help things along.) If this happens, the JGL will reside on every Java-compatible client machine and no JGL classes will consume precious Internet bandwidth.

Conclusion

The benchmarks that accompany the JGL 2.0 distribution prove that the library's ADTs performance is very similar to that of the JDK structures: sometimes a bit faster, sometimes a bit slower. JGLs performance therefore, is not primarily to be found at run time, but rather during design and implementation, when its limitless flexibility will allow you to perform better. You will be able to tackle more complex software development faster, and without sacrificing any system flexibility.

ObjectSpace has implemented one of Java's most important third-party class libraries in existence to date, and is giving JGL away free to the explosively growing Java community. As with Java itself, the Java Generic Library borrows heavily from the C++ camp: It takes the best from C++'s STL, while leaving the C++ warts behind. Most C++ programmers today will know of their STL, but few are managing to exploit its potential. This is no surprise since the C++-STL combination is overly complex. On the other side of the fence Java now beckons with its JGL. Just a year and a half after Java's official launch, the language and its APIs remain delightfully simple, elegant, and powerful; ObjectSpace's JGL only manages to strengthen these rare and desirable characteristics.

Laurence Vanhelsuwé is an independent software engineer. Self-taught, he dropped out of college and immediately started a professional career writing arcade games. He has worked on such diverse technologies as X.25 WAN routers, virtual reality flight simulation, Postscript, and real-time digitized video-based traffic analysis. Laurence currently invests most of his time in learning the latest Java APIs and writing books about them. He thinks Java will revolutionize computer science by leveling the computing landscape into one pan-Java playing field.

Learn more about this topic

1 2 Page 2
Page 2 of 2