Data structures and algorithms: A comparative slice and dice -- er, review -- of 5 Java DS&A books

Laurence places 5 books on the chopping block. Find out which books avoid the cut

It's all very good and well being able to use the latest Java APIs (like Swing and InfoBus, for example), but if you don't rely on decent data structures and algorithms (abbreviated as DS&A from this point on) to build your applications around, chances are your end-product will not satisfy the most basic and universal system requirement: time- and space-wise efficiency.

While DS&A can be discussed in purely implementation language-independent terms, these intimately related topics have generally been brought to the programming and student masses in language-specific texts -- Pascal-flavor in the '70s, C-flavor in the '80s, and C++-flavor during most of the '90s. If there's any indicator that more clearly supports the rumors of Java replacing C++, it's the fact that these all-important books are now Java flavored.

Available at the time of writing were the following books that deal with algorithms, data structures and/or abstract data types (ADTs):

  • Abstract Data Types in Java by Michael S. Jenkins (McGraw-Hill)
  • An Introduction to Data Structures and Algorithms with Java by Glenn W. Rowe (Prentice Hall)
  • Data Structures and Algorithms in Java by Michael T. Goodrich and Roberto Tamassia (John Wiley)
  • Data Structures and Problem Solving Using Java by Mark Allen Weiss (Addison-Wesley)
  • Java Algorithms by Scott Robert Ladd (McGraw-Hill)

Here's a table to provide you with a bird's-eye view of each title's main characteristics. The table orders the books by price.

An Introduction to Data Structures and Algorithms with Java Java Algorithms Abstract Data Types in Java Data Structures and Problem Solving Using Java Data Structures and Algorithms in Java
Price (U.S.$)4.004.954.959.951.95

Pages, Chapters

(Appendices)

445, 15 (1)484, 15 (0)291, 12 (2)780, 23 (4)738, 16 (2)
IndexBasicGoodBasicGoodGood
GlossaryNoNoNoNoNo
CD-ROMNoYesYesNoYes*
Authors11112
Listings Density (lines/page)4941525143
Big-Oh Algorithm Analysis?PoorNoNoThoroughVery Thorough
Algorithm Animation Applets?SomeNoNoNoOn Web site
Focus on Abstract vs. Implementation?NoNoNo (!)Yes!Yes
JGL or STL Discussed?NoNoNoNoNo
Math SophisticationNoneUniversity levelNoneHigher secondary levelHigher secondary level
Suitable for University course?NoNoNoYesYes
Overall Score out of 1046688.5

* Strictly speaking Yes, but the CD-ROM content is unrelated to the book's content!

In the absence of tools to calculate the true cost-per-bit equivalent of a book, the Listings Density row gives you an idea of how dense or "aerated" the source listings are. Low lines/page values usually mean unreadable listings and a high page-fill factor, so the higher this value, the better.

The Big-Oh Algorithm Analysis? row indicates how thorough the text is when it comes to mathematically analyzing the runtime performance of a given algorithm.

The Algorithm Animation Applets? row indicates whether the authors grabbed the staring-in-your-face opportunity to enhance the text with animation applets that graphically show algorithms in action. (Judging by how few did grab the opportunity, I wonder if those authors ever saw the JDK's very own SortDemo demonstration applet?)

The Focus on Abstract vs. Implementation? row highlights whether the text draws a clear line between the definition of the abstract data type (ADT) and the near-infinite number of ways to implement that ADT.

The JGL or STL Discussed? row highlights whether another very obvious opportunity was grabbed: the opportunity to take an in-depth look at ObjectSpace's Java Generic Library (JGL), or, at least, give an overview of the philosophy and design behind JGL or JGL's C++ ancestor -- the Standard Template Library. (I didn't include a row that covers the new 1.2 JDK Collections API because this API hasn't been finalized yet, and in any case, none of the books even mention it.)

The Math Sophistication row should give you an idea of how strong your stomach will need to be when it comes time to wade through the respective authors' use of mathematics to analyze algorithms and/or prove theorems relating to a given algorithm.

An Introduction to Data Structures and Algorithms with Java by Glenn W. Rowe (Prentice Hall)

An Introduction to Data Structures and Algorithms with Java

For a book that's squarely aimed at university courses, I found Rowe's An Introduction to Data Structures and Algorithms with Java unacceptably weak. Because educational establishments should teach students sound practices based on conventionally accepted norms, it's disheartening to see a book that implants in students "knowledge" and "techniques" that range from questionable to downright counter-productive. My main criticism is that this book has no intellectual muscle to it: The author seems to think that computer science students cannot handle problems beyond a certain, unrealistically low, level of complexity. This is a fundamentally flawed assumption, and no way to groom the software engineers of tomorrow, who will have to tackle application complexity that will dwarf anything from the present and past.

One of the perceived mental obstacles that the author avoids like the plague is the (simple) math necessary to formally analyze an algorithm's running time characteristics. So classic Big-Oh reasoning goes out the window in favor of an experimental, quantitative approach that seeks to determine the running time Big-Oh signature of an algorithm by interpreting plots that map input size to runtime. The author does this very consciously, as he explains in his Preface:

The efficiency of the algorithm is then deduced from graphs of the results of these experiments. Although such an approach is certainly not rigorous and will probably horrify mathematicians, the author believes that it is a much clearer way of presenting the topic of efficiency to a non-mathematical audience. What is lost in rigour is recouped in understanding.

I wonder then, what Rowe's justification is behind the total lack of separation between an ADT and its possible implementations in his book. In fact, ADT isn't mentioned in the index or, as far as I could detect, neither is this pivotal concept discussed at any length in the text itself. Either formal algorithm analysis or a strong, consistent focus on the ADT concept form the legitimate backbone to any DS&A book. With neither, does this book at least pay attention to the need for accurate, readable, and squeaky-clean code? Not as far as I'm concerned, I'm afraid.

This book was published hot on the heels of its C++ flavor predecessor, Introduction to Data Structures and Algorithms with C++ (Prentice-Hall, 1997, also by Rowe). As a result, large chunks of text were taken verbatim from the earlier C++ text, but unfortunately (surprise, surprise), these do not apply or break down in the context of Java. (And I'm not even thinking about the omnipresent use of the term pointer when reference would be more accurate.) Then there are errors, such as the statement that "it is also necessary to initialize the individual elements of the [Object] array in the constructor. It was not necessary to do this in IntStack, since there we were dealing with primitive data types." Such phrases are comments about the author's Stack class constructor and prove that Rowe didn't understand how Java arrays work when he wrote that first sentence. When it comes to coding, standard Java naming conventions are consistently flouted and (logical) while loops are implemented with for( ; condition; modifier). A simple interface Comparable (to allow sorting of arbitrary items) is defined as:

public interface Comparable {
   public boolean lessThan(Comparable other);
   public boolean equalTo(Comparable other);
}

Why require an equalTo() when java.lang.Object already has an equals() with the right semantics? Worse, many of the implementations of data structures and algorithms are not reusable as-is because they're stuck, embedded in test harness code! Lastly, for a book that was launched in 1998, the author can't be tracking Java developments very closely; many of his exception-handling code uses the old-style (illegal by now) try-catch, which doesn't require blocks.

Java Algorithms by Scott Robert Ladd (McGraw-Hill)

Java Algorithms

This book can't possibly aim for the higher education market because its content thoroughly deviates from standard DS&A texts. Granted, the title only promises that the pages will focus on algorithms, which is a much broader subject matter than "merely" that of algorithms associated with classic data structures (many numerical algorithms, for example, have nothing to do with data structures). Anyway, from the overview I give below, you can judge for yourself which chapters can be considered the proverbial odd ones out, or even completely out of context for this type of book.

Chapter 1 "Sorting Things Out" hits the ground running by flying through coverage (and implementation) of Selection sort, Insertion sort, and Shell sort in just five pages. Almost the entire remainder of the chapter, nearly 20 pages, is devoted to the classic QuickSort. This heavily biased approach gave me the impression that the author judged QuickSort to be the only sorting algorithm worth writing about. It is also in these pages that you get to like or hate one particular aspect of the writing style that permeates this book: The author loves to use the first person at every opportunity ("I did", "I found", "My solution", "My requirements"). Indeed, the thesis of the book often comes across as "Hey, look at my cool library of classes." The unfortunate side effect of his style is that the presented material doesn't seem to have been crafted for, or targeted at, you, the reader, but rather amounts to documentation for some of the author's personal, but admittedly valuable, Java class library.

Chapter 2 "Mathematical Components" begins with "I programmed my first numerical applications in FORTRAN, on a DEC PDP-8 back in the late 1970s" and continues by showing classes for the manipulation of polynomials, complex numbers, complex number polynomials, and the Fast Fourier Transform (FFT). Before finally giving us the PolynomialFFT class, the author sprinkles two pages with mathematical formulas like this one:

I'm sure most JavaWorld readers don't get serious mathitis attacks at the mere sight of such a summation formula, but with N, I, and k left totally up to the reader to "reverse engineer" their meaning, I wondered what the point of it all was. In addition, the FFT is only touched upon to implement a faster-than-normal polynomial multiplier, but without even a hint at any possible applications.

Chapter 3 "Java Strings and Text" is mainly about the Boyer-Moore string searching algorithm, one of the fastest algorithms to do constant (that is, non-pattern) substring searches. (By the way, the shown implementation clearly illustrates why Boyer-Moore isn't used in Java's own String.indexOf(): Because Java relies on Unicode characters for its strings, and because Boyer-Moore relies on a look-up table indexed by character code, a Java implementation needs a LUT with 65536 entries, amounting to 256 kilobytes of table storage alone [using a non-optimized implementation].)

Chapter 4 "Layout Managers" presents a (vertical) VFlowLayout and a ClockLayout. Chapter 5 "High-Performance Containers" lists implementations for a stack, queue, dequeue ("double-ended" queues), heap, and priority queue. All of these implementations are based on Java's native arrays, but do not use array-growing techniques like java.util.Vector does; in other words, these containers have a fixed capacity (so it would be more accurate to state that these are fast containers, because their fixed capacity isn't exactly a high-performance feature!).

Chapter 6 "Tree Structures" discusses binary trees and common operations on binary trees, and presents an implementation. The issue of unbalanced trees leads to a further discussion of self-balancing red-black trees. As is often the case in this book, the text surrounding red-black trees is extremely terse, with most of the implementation left unexplained.

Chapter 7 "Finite State Machines and Evolving Software" combines two very different topics. After a very short introduction on state machines and the implementation of one, Ladd writes about a topic close to my heart: evolution and the use of natural selection principles in the design of adaptable (or learning) software systems. The reader then discovers why FSMs were briefly introduced earlier: Ladd uses an FSM to give a simulated ant a primitive "brain" that is tasked with learning how to follow a trail of food in a virtual environment. The learning is done across ant generations (not by the individual ants) by letting the same forces of evolution act on Ladd's ants as on real, flesh-and-chitin ants. The chapter ends with a brief intro on random numbers and code to make java.lang.Math.random() obsolete.

Chapter 8 "Matrices and Linear Algebra" develops classes to test and manipulate matrices, with a view to solving simultaneous linear equations. Gaussian elimination is briefly explained (and implemented), but the remainder of the chapter tackles the more mathematically advanced "LUP decomposition" approach (lower-upper-permutation).

Chapter 9 "Serialization" is simple and pure, delving into Java's serialization API as a preparation for the following chapters' reliance on this service.

1 2 3 Page
Recommended
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more