Data structures and algorithms: A comparative slice and dice -- er, review -- of 5 Java DS&A books

Laurence places 5 books on the chopping block. Find out which books avoid the cut

It's all very good and well being able to use the latest Java APIs (like Swing and InfoBus, for example), but if you don't rely on decent data structures and algorithms (abbreviated as DS&A from this point on) to build your applications around, chances are your end-product will not satisfy the most basic and universal system requirement: time- and space-wise efficiency.

While DS&A can be discussed in purely implementation language-independent terms, these intimately related topics have generally been brought to the programming and student masses in language-specific texts -- Pascal-flavor in the '70s, C-flavor in the '80s, and C++-flavor during most of the '90s. If there's any indicator that more clearly supports the rumors of Java replacing C++, it's the fact that these all-important books are now Java flavored.

Available at the time of writing were the following books that deal with algorithms, data structures and/or abstract data types (ADTs):

  • Abstract Data Types in Java by Michael S. Jenkins (McGraw-Hill)
  • An Introduction to Data Structures and Algorithms with Java by Glenn W. Rowe (Prentice Hall)
  • Data Structures and Algorithms in Java by Michael T. Goodrich and Roberto Tamassia (John Wiley)
  • Data Structures and Problem Solving Using Java by Mark Allen Weiss (Addison-Wesley)
  • Java Algorithms by Scott Robert Ladd (McGraw-Hill)

Here's a table to provide you with a bird's-eye view of each title's main characteristics. The table orders the books by price.

An Introduction to Data Structures and Algorithms with Java Java Algorithms Abstract Data Types in Java Data Structures and Problem Solving Using Java Data Structures and Algorithms in Java
Price (U.S.$)4.004.954.959.951.95

Pages, Chapters

(Appendices)

445, 15 (1)484, 15 (0)291, 12 (2)780, 23 (4)738, 16 (2)
IndexBasicGoodBasicGoodGood
GlossaryNoNoNoNoNo
CD-ROMNoYesYesNoYes*
Authors11112
Listings Density (lines/page)4941525143
Big-Oh Algorithm Analysis?PoorNoNoThoroughVery Thorough
Algorithm Animation Applets?SomeNoNoNoOn Web site
Focus on Abstract vs. Implementation?NoNoNo (!)Yes!Yes
JGL or STL Discussed?NoNoNoNoNo
Math SophisticationNoneUniversity levelNoneHigher secondary levelHigher secondary level
Suitable for University course?NoNoNoYesYes
Overall Score out of 1046688.5

* Strictly speaking Yes, but the CD-ROM content is unrelated to the book's content!

In the absence of tools to calculate the true cost-per-bit equivalent of a book, the Listings Density row gives you an idea of how dense or "aerated" the source listings are. Low lines/page values usually mean unreadable listings and a high page-fill factor, so the higher this value, the better.

The Big-Oh Algorithm Analysis? row indicates how thorough the text is when it comes to mathematically analyzing the runtime performance of a given algorithm.

The Algorithm Animation Applets? row indicates whether the authors grabbed the staring-in-your-face opportunity to enhance the text with animation applets that graphically show algorithms in action. (Judging by how few did grab the opportunity, I wonder if those authors ever saw the JDK's very own SortDemo demonstration applet?)

The Focus on Abstract vs. Implementation? row highlights whether the text draws a clear line between the definition of the abstract data type (ADT) and the near-infinite number of ways to implement that ADT.

The JGL or STL Discussed? row highlights whether another very obvious opportunity was grabbed: the opportunity to take an in-depth look at ObjectSpace's Java Generic Library (JGL), or, at least, give an overview of the philosophy and design behind JGL or JGL's C++ ancestor -- the Standard Template Library. (I didn't include a row that covers the new 1.2 JDK Collections API because this API hasn't been finalized yet, and in any case, none of the books even mention it.)

The Math Sophistication row should give you an idea of how strong your stomach will need to be when it comes time to wade through the respective authors' use of mathematics to analyze algorithms and/or prove theorems relating to a given algorithm.

An Introduction to Data Structures and Algorithms with Java by Glenn W. Rowe (Prentice Hall)

An Introduction to Data Structures and Algorithms with Java

For a book that's squarely aimed at university courses, I found Rowe's An Introduction to Data Structures and Algorithms with Java unacceptably weak. Because educational establishments should teach students sound practices based on conventionally accepted norms, it's disheartening to see a book that implants in students "knowledge" and "techniques" that range from questionable to downright counter-productive. My main criticism is that this book has no intellectual muscle to it: The author seems to think that computer science students cannot handle problems beyond a certain, unrealistically low, level of complexity. This is a fundamentally flawed assumption, and no way to groom the software engineers of tomorrow, who will have to tackle application complexity that will dwarf anything from the present and past.

One of the perceived mental obstacles that the author avoids like the plague is the (simple) math necessary to formally analyze an algorithm's running time characteristics. So classic Big-Oh reasoning goes out the window in favor of an experimental, quantitative approach that seeks to determine the running time Big-Oh signature of an algorithm by interpreting plots that map input size to runtime. The author does this very consciously, as he explains in his Preface:

The efficiency of the algorithm is then deduced from graphs of the results of these experiments. Although such an approach is certainly not rigorous and will probably horrify mathematicians, the author believes that it is a much clearer way of presenting the topic of efficiency to a non-mathematical audience. What is lost in rigour is recouped in understanding.

I wonder then, what Rowe's justification is behind the total lack of separation between an ADT and its possible implementations in his book. In fact, ADT isn't mentioned in the index or, as far as I could detect, neither is this pivotal concept discussed at any length in the text itself. Either formal algorithm analysis or a strong, consistent focus on the ADT concept form the legitimate backbone to any DS&A book. With neither, does this book at least pay attention to the need for accurate, readable, and squeaky-clean code? Not as far as I'm concerned, I'm afraid.

This book was published hot on the heels of its C++ flavor predecessor, Introduction to Data Structures and Algorithms with C++ (Prentice-Hall, 1997, also by Rowe). As a result, large chunks of text were taken verbatim from the earlier C++ text, but unfortunately (surprise, surprise), these do not apply or break down in the context of Java. (And I'm not even thinking about the omnipresent use of the term pointer when reference would be more accurate.) Then there are errors, such as the statement that "it is also necessary to initialize the individual elements of the [Object] array in the constructor. It was not necessary to do this in IntStack, since there we were dealing with primitive data types." Such phrases are comments about the author's Stack class constructor and prove that Rowe didn't understand how Java arrays work when he wrote that first sentence. When it comes to coding, standard Java naming conventions are consistently flouted and (logical) while loops are implemented with for( ; condition; modifier). A simple interface Comparable (to allow sorting of arbitrary items) is defined as:

public interface Comparable {
   public boolean lessThan(Comparable other);
   public boolean equalTo(Comparable other);
}

Why require an equalTo() when java.lang.Object already has an equals() with the right semantics? Worse, many of the implementations of data structures and algorithms are not reusable as-is because they're stuck, embedded in test harness code! Lastly, for a book that was launched in 1998, the author can't be tracking Java developments very closely; many of his exception-handling code uses the old-style (illegal by now) try-catch, which doesn't require blocks.

Java Algorithms by Scott Robert Ladd (McGraw-Hill)

Java Algorithms

This book can't possibly aim for the higher education market because its content thoroughly deviates from standard DS&A texts. Granted, the title only promises that the pages will focus on algorithms, which is a much broader subject matter than "merely" that of algorithms associated with classic data structures (many numerical algorithms, for example, have nothing to do with data structures). Anyway, from the overview I give below, you can judge for yourself which chapters can be considered the proverbial odd ones out, or even completely out of context for this type of book.

Chapter 1 "Sorting Things Out" hits the ground running by flying through coverage (and implementation) of Selection sort, Insertion sort, and Shell sort in just five pages. Almost the entire remainder of the chapter, nearly 20 pages, is devoted to the classic QuickSort. This heavily biased approach gave me the impression that the author judged QuickSort to be the only sorting algorithm worth writing about. It is also in these pages that you get to like or hate one particular aspect of the writing style that permeates this book: The author loves to use the first person at every opportunity ("I did", "I found", "My solution", "My requirements"). Indeed, the thesis of the book often comes across as "Hey, look at my cool library of classes." The unfortunate side effect of his style is that the presented material doesn't seem to have been crafted for, or targeted at, you, the reader, but rather amounts to documentation for some of the author's personal, but admittedly valuable, Java class library.

Chapter 2 "Mathematical Components" begins with "I programmed my first numerical applications in FORTRAN, on a DEC PDP-8 back in the late 1970s" and continues by showing classes for the manipulation of polynomials, complex numbers, complex number polynomials, and the Fast Fourier Transform (FFT). Before finally giving us the PolynomialFFT class, the author sprinkles two pages with mathematical formulas like this one:

I'm sure most JavaWorld readers don't get serious mathitis attacks at the mere sight of such a summation formula, but with N, I, and k left totally up to the reader to "reverse engineer" their meaning, I wondered what the point of it all was. In addition, the FFT is only touched upon to implement a faster-than-normal polynomial multiplier, but without even a hint at any possible applications.

Chapter 3 "Java Strings and Text" is mainly about the Boyer-Moore string searching algorithm, one of the fastest algorithms to do constant (that is, non-pattern) substring searches. (By the way, the shown implementation clearly illustrates why Boyer-Moore isn't used in Java's own String.indexOf(): Because Java relies on Unicode characters for its strings, and because Boyer-Moore relies on a look-up table indexed by character code, a Java implementation needs a LUT with 65536 entries, amounting to 256 kilobytes of table storage alone [using a non-optimized implementation].)

Chapter 4 "Layout Managers" presents a (vertical) VFlowLayout and a ClockLayout. Chapter 5 "High-Performance Containers" lists implementations for a stack, queue, dequeue ("double-ended" queues), heap, and priority queue. All of these implementations are based on Java's native arrays, but do not use array-growing techniques like java.util.Vector does; in other words, these containers have a fixed capacity (so it would be more accurate to state that these are fast containers, because their fixed capacity isn't exactly a high-performance feature!).

Chapter 6 "Tree Structures" discusses binary trees and common operations on binary trees, and presents an implementation. The issue of unbalanced trees leads to a further discussion of self-balancing red-black trees. As is often the case in this book, the text surrounding red-black trees is extremely terse, with most of the implementation left unexplained.

Chapter 7 "Finite State Machines and Evolving Software" combines two very different topics. After a very short introduction on state machines and the implementation of one, Ladd writes about a topic close to my heart: evolution and the use of natural selection principles in the design of adaptable (or learning) software systems. The reader then discovers why FSMs were briefly introduced earlier: Ladd uses an FSM to give a simulated ant a primitive "brain" that is tasked with learning how to follow a trail of food in a virtual environment. The learning is done across ant generations (not by the individual ants) by letting the same forces of evolution act on Ladd's ants as on real, flesh-and-chitin ants. The chapter ends with a brief intro on random numbers and code to make java.lang.Math.random() obsolete.

Chapter 8 "Matrices and Linear Algebra" develops classes to test and manipulate matrices, with a view to solving simultaneous linear equations. Gaussian elimination is briefly explained (and implemented), but the remainder of the chapter tackles the more mathematically advanced "LUP decomposition" approach (lower-upper-permutation).

Chapter 9 "Serialization" is simple and pure, delving into Java's serialization API as a preparation for the following chapters' reliance on this service.

Chapter 10 "Data Compression" presents an object-compression system, based on serializing objects to memory and then compressing the data stream using Huffman encoding. The brief explanation of the Huffman algorithm is made even more opaque by a missing figure that was meant to show one of the core data structures for the algorithm: a trie.

Chapter 11 "Serializing Objects in Random Access Files" builds on the two previous chapters to design and implement an ObjectDatabaseFile class, allowing objects to be stored, retrieved, and deleted as records, a la old-fashioned dBase DBF file. This chapter contains some very poorly typeset hexdumps.

Chapter 12 "Btree Indexed Databases" continues the database theme begun by its preceding chapter, devoting itself to developing Page, PageDatabaseFile, BTreePageFile, and BTreeDatabase classes. Together they allow you to write, delete, and read records specified by arbitrary object keys. Now, compared to complex number polynomial multiplication, that's what I call useful!

The remaining chapters, chapters 13 to 15 ("Where and When," "Stellar Cartography," and "A Star Map Plotter," respectively) are simply the author sharing his passion for astronomy. In his own words:

I've always liked including something "different" in my books -- code that is, perhaps, less general purpose and more interesting than the average algorithm. A recent project caused me to write code for an embedded navigation and astronomy system. Some of that code follows.

And follow it does, sprinkled with lots of stuff on longitudes, latitudes, the Julian calendar, sidereal time, ascension, and declination. If you love trigonometry and/or astronomy, these chapters will be right up your alley.

Abstract Data Types in Java by Michael S. Jenkins (McGraw-Hill)

Abstract Data Types in Java

Although the title of this book prominently mentions ADTs, it completely misses its mark because Jenkins does not clearly draw the crucial distinction between a data structure's abstract interface (its API) and the many possible implementations of that data structure. In fact, this book is all about implementations, and thoroughly ignores the enlightening (and powerful) world of abstract data structures.

Okay, so the title should have read Concrete Data Types in Java instead. Unfortunately, even with an accurate title the book still has a lot of problems, as you will soon see. Chapter 1 "Basic Concepts" starts off with a labored explanation and definition of ADTs, then goes off (on a tangent) to concentrate on Java's dual primitive and reference-based types, before finishing by explaining why we need ADTs. Chapter 2 "Error Handling and Exceptions" has nothing to do with AD: Ts, of course, but is this book's second chapter anyway.

Chapter 3 "Arrays, Vectors and Sorting" is where we get into the realm of classic DS&A stuff. Unfortunately, I repeatedly encountered sentences and explanations that left much to be desired. The section on the QuickSort algorithm, for example, doesn't bode well for the rest of the book; it is about as clear as mud (with the primitive figures not helping one iota).

Chapter 4 "Hash Tables" only confirms the fear that Jenkins is lacking in the rather elementary skill, for a technical author that is, of being able to explain things in a readable, accurate, and concise manner. His introduction on hashtables is of the same caliber as that on QuickSort. Furthermore, his implementation of a hashtable relies on linked lists without first explaining linked lists (in fact, linked lists are not examined until the next chapter).

Because this book is all about implementations, it was rather odd to have chapter 5 "Linked Lists" begin by showing an implementation based on java.util.Vector (a node-free implementation), followed by the classic node-based implementation that gives linked lists their name.

Chapter 7 "Stacks" repeatedly makes the equally weird statement that stacks are "specialized linked lists." This is just one of many examples of the author's complete deviation from established DS&A terminology and use of concepts.

The only redeeming material in Abstract Data Types in Java is its coverage of tree DS&A. The four tail-end chapters are all devoted to increasingly advanced chapters on trees: "Simple Trees," "Binary Trees," "Multi-way Trees," and "B-Trees." Here the explanations and figures become clear enough as to be usable, although too late to save this book.

SUBHEADData Structures and Problem Solving Using Java by Mark Allen Weiss (Addison-Wesley)

Data Structures and Problem Solving Using Java

This is another textbook targeted at university-level courses. And as one would expect, a lot of thought and care has gone into its pedagogical approach. Weiss really goes whole hog when separating ADT from implementation by leaving the implementations of covered ADTs to the tail-section of the book, "Part IV: Implementations" (starting on page 411, which is past the half-way mark for this book). Parts I, II, and III ("Tour of Java," "Algorithms and Building Blocks," and "Applications," respectively) attempt to instill the commendable habits of abstract thinking and reuse, staying clear of the classic computer science education flaw of letting students reinvent wheels all the time, starting from atoms.

Part I is the only significant section that I felt didn't belong in this otherwise high-quality book. Its chapters, chapters 1 through 4 ("Primitive Java," "References," "Objects and Classes," and "Inheritance," respectively) don't have much to do with the classically delineated subject of DS&A, being more an introduction to Java and object-oriented principles. Regardless, these first chapters set the routine for the book. Almost every chapter ends with the following: a summary, "Objects of the Game" (which amounts to a little end-of-chapter glossary), "On the Internet" (this book doesn't come with a CD-ROM, so all listings are available on a Web site associated with the book), exercises, and references.

Part II "Algorithms and Building Blocks" is really the logical beginning of Data Structures and Problem Solving Using Java. Its chapters cover all the abstract fundamentals. Chapter 5, "Algorithm Analysis," is an excellent introduction to Big-Oh analysis, with examples of O(n)-, O(n2)- and O(n3)-running algorithms. Possibly more enlightening still is an example problem for which the author shows increasingly sophisticated solutions that run in O(n3) down to O(n log n).

Chapter 6 "Data Structures" crams an introduction to all the mainstream AD: Ts (stacks, queues, linked lists, trees, binary trees, hashtables, and priority queues) by showing their respective Java interfaces and example applications.

Chapter 7 "Recursion" examines recursion as one of the fundamental algorithm design tools, and shows how divide-and-conquer, dynamic programming, and backtracking are all often-used techniques that rely on recursion.

Chapter 8 "Sorting Algorithms" tackles the various sorting algorithms (Shell, Merge and QuickSort), complete with detailed Big-Oh analysis. QuickSort is treated in extra detail (10+ pages).

Part III "Applications" is a mixed bag of example applications of DS&A. The examples range in complexity from a simple balanced-symbol checker (to analyze Java source files) to a human-beating Tic-Tac-Toe game that relies on trees and "Alpha-beta pruning."

Chapter 14 "Graphs and Paths" is the last chapter in part III and clearly doesn't belong here. The complexity, volume, and theoretical nature of graph DS&A material is such that this chapter should have been part of part V (see below).

Part IV "Implementations" consists of six chapters (15-20) which basically implement the fundamental data structures presented as ADTs in chapter 6. The chapters on trees (18 and 19, "Trees" and "Binary Search Trees") are very thorough and cover the usual bases: post-, in-, pre-, and level-order tree traversal; AVL-, red-black-, AA-, and B-Trees.

Finally, part V "Advanced Data Structures" consists of the chapters "Splay Trees," "Merging Priority Queues," and "The Disjoint Set Class."

The appendices amount to some 70 pages, but do not, in my opinion, add any value to the book. "Java Platforms," "Operators," "Some Library Routines," (dealing with java.lang, java.io, and java.util classes, nothing more exciting) and "Graphical User Interfaces" (talking about AWT and applets) tasted more like the kind of book filler that lesser publishers are all too happy to pad their books with, but rarely, in my experience anyway, does Addison-Wesley stoop to this level.

By the way, when a book's table of contents is so detailed as to need nearly 20 pages to present it (as is the case here), a contents-at-a-glance section would be a nice feature. Somehow I doubt publishers will take note, but I can try.

SUBHEADData Structures and Algorithms in Java by Michael T. Goodrich, Roberto Tamassia (John Wiley)

Data Structures and Algorithms in Java

Goodrich and Tamassia are excellent, highly knowledgeable writers who, with their Data Structures and Algorithms in Java, have created a first-class, university-level text. The writing is crystal clear, the explanations are rock solid, the chosen examples are enlightening, and the high-quality figures complement the text perfectly. The typesetting too, is of very high quality. The listings, for example, use syntax-coloring to thoroughly enhance source code (one unfortunate, but minor, mistake is to have chosen a proportional font for the code, sending originally perfectly aligned end-of-line comments out of vertical alignment). In short, this book has quality stamped all over it.

More importantly, the quality equally pervades the intellectual material between its (hardback) covers. The authors use a fairly consistent approach to present their exploration of classic, fundamental data structures and their associated algorithms: They define ADTs using plain, implementation language-independent descriptions, give pseudo-code for algorithms, then define Java interfaces to reflect the ADT definition, and last but not least, present one or more concrete Java implementations of the ADT. All along, the listings remain in their rightful spot as supporting material; the book's prose and many figures being its solid backbone (many computer books get their priorities backward by having the text in a supporting role for the listings).

The text goes to great lengths to use the correct terminology, as used by the academic DS&A community (who, let's face it, are way ahead of us commercial types when it comes to understanding DS&A). Key terms are consistently highlighted in bold, so you can just browse the pages and instantly know what the text is currently focusing on (Mark Weiss's book uses italic for the same purpose, but his text's italicized words easily blend in with the surrounding text, losing the intended effect).

Data Structures and Algorithms in Java isn't subdivided into parts, although the preface provides a guiding map of chapter dependencies, which groups chapters into three sets of increasingly advanced material:

  • Chapters 1 to 6: "Design Principles," "Analysis Tools," "Stacks, Queues and Linked Lists," "Sequences," "Trees," and "Priority Queues"

  • Chapters 7 to 11: "Dictionaries," "Sorting, Sets, and Selection," "Graphs," "Weighted Graphs," and "Strings"

  • Chapters 12 to 16: "Fundamental Techniques," "Balanced Search Trees," "Multi-Dimensional Search Trees," "Computational Geometry," and "Caches and Disks"

Expect the content of those chapters with unambiguous titles to be very thorough and accurate.

Very briefly, I'll just say a couple of words on those chapters with slightly ambiguous titles. "Design Principles" deals mainly with object-oriented design principles, and only to a much lesser extent DS&A design principles. "Analysis Tools" introduces the theory, methodology, and practice of algorithm running time analysis using the classic Big-Oh technique. "Fundamental Techniques" covers some of the recurring approaches in algorithm design, namely divide-and-conquer, amortization, dynamic programming, and the "greedy method."

Purchase these books from

Computer Literacy

Abstract Data Types in Java, by Michael S. Jenkins

(McGraw-Hill) - ISBN: 0079132707

An Introduction to Data Structures and Algorithms with Java, by Glenn W. Rowe

(Prentice Hall) - ISBN: 0138577498

Data Structures and Algorithms in Java, by Michael T. Goodrich and Roberto Tamassia

(John Wiley) - ISBN: 0471193089

Data Structures and Problem Solving Using Java, by Mark Allen Weiss

(Addison-Wesley) - ISBN: 0201549913

Java Algorithms, by Scott Robert Ladd

(McGraw-Hill) - ISBN: 0079136966

Mathematical reasoning and formulae are almost everywhere in this book. Nevertheless, using my own, admittedly limited, mathematical education (no university), I found the way the authors use (and explain) math easy to follow, provided I didn't let the concentration flag. Anyone with university-level math should have no problem with Goodrich and Tamassia's math.

Lastly, every chapter ends with a meaty exercises section, helpfully subdivided into "Reinforcement," "Creativity," and "Project" exercises.

Because I failed to find any faults in the text worth mentioning (but check the book's online errata!), I am left with just a single, minor criticism: Because quite a few algorithms and data structures are not given as complete Java implementations (many are left as exercises for the reader, who always has the pseudo-code at hand to realistically solve these exercises), I found the CD-ROM extremely disappointing (it contains nothing more than a trial version of Microsoft's J++).

"So, dear Java Book Reviewer, which book do you suggest I buy?"

At a 2 list price, the book I would enthusiastically suggest you go for isn't cheap. But for once, with Wiley's Data Structures and Algorithms in Java, price does reflect quality. A close second is Addison-Wesley's Data Structures and Problem Solving Using Java, which, by coincidence, also is the second most expensive. If academic texts really aren't your cup of tea, you might find any of the other books of some value. McGraw-Hill's book by Scott Robert Ladd (Java Algorithms) in particular contains lots of good and diverse code, although the text itself is poor.

Laurence Vanhelsuwé is an independent software engineer. Self-taught, he dropped out of college and immediately started a professional career writing arcade games. He has worked on such diverse technologies as X.25 WAN routers, virtual reality flight simulation, Postscript, and real-time digitized video-based traffic analysis.

Learn more about this topic

  • Read all of Laurence's other great reviews. http://www.javaworld.com/common/jw-ti-bookreviews.html
  • Objectspace Inc.'s home page for their Java Generic Library (JGL). If you don't care about DS&A internals, but just need some free, industrial grade data structures and algorithms to use in your projects, then this has to be your first stop on the Web. http://www.objectspace.com/jgl/
  • Gamelan's Java resource directory contains numerous DS&A related resources. Use their search engine to home in on the goodies. http://www.developer.com/directories/pages/dir.java.html
  • John Wiley & Sons's online support pages for Goodrich and Tamassia's Data Structures and Algorithms in Java http://www.wiley.com/college/cs2java/
  • Addison-Wesley's product catalog entry for Weiss's Data Structures and Problem Solving Using Java. http://cseng.awl.com/bookdetail.qry?ISBN=0-201-54991-3&ptype=0
  • Glenn Rowe's books page, on which you can find links to the source code for his An Introduction to Data Structures and Algorithms with Java. http://alife.mic.dundee.ac.uk/growe/Books/index.html
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more