Iterating over collections in Java

Use Java 8's forEach() method and Streams API to fine-tune and parallelize the behavior of Java iterators

Anytime you have a collection of things you will need some mechanism to systematically step though the items in that collection. As an everyday example, consider the television remote control, which lets us iterate through various television channels. Similarly, in the programming world, we need a mechanism to systematically iterate through a collection of software objects. Java includes various mechanisms for iteration, including index (for iterating over an array), cursor (for iterating over the results of a database query), enumeration (in early versions of Java), and iterator (in more recent versions of Java).

The Iterator pattern

An iterator is a mechanism that permits all elements of a collection to be accessed sequentially, with some operation being performed on each element. Essentually, an iterator provides a means of "looping" over an encapsulated collection of objects. Examples of using iterators include

  • Visit each file in a directory (aka folder) and display its name.
  • Visit each node in a graph and determine whether it is reachable from a given node.
  • Visit each customer in a queue (for instance, simulating a line in a bank) and find out how long he or she has been waiting.
  • Visit each node in a compiler's abstract syntax tree (which is produced by the parser) and perform semantic checking or code generation. (You could also use the Visitor pattern in this context.)

Certain principles hold for the use of iterators: In general, you should be able to have multiple traversals in progress at the same time; that is, an iterator should allow for the concept of nested looping. An iterator should also be nondestructive in the sense that the act of iteration should not, by itself, change the collection. Of course the operation being performed on the elements in a collection could possibly change some of the elements. It might also be possible for an iterator to support removing an element from a collection or inserting a new element at a particular point in the collection, but such changes should be explicit within the program and not a byproduct of the iteration. In some cases, you will also need to have iterators with different traversal methods; for instance, preorder and postorder traversal of a tree, or depth-first and breadth-first traversal of a graph.

Iterators and the Gang of Four design patterns

According to the Gang of Four (see below), the Iterator design pattern is a behavioral pattern, whose key idea is "to take the responsibility for access and traversal out of the list [ed. think collection] object and put it into an iterator object." This article isn't as much about the Iterator pattern as it is about how iterators are used in practice. To fully cover the pattern would require discussing how an iterator would be designed, participants (objects and classes) in the design, possible alternative designs, and tradeoffs of different design alternatives. I'd rather focus on how iterators are used in practice, but I'll point you to a few resources for investigating the Iterator pattern and design patterns generally:

  • Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley Professional, 1994) written by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (also known as the Gang of Four or simply GoF) is the definitive resource for learning about design patterns. Although the book was first published in 1994, it remains a classic, as evidenced by the fact that there have been more than 40 printings.
  • Bob Tarr, a lecturer at University of Maryland Baltimore County, has an excellent set of slides for his course on design patterns, including his introduction to the Iterator pattern.
  • David Geary's JavaWorld series Java Design Patterns introduces many of the Gang of Four design patterns, including the Singleton, Observer, and Composite patterns. Also on JavaWorld, Jeff Friesen's more recent three-part overview of design patterns includes a guide to the GoF patterns.

Active iterators vs passive iterators

There are two general approaches to implementing an iterator depending on who controls the iteration. For an active iterator (also known as explicit iterator or external iterator), the client controls the iteration in the sense that the client creates the iterator, tells it when to advance to the next element, tests to see if every element has been visited, and so on. This approach is common in languages like C++, and it is the approach that receives the most attention in the GoF book. Although iterators in Java have taken different forms, using an active iterator was essentially the only viable option prior to Java 8.

For a passive iterator (also known as an implicit iterator, internal iterator, or callback iterator), the iterator itself controls the iteration. The client essentially says to the iterator, "perform this operation on the elements in the collection." This approach is common in languages like LISP that provide anonymous functions or closures. With the release of Java 8, this approach to iteration is now a reasonable alternative for Java programmers.

To illustrate the various approaches to iteration in Java, I need an example of a collection and something that needs to be done with its elements. For the initial part of this article I'll use a collection of strings representing names of things. For each name in the collection, I will simply print its value to standard output. These basic ideas are easily extended to collections of more complicated objects (such as employees), and where the processing for each object is a little more involved (like giving each highly rated employee a 4.5 percent raise).

Iteration with the Enumeration class

In Java 1.0 and 1.1, the two primary collection classes were Vector and Hashtable, and the Iterator design pattern was implemented in a class called Enumeration. In retrospect this was a bad name for the class. Do not confuse the class Enumeration with the concept of enum types, which didn't appear until Java 5. Today both Vector and Hashtable are generic classes, but back then generics were not part of the Java language. The code to process a vector of strings using Enumeration would look something like Listing 1.

Listing 1. Using enumeration to iterate over a vector of strings


Vector names = new Vector();

// ... add some names to the collection

Enumeration e = names.elements();
while (e.hasMoreElements())
  {
    String name = (String) e.nextElement();
    System.out.println(name);
  }

Iteration with the Iterator class

Java 1.2 introduced the collection classes that we all know and love, and the Iterator design pattern was implemented in a class appropriately named Iterator. Because we didn't yet have generics in Java 1.2, casting an object returned from an Iterator was still necessary. For Java versions 1.2 through 1.4, iterating over a list of strings might resemble Listing 2.

Listing 2. Using an Iterator to iterate over a list of strings


List names = new LinkedList();

// ... add some names to the collection

Iterator i = names.iterator();
while (i.hasNext())
  {
    String name = (String) i.next();
    System.out.println(name);
  }

Iteration with generics and the enhanced for-loop

Java 5 gave us generics, the interface Iterable, and the enhanced for-loop. The enhanced for-loop is one of my all-time-favorite small additions to Java. The creation of the iterator and calls to its hasNext() and next() methods are not expressed explicitly in the code, but they still take place behind the scenes. Thus, even though the code is more compact, we are still using an active iterator. Using Java 5, our example would look something like what you see in Listing 3.

Listing 3. Using generics and the enhanced for-loop to iterate over a list of strings


List<String> names = new LinkedList<String>();

// ... add some names to the collection

for (String name : names)
    System.out.println(name);

Java 7 gave us the diamond operator, which reduces the verbosity of generics. Gone were the days of having to repeat the type used to instantiate the generic class after invoking the new operator! In Java 7 we could simplify the first line in Listing 3 above to the following:


List<String> names = new LinkedList<>();

Iteration with the forEach() method

Before delving into Java 8 iteration features, let's reflect on what's wrong with the code shown in the previous listings–which is, well, nothing really. There are millions of lines of Java code in currently deployed applications that use active iterators similar to those shown in my listings. Java 8 simply provides additional capabilities and new ways of performing iteration. For some scenarios, the new ways can be better.

The major new features in Java 8 center on lambda expressions, along with related features such as streams, method references, and functional interfaces. These new features in Java 8 allow us to seriously consider using passive iterators instead of the more conventional active iterators. In particular, the Iterable interface provides a passive iterator in the form of a default method called forEach().

A default method, another new feature in Java 8, is a method in an interface with a default implementation. In this case, the forEach() method is actually implemented using an active iterator in a manner similar to what you saw in Listing 3.

Collection classes that implement Iterable (for example, all list and set classes) now have a forEach() method. This method takes a single parameter that is a functional interface. Therefore the actual parameter passed to the forEach() method is a candidate for a lambda expression. Using the features of Java 8, our running example would evolve to the form shown in Listing 4.

Listing 4. Iteration in Java 8 using the forEach() method


List<String> names = new LinkedList<>();

// ... add some names to the collection

names.forEach(name -> System.out.println(name));

Note the difference between the passive iterator in Listing 4 and the active iterator in the previous three listings. In the first three listings, the loop structure controls the iteration, and during each pass through the loop, an object is retrieved from the list and then printed. In Listing 4, there is no explicit loop. We simply tell the forEach() method what to do with the objects in the list — in this case we simply print the object. Control of the iteration resides within the forEach() method.

Iteration with Java streams

Now let's consider doing something slightly more involved than simply printing the names in our list. Suppose, for example, that we want to count the number of names that begin with the letter A. We could implement the more complicated logic as part of the lambda expression, or we could use Java 8's new Stream API. Let's take the latter approach.

1 2 Page 1
Page 1 of 2