Java performance programming, Part 3: Managing collections

See how collections alternatives measure up in performance, and find out how to get the most out of each type

1 2 3 Page 3
Page 3 of 3

The IntArray version of the test uses a collection of int values based on the first type-specific collection example code, while IntBaseArray uses the version obtained by subclassing our ArrayBase class. Both of these are much slower than the simple array case, but there's not much performance difference between the two, showing that we haven't hurt performance by going to the common base class.

The IntegerArray test uses the example code for a type-specific collection of Integer values from this article, while ObjectArray uses code that is basically the same, but has been altered to use Object values instead. The latter case is essentially the same as ArrayList from the collections framework discussed in the first part of this article, requiring casting to convert returned reference values from Object to Integer.

These two versions of the test show the added overhead of working with objects rather than primitive values (even when only the added time needed to access a value from an object, and not any object-creation time, is included). Some JVMs do better than others at handling these cases, but they all experience a substantial performance drop when moving from primitive values to objects and a similar performance drop when the overhead of casting objects to a specific type is added in.

The final version of this test uses Integer values with our old friend Vector, adding the overhead of synchronization to each access. This is a topic we'll discuss in more detail next month, but for now it's interesting to see the extreme performance hit this causes with the standard JVMs.

Performance test 2: Add elements to a collection

Table 2 shows how these same approaches perform when elements are added to a collection. This test repeatedly fills a collection with values, discarding the existing set of values each time it begins with a new set.

 JRE 1.1.8 (Sun)JRE 1.1.8 (IBM)JRE 1.2.2 (Classic)JRE 1.2.2 (HotSpot 2.0 beta)
int[]

5

3

9

8

IntArray

17

15

19

25

IntBaseArray

17

17

19

23

IntegerArray

120

56

106

96

ObjectArray

125

63

106

96

Vector

184

60

131

104

Table 2. Performance in adding to a collection (typical time in seconds, 200,000 values, 200 iterations)

The performance here largely matches the results in the first set of tests, with the exception of the larger difference between the collections of primitive types and those of objects. This difference is due to the relatively long object-creation time, since the objects being added to the collections are created as they're added. It demonstrates the advantage of working with a collection of primitive types as opposed to objects when elements are continually being added and deleted from the collection.

Conclusion

The collections framework added in JDK 1.2 goes a long way toward eliminating the performance penalties of the earlier collections support. There are still some ways to improve performance further, though, in cases in which code executes frequently enough to make the effort worthwhile.

One simple approach (by far the fastest overall, for collections that are fixed in size after initial construction) is to convert your collection to a type-specific array. The collections framework includes direct support for doing just this, and with some of our examples we've seen how support for this can easily be added to the JDK 1.1.x java.util.Vector class as well.

Another alternative is to use a type-specific collection. This offers especially large performance gains when working with primitive values, since these cannot be used without object wrappers in the standard collections. With a little work on the class structure it's possible to eliminate most of the duplicated code you'd otherwise need for type-specific variations of a collection, so this can be a reasonable approach to take if you frequently work with collections of a specific type and need the best performance.

December's article showed the double-barrelled performance and clarity costs of casting in your code, and gave some structural techniques to use for avoiding it. In this article we've gone on to show the benefits of arrays and type-specific collections, which again keep your code cleaner and more efficient while eliminating possible runtime errors due to casting.

Next month, we'll take on the performance costs of threading in your programs -- investigating the costs associated with various types of operations and outlining some ways to minimize the price you pay for the convenience of multithreaded programming in Java.

Dennis Sosnoski is a software professional with over 25 years experience in developing for a wide range of platforms and application areas. An early adoptee of C++, he was just as quick to convert to using Java when it became available, and for the last three years it's been his preferred development platform. Dennis is the President and Lead Consultant of Sosnoski Software Solutions, Inc., a Java consulting and contract software development firm that's not a monopoly even though it's headquartered in Redmond, Washington.

Learn more about this topic

Related:
1 2 3 Page 3
Page 3 of 3