Java performance programming, Part 1: Smart object-management saves the day

Learn how to reduce program overhead and improve performance by controlling object creation and garbage collection

Objects are a powerful software engineering construct, and Java uses them extensively. In fact, it encourages the use of objects so much that developers sometimes forget the costs behind the construct. The result can be object churn, a program state in which most of your processor time is soaked up by repeatedly creating and then garbage collecting objects.

This is the first in a series of articles focused on performance issues in Java. In this series, we'll examine a number of areas in which Java performance can be less than ideal, and provide techniques for bypassing many of these performance roadblocks. Actual timing measurements will be used throughout to demonstrate the performance improvements possible with the right coding techniques.

This month, we'll take a look at the issue of object management in Java. Since Java is often in competition with C/C++ as a language choice for implementing applications, we'll start out by reviewing the differences in how these languages manage the allocation and deallocation of objects, and examine what impact these differences have on performance. In the remainder of the article, we'll look at three ways to reduce the amount of object churn in your Java programs.

Java memory management

Java's simplified memory management is one of the key features that appeals to developers with backgrounds in languages such as C/C++. In contrast with the explicit deallocation required by C/C++, Java lets you allocate objects as necessary and trust that they'll be reclaimed and recycled by the JVM when they're no longer needed. The work required to make this happen goes on behind the scenes, in the garbage collection process.

Garbage collection has been used for memory management in programming languages dating back to the dawn of the computer era in the 1960s. The basic principle of garbage collection is the same in all cases: identify objects that are no longer in use by the program, and recycle the memory used by these objects to create new ones.

JVMs generally use a reachability analysis to identify the objects that are in use, then recycle all the other objects. This starts with a base set of variables that the program uses directly, including object references in local or argument variables on the method call stack of every active thread, and in static variables of loaded classes. Each object that one of these variables references is added to the set of reachable objects. Next, each object that a member variable references in one of these objects is also added to the reachable set. This process continues until closure is obtained; in the end, every object referenced by any object in the reachable set is also in the reachable set. Any objects that are not in the reachable set are by definition not in use by the program, so they can safely be recycled.

The developer generally doesn't need to be directly involved in this garbage collection process. Objects drop out of the reachable set and become eligible for recycling as they're replaced with other objects, or as methods return and their variables are dropped from the calling thread's stack. The JVM runs garbage collection periodically, either when it can, because the program threads are waiting for some external event, or when it needs to, because it's run out of memory for creating new objects. Despite the automatic nature of the process, it's important to understand that it's going on, because it can be a significant part of the overhead of Java programs.

Besides the time overhead of garbage collection, there's also a significant space overhead for objects in Java. The JVM adds internal information to each allocated object to help in the garbage collection process. It also adds other information required by the Java language definition, which is needed in order to implement such features as the ability to synchronize on any object. When the storage used internally by the JVM for each object is included in the size of the object, small objects may be substantially larger than their C/C++ counterparts. Table 1 shows the user-accessible content size and actual object memory size measurements for several simple objects on various JVMs, illustrating the memory overhead added by the JVMs.

Table 1. Measured memory usage (bytes)
 Content (bytes)JRE 1.1.8 (Sun)JRE 1.1.8 (IBM)JRE 1.2.2 (Classic)JRE 1.2.2 (HotSpot 2.0 beta)
java.lang.Object

0

26

31

28

18

java.lang.Integer

4

26

31

28

26

int[0]

4 (length)

26

31

28

26

java.lang.String

(4 characters)

8 + 4

58

63

60

58

This space overhead is a per object value, so the percentage of overhead decreases with larger objects. It can lead to some unpleasant surprises when you're working with large numbers of small objects, though -- a program juggling a million Integers will have most systems down on their knees, for example!

Comparison with C/C++

For most operations, Java performance is now within a few percent of C/C++. The just-in-time (JIT) compilers included with most JVMs convert Java byte codes to native code with amazing efficiency, and in the latest generation (represented by IBM's JVM and Sun's HotSpot) they're showing the potential to start beating C/C++ performance for computational (CPU intensive) applications.

However, Java performance can suffer by comparison with C/C++ when many objects are being created and discarded. This is due to several factors, including initialization time for the added overhead information, garbage collection time, and structural differences between the languages. Table 2 shows the impact these factors can have on program performance, comparing C/C++ and Java versions of code repeatedly allocating and freeing arrays of byte values.

Table 2. Memory management performance (time in seconds)
 JRE 1.1.8 (Sun)JRE 1.1.8 (IBM)JRE 1.2.2 (Classic)JRE 1.2.2 (HotSpot 2.0 beta)C++
Short-term Allocations (7.5 M blocks, 331 MB)

30

22

26

14

9

Long-term Allocations (7.6 M blocks, 334 MB)

48

28

39

33

13

For both short- and long-term allocations, the C++ program is considerably faster than the Java program running on any JVM. Short-term allocations have been one focus area for optimization in HotSpot. Results show that -- with the Server 2.0 beta used in this test -- this is the closest any JVM comes to the C++ code, with a 50 percent longer test time. For long-term allocations, the IBM JVM gives better performance than the HotSpot JVM, but both trail far behind the performance of the C++ code for this type of operation.

Even the relatively good performance of HotSpot on short-term allocations is not necessarily a cause for joy. In general, C++ programs tend to allocate short-lived objects on the stack, which would give a lower overhead than the explicit allocation and deallocation used in this test. C++ also has a big advantage in the way it allocates composite objects, using a single block of memory for the combined entity. In Java, each object needs to be allocated by its own block.

We'll certainly see more performance improvements for object allocation as vendors continue to work on their VMs. Given the above advantages, though, it seems unlikely the performance will ever match C++ in this area.

Does this mean your Java programs are eternally doomed to sluggish performance? Not at all -- object creation and recycling is just one aspect of program performance, and, providing you're sensible about creating objects in heavily used code, it's easy to avoid the object churn cycle! In the remainder of this article we'll look at ways to keep your programs out of the churn by reducing unnecessary object creation.

Keep it primitive

Probably the easiest way to reduce object creation in your programs is by using primitive types in place of objects. This approach doesn't apply very often -- usually there's a good reason for making something an object in the first place, and just replacing it with a primitive type is not going to fill the same design function. In the cases where this technique does apply, though, it can eliminate a lot of overhead.

The primitive types in Java are boolean, byte, char, double, float, int, long, and short. When you create a variable of one of these types, there is no object creation overhead, and no garbage collection overhead when you're done using it. Instead, the JVM allocates the variable directly on the stack (if it's a local method variable) or within the memory used for the containing object (if it's a member variable).

Java defines wrappers for each of these primitive types, which can sometimes confuse Java novices. The wrapper classes represent immutable values of the corresponding primitive types. They allow you to treat values of a primitive type as objects, and are very useful when you need to work with generic values that may be of any type. For instance, the standard Java class libraries define the java.util.Vector, java.util.Stack, and java.util.Hashtable classes for working with object collections. Wrapper classes provide a way to use these utility classes with values of primitive types (not necessarily a good approach from the performance standpoint, for reasons we'll cover in the next article in this series, but a quick and easy way to handle some common needs).

Except for such special cases, you're best off avoiding the usage of the wrapper classes and staying with the base types. This avoids both the memory and performance overhead of object creation.

Besides the actual wrapper types, other classes in the class libraries take values of primitive types and add a layer of semantics and behavior. Classes such as java.util.Date and java.awt.Point are examples of this type. If you're working with a large number of values of such types, you can avoid excessive object overhead by storing and passing values of the underlying primitive types, only converting the values into the full objects when necessary for use with methods in the class libraries. For instance, with the Point class you can access the internal int values directly, even combining them into a long so that a single value can be returned from a method call. The following code fragment illustrates this approach with a simple midpoint calculation:

    ...
    // Method working with long values representing Points,
    // each long contains an x position in the high bits, a y position
    // in the low bits.
    public long midpoint(long a, long b) {
      
        // Compute the average value in each axis.
        int x = (int) (((a >> 32) + (b >> 32)) / 2);
        int y = ((int) a + (int) b) / 2;
        // Return combined value for midpoint of arguments.
        return (x << 32) + y;
    }
    ...

Reuse objects

Now let's consider another approach to reducing object churn: reusing objects. There are at least two major variations of this approach, depending on whether you dedicate the reused object for a particular use or use it for different purposes at different times. The first technique -- dedicated object reuse -- has the advantages of simplicity and speed, while the second -- the free pool approach -- allows more efficient use of the objects.

Dedicated object reuse

The simplest case of object reuse occurs when one or more helper objects are required for handling a frequently repeated task. We'll use date formatting as an example, since it's a function that occurs fairly often in a variety of applications. To just generate a default string representation of a passed date value (represented as a long, let's say, given the preceding discussion of using primitive types), we can do the following:

        ...
        // Get the default string for time.
        long time = ...;
    String display = DateFormat.getDateInstance().format(new Date(time));
        ...

This is a simple statement that masks a lot of complexity and object creation behind the scenes. The call to DateFormat.getDateInstance() creates a new instance of SimpleDateFormat, which in turn creates a number of associated objects; the call to format then creates new StringBuffer and FieldPosition objects. The total memory allocation resulting from this one statement actually came out to about 2,400 bytes, when measured with JRE 1.2.2 on Windows 98.

Since the program only uses these objects (except the output string) during the execution of this statement, there's going to be a lot of object churning if this use and discard approach is implemented in frequently executed code. Dedicated object reuse offers a simple technique for eliminating this type of churn.

Owned objects

By using some one-time allocations, we can create a set of objects required for formatting, then reuse these dedicated objects as needed. This set of objects is then owned by the code which uses them. For example, if we apply this approach using instance variables, so that each instance of the containing class owns a unique copy of the objects, we'd have something like this:

1 2 3 Page
Recommended
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more