Java performance programming, Part 1: Smart object-management saves the day

Learn how to reduce program overhead and improve performance by controlling object creation and garbage collection

Objects are a powerful software engineering construct, and Java uses them extensively. In fact, it encourages the use of objects so much that developers sometimes forget the costs behind the construct. The result can be object churn, a program state in which most of your processor time is soaked up by repeatedly creating and then garbage collecting objects.

This is the first in a series of articles focused on performance issues in Java. In this series, we'll examine a number of areas in which Java performance can be less than ideal, and provide techniques for bypassing many of these performance roadblocks. Actual timing measurements will be used throughout to demonstrate the performance improvements possible with the right coding techniques.

This month, we'll take a look at the issue of object management in Java. Since Java is often in competition with C/C++ as a language choice for implementing applications, we'll start out by reviewing the differences in how these languages manage the allocation and deallocation of objects, and examine what impact these differences have on performance. In the remainder of the article, we'll look at three ways to reduce the amount of object churn in your Java programs.

Java memory management

Java's simplified memory management is one of the key features that appeals to developers with backgrounds in languages such as C/C++. In contrast with the explicit deallocation required by C/C++, Java lets you allocate objects as necessary and trust that they'll be reclaimed and recycled by the JVM when they're no longer needed. The work required to make this happen goes on behind the scenes, in the garbage collection process.

Garbage collection has been used for memory management in programming languages dating back to the dawn of the computer era in the 1960s. The basic principle of garbage collection is the same in all cases: identify objects that are no longer in use by the program, and recycle the memory used by these objects to create new ones.

JVMs generally use a reachability analysis to identify the objects that are in use, then recycle all the other objects. This starts with a base set of variables that the program uses directly, including object references in local or argument variables on the method call stack of every active thread, and in static variables of loaded classes. Each object that one of these variables references is added to the set of reachable objects. Next, each object that a member variable references in one of these objects is also added to the reachable set. This process continues until closure is obtained; in the end, every object referenced by any object in the reachable set is also in the reachable set. Any objects that are not in the reachable set are by definition not in use by the program, so they can safely be recycled.

The developer generally doesn't need to be directly involved in this garbage collection process. Objects drop out of the reachable set and become eligible for recycling as they're replaced with other objects, or as methods return and their variables are dropped from the calling thread's stack. The JVM runs garbage collection periodically, either when it can, because the program threads are waiting for some external event, or when it needs to, because it's run out of memory for creating new objects. Despite the automatic nature of the process, it's important to understand that it's going on, because it can be a significant part of the overhead of Java programs.

Besides the time overhead of garbage collection, there's also a significant space overhead for objects in Java. The JVM adds internal information to each allocated object to help in the garbage collection process. It also adds other information required by the Java language definition, which is needed in order to implement such features as the ability to synchronize on any object. When the storage used internally by the JVM for each object is included in the size of the object, small objects may be substantially larger than their C/C++ counterparts. Table 1 shows the user-accessible content size and actual object memory size measurements for several simple objects on various JVMs, illustrating the memory overhead added by the JVMs.

Table 1. Measured memory usage (bytes)
 Content (bytes)JRE 1.1.8 (Sun)JRE 1.1.8 (IBM)JRE 1.2.2 (Classic)JRE 1.2.2 (HotSpot 2.0 beta)
java.lang.Object

0

26

31

28

18

java.lang.Integer

4

26

31

28

26

int[0]

4 (length)

26

31

28

26

java.lang.String

(4 characters)

8 + 4

58

63

60

58

This space overhead is a per object value, so the percentage of overhead decreases with larger objects. It can lead to some unpleasant surprises when you're working with large numbers of small objects, though -- a program juggling a million Integers will have most systems down on their knees, for example!

Comparison with C/C++

For most operations, Java performance is now within a few percent of C/C++. The just-in-time (JIT) compilers included with most JVMs convert Java byte codes to native code with amazing efficiency, and in the latest generation (represented by IBM's JVM and Sun's HotSpot) they're showing the potential to start beating C/C++ performance for computational (CPU intensive) applications.

However, Java performance can suffer by comparison with C/C++ when many objects are being created and discarded. This is due to several factors, including initialization time for the added overhead information, garbage collection time, and structural differences between the languages. Table 2 shows the impact these factors can have on program performance, comparing C/C++ and Java versions of code repeatedly allocating and freeing arrays of byte values.

Table 2. Memory management performance (time in seconds)
 JRE 1.1.8 (Sun)JRE 1.1.8 (IBM)JRE 1.2.2 (Classic)JRE 1.2.2 (HotSpot 2.0 beta)C++
Short-term Allocations (7.5 M blocks, 331 MB)

30

22

26

14

9

Long-term Allocations (7.6 M blocks, 334 MB)

48

28

39

33

13

For both short- and long-term allocations, the C++ program is considerably faster than the Java program running on any JVM. Short-term allocations have been one focus area for optimization in HotSpot. Results show that -- with the Server 2.0 beta used in this test -- this is the closest any JVM comes to the C++ code, with a 50 percent longer test time. For long-term allocations, the IBM JVM gives better performance than the HotSpot JVM, but both trail far behind the performance of the C++ code for this type of operation.

Even the relatively good performance of HotSpot on short-term allocations is not necessarily a cause for joy. In general, C++ programs tend to allocate short-lived objects on the stack, which would give a lower overhead than the explicit allocation and deallocation used in this test. C++ also has a big advantage in the way it allocates composite objects, using a single block of memory for the combined entity. In Java, each object needs to be allocated by its own block.

We'll certainly see more performance improvements for object allocation as vendors continue to work on their VMs. Given the above advantages, though, it seems unlikely the performance will ever match C++ in this area.

Does this mean your Java programs are eternally doomed to sluggish performance? Not at all -- object creation and recycling is just one aspect of program performance, and, providing you're sensible about creating objects in heavily used code, it's easy to avoid the object churn cycle! In the remainder of this article we'll look at ways to keep your programs out of the churn by reducing unnecessary object creation.

Keep it primitive

Probably the easiest way to reduce object creation in your programs is by using primitive types in place of objects. This approach doesn't apply very often -- usually there's a good reason for making something an object in the first place, and just replacing it with a primitive type is not going to fill the same design function. In the cases where this technique does apply, though, it can eliminate a lot of overhead.

The primitive types in Java are boolean, byte, char, double, float, int, long, and short. When you create a variable of one of these types, there is no object creation overhead, and no garbage collection overhead when you're done using it. Instead, the JVM allocates the variable directly on the stack (if it's a local method variable) or within the memory used for the containing object (if it's a member variable).

Java defines wrappers for each of these primitive types, which can sometimes confuse Java novices. The wrapper classes represent immutable values of the corresponding primitive types. They allow you to treat values of a primitive type as objects, and are very useful when you need to work with generic values that may be of any type. For instance, the standard Java class libraries define the java.util.Vector, java.util.Stack, and java.util.Hashtable classes for working with object collections. Wrapper classes provide a way to use these utility classes with values of primitive types (not necessarily a good approach from the performance standpoint, for reasons we'll cover in the next article in this series, but a quick and easy way to handle some common needs).

Except for such special cases, you're best off avoiding the usage of the wrapper classes and staying with the base types. This avoids both the memory and performance overhead of object creation.

Besides the actual wrapper types, other classes in the class libraries take values of primitive types and add a layer of semantics and behavior. Classes such as java.util.Date and java.awt.Point are examples of this type. If you're working with a large number of values of such types, you can avoid excessive object overhead by storing and passing values of the underlying primitive types, only converting the values into the full objects when necessary for use with methods in the class libraries. For instance, with the Point class you can access the internal int values directly, even combining them into a long so that a single value can be returned from a method call. The following code fragment illustrates this approach with a simple midpoint calculation:

    ...
    // Method working with long values representing Points,
    // each long contains an x position in the high bits, a y position
    // in the low bits.
    public long midpoint(long a, long b) {
      
        // Compute the average value in each axis.
        int x = (int) (((a >> 32) + (b >> 32)) / 2);
        int y = ((int) a + (int) b) / 2;
        // Return combined value for midpoint of arguments.
        return (x << 32) + y;
    }
    ...

Reuse objects

Now let's consider another approach to reducing object churn: reusing objects. There are at least two major variations of this approach, depending on whether you dedicate the reused object for a particular use or use it for different purposes at different times. The first technique -- dedicated object reuse -- has the advantages of simplicity and speed, while the second -- the free pool approach -- allows more efficient use of the objects.

Dedicated object reuse

The simplest case of object reuse occurs when one or more helper objects are required for handling a frequently repeated task. We'll use date formatting as an example, since it's a function that occurs fairly often in a variety of applications. To just generate a default string representation of a passed date value (represented as a long, let's say, given the preceding discussion of using primitive types), we can do the following:

        ...
        // Get the default string for time.
        long time = ...;
    String display = DateFormat.getDateInstance().format(new Date(time));
        ...

This is a simple statement that masks a lot of complexity and object creation behind the scenes. The call to DateFormat.getDateInstance() creates a new instance of SimpleDateFormat, which in turn creates a number of associated objects; the call to format then creates new StringBuffer and FieldPosition objects. The total memory allocation resulting from this one statement actually came out to about 2,400 bytes, when measured with JRE 1.2.2 on Windows 98.

Since the program only uses these objects (except the output string) during the execution of this statement, there's going to be a lot of object churning if this use and discard approach is implemented in frequently executed code. Dedicated object reuse offers a simple technique for eliminating this type of churn.

Owned objects

By using some one-time allocations, we can create a set of objects required for formatting, then reuse these dedicated objects as needed. This set of objects is then owned by the code which uses them. For example, if we apply this approach using instance variables, so that each instance of the containing class owns a unique copy of the objects, we'd have something like this:

    ...
    // Allocate dedicated time formatting objects as member variables.
    private final Date convertDate = new Date();
    private final DateFormat convertFormat = DateFormat.getDateInstance();
    private final StringBuffer convertBuffer = new StringBuffer();
    private final FieldPosition convertField = new FieldPosition(0);
    ...
        // Get the default string for time.
        long time = ...;
        convertDate.setTime(time);
        convertBuffer.setLength(0);
        StringBuffer output =
            dateFormatter.format(convertDate, convertBuffer, convertField);
        String display = output.toString();
        ...

This code is considerably longer than the original statement, but executes much more quickly because the only object constructed each time through is the output string. In a test run with 100,000 iterations, this code took only 8 seconds to execute, as opposed to 50 seconds for the original technique. This approach does incur some additional memory usage as the price of the speed advantage, since the objects used for the formatting are kept permanently allocated instead of being freed when not in use. But if the code is executed frequently, it's a very good trade-off.

It's worth pointing out that the same technique applies if you have an inner loop within a method doing a large number of iterations. You may not need to have a set of the objects used within the loop permanently owned by the object containing the method, but you can still move the object allocations outside the loop so that they're only done once. In this case, our example code might look like this:

        // Allocate objects to be used inside loop.
        Date date = new Date();
        DateFormat formatter = DateFormat.getDateInstance();
        StringBuffer buffer= new StringBuffer();
        FieldPosition field = new FieldPosition(0);
        // Execute the loop.
        for (...) {
            // Get the default string for time.
            long time = ...;
            date.setTime(time);
            buffer.setLength(0);
            StringBuffer output = formatter.format(date, buffer, field);
            String display = output.toString();
        }

This dedicated object reuse technique often works especially well in combination with the approach of using primitive values in place of objects, as described above. A dedicated object can be initialized from the primitive values and passed to class library methods that expect an object of the original type. The use of the dedicated Date object above provides an example of this.

Multithreading owned objects

If we have a set of owned objects, and multiple threads can execute the code that uses the objects concurrently, we need to prevent conflicts between the different threads' usage of the objects. The easiest way to accomplish this is to designate one of the objects as the lock for the whole set, and enclose the code using the set of objects within a synchronized block on the lock object. This adds the overhead of a locking operation for each use of the owned objects, but the locking overhead is low in comparison to the object creation time.

In this case, suppose we made the convertDate object the lock. The code fragment using the objects would then need to be changed to the following:

        // Get usage of the owned objects.
        synchronized (convertDate) {
            // Get the default string for time.
            long time = ...;
            convertDate.setTime(time);
            convertBuffer.setLength(0);
            StringBuffer output = 
                dateFormatter.format(convertDate, convertBuffer, convertField);
            String display = output.toString();
        }

If the code using the owned objects is required to be single threaded, there's no need to bother with the locking step. However, adding the locking can give some additional flexibility.

For instance, the example we've been following uses instance variables for the owned objects, so that there's one set of objects for each instance of the containing class. This approach makes sense when there's heavy use of the owned objects, or when the objects may need to be configured differently for each instance of the containing class. This would be the case, for example, if we wanted to have the format string for our date example specified to the constructor of the class, instead of always using the default format.

In cases where usage of the owned objects is not extremely heavy, and they don't need to be customized for each instance of the containing class, it may be better to have them owned by the class itself rather than have a copy for every instance of the class. To do this, just make the member variables static:

    ...
    // Allocate dedicated time formatting objects as member variables.
    // (Synchronize on convertDate to use any or all of the objects.)
    private static final Date convertDate = new Date();
    private static final DateFormat convertFormat = DateFormat.getDateInstance();
    private static final StringBuffer convertBuffer = new StringBuffer();
    private static final FieldPosition convertField = new FieldPosition(0);
    ...
        // Get usage of the owned objects.
        synchronized (convertDate) {
            // Get the default string for time.
            long time = ...;
            convertDate.setTime(time);
            convertBuffer.setLength(0);
            StringBuffer output =
                dateFormatter.format(convertDate, convertBuffer, convertField);
            String display = output.toString();
        }
        ...

This approach gives the speed advantage of dedicated object reuse while sharing the memory overhead across all instances of the class.

Pooled object reuse

Free object pools represent another form of object reuse. With the free pool approach, code using objects of the pooled type must track usage and explicitly return the objects to the free pool when usage is complete. The free pool keeps a collection of objects available for reuse, adding returned objects to the collection. When an object is needed, the free pool removes one from the available collection and reinitializes it, rather than constructing a new object. The free pool only constructs a new object of the pooled type when the available collection is empty.

The bookkeeping overhead of maintaining a collection of available objects limits the performance gain of this approach, but it can still be useful in circumstances where there's a lot of reuse of a particular object type. We'll look at alternative approaches to handling this bookkeeping and see how each performs in practice in order to get a better idea of when each approach might be useful.

Everybody into the pool

The basic code for constructing and managing a free pool can be implemented in a number of ways. The most flexible approach (though not necessarily a good one from the performance standpoint, as we'll see) uses a passed-in type for the objects held by the pool, and could be structured as follows:

import java.lang.*;
import java.util.*;
public class ObjectPool
{
    private final Class objectType;
    private final Vector freeStack;
    
    public ObjectPool(Class type) {
        objectType = type;
        freeStack = new Vector();
    }
    
    public ObjectPool(Class type, int size) {
        objectType = type;
        freeStack = new Vector(size);
    }
    
    public synchronized Object getInstance() {
        
        // Check if the pool is empty.
        if (freeStack.isEmpty()) {
            
            // Create a new object if so.
            try {
                return objectType.newInstance();
            } catch (InstantiationException ex) {}
            catch (IllegalAccessException ex) {}
            
            // Throw unchecked exception for error in pool configuration.
            throw new RuntimeException("exception creating new instance for pool");
            
        } else {
            
            // Remove object from end of free pool.
            Object result = freeStack.lastElement();
            freeStack.setSize(freeStack.size() - 1);
            return result;
        }
    }
    
    public synchronized void freeInstance(Object obj) {
        
        // Make sure the object is of the correct type.
        if (objectType.isInstance(obj)) {
            freeStack.addElement(obj);
        } else {
            throw new IllegalArgumentException("argument type invalid for pool");
        }
    }
}

This code uses a member java.util.Vector as a growable stack to implement the free pool. It requires that an object class be specified in the constructor (with an optional pool size estimate), and checks that only the proper type of objects are added to the pool. It also automatically creates and returns a new instance of the pooled object class if none are present in the pool.

To see how this might operate, suppose we're working with some graphics code that makes extensive use of coordinate rectangles. Using the base java.awt.Rectangle class makes sense, but constantly allocating objects of this type for short-term uses might add a lot of overhead. In this case, we could easily use the ObjectPool class to eliminate this allocation overhead:

    // Construct a shared pool of Rectangle objects.
    private static final ObjectPool rectanglePool = new ObjectPool(Rectangle);
    ...
        // Construct a rectangle to be passed.
        Rectangle rect = (Rectangle) rectanglePool.getInstance();
        rect.height = height;
        rect.width = width;
        rect.x = x;
        rect.y = y;
        ...
        // Return passed rectangle to pool.
        rectanglePool.freeInstance(rect);
        ...

This is convenient and easy to implement. Unfortunately, it's also slower than just allocating the Rectangle object directly! All the generic code (especially the extensive use of type casts), along with the synchronization of the Vector class, adds so much overhead to the bookkeeping that a test run with this type of pool actually took twice as long as directly allocating and discarding the objects.

The test was somewhat biased in favor of the allocation and discard approach, since it used short-lived Rectangle instances and a bare minimum of other objects in the program (thereby letting garbage collection run at its best), but it demonstrated that this object-pool approach will give minimal gains in performance at best, and a loss in performance at worse. The generic approach can work fine for object pools controlling resources (such as database connections), but we need something faster for reducing allocation overhead.

A built-in pool

The generic ObjectPool approach tracks free objects, but adds so much overhead in the bookkeeping that it eliminates any advantage we might have from reusing objects rather than reallocating. As we'll discuss in more detail in the next article in this series, this can be a common problem with generic code -- it provides code reuse, but generally with a performance penalty. The solution to this type of overhead is to change to type-specific code, and that's what we'll look at for our next shot at an object pooling mechanism.

While we're at it, we can also eliminate another problem with the generic approach. A generic pool requires access to the internal state variables of the managed objects, so that the objects can be initialized for reuse. If we want to hide state information or make it immutable (so that it can't be modified once it's been initialized for a particular usage) in our objects, we need a different approach.

We can solve both these problems by building the object pool into the actual object class, rather than making it a separate add-on. To see how this works, let's define our own immutable Rectangle equivalent with a built-in pool (which we'll make fixed size, for simplicity):

import java.awt.*;
import java.lang.*;
import java.util.*;
public class ImmutableRectangle
{
    private static final int FREE_POOL_SIZE = 40;  // Free pool capacity.
    // Pool owned by class.
    private static final ImmutableRectangle[] freeStack =
        new ImmutableRectangle[FREE_POOL_SIZE];
    private static int countFree;
    // Member variables for state.
    private int xValue;
    private int yValue;
    private int widthValue;
    private int heightValue;
    
    private ImmutableRectangle() {
    }
    
    public static synchronized
     ImmutableRectangle getInstance(int x, int y, int width, int height) {
        
        // Check if the pool is empty.
        ImmutableRectangle result;
        if (countFree == 0) {
            
            // Create a new object if so.
            result = new ImmutableRectangle();
            
        } else {
            
            // Remove object from end of free pool.
            result = freeStack[--countFree];
        }
        // Initialize the object to the specified state.
        result.xValue = x;
        result.yValue = y;
        result.widthValue = width;
        result.heightValue = height;
        return result;
    }
    public static ImmutableRectangle getInstance(int width, int height) {
        return getInstance(0, 0, width, height);
    }
    
    public static ImmutableRectangle getInstance(Point p, Dimension d) {
        return getInstance(p.x, p.y, d.width, d.height);
    }
    public static ImmutableRectangle getInstance() {
        return getInstance(0, 0, 0, 0);
    }
    public static synchronized void freeInstance(ImmutableRectangle rect) {
        if (countFree < FREE_POOL_SIZE) {
            freeStack[countFree++] = rect;
        }
    }
    public int getX() {
        return xValue;
    }
    public int getY() {
        return yValue;
    }
    public int getWidth() {
        return widthValue;
    }
    public int getHeight() {
        return heightValue;
    }
}

With this approach we can have objects that are reusable and, from the user's standpoint at least, immutable. We can also hide any state information we don't want to make visible to the users of the class. It also gives somewhat better performance than allocation and recycling, even for these simple objects, and substantially improves performance when used for composite objects.

If the pooled objects are only used by a single thread, a further performance enhancement is possible. The use of synchronized for the getInstance() and freeInstance() methods can be eliminated in this case, providing up to several times the performance of allocation and recycling. However, this approach needs to be used with caution because of the resulting lack of thread safety in the pool.

Lifeguard on duty?

One complication of the free pool approach to object reuse: the code using the objects needs to return them to the pool when usage is complete. In some ways, this looks like a retreat to the C/C++ approach of explicit allocation and deallocation. Unlike in C/C++, though, we can pick the particular cases in which we want to implement this level of object management. In cases in which we're dealing with high-usage object types, a strategic withdrawal to explicit allocation and deallocation can yield a major benefit in terms of improved performance.

This type of free pool is also much more forgiving than C/C++. If there are some abnormal cases in which the objects never get returned to the pool, the only impact will be somewhat lower performance -- the objects will eventually be garbage collected when they're no longer in use, and we'll just need to allocate new ones for the pool at some point. In C/C++, objects which are not deallocated stay around as long as the program executes, causing the memory leaks that can plague C/C++ programs.

Object pools are also often used for managing limited or costly resources, such as database connections. A number of past JavaWorld articles have covered this use in detail, and are linked in the Resources section below. Pools of this type are generally not so forgiving: if the resource is not properly freed, it may not be available for reuse without restarting the JVM.

Conclusion

In this article we've examined some of the issues surrounding object management in Java and provided techniques that can substantially reduce the volume of object creation and garbage collection in your programs. The timing results given in the article demonstrate the performance improvements possible with these techniques. These improvements come at the cost of a small increase in code complexity, but, in cases in which your programs are generating large numbers of objects, this increased complexity can have a big performance payoff.

As with all performance enhancement techniques, these need to be applied judiciously. Most of the code in any given application is not going to be used often enough for object creation to be a significant concern. Where it does become important is in code shown by design requirements or execution profiling to be heavily used. If performance is a concern, you need to look at the object-creation load this heavily used code generates and structure it to minimize the resulting object churn.

In future articles in this series, we'll look at other issues that can limit the performance of Java applications. The next article will take a look at the drawbacks of using type casts and illustrate coding techniques to reduce the amount of casting in your code. As a side benefit, this will also provide some useful utility classes for working with primitive types instead of the corresponding wrapper objects.

Dennis Sosnoski is a software professional with over 25 years experience in developing for a wide range of platforms and application areas. An early adopter of C++, he was just as quick to convert to using Java when it became available, and for the last three years it's been his preferred development platform. Dennis is the president and senior consultant of Sosnoski Software Solutions Inc., a Java consulting and contract software development firm based "In the Land of Redmond, where the Shadows lie." He's also a big fan of sci-fi and fantasy books.

Learn more about this topic

  • Benchmarks
  • Recent JavaWorld articles covering object pools for resource management

Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more