Java performance programming, Part 2: The cost of casting

Reduce overhead and execution errors through type-safe code

For this second article in our series on Java performance, the focus shifts to casting -- what it is, what it costs, and how we can (sometimes) avoid it. This month, we start off with a quick review of the basics of classes, objects, and references, then follow up with a look at some hardcore performance figures (in a sidebar, so as not to offend the squeamish!) and guidelines on the types of operations that are most likely to give your Java Virtual Machine (JVM) indigestion. Finally, we finish off with an in-depth look at how we can avoid common class-structuring effects that can cause casting.

Object and reference types in Java

Last month, we discussed the basic distinction between primitive types and objects in Java. Both the number of primitive types and the relationships between them (particularly conversions between types) are fixed by the language definition. Objects, on the other hand, are of unlimited types and may be related to any number of other types.

Each class definition in a Java program defines a new type of object. This includes all the classes from the Java libraries, so any given program may be using hundreds or even thousands of different types of objects. A few of these types are specified by the Java language definition as having certain special usages or handling (such as the use of java.lang.StringBuffer for java.lang.String concatenation operations). Aside from these few exceptions, however, all the types are treated basically the same by the Java compiler and the JVM used to execute the program.

If a class definition does not specify (by means of the extends clause in the class definition header) another class as a parent or superclass, it implicitly extends the java.lang.Object class. This means that every class ultimately extends java.lang.Object, either directly or via a sequence of one or more levels of parent classes.

Objects themselves are always instances of classes, and an object's type is the class of which it's an instance. In Java, we never deal directly with objects, though; we work with references to objects. For example, the line:

    java.awt.Component myComponent;

does not create an java.awt.Component object; it creates a reference variable of type java.lang.Component. Even though references have types just as objects do, there is not a precise match between reference and object types -- a reference value may be null, an object of the same type as the reference, or an object of any subclass (i.e., class descended from) the type of the reference. In this particular case, java.awt.Component is an abstract class, so we know that there can never be an object of the same type as our reference, but there can certainly be objects of subclasses of that reference type.

Polymorphism and casting

The type of a reference determines how the referenced object -- that is, the object that is the value of the reference -- can be used. For instance, in the example above, code using myComponent could invoke any of the methods defined by the class java.awt.Component, or any of its superclasses, on the referenced object.

However, the method actually executed by a call is determined not by the type of the reference itself, but rather by the type of the referenced object. This is the basic principle of polymorphism -- subclasses can override methods defined in the parent class in order to implement different behavior. In the case of our example variable, if the referenced object was actually an instance of java.awt.Button, the change in state resulting from a setLabel("Push Me") call would be different from that resulting if the referenced object were an instance of java.awt.Label.

Besides class definitions, Java programs also use interface definitions. The difference between an interface and a class is that an interface only specifies a set of behaviors (and, in some cases, constants), while a class defines an implementation. Since interfaces do not define implementations, objects can never be instances of an interface. They can, however, be instances of classes that implement an interface. References can be of interface types, in which case the referenced objects may be instances of any class that implements the interface (either directly or through some ancestor class).

Casting is used to convert between types -- between reference types in particular, for the type of casting operation in which we're interested here. Upcast operations (also called widening conversions in the Java Language Specification) convert a subclass reference to an ancestor class reference. This casting operation is normally automatic, since it's always safe and can be implemented directly by the compiler.

Downcast operations (also called narrowing conversions in the Java Language Specification) convert an ancestor class reference to a subclass reference. This casting operation creates execution overhead, since Java requires that the cast be checked at runtime to make sure that it's valid. If the referenced object is not an instance of either the target type for the cast or a subclass of that type, the attempted cast is not permitted and must throw a java.lang.ClassCastException.

The instanceof operator in Java allows you to determine whether or not a specific casting operation is permitted without actually attempting the operation. Since the performance cost of a check is much less than that of the exception generated by an unpermitted cast attempt, it's generally wise to use an instanceof test anytime you're not sure that the type of a reference is what you'd like it to be. Before doing so, however, you should make sure that you have a reasonable way of dealing with a reference of an unwanted type -- otherwise, you may as well just let the exception be thrown and handle it at a higher level in your code.

Casting caution to the winds

Casting allows the use of generic programming in Java, where code is written to work with all objects of classes descended from some base class (often java.lang.Object, for utility classes). However, the use of casting causes a unique set of problems. In the next section we'll look at the impact on performance, but let's first consider the effect on the code itself. Here's a sample using the generic java.lang.Vector collection class:

        private Vector someNumbers;
        ...
        public void doSomething() {
            ...
            int n = ...
            Integer number = (Integer) someNumbers.elementAt(n);
            ...
        }

This code presents potential problems in terms of clarity and maintainability. If someone other than the original developer were to modify the code at some point, he might reasonably think that he could add a java.lang.Double to the someNumbers collections, since this is a subclass of java.lang.Number. Everything would compile fine if he tried this, but at some indeterminate point in execution he'd likely get a java.lang.ClassCastException thrown when the attempted cast to a java.lang.Integer was executed for his added value.

The problem here is that the use of casting bypasses the safety checks built into the Java compiler; the programmer ends up hunting for errors during execution, since the compiler won't catch them. This is not disastrous in and of itself, but this type of usage error often hides quite cleverly while you're testing your code, only to reveal itself when the program is put into production.

Not surprisingly, support for a technique that would allow the compiler to detect this type of usage error is one of the more heavily requested enhancements to Java. There's a project now in progress in the Java Community Process that's investigating adding just this support: project number JSR-000014, Add Generic Types to the Java Programming Language (see the Resources section below for more details.) In the continuation of this article, coming next month, we'll look at this project in more detail and discuss both how it's likely to help and where it's likely to leave us wanting more.

The performance issue

It's long been recognized that casting can be detrimental to performance in Java, and that you can improve performance by minimizing casting in heavily used code. Method calls, especially calls through interfaces, are also often mentioned as potential performance bottlenecks. The current generation of JVMs have come a long way from their predecessors, though, and it's worth checking to see how well these principles hold up today.

For this article, I developed a series of tests to see how important these factors are to performance with current JVMs. The test results are summarized into two tables in the sidebar, Table 1 showing method call overhead and Table 2 casting overhead. The full source code for the test program is also available online (see the Resources section below for more details).

To summarize these conclusions for readers who don't want to wade through the details in the tables, certain types of method calls and casts are still fairly expensive, in some cases taking nearly as long as a simple object allocation. Where possible, these types of operations should be avoided in code that needs to be optimized for performance.

In particular, calls to overridden methods (methods that are overridden in any loaded class, not just the actual class of the object) and calls through interfaces are considerably more costly than simple method calls. The HotSpot Server JVM 2.0 beta used in the test will even convert many simple method calls to inline code, avoiding any overhead for such operations. However, HotSpot shows the worst performance among the tested JVMs for overridden methods and calls through interfaces.

For casting (downcasting, of course), the tested JVMs generally keep the performance hit to a reasonable level. HotSpot does an exceptional job with this in most of the benchmark testing, and, as with the method calls, is in many simple cases able to almost completely eliminate the overhead of casting. For more complicated situations, such as casts followed by calls to overridden methods, all the tested JVMs show noticeable performance degradation.

The tested version of HotSpot also showed extremely poor performance when an object was cast to different reference types in succession (instead of always being cast to the same target type). This situation regularly arises in libraries such as Swing that use a deep hierarchy of classes.

In most cases, the overhead of both method calls and casting is small in comparison with the object-allocation times looked at in last month's article. However, these operations will often be used far more frequently than object allocations, so they can still be a significant source of performance problems.

In the remainder of this article, we'll discuss some specific techniques for reducing the need for casting in your code. Specifically, we'll look at how casting often arises from the way subclasses interact with base classes, and explore some techniques for eliminating this type of casting. Next month, in the second part of this look at casting, we'll consider another common cause of casting, the use of generic collections.

Base classes and casting

There are several common uses of casting in Java programs. For instance, casting is often used for the generic handling of some functionality in a base class that may be extended by a number of subclasses. The following code shows a somewhat contrived illustration of this usage:

    // simple base class with subclasses
    public abstract class BaseWidget {
        ...
    }
    public class SubWidget extends BaseWidget {
        ...
        public void doSubWidgetSomething() {
            ...
        }
    }
    ...
    // base class with subclasses, using the prior set of classes
    public abstract class BaseGorph {
        // the Widget associated with this Gorph
        private BaseWidget myWidget;
        ...
        // set the Widget associated with this Gorph (only allowed for subclasses)
        protected void setWidget(BaseWidget widget) {
           myWidget = widget;
        }
        // get the Widget associated with this Gorph
        public BaseWidget getWidget() {
           return myWidget;
        }
        ...
        // return a Gorph with some relation to this Gorph
        //  this will always be the same type as it's called on, but we can only
        //  return an instance of our base class
        public abstract BaseGorph otherGorph() {
            ...
        }
    }
    // Gorph subclass using a Widget subclass
    public class SubGorph extends BaseGorph {
        // return a Gorph with some relation to this Gorph
        public BaseGorph otherGorph() {
            ...
        }
        ...
        public void anyMethod() {
            ...
            // set the Widget we're using
            SubWidget widget = ...
            setWidget(widget);
            ...
            // use our Widget
            ((SubWidget)getWidget()).doSubWidgetSomething();
            ...
            // use our otherGorph
            SubGorph other = (SubGorph) otherGorph();
            ...
        }
    }
1 2 Page
Recommended
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more