Java performance programming, Part 2: The cost of casting

Reduce overhead and execution errors through type-safe code

For this second article in our series on Java performance, the focus shifts to casting -- what it is, what it costs, and how we can (sometimes) avoid it. This month, we start off with a quick review of the basics of classes, objects, and references, then follow up with a look at some hardcore performance figures (in a sidebar, so as not to offend the squeamish!) and guidelines on the types of operations that are most likely to give your Java Virtual Machine (JVM) indigestion. Finally, we finish off with an in-depth look at how we can avoid common class-structuring effects that can cause casting.

Object and reference types in Java

Last month, we discussed the basic distinction between primitive types and objects in Java. Both the number of primitive types and the relationships between them (particularly conversions between types) are fixed by the language definition. Objects, on the other hand, are of unlimited types and may be related to any number of other types.

Each class definition in a Java program defines a new type of object. This includes all the classes from the Java libraries, so any given program may be using hundreds or even thousands of different types of objects. A few of these types are specified by the Java language definition as having certain special usages or handling (such as the use of java.lang.StringBuffer for java.lang.String concatenation operations). Aside from these few exceptions, however, all the types are treated basically the same by the Java compiler and the JVM used to execute the program.

If a class definition does not specify (by means of the extends clause in the class definition header) another class as a parent or superclass, it implicitly extends the java.lang.Object class. This means that every class ultimately extends java.lang.Object, either directly or via a sequence of one or more levels of parent classes.

Objects themselves are always instances of classes, and an object's type is the class of which it's an instance. In Java, we never deal directly with objects, though; we work with references to objects. For example, the line:

    java.awt.Component myComponent;

does not create an java.awt.Component object; it creates a reference variable of type java.lang.Component. Even though references have types just as objects do, there is not a precise match between reference and object types -- a reference value may be null, an object of the same type as the reference, or an object of any subclass (i.e., class descended from) the type of the reference. In this particular case, java.awt.Component is an abstract class, so we know that there can never be an object of the same type as our reference, but there can certainly be objects of subclasses of that reference type.

Polymorphism and casting

The type of a reference determines how the referenced object -- that is, the object that is the value of the reference -- can be used. For instance, in the example above, code using myComponent could invoke any of the methods defined by the class java.awt.Component, or any of its superclasses, on the referenced object.

However, the method actually executed by a call is determined not by the type of the reference itself, but rather by the type of the referenced object. This is the basic principle of polymorphism -- subclasses can override methods defined in the parent class in order to implement different behavior. In the case of our example variable, if the referenced object was actually an instance of java.awt.Button, the change in state resulting from a setLabel("Push Me") call would be different from that resulting if the referenced object were an instance of java.awt.Label.

Besides class definitions, Java programs also use interface definitions. The difference between an interface and a class is that an interface only specifies a set of behaviors (and, in some cases, constants), while a class defines an implementation. Since interfaces do not define implementations, objects can never be instances of an interface. They can, however, be instances of classes that implement an interface. References can be of interface types, in which case the referenced objects may be instances of any class that implements the interface (either directly or through some ancestor class).

Casting is used to convert between types -- between reference types in particular, for the type of casting operation in which we're interested here. Upcast operations (also called widening conversions in the Java Language Specification) convert a subclass reference to an ancestor class reference. This casting operation is normally automatic, since it's always safe and can be implemented directly by the compiler.

Downcast operations (also called narrowing conversions in the Java Language Specification) convert an ancestor class reference to a subclass reference. This casting operation creates execution overhead, since Java requires that the cast be checked at runtime to make sure that it's valid. If the referenced object is not an instance of either the target type for the cast or a subclass of that type, the attempted cast is not permitted and must throw a java.lang.ClassCastException.

The instanceof operator in Java allows you to determine whether or not a specific casting operation is permitted without actually attempting the operation. Since the performance cost of a check is much less than that of the exception generated by an unpermitted cast attempt, it's generally wise to use an instanceof test anytime you're not sure that the type of a reference is what you'd like it to be. Before doing so, however, you should make sure that you have a reasonable way of dealing with a reference of an unwanted type -- otherwise, you may as well just let the exception be thrown and handle it at a higher level in your code.

Casting caution to the winds

Casting allows the use of generic programming in Java, where code is written to work with all objects of classes descended from some base class (often java.lang.Object, for utility classes). However, the use of casting causes a unique set of problems. In the next section we'll look at the impact on performance, but let's first consider the effect on the code itself. Here's a sample using the generic java.lang.Vector collection class:

        private Vector someNumbers;
        ...
        public void doSomething() {
            ...
            int n = ...
            Integer number = (Integer) someNumbers.elementAt(n);
            ...
        }

This code presents potential problems in terms of clarity and maintainability. If someone other than the original developer were to modify the code at some point, he might reasonably think that he could add a java.lang.Double to the someNumbers collections, since this is a subclass of java.lang.Number. Everything would compile fine if he tried this, but at some indeterminate point in execution he'd likely get a java.lang.ClassCastException thrown when the attempted cast to a java.lang.Integer was executed for his added value.

The problem here is that the use of casting bypasses the safety checks built into the Java compiler; the programmer ends up hunting for errors during execution, since the compiler won't catch them. This is not disastrous in and of itself, but this type of usage error often hides quite cleverly while you're testing your code, only to reveal itself when the program is put into production.

Not surprisingly, support for a technique that would allow the compiler to detect this type of usage error is one of the more heavily requested enhancements to Java. There's a project now in progress in the Java Community Process that's investigating adding just this support: project number JSR-000014, Add Generic Types to the Java Programming Language (see the Resources section below for more details.) In the continuation of this article, coming next month, we'll look at this project in more detail and discuss both how it's likely to help and where it's likely to leave us wanting more.

The performance issue

It's long been recognized that casting can be detrimental to performance in Java, and that you can improve performance by minimizing casting in heavily used code. Method calls, especially calls through interfaces, are also often mentioned as potential performance bottlenecks. The current generation of JVMs have come a long way from their predecessors, though, and it's worth checking to see how well these principles hold up today.

For this article, I developed a series of tests to see how important these factors are to performance with current JVMs. The test results are summarized into two tables in the sidebar, Table 1 showing method call overhead and Table 2 casting overhead. The full source code for the test program is also available online (see the Resources section below for more details).

To summarize these conclusions for readers who don't want to wade through the details in the tables, certain types of method calls and casts are still fairly expensive, in some cases taking nearly as long as a simple object allocation. Where possible, these types of operations should be avoided in code that needs to be optimized for performance.

In particular, calls to overridden methods (methods that are overridden in any loaded class, not just the actual class of the object) and calls through interfaces are considerably more costly than simple method calls. The HotSpot Server JVM 2.0 beta used in the test will even convert many simple method calls to inline code, avoiding any overhead for such operations. However, HotSpot shows the worst performance among the tested JVMs for overridden methods and calls through interfaces.

For casting (downcasting, of course), the tested JVMs generally keep the performance hit to a reasonable level. HotSpot does an exceptional job with this in most of the benchmark testing, and, as with the method calls, is in many simple cases able to almost completely eliminate the overhead of casting. For more complicated situations, such as casts followed by calls to overridden methods, all the tested JVMs show noticeable performance degradation.

The tested version of HotSpot also showed extremely poor performance when an object was cast to different reference types in succession (instead of always being cast to the same target type). This situation regularly arises in libraries such as Swing that use a deep hierarchy of classes.

In most cases, the overhead of both method calls and casting is small in comparison with the object-allocation times looked at in last month's article. However, these operations will often be used far more frequently than object allocations, so they can still be a significant source of performance problems.

In the remainder of this article, we'll discuss some specific techniques for reducing the need for casting in your code. Specifically, we'll look at how casting often arises from the way subclasses interact with base classes, and explore some techniques for eliminating this type of casting. Next month, in the second part of this look at casting, we'll consider another common cause of casting, the use of generic collections.

Base classes and casting

There are several common uses of casting in Java programs. For instance, casting is often used for the generic handling of some functionality in a base class that may be extended by a number of subclasses. The following code shows a somewhat contrived illustration of this usage:

    // simple base class with subclasses
    public abstract class BaseWidget {
        ...
    }
    public class SubWidget extends BaseWidget {
        ...
        public void doSubWidgetSomething() {
            ...
        }
    }
    ...
    // base class with subclasses, using the prior set of classes
    public abstract class BaseGorph {
        // the Widget associated with this Gorph
        private BaseWidget myWidget;
        ...
        // set the Widget associated with this Gorph (only allowed for subclasses)
        protected void setWidget(BaseWidget widget) {
           myWidget = widget;
        }
        // get the Widget associated with this Gorph
        public BaseWidget getWidget() {
           return myWidget;
        }
        ...
        // return a Gorph with some relation to this Gorph
        //  this will always be the same type as it's called on, but we can only
        //  return an instance of our base class
        public abstract BaseGorph otherGorph() {
            ...
        }
    }
    // Gorph subclass using a Widget subclass
    public class SubGorph extends BaseGorph {
        // return a Gorph with some relation to this Gorph
        public BaseGorph otherGorph() {
            ...
        }
        ...
        public void anyMethod() {
            ...
            // set the Widget we're using
            SubWidget widget = ...
            setWidget(widget);
            ...
            // use our Widget
            ((SubWidget)getWidget()).doSubWidgetSomething();
            ...
            // use our otherGorph
            SubGorph other = (SubGorph) otherGorph();
            ...
        }
    }

This illustration shows related abstract base classes implementing some shared behavior, where specialized subclasses are used to extend the shared implementation for a specific instance. The base class for Gorph objects, BaseGorph, tracks the Widget object associated with each Gorph and implements some shared behavior using BaseWidget, the Widget base class.

With this approach, we need to use a cast when a Gorph subclass makes use of some specific feature of the Widget subclass with which it's associated, as shown in the doSubWidgetSomething() call within SubGorph. As a more general issue, this approach also requires that methods included in the base class definition, such as otherGorph(), need to return instances of the base class, even though we may have a design that requires that otherGorph() always returns a Gorph of the same type on which it's used. This makes it necessary to cast the returned result, as shown in the otherGorph() call within SubGorph. We'll first consider ways around the Widget problem, then get back to the more general issue of casting results even when we know the type.

Redundant references

For the first type of casting operation there are some workarounds, though they can be messy. The most direct is to have each Gorph subclass keep an appropriately typed reference to its own Widget. The value of the reference in the base class is duplicated, except that proper typing is added:

    // Gorph subclass using a Widget subclass
    public class SubGorph {
        // the SubWidget associated with this SubGorph
        private SubWidget subWidget;
        ...
    }

This works, but it creates another problem: you must make sure that the two references -- one in the base class, one in the subclass -- are always updated together. We might want to handle this by overriding the base class's set() method:

    // Gorph subclass using a Widget subclass
    public class SubGorph {
        // the SubWidget associated with this SubGorph
        private SubWidget subWidget;
        ...
        // set the SubWidget associated with this SubGorph
        protected void setWidget(SubWidget widget) {
            subWidget = widget;
            super.setWidget(widget);
        }
    }

Unfortunately, the setWidget() call defined in this manner does not actually override the base-class method, because the method signature is different -- this version takes a SubWidget parameter, while the base-class method by the same name takes a BaseWidget parameter. The net effect is that setWidget(SubWidget) is only defined for the SubGorph class, and if setWidget(BaseWidget) were used with a SubGorph instance, it would bypass this method and access the base-class method directly. We can avoid this problem with an actual override of the base-class method:

    // Gorph subclass using a Widget subclass
    public class SubGorph {
        // the SubWidget associated with this SubGorph
        private SubWidget subWidget;
        ...
        // set the SubWidget associated with this SubGorph
        protected void setWidget(SubWidget widget) {
            subWidget = widget;
            super.setWidget(widget);
        }
        // override the base class method to make sure we don't miss any changes
        //  will throw ClassCastException if called with the wrong type of
        // object
        protected void setWidget(BaseWidget widget) {
            setWidget((SubWidget) widget);
        }
    }

This is a pretty messy solution for avoiding some casting, though. It's also a source of potential problems for multithreaded code, since the duplicated references will not be changed in an atomic operation. Solving this multithreading problem would require the use of synchronization, itself a more costly operation than casting (as we'll discuss in a future article in this series). In summary, this approach can be made to work, but it's not very adaptable.

Callouts to subclasses

An alternative approach to the use of redundant references is to simply avoid storing the reference in the base class at all. Instead, the base class can use callout methods to access the value from the subclass when needed:

    public abstract class BaseGorph {
        ...
        // get the Widget associated with this Gorph
        public abstract BaseWidget getBaseWidget();
        ...
    }
    // Gorph subclass using a Widget subclass
    public class SubGorph {
        // the SubWidget associated with this SubGorph
        private SubWidget subWidget;
        ...
        // get the BaseWidget associated with this SubGorph
        public BaseWidget getBaseWidget() {
            return subWidget;
        }
        // get the SubWidget associated with this SubGorph
        public SubWidget getSubWidget() {
            return subWidget;
        }
    }

This approach can lead to awkward class definitions when carried to extremes, but used in moderation it's a fairly clean solution. The main drawback is that it's strictly a one-level approach, applicable only with a single level of concrete subclasses. If you wanted to allow subclasses of SubGorph to have their own associated subclasses of SubWidget, you'd be confronted with the original problem again. Still, this situation can often be avoided in practice, and callouts have the advantage of simplicity while avoiding the problems of redundant references.

Type-specific returns

The more general casting problem in our original code was that base class methods cannot return types specific to a subclass. One route around this problem -- currently only a theoretical route, as we'll discuss in a moment -- would be to define overrides for the base-class methods with type-specific returns, as can be done for virtual methods in C++:

    public abstract class BaseGorph {
        ...
        // return a Gorph with some relation to this Gorph
        //  this will always be the same type as it's called on, but we can only
        //  return an instance of our base class
        public abstract BaseGorph otherGorph() {
            ...
        }
    }
    // Gorph subclass using a Widget subclass
    public class SubGorph {
        ...
        // return a SubGorph with some relation to this SubGorph
        public SubGorph otherGorph() {
            ...
        }
    }

This would allow the base-class method to return the most specific type of which it is aware, while a subclass method could return an even more specific result. Since the result returned by the subclass method would always be an instance of the type returned by the base method, the contract defined by the method would still be valid. Within the subclass (or in other code specifically using an instance of the subclass), the more specific return value would be available without the need for a cast operation.

Unfortunately, the Java language definition does not currently allow this type of (intended) override. Bug 4144488 (see Resources) in the Java Developer Connection database addresses this issue with the specific example of the clone() method, one of the cases in which it would be especially useful to use type-specific overrides. It doesn't sound like we're likely to see this fixed any time soon, even though it would apparently only require a compiler change -- but adding your vote for it couldn't hurt!

Lacking the ability to use type-specific overrides, the only way to avoid casting the result in this situation is to provide a separate method specific to the subclass, implementing the method defined in the base class using this new specific method:

    // Gorph subclass using a Widget subclass
    public class SubGorph {
        ...
        // return a SubGorph with some relation to this SubGorph
        public SubGorph otherSubGorph() {
            ...
        }
        // implement the generic method defined in the base class
        public Gorph otherGorph() {
            return otherSubGorph();
        }
    }

This adds clutter and confusion, but at least allows for type-specific usage without the need for casting.

Conclusion

So far we've covered the basics of object and reference types in Java, along with the related issues of casting and polymorphism. After a detailed look at the performance costs of some of these operations, we've also discussed ways of structuring code to reduce the need for casting in one situation that often encourages it -- the linkage between subclasses and base classes.

Hopefully, this discussion of different approaches of achieving this goal -- and the limitations of some of these approaches -- will get you to think about ways to eliminate some of the unnecessary casting in your code and clean up the structure at the same time. Properly applied, structural techniques that reduce casting produce code that not only performs better than the alternatives, but is also cleaner and less prone to runtime errors.

Next month, we'll look into casting as it relates to the different ways of handling collections in Java. Starting with the generic collections support included in the original Java class libraries, we'll discuss the enhancements provided with Java 2 and the most efficient and effective ways of using these standard implementations. We'll also discuss the changes that may be coming in this area with the proposed addition of support for generic types in Java. Finally, we'll look at using custom type-safe collections, especially as applied to primitive types where we can eliminate both casting and object-creation overhead.

Don't forget to check back then for the full scoop on this aspect of Java performance!

Dennis Sosnoski is a software professional with over 25 years experience in developing for a wide range of platforms and application areas. An early adoptee of C++, he was just as quick to convert to using Java when it became available, and for the last three years it's been his preferred development platform. Dennis is the president and senior consultant of Sosnoski Software Solutions Inc., a Java consulting and contract software development firm collocated with another well-known software company in Redmond, Wash.

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more