Attack of the clones

Time and space considerations in four different approaches to implementing deep clone() methods

January 24, 2003

Q: What are the advantages and disadvantages of implementing deep cloning via Java serialization and a built-in Object.clone() method from a performance point of view?

A: Equipping classes in your application with correct clone() implementation is essential to many defensive programming patterns. Common examples include defensive copying of method parameters, cloning internal fields before returning them in getters, implementing immutability patterns, implementing containers with deep cloning semantics, and so on.

Even though the question mentions just two possibilities, there are at least four distinct approaches to clone() implementation. In this Java Q&A installment, I consider design and performance tradeoffs involved in all of them.

Because cloning is so customizable, this article's examples will not necessarily translate directly to your own code; however, the general conclusions we will reach should provide useful guidelines in any application design.

Note the following disclaimer: What exactly constitutes a deep clone is debatable. Even though two objects can safely share the same String reference viewed as data, they cannot if the same field is used as an instance-scoped object monitor (such as when calling Object.wait()/notify() on it) or if field instance identity (such as when using the == operator) is significant to the design. In the end, whether or not a field is shareable depends on the class design. For simplicity, I assume below that all fields are used as pure data.

Performance measurements setup

Let's jump right into some code. I use the following simple hierarchy of classes as my cloning guinea pig:

public class TestBaseClass
             implements Cloneable, Serializable
{
    public TestBaseClass (String dummy)
    {
        m_byte = (byte) 1;
        m_short = (short) 2;
        m_long = 3L;
        m_float = 4.0F;
        m_double = 5.0;
        m_char = '6';
        m_boolean = true;
        m_int = 16;
        m_string = "some string in TestBaseClass";
        
        m_ints = new int [m_int];
        for (int i = 0; i < m_ints.length; ++ i) m_ints [i] = m_int;
        
        m_strings = new String [m_int];
        m_strings [0] = m_string; // invariant: m_strings [0] == m_string
        for (int i = 1; i < m_strings.length; ++ i)
            m_strings [i] = new String (m_string);
    }
    public TestBaseClass (final TestBaseClass obj)
    {
        if (obj == null) throw new IllegalArgumentException ("null input: obj");
        
        // Copy all fields:
        
        m_byte = obj.m_byte;
        m_short = obj.m_short;
        m_long = obj.m_long;
        m_float = obj.m_float;
        m_double = obj.m_double;
        m_char = obj.m_char;
        m_boolean = obj.m_boolean;
        
        m_int = obj.m_int;
        m_string = obj.m_string;
        
        if (obj.m_ints != null) m_ints = (int []) obj.m_ints.clone ();
        if (obj.m_strings != null) m_strings = (String []) obj.m_strings.clone ();
    }
    
    // Cloneable:
    public Object clone ()
    {
        if (Main.OBJECT_CLONE)
        {
            try
            {
                // Chain shallow field work to Object.clone(): 
                final TestBaseClass clone = (TestBaseClass) super.clone ();
                
                // Set deep fields:
                if (m_ints != null)
                    clone.m_ints = (int []) m_ints.clone ();
                if (m_strings != null)
                    clone.m_strings = (String []) m_strings.clone ();
                
                return clone;
            }
            catch (CloneNotSupportedException e)
            {
                throw new InternalError (e.toString ());
            }
        }
        else if (Main.COPY_CONSTRUCTOR)
            return new TestBaseClass (this);
        else if (Main.SERIALIZATION)
            return SerializableClone.clone (this);
        else if (Main.REFLECTION)
            return ReflectiveClone.clone (this);
        else
            throw new RuntimeException ("select cloning method");
    }
    
    protected TestBaseClass () {} // accessible to subclasses only
    
    private byte m_byte;
    private short m_short;
    private long m_long;
    private float m_float;
    private double m_double;
    private char m_char;
    private boolean m_boolean;
    private int m_int;
    private int [] m_ints;
    private String m_string;
    private String [] m_strings; // invariant: m_strings [0] == m_string    
} // end of class
public final class TestClass extends TestBaseClass
             implements Cloneable, Serializable
{
    public TestClass (String dummy)
    {
        super (dummy);
        
        m_int = 4;
        
        m_object1 = new TestBaseClass (dummy);
        m_object2 = m_object1; // invariant: m_object1 == m_object2
    
        m_objects = new Object [m_int];
        for (int i = 0; i < m_objects.length; ++ i)
            m_objects [i] = new TestBaseClass (dummy);
    }
    public TestClass (final TestClass obj)
    {
        // Chain to super copy constructor:
        super (obj);
        
        // Copy all fields declared by this class:
        
        m_int = obj.m_int;
        
        if (obj.m_object1 != null)
            m_object1 = ((TestBaseClass) obj.m_object1).clone ();
        m_object2 = m_object1; // preserve the invariant
        
        if (obj.m_objects != null)
        {
            m_objects = new Object [obj.m_objects.length];
            for (int i = 0; i < m_objects.length; ++ i)
                m_objects [i] = ((TestBaseClass) obj.m_objects [i]).clone ();
        }
    }
    
    // Cloneable:
    public Object clone ()
    {
        if (Main.OBJECT_CLONE)
        {
            // Chain shallow field work to Object.clone():
            final TestClass clone = (TestClass) super.clone ();
            
            // Set only deep fields declared by this class:
            
            if (m_object1 != null)
                clone.m_object1 = ((TestBaseClass) m_object1).clone ();
            clone.m_object2 = clone.m_object1; // preserve the invariant
            
            if (m_objects != null)
            {
                clone.m_objects = (Object []) m_objects.clone ();
                for (int i = 0; i < m_objects.length; ++ i)
                    clone.m_objects [i] = ((TestBaseClass) m_objects [i]).clone ();
            }
            return clone;
        }
        else if (Main.COPY_CONSTRUCTOR)
            return new TestClass (this);
        else if (Main.SERIALIZATION)
            return SerializableClone.clone (this);
        else if (Main.REFLECTION)
            return ReflectiveClone.clone (this);
        else
            throw new RuntimeException ("select cloning method");
    }
        
    protected TestClass () {} // accessible to subclasses only
    private int m_int;        
    private Object m_object1, m_object2; // invariant: m_object1 == m_object2
    private Object [] m_objects;
} // End of class

TestBaseClass has several fields of primitive types as well as a String and a couple of array fields. TestClass both extends TestBaseClass and aggregates several instances of it. This setup allows us to see how inheritance, member object ownership, and data types can affect cloning design and performance.

In a previous Java Q&A article, I developed a simple timing library that comes in handy now. This code in class Main measures the cost of TestClass.clone():

        // Create an ITimer:
        final ITimer timer = TimerFactory.newTimer ();
        
        // JIT/hotspot warmup:
        // ...
        TestClass obj = new TestClass ();
        
        // Warm up clone():
        // ...
        final int repeats = 1000;
        
        timer.start ();
        // Note: the loop is unrolled 10 times
        for (int i = 0; i < repeats / 10; ++ i)
        {
            obj = (TestClass) obj.clone ();
            ... repeated 10 times ...
        }
        timer.stop ();
        
        final DecimalFormat format = new DecimalFormat ();
        format.setMinimumFractionDigits (3);
        format.setMaximumFractionDigits (3);
        
        System.out.println ("method duration: " +
            format.format (timer.getDuration () / repeats) + " ms");

I use the high-resolution timer supplied by TimerFactory with a loop that creates a moderate number of cloned objects. The elapsed time reading is reliable, and there is little interference from the garbage collector. Note how the obj variable continuously updates to avoid memory caching effects.

Also note how clone() is implemented in both classes. The implementation in each class is in fact four, selected one at a time using four conditional compilation constants in Main: OBJECT_CLONE, COPY_CONSTRUCTOR, SERIALIZATION, and REFLECTION. Recompile the entire object when changing the cloning approach.

Let's now examine each approach in detail.

Approach 1: Cloning by chaining to Object.clone()

This is perhaps the most classical approach. The steps involved are:

  • Declare your class to implement the Cloneable marker interface.
  • Provide a public clone override that always begins with a call to super.clone() followed by manual copying of all deep fields (i.e., mutable fields that are object references and cannot be shared between several instances of the parent class).
  • Declare your clone override not to throw any exceptions, including CloneNotSupportedException. To this effect, the clone() method in your hierarchy's first class that subclasses a non-Cloneable class will catch CloneNotSupportedException and wrap it into an InternalError.

Correct implementation of Cloneable easily deserves a separate article. Because my focus is on measuring performance, I will repeat the relevant points here and direct readers to existing references for further details (see Resources).

This traditional approach is particularly well suited to the presence of inheritance because the chain of super.clone() eventually calls the native java.lang.Object.clone() implementation. This is good for two reasons. First, this native method has the magic ability to always create an instance of the most derived class for the current object. That is, the result of super.clone() in TestBaseClass is an instance of TestClass when TestBaseClass.clone() is part of the chain of methods originating from TestClass.clone(). This makes it easy to implement the desirable x.clone().getClass() == x.getClass() invariant even in the presence of inheritance.

Second, if you examine the JVM sources, you will see that at the heart of java.lang.Object.clone() is the memcpy C function, usually implemented in very efficient assembly on a given platform; so I expect the method to act as a fast "bit-blasting" shallow clone implementation, replicating all shallow fields in one fell swoop. In many cases, the only remaining manual coding is done to deeply clone object reference fields that point to unshareable mutable objects.

Running the test with the OBJECT_CLONE variable set to true on a Windows 550-MHz machine with Sun Microsystems' JDK 1.4.1 produces:

clone implementation: Object.clone()
method duration: 0.033 ms

This is not bad for a class with multiple primitive and object reference fields. But for better insight, I must compare the result with other approaches below.

Despite its advantages, this approach is plagued with problems due to poor java.lang.Object.clone() design. It cannot be used for cloning final fields unless they can be copied shallowly. Creating smart, deeply cloning container classes is complicated by the fact that Cloneable is just a marker interface, and java.lang.Object.clone() is not public. Finally, cloning inner classes does not work due to problems with outer references. See articles by Mark Davis and Steve Ball in Resources for some of the earliest discussions about this topic.

Approach 2: Cloning via copy construction

This approach complements Approach 1. It involves these steps:

  • For every class X, provide a copy constructor with signature X(X x).

  • Chain to the base class's copy constructor in all but the first class in your hierarchy. You can chain to clone() or directly to the base copy constructor. The former choice is more polymorphic and works when the base copy constructor is private, and the latter sometimes avoids the small cost of casting clone()'s return value to a specific type.

  • Following the chaining call, set all class fields by copying them from the input parameter. For every object reference field, you decide individually whether to clone it deeply.

Setting COPY_CONSTRUCTOR to true and rerunning the test produces:

clone implementation: copy construction
method duration: 0.024 ms

This beats Approach 1. The result might not be surprising because the overhead of native method calls has increased and the cost of new object creation has decreased with increasing JDK versions. If you rerun the same tests in Sun's JDK 1.2.2, the situation favors Approach 1. Of course, performance depends on the relative mix of shallow and deep fields in the class hierarchy. Classes with many primitive type fields benefit more from Approach 1. Classes with a few mostly immutable fields work very efficiently with Approach 2, with a speed advantage at least 10 times greater than Approach 1.

1 2 3 Page
Recommended
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more