Sizeof for Java

Object sizing revisited

1 2 3 Page 2
Page 2 of 3
    public static IObjectProfileNode profile (Object obj)
    {
        final IdentityHashMap visited = new IdentityHashMap (); 
        
        final ObjectProfileNode root = createProfileTree (obj, visited,
                                                          CLASS_METADATA_CACHE);
        finishProfileTree (root);
        
        return root;  
    }
    
    private static ObjectProfileNode createProfileTree (Object obj,
                                                        IdentityHashMap visited,
                                                        Map metadataMap)
    {
        final ObjectProfileNode root = new ObjectProfileNode (null, obj, null);
        
        final LinkedList queue = new LinkedList ();
        
        queue.addFirst (root);
        visited.put (obj, root);
        
        final ClassAccessPrivilegedAction caAction =
            new ClassAccessPrivilegedAction ();
        final FieldAccessPrivilegedAction faAction =
            new FieldAccessPrivilegedAction ();
        
        while (! queue.isEmpty ())
        {
            final ObjectProfileNode node = (ObjectProfileNode) queue.removeFirst ();
            
            obj = node.m_obj;
            final Class objClass = obj.getClass ();
            
            if (objClass.isArray ())
            {           
                final int arrayLength = Array.getLength (obj);
                final Class componentType = objClass.getComponentType ();
                
                // Add shell pseudo-node:
                final AbstractShellProfileNode shell =
                    new ArrayShellProfileNode (node, objClass, arrayLength);
                shell.m_size = sizeofArrayShell (arrayLength, componentType);
                
                node.m_shell = shell;
                node.addFieldRef (shell);
                
                if (! componentType.isPrimitive ())
                {
                    // Traverse each array slot:
                    for (int i = 0; i < arrayLength; ++ i)
                    {
                        final Object ref = Array.get (obj, i);
                        
                        if (ref != null)
                        {
                            ObjectProfileNode child =
                                (ObjectProfileNode) visited.get (ref);
                            if (child != null)
                                ++ child.m_refcount;
                            else
                            {
                                child = new ObjectProfileNode (node, ref,
                                    new ArrayIndexLink (node.m_link, i));
                                node.addFieldRef (child);
                                
                                queue.addLast (child);
                                visited.put (ref, child);
                            }
                        }
                    }
                }
            }
            else // the object is of a non-array type
            {
                final ClassMetadata metadata =
                    getClassMetadata (objClass, metadataMap, caAction, faAction);
                final Field [] fields = metadata.m_refFields;
                
                // Add shell pseudo-node:
                final AbstractShellProfileNode shell =
                    new ObjectShellProfileNode (node,
                                                metadata.m_primitiveFieldCount,
                                                metadata.m_refFields.length);
                shell.m_size = metadata.m_shellSize;
                
                node.m_shell = shell;  
                node.addFieldRef (shell);
                
                // Traverse all non-null ref fields:
                for (int f = 0, fLimit = fields.length; f < fLimit; ++ f)
                {
                    final Field field = fields [f];
                    
                    final Object ref;
                    try // to get the field value: 
                    {
                        ref = field.get (obj);
                    }
                    catch (Exception e)
                    {
                        throw new RuntimeException ("cannot get field [" +
                            field.getName () + "] of class [" +
                            field.getDeclaringClass ().getName () +
                            "]: " + e.toString ());
                    }
                    
                    if (ref != null)
                    {
                        ObjectProfileNode child =
                            (ObjectProfileNode) visited.get (ref);
                        if (child != null)
                            ++ child.m_refcount;
                        else
                        {
                            child = new ObjectProfileNode (node, ref,
                                new ClassFieldLink (field));
                            node.addFieldRef (child);
                            
                            queue.addLast (child);
                            visited.put (ref, child);
                        }
                    }
                }
            }
        }
        
        return root;
    }
    private static void finishProfileTree (ObjectProfileNode node)
    {
        final LinkedList queue = new LinkedList ();
        IObjectProfileNode lastFinished = null;
        while (node != null)
        {
            // Note that an unfinished nonshell node has its child count
            // in m_size and m_children[0] is its shell node:
             
            if ((node.m_size == 1) || (lastFinished == node.m_children [1]))
            {
                node.finish ();
                lastFinished = node;
            }
            else
            {
                queue.addFirst (node);
                for (int i = 1; i < node.m_size; ++ i)
                {
                    final IObjectProfileNode child = node.m_children [i];
                    queue.addFirst (child);
                }
            }
            
            if (queue.isEmpty ())
                return;
            else              
                node = (ObjectProfileNode) queue.removeFirst ();
        }
    }

This code is a distant relative of the clone-via-reflection implementation used in a past Java Q&A, "Attack of the Clones." As before, it caches reflection metadata to improve performance and uses an identity hashmap to mark visited objects. The profile() method starts by spanning the original object graph with a tree of IObjectProfileNodes in a breadth-first traversal and finishes with a quick post-order traversal that totals and assigns all node sizes. profile() returns a IObjectProfileNode that is the generated spanning tree's root, and its size() is the entire graph's size.

Of course, profile()'s output is useful only if I have a good way to explore it. To this end, every IObjectProfileNode supports examination by a node visitor together with a node filter:

interface IObjectProfileNode
{
    interface INodeFilter
    {
        boolean accept (IObjectProfileNode node);
        
    } // End of nested interface
    interface INodeVisitor
    {
        /**
         * Pre-order visit.
         */
        void previsit (IObjectProfileNode node);
        
        /**
         * Post-order visit.
         */
        void postvisit (IObjectProfileNode node);
        
    } // End of nested interface
    boolean traverse (INodeFilter filter, INodeVisitor visitor);
    ...
    
} // End of interface

A node visitor gets a shot at doing something with a tree node only if the accompanying filter is null or if the filter accepts the node. For simplicity, the node's children are examined only if the node itself has been examined. Both pre- and post-order visits are supported. The size contributions from the java.lang.Object shell plus all primitive data fields are lumped together in a pseudo-node attached to every "real" node representing an object instance. Such shell nodes are accessible via IObjectProfileNode.shell() and also show up in the IObjectProfileNode.children() list: the idea is to be able to write data filters and visitors that consider primitive data overhead on equal footing with instantiable data types.

It is up to you how to implement filters and visitors. As a starting point, the ObjectProfileFilters class (see this article's download) offers several useful stock filters that help prune large object trees based on node size, node size relative to its parent's size, node size relative to the root object, and so on. The ObjectProfilerVisitors class contains the default visitor used by IObjectProfileNode.dump(), as well as a visitor that can create an XML dump for more sophisticated object browsing. It is also easy to turn a profile into a Swing TreeModel.

As an illustration, let's do a full dump of the two-string array object mentioned above:

public class Main
{
    public static void main (String [] args)
    {
        Object obj = new String [] {new String ("JavaWorld"),
                                    new String ("JavaWorld")};
        
        IObjectProfileNode profile = ObjectProfiler.profile (obj);
        
        System.out.println ("obj size = " + profile.size () + " bytes");
        System.out.println (profile.dump ());
    }
    
} // End of class

This code produces:

obj size = 106 bytes
  106 ->  : String[]
    58 (54.7%) -> [0] : String
      34 (32.1%) -> String#value : char[], refcount=2
        34 (32.1%) -> 
      24 (22.6%) -> 
    24 (22.6%) -> 
    24 (22.6%) -> [1] : String
      24 (22.6%) -> 

Indeed, as explained earlier, the internal character array (referenced by java.lang.String#value) is shared between both strings. Even though ObjectProfiler.profile() assigns the ownership of this array to the first discovered string, it notices that the array is shared (shown by refcount=2 next to it).

Simply sizeof()

ObjectProfiler.profile() creates a node graph whose size overhead is typically several times the size of the original object graph. If you are interested only in the root object size, you can use a faster and more efficient ObjectProfiler.sizeof() method (see this article's download for the actual code), implemented via a nonrecursive depth-first traversal.

More examples

Let's apply profile() and sizeof() machinery to a couple of quick examples.

Java Strings are notorious memory wasters because they are so ubiquitous and because common string usage patterns can be quite inefficient. As I am sure you know, the natural string concatenation operator usually results in Strings that are not compact. This code:

String obj = "Java" + new String ("World");

produces the following profile:

obj size = 80 bytes
  80 ->  : String
    56 (70%) -> String#value : char[]
      56 (70%) -> 
    24 (30%) -> 

The value character array holds 20 chars even though it only needs 9. Compare this with the result of "Java".concat ("World") or String obj = new String ("Java" + new String ("World")):

obj size = 58 bytes
  58 ->  : String
    34 (58.6%) -> String#value : char[]
      34 (58.6%) -> 
    24 (41.4%) -> 

Obviously, if you allocate many objects with string properties constructed via the concatenation operator or StringBuffer.toString() (both cases are actually quite related under the hood), you will improve memory consumption if you instead use concat() or the String copy constructor.

As a more esoteric example of pushing this theme further, this visitor/filter will examine an object and report all uncompacted Strings inside it:

        class StringInspector implements IObjectProfileNode.INodeFilter,
                                         IObjectProfileNode.INodeVisitor
        {
            public boolean accept (IObjectProfileNode node)
            {
                m_node = null;
                final Object obj = node.object ();
                if ((obj != null) && (node.parent () != null))
                {
                    final Object parentobj = node.parent ().object ();
                    if ((obj.getClass () == char [].class)
                        && (parentobj.getClass () == String.class))
                    {
                        int wasted = ((char []) obj).length -
                                      ((String) parentobj).length (); 
                        if (wasted > 0)
                        {
                            m_node = node.parent ();
                            m_wasted += m_nodeWasted = wasted;
                        }
                    }
                }
                return true;
            }
            
            public void previsit (IObjectProfileNode node)
            {
                if (m_node != null)
                    System.out.println (ObjectProfiler.pathName (m_node.path ())
                     + ": " + m_nodeWasted  + " bytes wasted");
            }
            
            public void postvisit (IObjectProfileNode node)
            {
                // Do nothing
            }
            
            int wasted ()
            {
                return 2 * m_wasted;
            }
            
            private IObjectProfileNode m_node;
            private int m_nodeWasted, m_wasted;
            
        }; // End of local class
        IObjectProfileNode profile = ObjectProfiler.profile (obj);
        StringInspector si = new StringInspector ();
        profile.traverse (si, si);
        System.out.println ("wasted " + si.wasted () + " bytes (out of " +
            profile.size () + ")");

For an example of using sizeof(), let's look at LinkedList() versus ArrayList(). This code:

        List obj = new LinkedList (); // or ArrayList
        for (int i = 0; i < 1000; ++ i) obj.add (null);
        
        IObjectProfileNode profile = ObjectProfiler.profile (obj);   
        System.out.println ("obj size = " + profile.size () + " bytes");

populates a list instance with 1,000 null references. The size of the resulting structure is thus the storage overhead of the list implementation. For LinkedList and ArrayList collection implementations, sizeof() reports 20,040 and 4,112 bytes, respectively. Even though ArrayList grows its internal capacity ahead of its size (such that almost 50 percent of capacity could be wasted at any moment; this is done to keep the amortized cost of insertion constant), its array-based design is far more memory efficient than LinkedList()'s doubly linked list implementation, which creates a 20-byte node to store each value. (This is not to say you should never use LinkedLists: they guarantee constant unamortized insertion performance, among other things.)

Limitations

ObjectProfiler's approach is not perfect. Besides the already explained ignorance of memory alignment, another obvious problem with it is that Java object instances can share nonstatic data, such as when instance fields point to global singletons and other shared content.

Consider DecimalFormat.getPercentInstance(). Even though it returns a new NumberFormat instance each time, all of them usually share the Locale.getDefault() singleton. So, even though sizeof(DecimalFormat.getPercentInstance()) reports 1,111 bytes each time, it is an over-estimate. This is really just another manifestation of the conceptual difficulties in defining the size measure for a Java object. In a situation like that, ObjectProfiler.sizedelta(Object base, Object obj) might be handy: this method traverses the object graph rooted at base and then profiles obj using the visited object set pre-populated during the first traversal. The result is effectively computed as the total size of data owned by obj that does not appear to be owned by base. In other words, it is the amount of memory needed to instantiate obj given that base exists already (shared data is effectively subtracted out).

sizedelta(DecimalFormat.getPercentInstance(), DecimalFormat.getPercentInstance()) reports that every subsequent format instance requires 741 bytes, only a few bytes off the more precise value of 752 bytes measured by Java Tip 130's Sizeof class and much better than the original sizeof() estimate.

Another type of data ObjectProfiler can't see is native memory allocations. The result of java.nio.ByteBuffer.allocate(1000) is a JVM heap-allocated structure of 1,059 bytes, but the result of ByteBuffer.allocateDirect(1000) appears to be just 140 bytes; that's because the real storage is allocated in native memory. This is when you give up pure Java and switch to JVM Profiler Interface (JVMPI)-based profilers.

A quite obscure example of the same problem is trying to size an instance of Throwable. ObjectProfiler.sizeof(new Throwable()) reports 20 bytes, in stark contrast with 272 bytes reported by Java Tip 130's Sizeof class. The reason is this hidden field in Throwable:

    private transient Object backtrace;

The JVM treats this field in a special way: it does not show up in reflective calls even though its definition can be seen in JDK sources. Obviously, the JVM uses this object property to store some 250 bytes of native data that supports stack tracing.

Finally, profiling objects that make use of java.lang.ref.* references can lead to confusing results (e.g., results that fluctuate between repeated sizeof() calls on the same object). This happens because weak references create extra concurrency in the application, and the sheer fact of traversing such an object graph might change the reachability status of weak referents. Furthermore, boldly going and poking inside the innards of a java.lang.ref.Reference the way ObjectProfiler does might not be what pure Java code is supposed to do. It is perhaps best to enhance the traversal code to sidestep all nonstrong reference objects (it is also not even clear if such data contributes to the root object's size in the first place).

Sizing it up

This article probably goes too far trying to build a pure Java object profiler. Still, my experience has been that a quick look at a large datastructure via a simple method like ObjectProfiler.profile() can highlight easy memory savings on the order of tens and hundreds of percent. This approach can complement commercial profilers that tend to present very shallow (not graph-based) views of happenings inside the JVM heap. If nothing else, looking inside an object graph can be quite educational.

Vladimir Roubtsov has programmed in a variety of languages for more than 14 years, including Java since 1995. Currently, he develops enterprise software as a senior engineer for Trilogy in Austin, Texas.
1 2 3 Page 2
Page 2 of 3