Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Java Tip 130: Do you know your data size?

Don't pay the price for hidden class fields

  • Print
  • Feedback

Page 2 of 4

Note carefully the places where I invoke runGC(). You can edit the code between the heap1 and heap2 declarations to instantiate anything of interest.

Also note how Sizeof prints the object size: the transitive closure of data required by all count class instances, divided by count. For most classes, the result will be memory consumed by a single class instance, including all of its owned fields. That memory footprint value differs from data provided by many commercial profilers that report shallow memory footprints (for example, if an object has an int[] field, its memory consumption will appear separately).

The results

Let's apply this simple tool to a few classes, then see if the results match our expectations.

Note: The following results are based on Sun's JDK 1.3.1 for Windows. Due to what is and is not guaranteed by the Java language and JVM specifications, you cannot apply these specific results to other platforms or other Java implementations.

java.lang.Object

Well, the root of all objects just had to be my first case. For java.lang.Object, I get:

'before' heap: 510696, 'after' heap: 1310696
heap delta: 800000, {class java.lang.Object} size = 8 bytes


So, a plain Object takes 8 bytes; of course, no one should expect the size to be 0, as every instance must carry around fields that support base operations like equals(), hashCode(), wait()/notify(), and so on.

java.lang.Integer

My colleagues and I frequently wrap native ints into Integer instances so we can store them in Java collections. How much does it cost us in memory?

'before' heap: 510696, 'after' heap: 2110696
heap delta: 1600000, {class java.lang.Integer} size = 16 bytes


The 16-byte result is a little worse than I expected because an int value can fit into just 4 extra bytes. Using an Integer costs me a 300 percent memory overhead compared to when I can store the value as a primitive type.

java.lang.Long

Long should take more memory than Integer, but it does not:

'before' heap: 510696, 'after' heap: 2110696
heap delta: 1600000, {class java.lang.Long} size = 16 bytes


Clearly, actual object size on the heap is subject to low-level memory alignment done by a particular JVM implementation for a particular CPU type. It looks like a Long is 8 bytes of Object overhead, plus 8 bytes more for the actual long value. In contrast, Integer had an unused 4-byte hole, most likely because the JVM I use forces object alignment on an 8-byte word boundary.

Arrays

Playing with primitive type arrays proves instructive, partly to discover any hidden overhead and partly to justify another popular trick: wrapping primitive values in a size-1 array to use them as objects. By modifying Sizeof.main() to have a loop that increments the created array length on every iteration, I get for int arrays:

length: 0, {class [I} size = 16 bytes
length: 1, {class [I} size = 16 bytes
length: 2, {class [I} size = 24 bytes
length: 3, {class [I} size = 24 bytes
length: 4, {class [I} size = 32 bytes
length: 5, {class [I} size = 32 bytes
length: 6, {class [I} size = 40 bytes
length: 7, {class [I} size = 40 bytes
length: 8, {class [I} size = 48 bytes
length: 9, {class [I} size = 48 bytes
length: 10, {class [I} size = 56 bytes


and for char arrays:

length: 0, {class [C} size = 16 bytes
length: 1, {class [C} size = 16 bytes
length: 2, {class [C} size = 16 bytes
length: 3, {class [C} size = 24 bytes
length: 4, {class [C} size = 24 bytes
length: 5, {class [C} size = 24 bytes
length: 6, {class [C} size = 24 bytes
length: 7, {class [C} size = 32 bytes
length: 8, {class [C} size = 32 bytes
length: 9, {class [C} size = 32 bytes
length: 10, {class [C} size = 32 bytes


Above, the evidence of 8-byte alignment pops up again. Also, in addition to the inevitable Object 8-byte overhead, a primitive array adds another 8 bytes (out of which at least 4 bytes support the length field). And using int[1] appears to not offer any memory advantages over an Integer instance, except maybe as a mutable version of the same data.

Multidimensional arrays

Multidimensional arrays offer another surprise. Developers commonly employ constructs like int[dim1][dim2] in numerical and scientific computing. In an int[dim1][dim2] array instance, every nested int[dim2] array is an Object in its own right. Each adds the usual 16-byte array overhead. When I don't need a triangular or ragged array, that represents pure overhead. The impact grows when array dimensions greatly differ. For example, a int[128][2] instance takes 3,600 bytes. Compared to the 1,040 bytes an int[256] instance uses (which has the same capacity), 3,600 bytes represent a 246 percent overhead. In the extreme case of byte[256][1], the overhead factor is almost 19! Compare that to the C/C++ situation in which the same syntax does not add any storage overhead.

java.lang.String

Let's try an empty String, first constructed as new String():

'before' heap: 510696, 'after' heap: 4510696
heap delta: 4000000, {class java.lang.String} size = 40 bytes


The result proves quite depressing. An empty String takes 40 bytes—enough memory to fit 20 Java characters.

Before I try Strings with content, I need a helper method to create Strings guaranteed not to get interned. Merely using literals as in:

    object = "string with 20 chars";


will not work because all such object handles will end up pointing to the same String instance. The language specification dictates such behavior (see also the java.lang.String.intern() method). Therefore, to continue our memory snooping, try:

    public static String createString (final int length)
    {
        char [] result = new char [length];
        for (int i = 0; i < length; ++ i) result [i] = (char) i;
        
        return new String (result);
    }


After arming myself with this String creator method, I get the following results:

length: 0, {class java.lang.String} size = 40 bytes
length: 1, {class java.lang.String} size = 40 bytes
length: 2, {class java.lang.String} size = 40 bytes
length: 3, {class java.lang.String} size = 48 bytes
length: 4, {class java.lang.String} size = 48 bytes
length: 5, {class java.lang.String} size = 48 bytes
length: 6, {class java.lang.String} size = 48 bytes
length: 7, {class java.lang.String} size = 56 bytes
length: 8, {class java.lang.String} size = 56 bytes
length: 9, {class java.lang.String} size = 56 bytes
length: 10, {class java.lang.String} size = 56 bytes


The results clearly show that a String's memory growth tracks its internal char array's growth. However, the String class adds another 24 bytes of overhead. For a nonempty String of size 10 characters or less, the added overhead cost relative to useful payload (2 bytes for each char plus 4 bytes for the length), ranges from 100 to 400 percent.

  • Print
  • Feedback

Resources