|
|
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
December 26, 2003
Does Java have an operator like sizeof() in C?
A superficial answer is that Java does not provide anything like C's sizeof(). However, let's consider why a Java programmer might occasionally want it.
A C programmer manages most datastructure memory allocations himself, and sizeof() is indispensable for knowing memory block sizes to allocate. Additionally, C memory allocators like malloc() do almost nothing as far as object initialization is concerned: a programmer must set all object fields that are pointers
to further objects. But when all is said and coded, C/C++ memory allocation is quite efficient.
By comparison, Java object allocation and construction are tied together (it is impossible to use an allocated but uninitialized object instance). If a Java class defines fields that are references to further objects, it is also common to set them at construction time. Allocating a Java object therefore frequently allocates numerous interconnected object instances: an object graph. Coupled with automatic garbage collection, this is all too convenient and can make you feel like you never have to worry about Java memory allocation details.
Of course, this works only for simple Java applications. Compared with C/C++, equivalent Java datastructures tend to occupy
more physical memory. In enterprise software development, getting close to the maximum available virtual memory on today's
32-bit JVMs is a common scalability constraint. Thus, a Java programmer could benefit from sizeof() or something similar to keep an eye on whether his datastructures are getting too large or contain memory bottlenecks. Fortunately,
Java reflection allows you to write such a tool quite easily.
Before proceeding, I will dispense with some frequent but incorrect answers to this article's question.
Yes, a Java int is 32 bits in all JVMs and on all platforms, but this is only a language specification requirement for the programmer-perceivable width of this data type. Such an int is essentially an abstract data type and can be backed up by, say, a 64-bit physical memory word on a 64-bit machine. The
same goes for nonprimitive types: the Java language specification says nothing about how class fields should be aligned in
physical memory or that an array of booleans couldn't be implemented as a compact bitvector inside the JVM.
The reason this does not work is because the serialization layout is only a remote reflection of the true in-memory layout.
One easy way to see it is by looking at how Strings get serialized: in memory every char is at least 2 bytes, but in serialized form Strings are UTF-8 encoded and so any ASCII content takes half as much space.
You might recollect "Java Tip 130: Do You Know Your Data Size?" that described a technique based on creating a large number of identical class instances and carefully measuring the resulting increase in the JVM used heap size. When applicable, this idea works very well, and I will in fact use it to bootstrap the alternate approach in this article.
Archived Discussions (Read only)