Objects and arrays
A look at the bytecodes that deal with objects and arrays in the Java virtual machine
By Bill Venners, JavaWorld.com, 12/01/96
Welcome to another edition of
Under The Hood. This column focuses on Java's underlying technologies. It aims to give developers a glimpse of the mechanisms that make
their Java programs run. This month's article takes a look at the bytecodes that deal with objects and arrays.
Object-oriented machine
The Java virtual machine (JVM) works with data in three forms: objects, object references, and primitive types. Objects reside
on the garbage-collected heap. Object references and primitive types reside either on the Java stack as local variables, on
the heap as instance variables of objects, or in the method area as class variables.
In the Java virtual machine, memory is allocated on the garbage-collected heap only as objects. There is no way to allocate
memory for a primitive type on the heap, except as part of an object. If you want to use a primitive type where an Object reference is needed, you can allocate a wrapper object for the type from the java.lang package. For example, there is an Integer class that wraps an int type with an object. Only object references and primitive types can reside on the Java stack as local variables. Objects
can never reside on the Java stack.
The architectural separation of objects and primitive types in the JVM is reflected in the Java programming language, in which
objects cannot be declared as local variables. Only object references can be declared as such. Upon declaration, an object
reference refers to nothing. Only after the reference has been explicitly initialized -- either with a reference to an existing
object or with a call to new -- does the reference refer to an actual object.
In the JVM instruction set, all objects are instantiated and accessed with the same set of opcodes, except for arrays. In
Java, arrays are full-fledged objects, and, like any other object in a Java program, are created dynamically. Array references
can be used anywhere a reference to type Object is called for, and any method of Object can be invoked on an array. Yet, in the Java virtual machine, arrays are handled with special bytecodes.
As with any other object, arrays cannot be declared as local variables; only array references can. Array objects themselves
always contain either an array of primitive types or an array of object references. If you declare an array of objects, you
get an array of object references. The objects themselves must be explicitly created with new and assigned to the elements of the array.
Opcodes for objects
Instantiation of new objects is accomplished via the
new opcode. Two one-byte operands follow the
new opcode. These two bytes are combined to form a 16-bit index into the constant pool. The constant pool element at the specified
offset gives information about the class of the new object. The JVM creates a new instance of the object on the heap and pushes
the reference to the new object onto the stack, as shown below.
| Opcode |
Operand(s) |
Description |
new |
indexbyte1, indexbyte2 |
creates a new object on the heap, pushes reference |
The next table shows the opcodes that put and get object fields. These opcodes, putfield and getfield, operate only on fields
that are instance variables. Static variables are accessed by putstatic and getstatic, which are described later. The putfield
and getfield instructions each take two one-byte operands. The operands are combined to form a 16-bit index into the constant
pool. The constant pool item at that index contains information about the type, size, and offset of the field. The object
reference is taken from the stack in both the putfield and getfield instructions. The putfield instruction takes the instance
variable value from the stack, and the getfield instruction pushes the retrieved instance variable value onto the stack.
Accessing instance variables
| Opcode |
Operand(s) |
Description |
putfield |
indexbyte1, indexbyte2 |
set field, indicated by index, of object to value (both taken from stack) |
getfield |
indexbyte1, indexbyte2 |
pushes field, indicated by index, of object (taken from stack) |
Class variables are accessed via the getstatic and putstatic opcodes, as shown in the table below. Both getstatic and putstatic
take two one-byte operands, which are combined by the JVM to form a 16-bit unsigned offset into the constant pool. The constant
pool item at that location gives information about one static field of a class. Because there is no particular object associated
with a static field, there is no object reference used by either getstatic or putstatic. The putstatic instruction takes the
value to assign from the stack. The getstatic instruction pushes the retrieved value onto the stack.
Accessing class variables
| Opcode |
Operand(s) |
Description |
putstatic |
indexbyte1, indexbyte2 |
set field, indicated by index, of object to value (both taken from stack) |
getstatic |
indexbyte1, indexbyte2 |
pushes field, indicated by index, of object (taken from stack) |
The following opcodes check to see whether the object reference on the top of the stack refers to an instance of the class
or interface indexed by the operands following the opcode. The checkcast instruction throws CheckCastException if the object is not an instance of the specified class or interface. Otherwise, checkcast does nothing. The object reference
remains on the stack and execution is continued at the next instruction. This instruction ensures that casts are safe at run
time and forms part of the JVM's security blanket.
The instanceof instruction pops the object reference from the top of the stack and pushes true or false. If the object is
indeed an instance of the specified class or interface, then true is pushed onto the stack, otherwise, false is pushed onto
the stack. The instanceof instruction is used to implement the instanceof keyword of Java, which allows programmers to test whether an object is an instance of a particular class or interface.
| Opcode |
Operand(s) |
Description |
checkcast |
indexbyte1, indexbyte2 |
Throws ClassCastException if objectref on stack cannot be cast to class at index |
instanceof |
indexbyte1, indexbyte2 |
Pushes true if objectref on stack is an instanceof class at index, else pushes false |
Opcodes for arrays
Instantiation of new arrays is accomplished via the newarray, anewarray, and multianewarray opcodes. The newarray opcode is
used to create arrays of primitive types other than object references. The particular primitive type is specified by a single
one-byte operand following the newarray opcode. The newarray instruction can create arrays for byte, short, char, int, long,
float, double, or boolean.
The anewarray instruction creates an array of object references. Two one-byte operands follow the anewarray opcode and are
combined to form a 16-bit index into the constant pool. A description of the class of object for which the array is to be
created is found in the constant pool at the specified index. This instruction allocates space for the array of object references
and initializes the references to null.
The multianewarray instruction is used to allocate multidimensional arrays -- which are simply arrays of arrays -- and could
be allocated with repeated use of the anewarray and newarray instructions. The multianewarray instruction simply compresses
the bytecodes needed to create multidimensional arrays into one instruction. Two one-byte operands follow the multianewarray
opcode and are combined to form a 16-bit index into the constant pool. A description of the class of object for which the
array is to be created is found in the constant pool at the specified index. Immediately following the two one-byte operands
that form the constant pool index is a one-byte operand that specifies the number of dimensions in this multidimensional array.
The sizes for each dimension are popped off the stack. This instruction allocates space for all arrays that are needed to
implement the multidimensional arrays.
| Opcode |
Operand(s) |
Description |
newarray |
atype |
pops length, allocates new array of primitive types of type indicated by atype, pushes objectref of new array |
anewarray |
indexbyte1, indexbyte2 |
pops length, allocates a new array of objects of class indicated by indexbyte1 and indexbyte2, pushes objectref of new array |
multianewarray |
indexbyte1, indexbyte2, dimensions |
pops dimensions number of array lengths, allocates a new multidimensional array of class indicated by indexbyte1 and indexbyte2,
pushes objectref of new array
|
The next table shows the instruction that pops an array reference off the top of the stack and pushes the length of that array.
| Opcode |
Operand(s) |
Description |
arraylength |
(none) |
pops objectref of an array, pushes length of that array |
The following opcodes retrieve an element from an array. The array index and array reference are popped from the stack, and
the value at the specified index of the specified array is pushed back onto the stack.
Retrieving an array element
| Opcode |
Operand(s) |
Description |
baload |
(none) |
pops index and arrayref of an array of bytes, pushes arrayref[index] |
caload |
(none) |
pops index and arrayref of an array of chars, pushes arrayref[index] |
saload |
(none) |
pops index and arrayref of an array of shorts, pushes arrayref[index] |
iaload |
(none) |
pops index and arrayref of an array of ints, pushes arrayref[index] |
laload |
(none) |
pops index and arrayref of an array of longs, pushes arrayref[index] |
faload |
(none) |
pops index and arrayref of an array of floats, pushes arrayref[index] |
daload |
(none) |
pops index and arrayref of an array of doubles, pushes arrayref[index] |
aaload |
(none) |
pops index and arrayref of an array of objectrefs, pushes arrayref[index] |
The next table shows the opcodes that store a value into an array element. The value, index, and array reference are popped
from the top of the stack.
Storing to an array element
| Opcode |
Operand(s) |
Description |
bastore |
(none) |
pops value, index, and arrayref of an array of bytes, assigns arrayref[index] = value |
castore |
(none) |
pops value, index, and arrayref of an array of chars, assigns arrayref[index] = value |
sastore |
(none) |
pops value, index, and arrayref of an array of shorts, assigns arrayref[index] = value |
iastore |
(none) |
pops value, index, and arrayref of an array of ints, assigns arrayref[index] = value |
lastore |
(none) |
pops value, index, and arrayref of an array of longs, assigns arrayref[index] = value |
fastore |
(none) |
pops value, index, and arrayref of an array of floats, assigns arrayref[index] = value |
dastore |
(none) |
pops value, index, and arrayref of an array of doubles, assigns arrayref[index] = value |
aastore |
(none) |
pops value, index, and arrayref of an array of objectrefs, assigns arrayref[index] = value |
Three-dimensional array: a Java virtual machine simulation
The applet below demonstrates a Java virtual machine executing a sequence of bytecodes. The bytecode sequence in the simulation
was generated by
javac for the
initAnArray() method of the class shown below:
class ArrayDemo {
static void initAnArray() {
int[][][] threeD = new int[5][4][3];
for (int i = 0; i < 5; ++i) {
for (int j = 0; j < 4; ++j) {
for (int k = 0; k < 3; ++k) {
threeD[i][j][k] = i + j + k;
}
}
}
}
}
The bytecodes generated by javac for initAnArray() are shown below:
0 iconst_5 // Push constant int 5.
1 iconst_4 // Push constant int 4.
2 iconst_3 // Push constant int 3.
// Create a new multi-dimensional array using constant pool
// entry #2 as the class (which is [[[I, an 3D array of ints)
// with a dimension of 3.
3 multianewarray #2 dim #3 <Class [[[I>
7 astore_0 // Pop object ref into local variable 0: int threeD[][][] = new int[5][4][3];
8 iconst_0 // Push constant int 0.
9 istore_1 // Pop int into local variable 1: int i = 0;
10 goto 54 // Go to section of code that tests outer loop.
13 iconst_0 // Push constant int 0.
14 istore_2 // Pop int into local variable 2: int j = 0;
15 goto 46 // Go to section of code that tests middle loop.
18 iconst_0 // Push constant int 0.
19 istore_3 // Pop int into local variable 3: int k = 0;
20 goto 38 // Go to section of code that tests inner loop.
23 aload_0 // Push object ref from local variable 0.
24 iload_1 // Push int from local variable 1 (i).
25 aaload // Pop index and arrayref, push object ref at arrayref[index] (gets threeD[i]).
26 iload_2 // Push int from local variable 2 (j).
27 aaload // Pop index and arrayref, push object ref at arrayref[index] (gets threeD[i][j]).
28 iload_3 // Push int from local variable 3 (k).
// Now calculate the int that will be assigned to threeD[i][j][k]
29 iload_1 // Push int from local variable 1 (i).
30 iload_2 // Push int from local variable 2 (j).
31 iadd // Pop two ints, add them, push int result (i + j).
32 iload_3 // Push int from local variable 3 (k).
33 iadd // Pop two ints, add them, push int result (i + j + k).
34 iastore // Pop value, index, and arrayref; assign arrayref[index] = value: threeD[i][j][k] = i + j + k;
35 iinc 3 1 // Increment by 1 the int in local variable 3: ++k;
38 iload_3 // Push int from local variable 3 (k).
39 iconst_3 // Push constant int 3.
40 if_icmplt 23 // Pop right and left ints, jump if left < right: for (...; k < 3;...)
43 iinc 2 1 // Increment by 1 the int in local variable 2: ++j;
46 iload_2 // Push int from local variable 2 (j).
47 iconst_4 // Push constant int 4.
48 if_icmplt 18 // Pop right and left ints, jump if left < right: for (...; j < 4;...)
51 iinc 1 1 // Increment by 1 the int in local variable 1: ++i;
54 iload_1 // Push int from local variable 1 (i).
55 iconst_5 // Push constant int 5.
56 if_icmplt 13 // Pop right and left ints, jump if left < right: for (...; i < 5;...)
59 return
The initAnArray() method merely allocates and initializes a three-dimensional array. This simulation demonstrates how the Java virtual machine
handles multidimensional arrays. In response to the multianewarray instruction, which in this example requests the allocation
of a three-dimensional array, the JVM creates a tree of one-dimensional arrays. The reference returned by the multianewarray
instruction refers to the base one-dimensional array in the tree. In the initAnArray() method, the base array has five components -- threeD[0] through threeD[4]. Each component of the base array is itself a reference to a one-dimensional array of four components, accessed by threeD[0][0] through threeD[4][3]. The components of these five arrays are also references to arrays, each of which has three components. These components
are ints, the elements of this multidimensional array, and they are accessed by threeD[0][0][0] through threeD[4][3][2].
In response to the multianewarray instruction in the initAnArray() method, the Java virtual machine creates one five-dimensional array of arrays, five four-dimensional arrays of arrays, and
twenty three-dimensional arrays of ints. The JVM allocates these 26 arrays on the heap, initializes their components such that they form a tree, and returns the
reference to the base array.
To assign an int value to an element of the three-dimensional array, the JVM uses aaload to get a component of the base array. Then the JVM
uses aaload again on this component -- which is itself an array of arrays -- to get a component of the branch array. This
component is a reference to a leaf array of ints. Finally the JVM uses iastore to assign an int value to the element of the leaf array. The JVM uses multiple one-dimensional array accesses to accomplish operations on
multidimensional arrays.
Resources
- Previous Under The Hood articles
- The lean, mean virtual machine -- Gives an introduction to the Java virtual machine. Look here to see how the garbage collected heap fits in with the other
parts of the Java Virtual Machine.
- The Java class file lifestyle -- Gives an overview to the Java class file, the file format into which all Java programs are compiled.
- Java's garbage-collected heap -- Gives an overview of garbage collection in general and the garbage-collected heap of the Java virtual machine in particular.
- Bytecode basics -- Introduces the bytecodes of the Java Virtual Machine, and discusses primitive types, conversion operations, and stack
operations in particular.
- Floating Point Arithmetic -- Describes the Java Virtual Machine's floating point support and the bytecodes that perform floating point operations.
- Logic and Integer Arithmetic -- Describes the Java Virtual Machine's support for logical and integer arithmetic, and the relevant bytecode instructions.