Wizard API updated!
Tim Boudreau has released a new version of the Swing Wizard library (version 0.997) that fixes the WizardException bug reported in JavaWorld's recent Open Source Java Project profile. The article's examples have been reworked to test out the new, improved WizardException. Thanks, Tim, for this helpful fix!
Open Source Java Projects: The Wizard API

Newsletter sign-up

Sign up for our technology specific newsletters.

Enterprise Java
View all newsletters

Email Address:

Objects and arrays

A look at the bytecodes that deal with objects and arrays in the Java virtual machine

Welcome to another edition of Under The Hood. This column focuses on Java's underlying technologies. It aims to give developers a glimpse of the mechanisms that make their Java programs run. This month's article takes a look at the bytecodes that deal with objects and arrays.

Object-oriented machine

The Java virtual machine (JVM) works with data in three forms: objects, object references, and primitive types. Objects reside on the garbage-collected heap. Object references and primitive types reside either on the Java stack as local variables, on the heap as instance variables of objects, or in the method area as class variables.

In the Java virtual machine, memory is allocated on the garbage-collected heap only as objects. There is no way to allocate memory for a primitive type on the heap, except as part of an object. If you want to use a primitive type where an Object reference is needed, you can allocate a wrapper object for the type from the java.lang package. For example, there is an Integer class that wraps an int type with an object. Only object references and primitive types can reside on the Java stack as local variables. Objects can never reside on the Java stack.

The architectural separation of objects and primitive types in the JVM is reflected in the Java programming language, in which objects cannot be declared as local variables. Only object references can be declared as such. Upon declaration, an object reference refers to nothing. Only after the reference has been explicitly initialized -- either with a reference to an existing object or with a call to new -- does the reference refer to an actual object.

In the JVM instruction set, all objects are instantiated and accessed with the same set of opcodes, except for arrays. In Java, arrays are full-fledged objects, and, like any other object in a Java program, are created dynamically. Array references can be used anywhere a reference to type Object is called for, and any method of Object can be invoked on an array. Yet, in the Java virtual machine, arrays are handled with special bytecodes.

As with any other object, arrays cannot be declared as local variables; only array references can. Array objects themselves always contain either an array of primitive types or an array of object references. If you declare an array of objects, you get an array of object references. The objects themselves must be explicitly created with new and assigned to the elements of the array.

Opcodes for objects

Instantiation of new objects is accomplished via the new opcode. Two one-byte operands follow the new opcode. These two bytes are combined to form a 16-bit index into the constant pool. The constant pool element at the specified offset gives information about the class of the new object. The JVM creates a new instance of the object on the heap and pushes the reference to the new object onto the stack, as shown below.

Object creation


Opcode Operand(s) Description


new indexbyte1, indexbyte2 creates a new object on the heap, pushes reference


The next table shows the opcodes that put and get object fields. These opcodes, putfield and getfield, operate only on fields that are instance variables. Static variables are accessed by putstatic and getstatic, which are described later. The putfield and getfield instructions each take two one-byte operands. The operands are combined to form a 16-bit index into the constant pool. The constant pool item at that index contains information about the type, size, and offset of the field. The object reference is taken from the stack in both the putfield and getfield instructions. The putfield instruction takes the instance variable value from the stack, and the getfield instruction pushes the retrieved instance variable value onto the stack.

Accessing instance variables


Opcode Operand(s) Description


putfield indexbyte1, indexbyte2 set field, indicated by index, of object to value (both taken from stack)


getfield indexbyte1, indexbyte2 pushes field, indicated by index, of object (taken from stack)


Class variables are accessed via the getstatic and putstatic opcodes, as shown in the table below. Both getstatic and putstatic take two one-byte operands, which are combined by the JVM to form a 16-bit unsigned offset into the constant pool. The constant pool item at that location gives information about one static field of a class. Because there is no particular object associated with a static field, there is no object reference used by either getstatic or putstatic. The putstatic instruction takes the value to assign from the stack. The getstatic instruction pushes the retrieved value onto the stack.

Accessing class variables


Opcode Operand(s) Description


putstatic indexbyte1, indexbyte2 set field, indicated by index, of object to value (both taken from stack)


getstatic indexbyte1, indexbyte2 pushes field, indicated by index, of object (taken from stack)


The following opcodes check to see whether the object reference on the top of the stack refers to an instance of the class or interface indexed by the operands following the opcode. The checkcast instruction throws CheckCastException if the object is not an instance of the specified class or interface. Otherwise, checkcast does nothing. The object reference remains on the stack and execution is continued at the next instruction. This instruction ensures that casts are safe at run time and forms part of the JVM's security blanket.

The instanceof instruction pops the object reference from the top of the stack and pushes true or false. If the object is indeed an instance of the specified class or interface, then true is pushed onto the stack, otherwise, false is pushed onto the stack. The instanceof instruction is used to implement the instanceof keyword of Java, which allows programmers to test whether an object is an instance of a particular class or interface.

Type checking


Opcode Operand(s) Description


checkcast indexbyte1, indexbyte2 Throws ClassCastException if objectref on stack cannot be cast to class at index


instanceof indexbyte1, indexbyte2 Pushes true if objectref on stack is an instanceof class at index, else pushes false


Opcodes for arrays


Instantiation of new arrays is accomplished via the newarray, anewarray, and multianewarray opcodes. The newarray opcode is used to create arrays of primitive types other than object references. The particular primitive type is specified by a single one-byte operand following the newarray opcode. The newarray instruction can create arrays for byte, short, char, int, long, float, double, or boolean.

The anewarray instruction creates an array of object references. Two one-byte operands follow the anewarray opcode and are combined to form a 16-bit index into the constant pool. A description of the class of object for which the array is to be created is found in the constant pool at the specified index. This instruction allocates space for the array of object references and initializes the references to null.

The multianewarray instruction is used to allocate multidimensional arrays -- which are simply arrays of arrays -- and could be allocated with repeated use of the anewarray and newarray instructions. The multianewarray instruction simply compresses the bytecodes needed to create multidimensional arrays into one instruction. Two one-byte operands follow the multianewarray opcode and are combined to form a 16-bit index into the constant pool. A description of the class of object for which the array is to be created is found in the constant pool at the specified index. Immediately following the two one-byte operands that form the constant pool index is a one-byte operand that specifies the number of dimensions in this multidimensional array. The sizes for each dimension are popped off the stack. This instruction allocates space for all arrays that are needed to implement the multidimensional arrays.

Creating new arrays


Opcode Operand(s) Description


newarray atype pops length, allocates new array of primitive types of type indicated by atype, pushes objectref of new array


anewarray indexbyte1, indexbyte2 pops length, allocates a new array of objects of class indicated by indexbyte1 and indexbyte2, pushes objectref of new array


multianewarray indexbyte1, indexbyte2, dimensions pops dimensions number of array lengths, allocates a new multidimensional array of class indicated by indexbyte1 and indexbyte2, pushes objectref of new array


The next table shows the instruction that pops an array reference off the top of the stack and pushes the length of that array.

Getting the array length


Opcode Operand(s) Description


arraylength (none) pops objectref of an array, pushes length of that array


The following opcodes retrieve an element from an array. The array index and array reference are popped from the stack, and the value at the specified index of the specified array is pushed back onto the stack.

Retrieving an array element


Opcode Operand(s) Description


baload (none) pops index and arrayref of an array of bytes, pushes arrayref[index]


caload (none) pops index and arrayref of an array of chars, pushes arrayref[index]


saload (none) pops index and arrayref of an array of shorts, pushes arrayref[index]


iaload (none) pops index and arrayref of an array of ints, pushes arrayref[index]


laload (none) pops index and arrayref of an array of longs, pushes arrayref[index]


faload (none) pops index and arrayref of an array of floats, pushes arrayref[index]


daload (none) pops index and arrayref of an array of doubles, pushes arrayref[index]


aaload (none) pops index and arrayref of an array of objectrefs, pushes arrayref[index]


The next table shows the opcodes that store a value into an array element. The value, index, and array reference are popped from the top of the stack.

Storing to an array element


Opcode Operand(s) Description


bastore (none) pops value, index, and arrayref of an array of bytes, assigns arrayref[index] = value


castore (none) pops value, index, and arrayref of an array of chars, assigns arrayref[index] = value


sastore (none) pops value, index, and arrayref of an array of shorts, assigns arrayref[index] = value


iastore (none) pops value, index, and arrayref of an array of ints, assigns arrayref[index] = value


lastore (none) pops value, index, and arrayref of an array of longs, assigns arrayref[index] = value


fastore (none) pops value, index, and arrayref of an array of floats, assigns arrayref[index] = value


dastore (none) pops value, index, and arrayref of an array of doubles, assigns arrayref[index] = value


aastore (none) pops value, index, and arrayref of an array of objectrefs, assigns arrayref[index] = value


Three-dimensional array: a Java virtual machine simulation


The applet below demonstrates a Java virtual machine executing a sequence of bytecodes. The bytecode sequence in the simulation was generated by javac for the initAnArray() method of the class shown below:

class ArrayDemo {
    static void initAnArray() {
        int[][][] threeD = new int[5][4][3];
        for (int i = 0; i < 5; ++i) {
            for (int j = 0; j < 4; ++j) {
                for (int k = 0; k < 3; ++k) {
                    threeD[i][j][k] = i + j + k;
                }
            }
        }
    }
}


The bytecodes generated by javac for initAnArray() are shown below:

   
   0 iconst_5             // Push constant int 5.
   1 iconst_4             // Push constant int 4.
   2 iconst_3             // Push constant int 3.
                          // Create a new multi-dimensional array using constant pool
                          // entry #2 as the class (which is [[[I, an 3D array of ints)
                          // with a dimension of 3.
   3 multianewarray #2 dim #3 <Class [[[I>
   7 astore_0             // Pop object ref into local variable 0: int threeD[][][] = new int[5][4][3];
   8 iconst_0             // Push constant int 0.
   9 istore_1             // Pop int into local variable 1: int i = 0;
  10 goto 54              // Go to section of code that tests outer loop.
  13 iconst_0             // Push constant int 0.
  14 istore_2             // Pop int into local variable 2: int j = 0;
  15 goto 46              // Go to section of code that tests middle loop.
  18 iconst_0             // Push constant int 0.
  19 istore_3             // Pop int into local variable 3: int k = 0;
  20 goto 38              // Go to section of code that tests inner loop.
  23 aload_0              // Push object ref from local variable 0.
  24 iload_1              // Push int from local variable 1 (i).
  25 aaload               // Pop index and arrayref, push object ref at arrayref[index] (gets threeD[i]).
  26 iload_2              // Push int from local variable 2 (j).
  27 aaload               // Pop index and arrayref, push object ref at arrayref[index] (gets threeD[i][j]).
  28 iload_3              // Push int from local variable 3 (k).
                          // Now calculate the int that will be assigned to threeD[i][j][k]
  29 iload_1              // Push int from local variable 1 (i).
  30 iload_2              // Push int from local variable 2 (j).
  31 iadd                 // Pop two ints, add them, push int result (i + j).
  32 iload_3              // Push int from local variable 3 (k).
  33 iadd                 // Pop two ints, add them, push int result (i + j + k).
  34 iastore              // Pop value, index, and arrayref; assign arrayref[index] = value: threeD[i][j][k] = i + j + k;
  35 iinc 3 1             // Increment by 1 the int in local variable 3: ++k;
  38 iload_3              // Push int from local variable 3 (k).
  39 iconst_3             // Push constant int 3.
  40 if_icmplt 23         // Pop right and left ints, jump if left < right: for (...; k < 3;...)
  43 iinc 2 1             // Increment by 1 the int in local variable 2: ++j;
  46 iload_2              // Push int from local variable 2 (j).
  47 iconst_4             // Push constant int 4.
  48 if_icmplt 18         // Pop right and left ints, jump if left < right: for (...; j < 4;...)
  51 iinc 1 1             // Increment by 1 the int in local variable 1: ++i;
  54 iload_1              // Push int from local variable 1 (i).
  55 iconst_5             // Push constant int 5.
  56 if_icmplt 13         // Pop right and left ints, jump if left < right: for (...; i < 5;...)
  59 return


The initAnArray() method merely allocates and initializes a three-dimensional array. This simulation demonstrates how the Java virtual machine handles multidimensional arrays. In response to the multianewarray instruction, which in this example requests the allocation of a three-dimensional array, the JVM creates a tree of one-dimensional arrays. The reference returned by the multianewarray instruction refers to the base one-dimensional array in the tree. In the initAnArray() method, the base array has five components -- threeD[0] through threeD[4]. Each component of the base array is itself a reference to a one-dimensional array of four components, accessed by threeD[0][0] through threeD[4][3]. The components of these five arrays are also references to arrays, each of which has three components. These components are ints, the elements of this multidimensional array, and they are accessed by threeD[0][0][0] through threeD[4][3][2].

In response to the multianewarray instruction in the initAnArray() method, the Java virtual machine creates one five-dimensional array of arrays, five four-dimensional arrays of arrays, and twenty three-dimensional arrays of ints. The JVM allocates these 26 arrays on the heap, initializes their components such that they form a tree, and returns the reference to the base array.

To assign an int value to an element of the three-dimensional array, the JVM uses aaload to get a component of the base array. Then the JVM uses aaload again on this component -- which is itself an array of arrays -- to get a component of the branch array. This component is a reference to a leaf array of ints. Finally the JVM uses iastore to assign an int value to the element of the leaf array. The JVM uses multiple one-dimensional array accesses to accomplish operations on multidimensional arrays.

1 | 2 |  Next >
Resources
  • Previous Under The Hood articles
  • The lean, mean virtual machine -- Gives an introduction to the Java virtual machine. Look here to see how the garbage collected heap fits in with the other parts of the Java Virtual Machine.
  • The Java class file lifestyle -- Gives an overview to the Java class file, the file format into which all Java programs are compiled.
  • Java's garbage-collected heap -- Gives an overview of garbage collection in general and the garbage-collected heap of the Java virtual machine in particular.
  • Bytecode basics -- Introduces the bytecodes of the Java Virtual Machine, and discusses primitive types, conversion operations, and stack operations in particular.
  • Floating Point Arithmetic -- Describes the Java Virtual Machine's floating point support and the bytecodes that perform floating point operations.
  • Logic and Integer Arithmetic -- Describes the Java Virtual Machine's support for logical and integer arithmetic, and the relevant bytecode instructions.