Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Control flow

With code samples, tables, and a Java virtual machine simulation, here's a look at the bytecodes of the Java virtual machine that deal with control flow

  • Print
  • Feedback
Java offers all the control-flow constructs that C++ programmers found endearing: if, if-else, while, do-while, for, and switch. (Java doesn't offer the goto, but that was never endearing, not to real C++ programmers anyway.)

Decisions, decisions: keep it simple

The simplest control-flow construct Java offers is the if statement. But in bytecodes, the if is not so simple. When a Java program is compiled, the if statement may be translated to a variety of opcodes. Each opcode pops one or two values from the top of the stack and does a comparison. The opcodes that pop only one value off the top of the stack compare that value with zero. The opcodes that pop two values off the stack compare one of the popped values to the other popped value. If the comparison succeeds (success is defined differently by each individual opcode), the Java virtual machine (JVM) branches -- or jumps -- to the offset given as an operand to the comparison opcode. In this manner, the if statement provides many ways for you to make the Java virtual machine decide between two alternative paths of program flow.

All you ever wanted to know about the if opcode

One family of if opcodes performs integer comparisons against zero. When the JVM encounters one of these opcodes, it pops one int off the stack and compares it with zero.

Conditional branch: Integer comparison with zero


Opcode Operand(s) Description


ifeq branchbyte1, branchbyte2 pop int value, if value == 0, branch to offset


ifne branchbyte1, branchbyte2 pop int value, if value != 0, branch to offset


iflt branchbyte1, branchbyte2 pop int value, if value < 0, branch to offset


ifle branchbyte1, branchbyte2 pop int value, if value <= 0, branch to offset


ifgt branchbyte1, branchbyte2 pop int value, if value > 0, branch to offset


ifge branchbyte1, branchbyte2 pop int value, if value >= 0, branch to offset


Another family of if opcodes pops two integers off the top of the stack and compares them against one another. The Java virtual machine branches if the comparison succeeds. Just before these opcodes are executed, value2 is on the top of the stack; value1 is just beneath value2.

Conditional branch: Comparison of two integers


Opcode Operand(s) Description


if_icmpeq branchbyte1, branchbyte2 pop int value2 and value1, if value1 == value2, branch to offset


if_icmpne branchbyte1, branchbyte2 pop int value2 and value1, if value1 != value2, branch to offset


if_icmplt branchbyte1, branchbyte2 pop int value2 and value1, if value1 < value2, branch to offset


if_icmple branchbyte1, branchbyte2 pop int value2 and value1, if value1 <= value2, branch to offset


if_icmpgt branchbyte1, branchbyte2 pop int value2 and value1, if value1 > value2, branch to offset


if_icmpge branchbyte1, branchbyte2 pop int value2 and value1, if value1 >= value2, branch to offset


The opcodes shown above operate on ints. These opcodes also are used for comparisons of types short, byte, and char -- the JVM always manipulates types smaller than int by first converting them to ints and then manipulating the ints.

A third family of opcodes takes care of comparisons of the other primitive types: long, float, and double. These opcodes don't cause a branch by themselves. Instead, they push the int value that represents the result of the comparison -- 0 for equal to, 1 for greater than, and -1 for less than -- and then use one of the int compare opcodes introduced above to force the actual branch.

Comparison of longs, floats, and doubles


Opcode Operand(s) Description


lcmp (none) pop long value2 and value1, compare, push int result


fcmpg (none) pop float value2 and value1, compare, push int result


fcmpl (none) pop float value2 and value1, compare, push int result


dcmpg (none) pop double value2 and value1, compare, push int result


dcmpl (none) pop double value2 and value1, compare, push int result


The two opcodes for float comparisons (fcmpg and fcmpl) differ only in how they handle NaN ("not a number"). In the Java virtual machine, comparisons of floating-point numbers always fail if one of the values being compared is NaN. If neither value being compared is NaN, both fcmpg and fcmpl instructions push a 0 if the values are equal, a 1 if the value1 is greater than value2, and a -1 if value1 is less than value2. But if one or both of the values is NaN, the fcmpg instruction pushes a 1, whereas the fcmpl instruction pushes a -1. Because both of these operands are available, any comparison between two float values can push the same result onto the stack independent of whether the comparison failed because of a NaN. This is also true for the two opcodes that compare double values: dcmpg and dcmpl.

A fourth family of if opcodes pops one object reference off the top of the stack and compares it with null. If the comparison succeeds, the JVM branches.

Conditional branch: object reference comparison with null


Opcode Operand(s) Description


ifnull branchbyte1, branchbyte2 pop reference value, if value == null, branches to offset


ifnonnull branchbyte1, branchbyte2 pop reference value, if value != null, branches to offset


The last family of if opcodes pops two object references off the stack and compares them with each other. In this case, there are only two comparisons that make sense: "equals" and "not equals." If the references are equal, then they refer to the exact same object on the heap. If not, they refer to two different objects. As with all the other if opcodes, if the comparison succeeds, the JVM branches.

Conditional branch: Comparison of two object references


Opcode Operand(s) Description


if_acmpeq branchbyte1, branchbyte2 pop reference value2 and value1, if value1 == value2, branch to offset


if_acmpne branchbyte1, branchbyte2 pop reference value2 and value1, if value1 != value2, branch to offset


It's unconditional: goto opcodes

Those are all of the opcodes that cause the Java virtual machine to branch conditionally. One other family of opcodes, however, causes the JVM to branch unconditionally. Not surprisingly, these opcodes are called "goto." Although goto is a reserved word in the Java programming language, it can't be used in your programs because it won't compile. The reason goto is a reserved word is so that a mischievous programmer can't make a variable named "goto" in order to freak out their peers. But, when you compile a Java program, the bytecodes generated will likely contain lots of goto instructions.

Unconditional branch


Opcode Operand(s) Description


goto branchbyte1, branchbyte2 branch to offset


goto_w branchbyte1, branchbyte2, branchbyte3, branchbyte4 branch to offset


The above opcodes, which perform comparisons and both conditional and unconditional branches, are sufficient to express to a Java virtual machine the desired control flow indicated in Java source code. They achieve this with an if, if-else, while, do-while, or for statement. The above opcodes also could be used to express a switch statement, but the JVM's instruction set includes two opcodes specially designed for the switch statement: tableswitch and lookupswitch.

The nitty gritty of tableswitch and lookupswitch

The tableswitch and lookupswitch instructions both include one default branch offset and a variable-length set of case value/branch offset pairs. Both instructions pop the key (the value of the expression in the parentheses immediately following the switch keyword) from the stack. The key is compared with all the case values. If a match is found, the branch offset associated with the case value is taken. If no match is found, the default branch offset is taken.

The difference between tableswitch and lookupswitch is in how they indicate the case values. The lookupswitch instruction is more general-purpose than tableswitch, but tableswitch is usually more efficient. Both instructions are followed by zero to three bytes of padding -- enough so that the byte immediately following the padding starts at an address that is a multiple of four bytes from the beginning of the method. (These two instructions, by the way, are the only ones in the entire Java virtual machine instruction set that involve alignment on a greater than one-byte boundary.) For both instructions, the next four bytes after the padding is the default branch offset.

After the zero- to three-byte padding and the four-byte default branch offset, the lookupswitch opcode is followed by a four-byte value, npairs, which indicates the number of case value/branch offset pairs that will follow. The case value is an int; this highlights the fact that switch statements in Java require a key expression that is an int, short, char, or byte. If you attempt to use a long, float, or double as a switch key, your program won't compile. The branch offset associated with each case value is another four-byte offset.

In the tableswitch instruction, the zero- to three-byte padding and the four-byte default branch offset are followed by low and high int values. The low and high values indicate the endpoints of a range of case values included in this tableswitch instruction. Following the low and high values are high - low + 1 branch offsets -- one branch offset for high, one for low, and one for each integer case value in between high and low. The branch offset for low immediately follows the high value.

Thus, when the Java virtual machine encounters a lookupswitch instruction, it must check the key against each case value until it finds a match or runs out of case values. If it runs out of case values, it uses the default branch offset. On the other hand, when the JVM encounters a tableswitch instruction, it can simply check to see if the key is within the range defined by low and high. If not, it takes the default branch offset. If so, it just subtracts low from key to get an offset into the list of branch offsets. In this manner, it can determine the appropriate branch offset without having to check each case value.

  • Print
  • Feedback

Resources
  • Previous Under The Hood articles:
  • The lean, mean virtual machine -- Gives an introduction to the Java virtual machine. Look here to see how the garbage collected heap fits in with the other parts of the Java virtual machine.
  • The Java class file lifestyle -- Gives an overview to the Java class file, the file format into which all Java programs are compiled.
  • Java's garbage-collected heap -- Gives an overview of garbage collection in general and the garbage-collected heap of the Java virtual machine in particular.
  • Bytecode basics -- Introduces the bytecodes of the Java virtual machine, and discusses primitive types, conversion operations, and stack operations in particular.
  • Floating Point Arithmetic -- Describes the Java virtual machine's floating-point support and the bytecodes that perform floating point operations.
  • Logic and Arithmetic -- Describes the Java virtual machine's support for logical and integer arithmetic, and the related bytecodes.
  • Objects and Arrays -- Describes how the Java virtual machine deals with objects and arrays, and discusses the relevant bytecodes.
  • Exceptions -- Describes how the Java virtual machine deals with exceptions, and discusses the relevant bytecodes.
  • Try-Finally -- Describes how the Java virtual machine implements try-finally clauses, and discusses the relevant bytecodes.