Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Bytecode basics

A first look at the bytecodes of the Java virtual machine

  • Print
  • Feedback
Welcome to another installment of "Under The Hood." This column gives Java developers a glimpse of what is going on beneath their running Java programs. This month's article takes an initial look at the bytecode instruction set of the Java virtual machine (JVM). The article covers primitive types operated upon by bytecodes, bytecodes that convert between types, and bytecodes that operate on the stack. Subsequent articles will discuss other members of the bytecode family.

The bytecode format

Bytecodes are the machine language of the Java virtual machine. When a JVM loads a class file, it gets one stream of bytecodes for each method in the class. The bytecodes streams are stored in the method area of the JVM. The bytecodes for a method are executed when that method is invoked during the course of running the program. They can be executed by intepretation, just-in-time compiling, or any other technique that was chosen by the designer of a particular JVM.

A method's bytecode stream is a sequence of instructions for the Java virtual machine. Each instruction consists of a one-byte opcode followed by zero or more operands. The opcode indicates the action to take. If more information is required before the JVM can take the action, that information is encoded into one or more operands that immediately follow the opcode.

Each type of opcode has a mnemonic. In the typical assembly language style, streams of Java bytecodes can be represented by their mnemonics followed by any operand values. For example, the following stream of bytecodes can be disassembled into mnemonics:

// Bytecode stream: 03 3b 84 00 01 1a 05 68 3b a7 ff f9
// Disassembly:
iconst_0      // 03
istore_0      // 3b
iinc 0, 1     // 84 00 01
iload_0       // 1a
iconst_2      // 05
imul          // 68
istore_0      // 3b
goto -7       // a7 ff f9


The bytecode instruction set was designed to be compact. All instructions, except two that deal with table jumping, are aligned on byte boundaries. The total number of opcodes is small enough so that opcodes occupy only one byte. This helps minimize the size of class files that may be traveling across networks before being loaded by a JVM. It also helps keep the size of the JVM implementation small.

All computation in the JVM centers on the stack. Because the JVM has no registers for storing abitrary values, everything must be pushed onto the stack before it can be used in a calculation. Bytecode instructions therefore operate primarily on the stack. For example, in the above bytecode sequence a local variable is multiplied by two by first pushing the local variable onto the stack with the iload_0 instruction, then pushing two onto the stack with iconst_2. After both integers have been pushed onto the stack, the imul instruction effectively pops the two integers off the stack, multiplies them, and pushes the result back onto the stack. The result is popped off the top of the stack and stored back to the local variable by the istore_0 instruction. The JVM was designed as a stack-based machine rather than a register-based machine to facilitate efficient implementation on register-poor architectures such as the Intel 486.

  • Print
  • Feedback

Resources
  • Previous Under The Hood articles:
  • The lean, mean virtual machine -- Gives an introduction to the Java virtual machine. Look here to see how the garbage collected heap fits in with the other parts of the JVM.
  • The Java class file lifestyle -- Gives an overview to the Java class file, the file format into which all Java programs are compiled.
  • Java's garbage-collected heap -- Gives an overview of garbage collection in general and the garbage-collected heap of the JVM in particular.