How the Java virtual machine handles method invocation and return

A look under the hood at the bytecodes used for invoking and returning from methods

This month's Under The Hood focuses on method invocation and return inside the Java virtual machine (JVM). It describes the four ways Java (and native) methods can be invoked, gives a code sample that illustrates the four ways, and covers the relevant bytecodes.

Method invocation

The Java programming language provides two basic kinds of methods: instance methods and class (or static) methods. The difference between these two kinds of methods are:

  1. Instance methods require an instance before they can be invoked, whereas class methods do not.
  2. Instance methods use dynamic (late) binding, whereas class methods use static (early) binding.

When the Java virtual machine invokes a class method, it selects the method to invoke based on the type of the object reference, which is always known at compile-time. On the other hand, when the virtual machine invokes an instance method, it selects the method to invoke based on the actual class of the object, which may only be known at run time.

The JVM uses two different instructions, shown in the following table, to invoke these two different kinds of methods: invokevirtual for instance methods, and invokestatic for class methods.

Method invocation of invokevirtual and invokestatic
OpcodeOperand(s)Description
invokevirtual
indexbyte1, indexbyte2
pop objectref and args, invoke method at constant pool index
invokestatic
indexbyte1, indexbyte2
pop args, invoke static method at constant pool index

Dynamic linking

Because Java programs are dynamically linked, references to methods initially are symbolic. All invoke instructions, such as

invokevirtual

and

invokestatic

, refer to a constant pool entry that initially contains a symbolic reference. (See my earlier column,

"The Java class file lifestyle,"

for a description of constant pool.) The symbolic reference is a bundle of information that uniquely identifies a method, including the class name, method name, and method descriptor. (A method descriptor is the method's return type and the number and types of its arguments.) The first time the Java virtual machine encounters a particular invoke instruction, the symbolic reference must be resolved.

To resolve a symbolic reference, the JVM locates the method being referred to symbolically and replaces the symbolic reference with a direct reference. A direct reference, such as a pointer or offset, allows the virtual machine to invoke the method more quickly if the reference is ever used again in the future.

For example, upon encountering an invokevirtual instruction, the Java virtual machine forms an index into the constant pool of the current class from the indexbyte1 and indexbyte2 operands that follow the invokevirtual opcode. The constant pool entry contains a symbolic reference to the method to invoke. The process of resolving symbolic references in the constant pool is how the JVM performs dynamic linking.

Verification

During resolution, the JVM also performs several verification checks. These checks ensure that Java language rules are followed and that the invoke instruction is safe to execute. For example, the virtual machine first makes sure the symbolically referenced method exists. If it exists, the virtual machine checks to make sure the current class can legally access the method. For example, if the method is private, it must be a member of the current class. If any of these checks fail, the Java virtual machine throws an exception.

Objectref and args

Once the method has been resolved, the Java virtual machine is ready to invoke it. If the method is an instance method, it must be invoked on an object. For every instance method invocation, the virtual machine expects a reference to the object (objectref) to be on the stack. In addition to objectref, the virtual machine expects the arguments (args) required by the method, if any, to be on the stack. If the method is a class method, only the args are on the stack. Class methods don't require an objectref because they aren't invoked on an object.

The objectref and args (or just args, in the case of a class method) must be pushed onto the calling method's operand stack by the bytecode instructions that precede the invoke instruction.

Pushing and popping the stack frame

To invoke a method, the Java virtual machine creates a new

stack frame

for the method. The stack frame contains space for the method's local variables, its operand stack, and any other information required by a particular virtual machine implementation. The size of the local variables and operand stack are calculated at compile-time and placed into the class file, so the virtual machine knows just how much memory will be needed by the method's stack frame. When the JVM invokes a method, it creates a stack frame of the proper size for that method.

Adding a new frame onto the Java stack when a method is invoked is called "pushing" a stack frame; removing a frame when a method returns is called "popping" a stack frame. The Java stack is made up solely of these frames.

Invoking a Java method

If the method is a Java method (not a native method), the Java virtual machine will push a new frame onto the current Java stack.

In the case of an instance method, the virtual machine pops the objectref and args from the operand stack of the calling method's stack frame. The JVM creates a new stack frame and places the objectref on the new stack frame as local variable 0, and all the args as local variable 1, 2, and so on. The objectref is the implicit this pointer that is passed to any instance method.

For a class method, the virtual machine just pops the args from the operand stack of the calling method's frame and places them onto the new stack frame as local variable 0, 1, 2, and so on.

Once the objectref and args (or just the args, for a class method) have been placed into the local variables of the new frame, the virtual machine makes the new stack frame current and sets the program counter to point to the first instruction in the new method.

The JVM specification does not require a particular implementation for the Java stack. Frames could be allocated individually from a heap, or they could be taken from contiguous memory, or both. If two frames are contiguous, however, the virtual machine can just overlap them such that the top of the operand stack of one frame forms the bottom of the local variables of the next. In this scheme, the virtual machine need not copy objectref and args from one frame to another, because the two frames overlap. The operand stack word containing objectref in the calling method's frame would be the same memory location as local variable 0 of the new frame.

Invoking a native method

If the method being invoked is native, the Java virtual machine invokes it in an implementation-dependent manner. The virtual machine does not push a new stack frame onto the Java stack for the native method. At the point at which the thread enters the native method, it leaves the Java stack behind. When the native method returns, the Java stack once again will be used.

Other forms of method invocation

Although

instance

methods normally are invoked with

invokevirtual

, two other opcodes are used to invoke this kind of method in certain situations:

invokespecial

and

invokeinterface

.

Invokespecial is used in three situations in which an instance method must be invoked based on the type of the reference, not on the class of the object. The three situations are:

  1. invocation of instance initialization (<init>) methods
  2. invocation of private methods
  3. invocation of methods using the super keyword

Invokeinterface is used to invoke an instance method given a reference to an interface.

Method invocation of invokespecial and invokeinterface
OpcodeOperand(s)Description
invokespecial
indexbyte1, indexbyte2
pop objectref and args, invoke method at constant pool index
invokeinterface
indexbyte1, indexbyte2
pop objectref and args, invoke method at constant pool index

The invokespecial instruction

Invokespecial

differs from

invokevirtual

primarily in that

invokespecial

selects a method based on the type of the reference rather than the class of the object. In other words, it does static binding instead of dynamic binding. In each of the three situations where

invokespecial

is used, dynamic binding wouldn't yield the desired result.

invokespecial and <init>

The compiler places code for constructors and instance variable initializers into

<init>

methods, or instance initialization methods. A class gets one

<init>

method in the class file for each constructor in the source. If you don't explicitly declare a constructor in the source, the compiler will generate a default no-arg constructor for you. This default constructor also ends up as an

<init>

method in the class file. So just as every class will have at least one constructor, every class also will have at least one

<init>

method.

The <init> methods are called only when a new instance is created. At least one <init> method will be invoked for each class along the inheritance path of the newly created object, and multiple <init> methods could be invoked for any one class along that path.

Why is invokespecial used to invoke <init> methods? Because subclass <init> methods need to be able to invoke superclass <init> methods. This is how multiple <init> methods get invoked when an object is instantiated. The virtual machine invokes an <init> method declared in the object's class. That <init> method first invokes either another <init> method in the same class, or an <init> method in its superclass. This process continues all the way up to Object.

For example, consider this code:

class Dog { }

class CockerSpaniel extends Dog {

public static void main(String args[]) { CockerSpaniel bootsie = new CockerSpaniel(); } }

When you invoke main(), the virtual machine will allocate space for a new CockerSpaniel object, then invoke CockerSpaniel's default no-arg <init> method to initialize that space. That method will invoke Dog's <init> method, which will invoke Object's <init> method.

Because every class has at least one <init> method, it is common for classes to have <init> methods with identical signatures. (A method's signature is its name and the number and types of its arguments.) For example, the <init> methods for all three classes in the inheritance path for CockerSpaniel have the same signature. CockerSpaniel, Dog, and Object all contain a method named <init> that takes no arguments.

It would be impossible to invoke a Dog's <init> method from CockerSpaniel's <init> method using invokevirtual, because invokevirtual would perform dynamic binding and invoke CockerSpaniel's <init> method. With invokespecial, however, Dog's <init> method can be invoked from CockerSpaniel's <init> method, because the type of the reference placed in the class file is Dog.

invokespecial and private methods

In the case of private methods, it must be possible for a subclass to declare a method with the same signature as a private method in a superclass. For example, consider the following code in which

interestingMethod()

is declared as

private

in a superclass and with package access in a subclass:

class Superclass {

private void interestingMethod() { System.out.println("Superclass's interesting method."); }

void exampleMethod() { interestingMethod(); } }

class Subclass extends Superclass {

void interestingMethod() { System.out.println("Subclass's interesting method."); }

public static void main(String args[]) { Subclass me = new Subclass(); me.exampleMethod(); } }

When you invoke main() in Subclass as defined above, it must print "Superclass's interesting method." If invokevirtual were used, it would print "Subclass's interesting method." Why? Because the virtual machine would choose the interestingMethod() to call based on the actual class of the object, which is Subclass. So it will use Subclass's interestingMethod(). On the other hand, with invokespecial the virtual machine will select the method based on the type of the reference, so Superclass's version of interestingMethod() will be invoked.

invokespecial and super

When invoking a method with the

super

keyword, as in

super.someMethod()

, you want the superclass's version of a method to be invoked -- even if the current class overrides the method. Once again,

invokevirtual

would invoke the current class's version, so it can't be used in this situation.

The invokeinterface instruction

The

invokeinterface

opcode performs the same function as

invokevirtual

. The only difference is that

invokeinterface

is used when the reference is of an interface type.

To understand why a separate opcode is necessary for interface references, you must understand a bit about method tables. When the Java virtual machine loads a class file, it may create a method table for the class. (Whether or not a method table is actually created is the decision of each virtual machine designer; however, it is likely that commercial JVMs will create method tables.) A method table is just an array of direct references to the bytecodes for each instance method that can be invoked on an object, including methods inherited from superclasses.

The JVM uses a different opcode to invoke a method given an interface reference because it can't make as many assumptions about the method table offset as it can given a class reference. If the JVM has a class reference, it knows each method will always occupy the same position in the method table, independent of the actual class of the object. This is not true with an interface reference: The method could occupy different locations for different classes that implement the same interface.

Invocation instructions and speed

As you might imagine, invoking a method given an interface reference is likely to be slower than invoking a method given a class reference. When the Java virtual machine encounters an

invokevirtual

instruction and resolves the symbolic reference to a direct reference to an instance method, that direct reference is likely to be an offset into a method table. From that point forward, the same offset can be used. For an

invokeinterface

instruction, however, the virtual machine will have to search through the method table every single time the instruction is encountered, because it can't assume the offset is the same as in previous invocations.

The fastest instructions will most likely be invokespecial and invokestatic, because methods invoked by these instructions are statically bound. When the JVM resolves the symbolic reference for these instructions and replaces it with a direct reference, that direct reference probably will include a pointer to the actual bytecodes.

Implementation dependence

All these predictions of speed are to some extent guesses, because individual designers of Java virtual machines can use any technique to speed things up; they are limited only by their imagination. The data structures and algorithms for resolving symbolic references and invoking methods are not part of the JVM specification. These decisions are left to the designers of each Java virtual machine implementation.

For example, the slowest kind of method to invoke traditionally has been the synchronized method, which takes about six times as long as a non-synchronized method in Sun's 1.1 Java virtual machine. Sun has claimed that its next-generation virtual machine will make synchronization "free." -- in other words, it will invoke a synchronized method as fast as a non-synchronized one. Also, Sun's 1.1 virtual machine uses an "interface lookup table" to increase the execution speed of the invokeinterface instruction over that of its 1.0 virtual machine.

Examples of method invocation

The following code illustrates the various ways in which the Java virtual machine invokes methods. The code also shows which invocation opcode is used in each situation:

interface inYourFace { void interfaceMethod (); }

class itsABirdItsAPlaneItsSuperClass implements inYourFace {

itsABirdItsAPlaneItsSuperClass(int i) { super(); // invokespecial (of an <init>) }

static void classMethod() { }

void instanceMethod() { }

final void finalInstanceMethod() { }

public void interfaceMethod() { } }

class subClass extends itsABirdItsAPlaneItsSuperClass {

subClass() { this(0); // invokespecial (of an <init>) }

subClass(int i) { super(i); // invokespecial (of an <init>) }

private void privateMethod() { }

void instanceMethod() { }

final void anotherFinalInstanceMethod() { }

void exampleInstanceMethod() {

instanceMethod(); // invokevirtual super.instanceMethod(); // invokespecial

privateMethod(); // invokespecial

finalInstanceMethod(); // invokevirtual anotherFinalInstanceMethod(); // invokevirtual

interfaceMethod(); // invokevirtual

classMethod(); // invokestatic } }

class unrelatedClass {

public static void main(String args[]) {

subClass sc = new subClass(); // invokespecial (of an <init>) subClass.classMethod(); // invokestatic sc.classMethod(); // invokestatic sc.instanceMethod(); // invokevirtual sc.finalInstanceMethod(); // invokevirtual sc.interfaceMethod(); // invokevirtual

inYourFace iyf = sc; iyf.interfaceMethod(); // invokeinterface }

Returning from methods

To return from a method, the JVM uses several opcodes, one for each type of return value. These opcodes do not take operands: If there is a return value, it must be on the operand stack. The return value is popped off the operand stack and pushed onto the operand stack of the calling method's stack frame. The current stack frame is popped, and the calling method's stack frame becomes current. The program counter is reset to the instruction following the instruction that invoked this method in the first place.

Returning from methods
OpcodeOperand(s)Description
ireturn
none
pop int, push onto stack of calling method and return
lreturn
none
pop long, push onto stack of calling method and return
freturn
none
pop float, push onto stack of calling method and return
dreturn
none
pop double, push onto stack of calling method and return
areturn
none
pop object reference, push onto stack of calling method and return
return
none
return void

The ireturn instruction is used for methods that return int, char, byte, or short.

Conclusion

Although the subtle differences between the ways a JVM invokes methods can be a bit confusing, understanding these differences can help you understand the subtleties of the Java language. The main points to remember are:

  1. Instance methods are dynamically bound except for <init> methods, private methods, and methods invoked with the super keyword. In these three special cases, instance methods are statically bound.

  2. Class methods are always statically bound.

  3. Instance methods invoked with an interface reference may be slower than the same methods invoked with an object reference.

Next month

This month's article left out one detail about method invocation and return: synchronization. Next month, I'll describe how the Java virtual machine performs thread synchronization, including how it invokes and returns from synchronized methods.

Bill Venners has been writing software professionally for 12 years. Based in Silicon Valley, he provides software consulting and training services under the name Artima Software Company. Over the years he has developed software for the consumer electronics, education, semiconductor, and life insurance industries. He has programmed in many languages on many platforms: assembly language on various microprocessors, C on Unix, C++ on Windows, Java on the Web. He is author of the book: Inside the Java Virtual Machine, published by McGraw-Hill. The small print: "How the Java Virtual Machine Handles Method Invocation and Return" Article Copyright (c) 1997 Bill Venners. All rights reserved.

Learn more about this topic

  • Previous Under The Hood articles:
  • The Lean, Mean Virtual Machine -- Gives an introduction to the Java virtual machine.
  • The Java Class File Lifestyle -- Gives an overview to the Java class file, the file format into which all Java programs are compiled.
  • Java's Garbage-Collected Heap -- Gives an overview of garbage collection in general and the garbage-collected heap of the Java virtual machine in particular.
  • Bytecode Basics -- Introduces the bytecodes of the Java virtual machine, and discusses primitive types, conversion operations, and stack operations in particular.
  • Floating Point Arithmetic -- Describes the Java virtual machine's floating-point support and the bytecodes that perform floating point operations.
  • Logic and Arithmetic -- Describes the Java virtual machine's support for logical and integer arithmetic, and the related bytecodes.
  • Objects and Arrays -- Describes how the Java virtual machine deals with objects and arrays, and discusses the relevant bytecodes.
  • Exceptions -- Describes how the Java virtual machine deals with exceptions, and discusses the relevant bytecodes.
  • Try-Finally -- Describes how the Java virtual machine implements try-finally clauses, and discusses the relevant bytecodes.
  • Control Flow -- Describes how the Java virtual machine implements control flow and discusses the relevant bytecodes.
  • The Architecture of Aglets-- Describes the inner workings of Aglets, IBM's autonomous Java-based software agent technology.
  • The Point of Aglets-- Analyzes the real-world utility of mobile agents such as Aglets, IBM's autonomous Java-based software agent technology.

Join the discussion
Be the first to comment on this article. Our Commenting Policies