Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

JVM performance optimization, Part 2: Compilers

Use the right Java compiler for your Java application

  • Print
  • Feedback

Page 3 of 6

Compilation

A compiler on the other hand loads the entire code to be executed into the runtime. As it translates bytecode, it has ability to look at the entire or partial runtime context and make decisions about how to actually translate the code. Its decisions are based on analysis of code graphs such as different execution branches of instructions and runtime-context data.

When a bytecode sequence is translated into a machine-code instruction set and optimizations can be done to this instruction set, the replacing instruction set (e.g., the optimized sequence) is stored into a structure called the code cache. The next time that bytecode is executed, the previously optimized code can be immediately located in the code cache and used for execution. In some cases a performance counter might kick in and override the previous optimization, in which case the compiler will run a new optimization sequence. The advantage of a code cache is that the resulting instruction set can be executed at once -- no need for interpretive lookups or compilation! This speeds up execution time, especially for Java applications where the same methods are called multiple times.

Optimization

Along with dynamic compilation comes the opportunity to insert performance counters. The compiler might, for instance, insert a performance counter to count every time a bytecode block (e.g, corresponding to a specific method) was called. Compilers use data about how "hot" a given bytecode is to determine where in the code optimizations will best impact the running application. Runtime profiling data enables the compiler to make a rich set of code optimization decisions on the fly, further improving code-execution performance. As more refined code-profiling data becomes available it can be used to make additional and better optimization decisions, such as: how to better sequence instructions in the compiled-to language, whether to replace a set of instructions with more efficient sets, or even whether to eliminate redundant operations.

Example

Consider the Java code:

static int add7( int x ) {
      return x+7;
}

This could be statically compiled by javac to the bytecode:

iload0
 bipush 7
 iadd
 ireturn

When the method is called the bytecode block will be dynamically compiled to machine instructions. When a performance counter (if present for the code block) hits a threshold it might also get optimized. The end result could look like the following machine instruction set for a given execution platform:

lea rax,[rdx+7]
  ret

Different compilers for different applications

Different Java applications have different needs. Long-running enterprise server-side applications could allow for more optimizations, while smaller client-side applications may need fast execution with minimal resource consumption. Let's consider three different compiler settings and their respective pros and cons.

Client-side compilers

A well-known optimizing compiler is C1, the compiler that is enabled through the -client JVM startup option. As its startup name suggests, C1 is a client-side compiler. It is designed for client-side applications that have fewer resources available and are, in many cases, sensitive to application startup time. C1 use performance counters for code profiling to enable simple, relatively unintrusive optimizations.

  • Print
  • Feedback

Resources
  • "JVM performance optimization, Part 1: A JVM technology primer" (Eva Andreasson, JavaWorld, August 2012) launches the JVM performance optimization series with an overview of how a classic Java virtual machine works, including Java's write-once, run-anywhere engine, garbage collection basics, and some common GC algorithms.
  • See "Watch your HotSpot compiler go" (Vladimir Roubtsov, JavaWorld.com, April 2003) for more about the mechanics of hotspot optimization and why it pays to warm up your compiler.
  • If you want to learn more about bytecode and the JVM, see "Bytecode basics" (Bill Venners, JavaWorld, 1996), which takes an initial look at the bytecode instruction set of the Java virtual machine, including primitive types operated upon by bytecodes, bytecodes that convert between types, and bytecodes that operate on the stack.
  • The Java compiler javac is fully discussed in the formal Java platform documentation.
  • Get more of the basics of JVM (JIT) compilers, see the IBM Research Java JIT compiler page.
  • Also see Oracle JRockit's "Understanding Just-In-Time Compilation and Optimization" (Oracle® JRockit Introduction Release R28).
  • Dr. Cliff Click gives a complete tutorial on tiered compilation in his Azul Systems blog (July 2010).
  • Learn more about using performance counters for JVM performance optimization: "Using Platform-Specific Performance Counters for Dynamic Compilation" (Florian Schneider and Thomas R. Gross; Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing, published by ACM Digital Library).
  • Oracle JRockit: The Definitive Guide (Marcus Hirt, Marcus Lagergren; Packt Publishing, 2010): A complete guide to the JRockit JVM.