Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Learn to speak Jamaican

Introducing Jamaica, a JVM macro assembler language

  • Print
  • Feedback

Most Java programmers, at one time or another, have wondered how the JVM works. Java bytecode programming reveals much insight into the JVM and helps developers program Java better. Also, the ability to produce bytecode at runtime is a great asset and opens doors for new options and imaginations. Historically, various language systems have invented their own runtime systems; today, many want to switch to Java for a good reason: to leverage the free hard work of others who port and optimize the virtual machine on numerous platforms. For pure-Java system software, dynamically generated bytecode may provide performance impossible otherwise. For example, suppose an RDBMS (relational database management system) supported functions in queries like this: SELECT GetLastName(name) FROM emp WHERE CalcAge(birthday) > 30; it is possible and desirable for the database engine to create and use a native Java method rather than simply interpret.

A JVM is a simple stack-based CPU with no general-purpose programmable registers. Its instruction set is called bytecode. Code and data are organized in JVM classes (to which Java classes are mapped), but the JVM does not support all Java language features directly. The JVM Specification also defines many verification and security rules. Even so, programming at the bytecode level proves error-prone and risky, as I personally witnessed in Jamaica testing. Also, bytecode programs tend to be longer than other CPU assembly programs because JVM instructions mostly operate on the stack top. Jamaica tries to address some of these issues, adopting a Java-ish approach: it uses Java syntax for class infrastructure declaration and symbolic names in instructions for references to variables, fields, and labels. Moreover, Jamaica has defined numerous macros for common patterns, making it much easier to read and write JVM assembly programs.

Because the JVM Specification does not define an assembly language, a few efforts have been made—the best-known thus far is Jasmin—and Jamaica is the latest. This article introduces the Jamaica language with many examples, details the instruction set's more complicated instructions, and elaborates on all Jamaica's macros. An equally important part of Jamaica is the underlying abstract API for creating JVM classes; this API and its close relationship with Jamaica are introduced in this article. Assembly programming is closely related to the CPU architecture, but this article does not cover JVM architecture extensively. In the end, this article summarizes Jamaica's benefits and limitations.

Let's start by looking at an example:

public class CHelloWorld {
  public static void main(String[] args) {
    getstatic System.out PrintStream
    ldc "Hello, World!"
    invokevirtual PrintStream.println(String)void
  }
}


The code above looks quite familiar to Java programmers, except for the method body, where executable code is written in the Jamaica bytecode instruction format. All class names are in Java format rather than JVM format. As a "Jamaican convenience," Java classes in java.lang, java.io, and java.util are automatically imported, so you can use the class names directly without package prefixes. If we use macros, we can reduce the code above to a single statement and easily do more:

  • Print
  • Feedback

Resources