Secure your Java apps from end to end, Part 1

The foundation of Java security: Virtual machine and byte code security

Nobody was ever fired for writing insecure code. My slightly reworked version of the popular adage, "Nobody was ever fired for buying IBM," while not exactly true is accurate enough to be alarming. Employers more concerned about hitting deadlines at Internet speed and employees more interested in adding more bullet items to their resumes often push security out of the picture.

Consider another alarming phenomenon: When I talk with managers and engineers about security, I often discover that they operate under the misconception that they don't need to worry about security because "Java is secure." By accepting this faulty notion, engineers fail to acknowledge that in building Java apps, they must consider security from three different contexts: virtual machine security, application security, and network security. Java is secure out of the box in only one of those contexts: virtual machine security.

In this series of articles, I will attempt to correct this dangerous misconception. To do so, I will examine Java security from within each of the three contexts and illustrate how common security flaws cut across each. I will also describe defensive measures that you can employ to create more secure applications.

Three flavors of security

When Java made its public debut, developers, researchers, and journalists made a lot of noise about its security. In those early days, Java security meant byte code and virtual machine security. Since Java was mainly seen as a language for downloading small applications to be executed locally, the downloaded code's integrity and execution environment were of paramount importance. Security in that context meant installing a correctly functioning class loader and security manager, and verifying downloaded code.

In my years of building C and C++ applications I never worried about virtual machine security -- it stepped into the limelight with Java. When I thought about security, I worried about flaws in the application logic or implementation that compromised the application or the system on which the application ran. In C++, security in the application context involved limiting the scope of setuid code (in the Unix environment, setuid code runs as another user -- typically the superuser) and trying to avoid introducing buffer overruns and other types of stack nastiness.

The introduction of distributed applications brought another aspect of security to the table. As its name suggests, a distributed application is composed of many parts; each part might reside on its own machine and might communicate with other parts over a public network. A Web application is the stereotypical example. Security in a network context means authenticating and authorizing users and application components and encrypting communication channels.

Many developers do not see the differences between the contexts I have described above and assume that because Java is secure at the virtual machine level, applications are secure across the board. I hope to correct that belief. I'll begin by discussing virtual machine security.

The foundation: Virtual machine security

Virtual machine security, the longtime focus of all developers' attentions, is almost a nonissue today. The number of VM-related security flaws has dropped precipitously in the last two years. Surprises certainly lurk in the wings, especially as Java finds itself implemented on an immensely heterogeneous line of micro devices via Java 2 Platform, Micro Edition technology, but the worst has past.

I was initially tempted to discuss virtual machine security only briefly before moving along to application and network security. I decided to give it equal time for two important reasons: First, excellent programming lessons lie hidden within the numerous flaws discovered over the past six years. Second, many security flaws operate across the three contexts that I described. To understand their behavior, you must be familiar with all three contexts, including JVM security.

If you examine the types of security vulnerabilities identified over the past six years (see Resources for the "official" list), you will find that they fall into a handful of categories. As far as virtual machine security is concerned, the two most important types of flaws revolve around the introduction of unverified and possibly illegal byte code, and the subversion of the Java type system. In exploits, the two are often used together.

The secret of unverified code

When the JVM loads a class file from a server across the network, it has no way of knowing whether or not the byte code is safe to execute. Safe byte code never instructs the virtual machine to perform operations that would leave the Java runtime in an inconsistent or invalid state.

Normally the Java compiler ensures that the byte code in the class files it creates is safe. It's possible to create byte code by hand that attempts to accomplish tasks Sun's Java compiler would never let you do. The Java verifier examines all such application byte code and, using a fancy set of heuristics, identifies code that doesn't play by the rules. Once byte code is verified, the virtual machine knows that it's safe to execute -- at least as long as the verifier is functioning correctly.

Let's take a look at an example in order to better understand the role the verifier plays and to see why trouble reigns when it fails to function.

Consider the following class:

  public
  class Test1
  {
    public
    static
    void
    main(String [] arstring)
    {
      Float a = new Float(56.78);
      Integer b = new Integer(1234);
      System.out.println(a.toString());
    }
  }

If you compile and run that class, the application will print the string 56.78 to the console. That is the value assigned to the instance of the Float class. We are going to modify one byte of the class's byte code and attempt to trick the virtual machine into invoking the toString() method on Integer class's instance rather than on Float's instance (you can download the original and the modified class files from Resources).

Let's take a look at the disassembled output of the unmodified class file:

  Method void main(java.lang.String[])
     0 new #3 <class java.lang.Float>
     3 dup
     4 ldc2_w #13 <Double 56.78>
     7 invokespecial #8 <Method java.lang.Float(double)>
    10 astore_1
    11 new #4 <Class java.lang.Integer>
    14 dup
    15 sipush 1234
    18 invokespecial #9 <Method java.lang.Integer(int)>
    21 astore_2
    22 getstatic #10 <Field java.io.PrintStream out>
    25 aload_1
    26 invokevirtual #12 <Method java.lang.String toString()>
    29 invokevirtual #11 <Method void println(java.lang.String)>
    32 return

The listing above contains the main() method's disassembled output. At byte code offset 25 in this method, the virtual machine loads a reference to an instance of Float that was created earlier (at offsets 0 to 10). That is the instruction we will modify. The disassembled output of the modified class is below:

  Method void main(java.lang.String[])
     0 new #3 <Class java.lang.Float>
     3 dup
     4 ldc2_w #13 <Double 56.78>
     7 invokespecial #8 <Method java.lang.Float(double)>
    10 astore_1
    11 new #4 <Class java.lang.Integer>
    14 dup
    15 sipush 1234
    18 invokespecial #9 <Method java.lang.Integer(int)>
    21 astore_2
    22 getstatic #10 <Field java.io.PrintStream out>
    25 aload_2
    26 invokevirtual #12 <Method java.lang.String toString()>
    29 invokevirtual #11 <Method void println(java.lang.String)>
    32 return

The class is identical byte for byte except for the instruction at offset 25, which loads a reference to an instance of the Integer class.

It is important to note that the result is still valid byte code, meaning the JVM can still execute the byte code without dumping core or heading off into space. The verifier, however, can tell that something has changed under the hood. On my system, I receive the following error when I try to load the class:

  Exception in thread "main" java.lang.VerifyError:
  (class: Test1, method: main signature: ([Ljava/lang/String;)V)
  Incompatible object argument for function call

If you turn the verifier off or if you could exploit a virtual machine flaw and pass the code by the verifier in systems in the wild, the illegal and possibly subversive code works. If I execute the command below I receive the answer 1234 -- the value of the Integer instance.

  java -noverify Test1

That example is as innocuous as can be, but the potential for harm is real enough. Techniques like those presented above, in conjunction with a virtual machine flaw that allows unverified code to be executed, make type confusion exploits possible.

Type confusion

The notion of type is integral to the Java programming language. Every value has an associated type, and the JVM uses its knowledge of a value's type to determine the set of operations that can be performed on the value.

Consistent application of type information is essential for virtual machine security. A type confusion attack, enabled by the introduction of malicious and unverified code, attempts to undermine this foundation of Java security by fooling the JVM into thinking that a block of memory representing one class instance is really an instance of another class. If the attack is successful, the application can then manipulate the instance's state in ways the class's designer never intended. That assault is called a type confusion attack because the virtual machine becomes confused as to the type of the compromised class.

If a class is properly verified, type confusion should never be possible. In the second listing above, the verifier caught the attempt and threw a VerifyError. As long as the verifier isn't turned off or bypassed, security is assured.

Fortunately, the last flaw in the Java byte code verifier of which I am aware was discovered and fixed in late 1999. Given this fact, you might assume you aren't in immediate danger; however, you'd be assuming too much.

While flaws are becoming few and far between, ample opportunity remains for slipping unverified code into an application. Remember, you can manually turn verification off. In the last two months I've come across three major Java applications that instruct the user to turn verification off in certain circumstances. One of those applications even has a significant RMI (remote method invocation) component (as you will learn later in the series, RMI allows network class loading to occur at places in applications that you might not expect). If you can avoid it, don't turn verification off.

Be assured

JVM security is an important facet of overall security. The discussion of unverified code and type confusion exploits should help you understand why. Without the assurance of properly verified downloaded code and a type system on which you can depend, secure computing of any kind is not possible.

Next month, I will explore another context of Java security: application security.

Todd Sundsted has been writing programs since computers became available in desktop models. Though originally interested in building distributed applications in C++, Todd moved on to the Java programming language when it became the obvious choice for that sort of thing. In addition to writing, Todd is cofounder and chief architect of PointFire, Inc.

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more