Java's three types of portability

Find out about the types of portability Java supports, and how Microsoft could undermine the technology by subverting the one type that most threatens its hold on the desktop operating system market

Java has generated a lot of excitement in the programming community because it promises portable applications and applets. In fact, Java provides three distinct types of portability: source code portability, CPU architecture portability, and OS/GUI portability. The fact that there are three distinct types of portability is critical, because only one of these types is a threat to Microsoft. Microsoft can be expected to undermine that one type of portability while embracing the other two -- all the while claiming to support Java. Understanding the three types of portability and how they work together is critical to understanding the threat to Microsoft, and Microsoft's possible responses.

Before jumping into details on each of these three types of portability, though, let's review a few fundamental terms.

Defining some terms

The following terms are used in this article:

Endianism
Endianism refers to the storage order of bytes in a multibyte quantity in a given CPU. For example, the unsigned short 256 (decimal) requires two bytes of storage: a 0x01 and 0x00. These two bytes can be stored in either order: 0x01, 0x00 or 0x00, 0x01. Endianism determines the order in which the two bytes are stored. For practical purposes, endianism usually matters only when CPUs of different endianism must share data.
Java
Java is several different technologies packaged together -- the Java programming language, the Java virtual machine (JVM), and the class libraries associated with the language. This article discusses all of these aspects.
Java virtual machine (JVM)

The JVM is an imaginary CPU for which most Java compilers emit code. Support for this imaginary CPU is what allows Java programs to run without being recompiled on different CPUs. Nothing in the Java programming language requires Java source code to be compiled into code for the JVM instead of into native object code.

In fact, Asymetrix and Microsoft have announced Java compilers that emit native Microsoft Windows applications. (See the Resources section of this article for additional information.)

J-code
J-code is the output emitted by most Java compilers into the class files. J-code can be thought of as object code for the Java virtual machine.
Portability
Portability refers to the ability to run a program on different machines. Running a given program on different machines can require different amounts of work (for example, no work whatsoever, recompiling, or making small changes to the source code). When people refer to Java applications and applets as portable, they usually mean the applications and applets run on different types of machines with no changes (such as recompilation or tweaks to the source code).

Now that we have covered some essential terms, we'll explain each of the three types of Java portability.

Java as a language: source code portability

As a programming language Java provides the simplest and most familiar form of portability -- source code portability. A given Java program should produce identical results regardless of the underlying CPU, operating system, or Java compiler. This idea is not new; languages such as C and C++ have provided the opportunity for this level of portability for many years. However, C and C++ also provide numerous opportunities to create non-portable code as well. Unless programs written in C and C++ are designed to be portable from the beginning, the ability to move to different machines is more theoretical than practical. C and C++ leave undefined details such as the size and endianism of atomic data types, the behavior of floating-point math, the value of uninitialized variables, and the behavior when freed memory is accessed.

In short, although the syntax of C and C++ is well defined, the semantics are not. This semantic looseness allows a single block of C or C++ source code to compile to programs that give different results when run on different CPUs, operating systems, compilers, and even on a single compiler/CPU/OS combination, depending on various compiler settings. (See the sidebar Syntax versus semantics for a discussion of the differences between semantics and syntax.)

Java is different. Java provides much more rigorous semantics and leaves less up to the implementer. Unlike C and C++, Java has defined sizes and endianism for the atomic types, as well as defined floating-point behavior.

Additionally, Java defines more behavior than C and C++. In Java, memory doesn't get freed until it can no longer be accessed, and the language doesn't have any uninitialized variables. All these features help to narrow the variation in the behavior of a Java program from platform to platform and implementation to implementation. Even without the JVM, programs written in the Java language can be expected to port (after recompiling) to different CPUs and operating systems much better than equivalent C or C++ programs.

Unfortunately, the features that make Java so portable have a downside. Java assumes a 32-bit machine with 8-bit bytes and IEEE754 floating-point math. Machines that don't fit this model, including 8-bit microcontrollers and Cray supercomputers, can't run Java efficiently. For this reason, we should expect C and C++ to be used on more platforms than the Java language. We also should expect Java programs to port easier than C or C++ between those platforms that do support both.

Java as a virtual machine: CPU portability

Most compilers produce object code that runs on one family of CPU (for example, the Intel x86 family). Even compilers that produce object code for several different CPU families (for example, x86, MIPS, and SPARC) only produce object code for one CPU type at a time; if you need object code for three different families of CPU, you must compile your source code three times.

The current Java compilers are different. Instead of producing output for each different CPU family on which the Java program is intended to run, the current Java compilers produce object code (called J-code) for a CPU that does not yet exist.

(Sun has announced a CPU that will execute J-code directly, but indicates the first samples of Java chips won't appear until the second half of this year; full production of such chips will begin next year. Sun Microelectronics' picoJavaI core technology will be at the heart of Sun's own microJava processor line, which will target network computers. Licensees such as LG Semicon, Toshiba Corp., and Rockwell Collins Inc. also plan to produce Java chips based on the picoJavaI core.)

For each real CPU on which Java programs are intended to run, a Java interpreter, or virtual machine, "executes" the J-code. This non-existent CPU allows the same object code to run on any CPU for which a Java interpreter exists.

Producing output for an imaginary CPU is not new with Java: The UCSD (University of California at San Diego) Pascal compilers produced P-code years ago; Limbo, a new programming language under development at Lucent Technologies, produces object code for an imaginary CPU; and Perl creates an intermediate program representation and executes this intermediate representation instead of creating native executable code. The Internet-savvy JVM distinguishes itself from these other virtual CPU implementations by intentionally being designed to allow the generation of provably safe, virus-free code. Prior to the Internet, there was no need for virtual machines to prove programs safe and virus-free. This safety feature, combined with a much better understanding of how to quickly execute programs for imaginary CPUs, has led to rapid, widespread acceptance of the JVM. Today, most major operating systems, including OS/2, MacOS, Windows 95/NT, and Novell Netware, either have, or are expected to have, built-in support for J-code programs.

The JVM, being essentially an imaginary CPU, is independent of the source code language. The Java language can produce J-code. But so can Ada95. In fact, J-code-hosted interpreters have been written for several languages, including BASIC, Forth, Lisp, and Scheme, and it is almost certain that implementations of other languages will emit J-code in the future. Once the source code has been converted to J-code, the Java interpreter can't tell what programming language created the J-code it is executing. The result: portability between different CPUs.

The benefit to compiling programs (in any language) to J-code is that the same code runs on different families of CPUs. The downside is that J-code doesn't run as fast as native code. For most applications, this won't matter, but for the highest of high-end programs -- those needing every last percent of the CPU -- the performance cost of J-code will not be acceptable.

Java as a virtual OS and GUI: OS portability

Most Microsoft Windows programs written in C or C++ do not port easily to the Macintosh or Unix environments, even after recompiling. Even if the programmers take extra care to deal with the semantic weaknesses in C or C++, the port is difficult. This difficulty occurs even when the port to the non-Windows operating system takes place without changing CPUs. Why the difficulty?

After eliminating the semantic problems in C and C++ and the CPU porting problems, programmers still must deal with the different operating system and different GUI API calls.

Windows programs make very different calls to the operating system than Macintosh and Unix programs. These calls are critical to writing non-trivial programs, so until this portability problem is addressed, porting will remain difficult.

Java solves this problem by providing a set of library functions (contained in Java-supplied libraries such as awt, util, and lang) that talk to an imaginary OS and imaginary GUI. Just like the JVM presents a virtual CPU, the Java libraries present a virtual OS/GUI. Every Java implementation provides libraries implementing this virtual OS/GUI. Java programs that use these libraries to provide needed OS and GUI functionality port fairly easily.

Using a portability library instead of native OS/GUI calls is not a new idea. Products such as Visix Software's Galaxy and Protools Software's Zinc provide this capability for C and C++. Another approach, not followed by Java, is to pick a single OS/GUI as the master and provide wrapper libraries supporting this master OS/GUI on all machines to which you wish to port. The problem with the master OS/GUI approach is that the ported applications often look alien on the other machines. Macintosh users, for example, complained about a recent version of Microsoft Word for Macintosh because it looked and behaved like a Windows program, not like a Macintosh program. Unfortunately, the approach Java has taken has problems too.

Java has provided a least-common-denominator functionality in its OS/GUI libraries. Features available on only one OS/GUI, such as tabbed dialog boxes, were omitted. The advantage to this approach is that mapping the common functionality to the native OS/GUI is fairly easy and, with care, can provide applications that work as expected on most OSs/GUIs. The disadvantage is that there will be functionality available to native-mode applications that is unavailable to Java applications. Sometimes developers will be able to work around this by extending the AWT; other times they will not. In those cases where desired functionality is unattainable with workarounds, developers most likely will choose to write non-portable code.

Who cares about portability?

Three main constituencies care about portability: developers, end-users, and MIS departments.

Developers: Opportunities and threats loom large

Developers have a vested interest in creating portable software. On the upside, portable software allows them to support more platforms, which leads to a larger base of potential customers. However, the same portability that allows developers to target new markets also allows competitors to target their market.

In a nutshell, Java portability pushes the application software market away from segregated markets based on the various OSs and GUIs and toward one large market. In the current software market, for example, Microsoft is a force to be reckoned with in the Windows and Macintosh application software markets, but has almost no presence in the OS/2 and Unix markets. This partitioning allows companies in the OS/2 and Unix markets to disregard Microsoft as a competitor. Java makes it easier for these companies to compete in the Windows market, but also allows Microsoft easier entry into the OS/2 and Unix markets.

Users: The indirect beneficiaries of portability

Users don't care about portability, per se. If portability makes their lives easier and more pleasant, then they're all for it; if not, they're not. Portability does have some positive effects for users, but these are somewhat indirect. The positive effects:

1 2 Page
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more