Java and the new Internet programming paradigm

An excerpt from Rise & Resurrection of the American Programmer

1 2 3 4 5 Page 3
Page 3 of 5
  • Primitive data types: Java supports numeric data types (8-bit byte, 16-bit short, 32-bit int, and 64-bit long); there is no unsigned specifier for integer data types. The 8-bit byte data type replaces the old C/C++ char data type. Real numeric types are 32-bit float and 64-bit double; these types, and their arithmetic operations, are as defined by the IEEE 754 specification.

  • Character data types: Java's char data type is different from traditional C; it defines a 16-bit Unicode character, which defines character codes in the range of 0 to 65,535. The Unicode character set facilitates internationalization and localization of character codes, in keeping with the worldwide nature of the Internet.

  • Boolean data types: Java adds a Boolean data type, which assumes the value true or false. Unlike common programming practice in C, a Java Boolean type cannot be converted to any numeric type.

  • Arithmetic and relational operators: All of the familiar C and C++ arithmetic operations are available. In addition, Java adds the ">>>" operator to indicate an unsigned (logical) right shift; it uses the "+" operator for string concatenation.

  • Arrays: Java allows the declaration and allocation of arrays of any type, and the programmer can allocate arrays of arrays to achieve multidimensional arrays. To get the length of an array, Java provides a length() "accessor method." Access to elements of an array can be accomplished with normal C-style indexing, but the Java run-time interpreter checks all array accesses to ensure that their indices are within the range of the array. Note that the familiar C concept of a pointer to an array of memory elements does not exist in C; arbitrary pointer arithmetic has also disappeared, which means that programmers won't be writing code that marches right past the end of an array, trashing the contents of innocent areas of memory and causing program failures in various unpredictable ways.

  • Strings: The string class is for read-only objects, and the StringClass class provides for string objects that the programmer wishes to modify. Note that strings are Java language objects, not pseudoarrays of characters as they are in C; however, the Java compiler understands that a string of characters enclosed within double quotes is to be instantiated as a String object. As noted earlier, the "+" supports concatenation of strings; the length() accessor method can be used to obtain the number of characters in the string.

  • Multilevel breaks: Java has no goto statement, but it does contain a break and continue statement, combined with the notion of labeled blocks of code, to provide the programmer with a mechanism to exit from multiple levels of nested loops. This is in contrast to C, where the continue statement only allows the program to escape to the immediately enclosing block of code. Java's break label and continue label statements are equivalent to the next label and last label statements in the Perl programming language used by many Internet application developers.

  • Memory management and garbage collection: Java does not support the malloc and free commands, with which C and C++ programmers have traditionally managed the allocation of memory within the programs. Java has a new operator that allocates memory for objects; the run-time system then keeps track of the object's status and automatically reclaims memory when objects are no longer in use. Because Java does not support or allow memory pointers, all references within a program to allocated storage (for example, to objects that have been created within the program) are made through symbolic references or "handles." The Java memory manager keeps track of references to existing objects, and when an object has no more references to it, then it becomes a candidate for automatic garbage collection. The Java run-time system performs background garbage collection during idle periods on the user's workstation.

  • Integrated thread synchronization: Java provides multithreading support at the syntactic (language) level and also via support from the run-time system and thread objects.

Features removed from C++

The discussion above has already illustrated some of the features that Sun removed from C and C++ when it created Java. Here's a more complete list:

  • Typedefs, defines, and preprocessors have been eliminated: There is no typedef, no #define statement, and no preprocessor. As a result, there is no need for the header files one typically sees in C and C++. Instead, the definition of other classes and their methods are included within the Java language source files. This is more than a cosmetic trick: In order to understand the typical C/C++ program, you must first read all of the related header files, #defines, and typedefs to understand the overall context of the program.

  • Structures and unions have been eliminated: Java achieves the same effect by allowing the programmer to declare a class with appropriate instance variables. C++ might have followed the same approach, but for obvious reasons, chose to maintain compatibility with C; the Java designers deliberately avoided committing themselves to C/C++ compatibility when they felt it was inappropriate to do so.

  • Functions have been eliminated: Anything the programmer can do with a function can be accomplished by defining a class and creating methods for the class. Thus, the Java designers have tried to eliminate the practice of creating "hybrid" mixtures of OO and procedural programming style.

  • Multiple inheritance is not supported: Only single inheritance is allowed. The desirable features of multiple inheritance are provided with interfaces in Java, which are conceptually similar to what's found in Objective C. An interface is a definition of a set of methods that one or more objects will implement; they contain only the declaration of methods and constants.

  • Goto statements have been eliminated because the most legitimate uses of the goto have typically been to exit from the innermost part of a loop, and that facility has been provided with the break and continue statements.

  • Operator overloading has been eliminated, which means that there are no mechanisms for programmers to overload the standard arithmetic operators. Where this kind of familiar C/C++ activity needs to be carried out, it can be accomplished in a more straightforward way by declaring a class, appropriate instance variables, and appropriate methods to manipulate those variables.

  • Automatic coercions of data types have been eliminated on the premise that if the programmer wants to "coerce" a data element of one type into a different data type, it should be done explicitly with a "cast" operation. Thus, the fragment of code shown below:

    int sampleInt;
        double sampleFloat = 3.1415926535;
        sampleInt = sampleFloat;
    

    would result in a compiler error because of the possible loss of precision. To accomplish this properly would require the programmer to write the code in the following fashion:

    int sampleInt;
        double sampleFloat = 3.1415926535;
        sampleInt = (int)sampleFloat;
    
  • Pointers and pointer arithmetic have been eliminated since they are one of the primary causes of bugs in programs. (Pointers in a data structure are roughly equivalent to goto statements in a control structure.) Because structures are gone, and arrays and strings are represented as objects, the need for pointers has largely disappeared.

Object-oriented features

With the exception of primitive data types, everything in Java is an object -- and even the primitive types can be encapsulated within objects if necessary. Java supports four fundamental aspects of object-oriented technology:

  • Encapsulation: Instance variables and methods for a class are packaged together, thus providing modularity and information hiding.

  • Inheritance: New classes and behavior can be defined in terms of existing classes.

  • Polymorphism: The same message sent to different objects results in behavior that is dependent on the nature of the object receiving the message. If you send a "move" message to an "animal" object, you don't want to be concerned with the nature of the animal you're talking to; if it's a bird, it should be smart enough to carry out the "move" by flying, whereas a snake would respond to the same message by wriggling, a rabbit would hop, and so on.

  • Dynamic binding: As implied above, a programmer doesn't want to be required to specify, at coding time, the specific type of object to which a message is sent; the type resolution needs to be done at run time. This is especially important for Java, because objects (Java applets) can come from anywhere on the network and may have been developed by anyone.

Java follows the C++ tradition of supporting "public," "private," and "protected" variables; the default (if the programmer doesn't specify one of these three) is "friendly," and indicates that the instance variables are accessible to all objects within the same package but inaccessible to objects outside the package. Like C++, the programmer can declare constructor methods that perform initialization when an object is instantiated from a class. But rather than the destructor method that one finds in C++, Java has a finalizer method; this does not require (and does not allow) the programmer to explicitly free the memory associated with the object when it's no longer used.

Like many object-oriented languages, Java supports class methods and class variables -- that is, methods and variables associated with the class as a whole, rather than instances within the class. Class methods can't access instance variables, nor can they invoke instance methods. By definition, a class variable is local to the class itself, and there is only a single copy of the variable, which is shared by every object that the programmer instantiates from the class.

Also, Java supports the concept of abstract classes and abstract methods. This allows the programmer to create a "generic" class, defined in the form of a template. By definition, the programmer cannot instantiate objects from the abstract class; objects can only be instantiated from a subclass of the abstract class, and it's only within those (concrete) subclasses that the defined methods can actually be used.

Java comes with several libraries of utility classes and methods; these include:

  • java.lang: a collection of language types that are always important to a compilation unit. This library contains the definition of object (the root of the entire class hierarchy in Java) and Class, plus threads, exceptions, and wrappers for primitive data types, and various other fundamental classes.

  • java.io: roughly equivalent to the Standard I/O library in most Unix systems; it contains classes to access streams of data and random-access files. An associated library, java.net, provides supports for sockets, Telnet interfaces, and URLs.

  • java.util: contains utility classes such as Dictionary, HashTable, and Stack, as well as Date and Time classes.

  • java.awt: an abstract windowing toolkit (AWT) provides an abstract layer enabling the programmer to port Java applications from one windowing system to another. The library contains classes for basic interface components such as events, fonts, colors, buttons, and scrollbars.

Java multithreading

The Java language and run-time environment support the concept of multithreading, so that Java applets can operate concurrently to play music, run animations, download a text file from a server, and allow the user to scroll down a page. Multithreading is a practical way of obtaining fast, straightforward concurrency within a single process space (that is, taking into account that there may be programs other than Java and the Java-enabled Web browser running on the user's platform, and the user's operating system may also be attempting to coordinate background printing, spreadsheet recalculations, and various other processes).

The Java library provides a class that supports a collection of methods to start a thread, run a thread, stop a thread, and check on a thread's status. Java's approach is sometimes called "lightweight processes" or "execution contexts"; it's modeled after the Cedar/Mesa systems implemented by Xerox PARC, which in turn are based on a formal set of synchronization primitives developed by Professor C.A.R. Hoare in the mid-1970s.

Java's threads are preemptive in nature, and the threads can also be time sliced if the Java interpreter runs on a hardware/software platform that supports time slicing. For environments that don't support time slicing -- such as the Macintosh System 7.5 operating on the computer I used to write this book -- a thread retains control of the processor once it has begun unless a higher-priority thread interrupts. This means that for compute-bound threads, it behooves the programmer to use the yield() method at appropriate times and places in order to give other threads a chance to operate.

At the language level, methods within a class that are declared synchronized are not allowed to run concurrently; such methods run under the control of monitors to ensure that their instance variables remain in a consistent state. Every class and instantiated object has its own monitor that comes into play if necessary. When a synchronized method within a class is entered, it acquires a monitor on the current object; when the synchronized method returns (exits) by any means, its associated monitor is released and other synchronized methods within that object are then allowed to begin executing. Thus, if programmers want to ensure that the classes and methods within their Java programs are "thread safe," then any methods that have the ability to change the values of instance variables should be declared synchronized; this will ensure that only one method can change the state of an object at any time.

Security mechanisms in Java

Security is one of the big concerns on the Internet these days, and the designers of Java have taken it into account in their definition of the programming language, in the compiler, and in the run-time system. As we've already seen, the Java language has eliminated memory pointers -- which means that programmers can't forge pointers into memory, one of the typical sources of security breaches. In addition, the run-time binding of object structures to physical memory addresses means that a programmer can't infer the physical memory layout of a class simply by looking at its definition.

However, it's not enough to make a more restrictive language definition and enforce it with a compiler. The problem is that a Java-enabled Web browser is importing applets from anywhere on the network, including, potentially, applets that "look" like Java but might have been created by a bogus compiler or even hand-coded by a hacker. Thus, to ensure security, the Java run-time interpreter cannot trust the incoming code and must subject it to a verification process to ensure that:

  • The incoming code doesn't forge memory pointers.

  • The incoming code doesn't violate access restrictions.

  • The incoming code accesses objects in the manner that they have been declared (for example, OutputStream objects are always used as OutputStreams and never as anything else).

The complete process in the creation, compiling, loading, verification, and execution of a Java program is shown in Figure 2. The key point here is that the Java bytecode loader and bytecode verifier make no assumption about the primary source of the bytecode stream that the user wishes to execute; it may have come from the user's local file system (that is, on the user's own workstation), or it may have come from anywhere on the Internet.

Figure 2. Implementing security in Java

When the bytecode verifier is finished with its examination, the run-time interpreter can be sure that there will be no operand stack overflows or underflows, that all object field accesses will be legal (that is, appropriate usage of private, public, and protected methods and instance variables), and that the parameter-types of all bytecode instructions are correct. As a result, the interpreter does not have to check for stack overflows or for correct operand types while the Java applet is executing -- and it can therefore run at full speed without compromising reliability.

While a Java applet is running, it may request that a class, or set of classes, be loaded into the user's computer -- and, of course, these new classes could also come from anywhere on the Internet. The new incoming code also has to be checked by the bytecode verifier, but there's one more level of protection that needs to be explained, involving "name spaces."

Most programmers understand this concept from their previous work, but for the nonprogrammers reading this, a brief analogy will help explain the concept. Suppose that you have a brother named Fred and that your spouse also has a brother named Fred. At the dinner table, your spouse casually says, "Oh, by the way, Fred called today to say hello." The question is: Which Fred? It depends on whether you're talking about the "name space" of your own family or that of your spouse. In the case of Java, when we refer to a class name, we need to know which name space it's associated with.

1 2 3 4 5 Page 3
Page 3 of 5