Serialization and the JavaBeans Specification

The trick to controlling and -- when necessary -- preventing serialization

In last month's column, "Do it the 'Nescafé way -- with freeze-dried JavaBeans," we discussed some of the reasons for, and applications of, freeze-drying JavaBeans into a persistent state. You will recall that serialization of an object is simply the encoding of its state (the values of its fields) in a structured way so that the object can be stored or transmitted as data and recreated at another place and time. (If you need an introduction to serialization in Java, see last month's column. This month we'll be diving right into coding examples, so you'll want to be prepared.)

First, we'll look at serialization of aggregate objects (not much of a feat, as you'll see). We've got a quick example of how to implement the Externalizable interface (for you control freaks out there). Then, we'll discuss how to keep sensitive information from being serialized at all. Finally, we'll finish up with some enlightening reader feedback on last month's column.

Serializing object structures

Last month, we saw that, for any object descended from java.lang.Object, you can make a class serializable simply by adding implements java.io.Serializable to the class definition, because class java.io.ObjectOutputStream knows how to serialize any class descended from java.lang.Object (which means any class at all).

But what if your object contains references to other objects or is composed of other objects? No problem! The serialization mechanism automatically detects references to other objects. As long as the "sub-objects" are also serializable, ObjectOutputStream serializes them and includes them in the stream.

Let's look at a concrete example of this. In the following code example, we implement a TreeNode object. This object has internal fields of sToken_ (a string) and iType_ and iValue_ (integers). It also contains references to two other objects, tnLeft_ and tnRight_, which are references to the node's left and right subtrees. (This node class could be extended easily for use in an expression evaluator.)

import java.io.*;
import java.lang.*;
// This is boring, but it gets the point across.
public class TreeNode
    extends java.lang.Object
    implements java.io.Serializable {
    protected int iType_;
    protected int iValue_;
    protected String sToken_ = new String("");
    protected TreeNode tnLeft_ = null;
    protected TreeNode tnRight_ = null;
    // Necessary to be a well-behaved bean.
    public TreeNode()
    {
        iType_ = iValue_ = -1;
    }
    // Explicit constructor
    public TreeNode(int iType, int iValue, String sToken,
                    TreeNode tnLeft, TreeNode tnRight)
    {
        iType_ = iType;
        iValue_ = iValue;
        sToken_ = sToken;
        tnLeft_ = tnLeft;
        tnRight_ = tnRight;
    }
    // Print me (indented) and all of my children
    public void print(String sIndent)
    {
        System.out.println(sIndent + "type:  " + iType_);
        System.out.println(sIndent + "value: " + iValue_);
        System.out.println(sIndent + "token: " + sToken_);
        System.out.println(sIndent + "left:");
        if (tnLeft_ != null) {
            tnLeft_.print(sIndent + "    ");
        } else {
            System.out.println(sIndent + "    (null)");
        }
        System.out.println(sIndent + "right:");
        if (tnRight_ != null) {
            tnRight_.print(sIndent + "    ");
        } else {
            System.out.println(sIndent + "    (null)");
        }
    }
    // Property accessors
    public void setToken(String sToken) { sToken_ = sToken; }
    public String getToken() { return sToken_; }
    public void setType(int iType) { iType_ = iType; }
    public int getType() { return iType_; }
    public void setValue(int iValue) { iValue_ = iValue; }
    public int getValue() { return iValue_; }
    public void setLeft(TreeNode tnLeft) { tnLeft_ = tnLeft; }
    public TreeNode getLeft() { return tnLeft_; }
    public void setRight(TreeNode tnRight) { tnRight_ = tnRight; }
    public TreeNode getRight() { return tnRight_; }
};

A TestNode is created with token, type, and value, and is connected to left and right branches at construction time. The property accessors allow us to set and interrogate the properties, including the left and right branches. (The BeanBox won't show the branches as properties, since there's no PropertyEditor for them. For more on the BeanBox, see "The BeanBox: Sun's JavaBeans test container.")

Our test class creates a recursive tree structure of TreeNodes and writes it to a file. Here's the source for the test class, followed by a diagram of the structure it creates and serializes:

001 import java.io.*;
002 import java.beans.*;
003 import TreeNode;
004 
005 public class StreamDemo {
006 
007     private static void Usage() throws java.io.IOException
008     {
009         System.out.println("Usage:\n\tStreamDemo w file\n\tStreamDemo r file");
010 
011         IOException ex = new IOException("ERROR");
012         throw ex;
013     }
014 
015     public static void main(String[] args)
016     {
017         System.out.println(args.length);
018 
019         try {
020             if (args.length <= 0)
021             {
022                 Usage();
023             }
024 
025             String cmd = args[0];
026 
027             if (cmd.compareTo("w") == 0)
028             {
029                 if (args.length != 2)
030                 {
031                     Usage();        // Unix anyone?
032                 }
033 
034                 TreeNode    tnLL = new TreeNode(4, 12, "Left Left",
035                                                     null, null);
036                 TreeNode    tnL = new TreeNode(2, 4, "Left", tnLL, null);
037                 TreeNode    tnR = new TreeNode(7, 9, "Right", null, null);
038                 TreeNode    tnRoot = new TreeNode(1, 2, "Root", tnL, tnR);
039 
040                 tnRoot.print("");
041 042                 FileOutputStream f = new FileOutputStream(args[1]);
043                 ObjectOutputStream s = new ObjectOutputStream(f);
044 
045                 s.writeObject(tnRoot); 
046
047                 s.flush();
048             }
049  
050             else if (cmd.compareTo("r") == 0)
051             {
052                 if (args.length != 2)
053                 {
054                     Usage();
055                 }
056 057                 FileInputStream f = new FileInputStream(args[1]);
058                 ObjectInputStream s = new ObjectInputStream(f);
059 
060                 System.out.println("Reading TreeNode:");
061 
062                 TreeNode tnRoot = (TreeNode) s.readObject(); 
063
064                 tnRoot.print("");
065             }
066 
067             else if (cmd.compareTo("i") == 0)
068             {
069                 if (args.length != 2)
070                 {
071                     Usage();
072                 }
073 074                 // Given a name, look for "name.ser"
075                 Object theBean = Beans.instantiate(null, args[1]);
076                 String sName = theBean.getClass().getName();
077 
078                 if ( sName.compareTo("TreeNode") == 0 )
079                 {
080                     TreeNode tn = (TreeNode)theBean;
081                     tn.print("");
082                 }
083                 else
084                 {
085                     System.err.println("There was a bean in that file, " +
086                     "but it was a " + sName);
087                 }
088             }
089 
090             else {
091                 System.err.println("Unknown command " + cmd);
092                 Usage();
093             }
094 
095         }
096 
097         catch (IOException ex) {
098             System.out.println("IO Exception:");
099             System.out.println(ex.getMessage());
100             ex.printStackTrace();
101         }
102         catch (ClassNotFoundException ex) {
103             System.out.println("ClassNotFound Exception:");
104             System.out.println(ex.getMessage());
105             ex.printStackTrace();
106         }
107     }
108 };

The tree created by this code looks like this:

Tree structure of TreeNodes code sample

The test program lets you exercise the TreeNode class in one of three ways. The code in red (lines 42-45) creates FileOutputStream f and then uses f to create an ObjectOutputStream, upon which we then invoke writeObject(). The serialization "machinery" inside the ObjectOutputStream analyzes the object that's passed to it and serializes to the stream any fields it finds. If the ObjectOutputStream finds any non-null object references inside the TreeNode, it then calls writeObject recursively to serialize those objects, as well. In our sample case, it finds tnLeft_ and tnRight_ in each TreeNode, and serializes them if they're non-null.

Now, the object serializer outputs only the fields, not the bytecodes, of an object. So how can the object run elsewhere if the bytecodes aren't in the .ser file? When an object is created from its serialized representation, the Java virtual machine (JVM) creating the instance of the object must either "know" about the class (that is, the class must already be loaded into the JVM), or the JVM must know where to get the class definition (using a class loader). The methods java.beans.Beans.instantiate() and java.io.ObjectInputStream.readObject() take care of all of the class file loading for you, under the hood. (You can control the loading of classes, but just how to do so is beyond our scope here.)

The next piece of code, in blue (lines 57-62), shows how to recreate the TreeNode tree: Just call java.io.ObjectInputStream.readObject() and typecast the result to the class you're expecting. Java's typecasting is type-safe, so if you get something other than a TreeNode from readObject(), you'll get an exception, and the deserialization will fail.

The final important code snippet above appears in green (lines 74-82), and uses the method java.beans.Beans.instantiate() to create the bean from the .ser file. This method is simply a higher-level interface to an ObjectInputStream. It lets you specify a class loader, so you have control over where your class files come from. Also, if the object that is loaded turns out to be an applet, this function initializes the applet by setting the applet's initial size, creating a context for the applet to run in, and calling the applet's init() method. See the documentation for java.beans.Beans.instantiate() for more on how this function works.

After all this explaining, the answer to the question "How do I make a complex structure of objects serializable?" is simple: Make sure every sub-object is serializable, and let Java handle the connections between the objects.

One final detail on serializing a complex structure: What if you had, say, a hundred references to the same object all throughout the structure? You might expect that the object would be serialized a hundred times, and when it was deserialized, you'd have a hundred instances of the same object in your structure, instead of just one. ObjectOutputStream is smarter than that, though. As it's serializing, it keeps track of the identity of each object, and if it's seen that object before, it inserts a special token into the output stream indicating which previously-seen object to use in that place. When ObjectInputStream receives one of these tokens, it hooks up the instance that's already created instead of creating a new one. This process ensures that you always get exactly the same structure you had when the object was serialized.

Creating an Externalizable class

Often in Java documentation, you'll see a requirement that a class "implement either the Serializable or the Externalizable interface." There's seldom a description of the Externalizable interface. (In fact, it's not even very easy to find examples on the Internet of the Externalizable class being used in Java code.)

The method ObjectOuputStream.defaultWriteObject() serializes the object in a distinct series of steps, defined in the section on ObjectOutputStream in the Serialization Specification (http://java.sun.com/products/jdk/1.1/docs/guide/serialization/spec/output.doc.html). ObjectOutputStream.defaultWriteObject() first writes a description of the object's class to the output stream so that the ObjectInputStream() that will recreate the object knows what kind of object to create. Then, defaultWriteObject() introspects the object to find out what its fields are. Next, defaultWriteObject() finds the "highest" (in the inheritance tree) serializable class of the object, and writes all of its fields to the stream. (I'm leaving out a couple of features here for simplicity.) Finally, defaultWriteObject goes down the inheritance tree, writing all of the fields for each derived subclass of that highest serializable class. This ensures that all fields of the object are written.

So, for example, if the object were an Ocelot, and its superclasses Animal and Mammal were serializable, defaultWriteObject would write all Serializable fields of Animal first, then of Mammal, and finally of Ocelot. (See the section Serial killers below for a description of serializable data fields.) defaultWriteObject writes any data fields that are of native types (String, int, and so on), using the members of interface java.io.DataOutput (which ObjectOutputStream implements), and any data fields that are objects by calling itself recursively on the object.

1 2 3 Page 1
Page 1 of 3