Serialization is the process of saving an object's state to a sequence of bytes; deserialization is the process of rebuilding those bytes into a live object. The Java Serialization API provides a standard mechanism for developers to handle object serialization. In this tip, you will see how to serialize an object, and why serialization is sometimes necessary. You'll learn about the serialization algorithm used in Java, and see an example that illustrates the serialized format of an object. By the time you're done, you should have a solid knowledge of how the serialization algorithm works and what entities are serialized as part of the object at a low level.
In today's world, a typical enterprise application will have multiple components and will be distributed across various systems and networks. In Java, everything is represented as objects; if two Java components want to communicate with each other, there needs be a mechanism to exchange data. One way to achieve this is to define your own protocol and transfer an object. This means that the receiving end must know the protocol used by the sender to re-create the object, which would make it very difficult to talk to third-party components. Hence, there needs to be a generic and efficient protocol to transfer the object between components. Serialization is defined for this purpose, and Java components use this protocol to transfer objects.
Figure 1 shows a high-level view of client/server communication, where an object is transferred from the client to the server through serialization.
In order to serialize an object, you need to ensure that the class of the object implements the java.io.Serializable interface, as shown in Listing 1.
import java.io.Serializable;
class TestSerial implements Serializable {
public byte version = 100;
public byte count = 0;
}
In Listing 1, the only thing you had to do differently from creating a normal class is implement the java.io.Serializable interface. The Serializable interface is a marker interface; it declares no methods at all. It tells the serialization mechanism that the class can be serialized.
Now that you have made the class eligible for serialization, the next step is to actually serialize the object. That is done by calling the writeObject() method of the java.io.ObjectOutputStream class, as shown in Listing 2.
public static void main(String args[]) throws IOException {
FileOutputStream fos = new FileOutputStream("temp.out");
ObjectOutputStream oos = new ObjectOutputStream(fos);
TestSerial ts = new TestSerial();
oos.writeObject(ts);
oos.flush();
oos.close();
}
Listing 2 stores the state of the TestSerial object in a file called temp.out. oos.writeObject(ts); actually kicks off the serialization algorithm, which in turn writes the object to temp.out.
To re-create the object from the persistent file, you would employ the code in Listing 3.
public static void main(String args[]) throws IOException {
FileInputStream fis = new FileInputStream("temp.out");
ObjectInputStream oin = new ObjectInputStream(fis);
TestSerial ts = (TestSerial) oin.readObject();
System.out.println("version="+ts.version);
}
In Listing 3, the object's restoration occurs with the oin.readObject() method call. This method call reads in the raw bytes that we previously persisted and creates a live object that is an exact replica of the original object graph. Because readObject() can read any serializable object, a cast to the correct type is required.
Executing this code will print version=100 on the standard output.
What does the serialized version of the object look like? Remember, the sample code in the previous section saved the serialized version of the TestSerial object into the file temp.out. Listing 4 shows the contents of temp.out, displayed in hexadecimal. (You need a hexadecimal editor to see the output in hexadecimal format.)
AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 A0 0C 34 00 FE B1 DD F9 02 00 02 42 00 05
63 6F 75 6E 74 42 00 07 76 65 72 73 69 6F 6E 78
70 00 64
If you look again at the actual TestSerial object, you'll see that it has only two byte members, as shown in Listing 5.
public byte version = 100;
public byte count = 0;
The size of a byte variable is one byte, and hence the total size of the object (without the header) is two bytes. But if you look at the size of the serialized object in Listing 4, you'll see 51 bytes. Surprise! Where did the extra bytes come from, and what is their significance? They are introduced by the serialization algorithm, and are required in order to to re-create the object. In the next section, you'll explore this algorithm in detail.
By now, you should have a pretty good knowledge of how to serialize an object. But how does the process work under the hood? In general the serialization algorithm does the following:
java.lang.object.I've written a different example object for this section that will cover all possible cases. The new sample object to be serialized is shown in Listing 6.
class parent implements Serializable {
int parentVersion = 10;
}
class contain implements Serializable{
int containVersion = 11;
}
public class SerialTest extends parent implements Serializable {
int version = 66;
contain con = new contain();
public int getVersion() {
return version;
}
public static void main(String args[]) throws IOException {
FileOutputStream fos = new FileOutputStream("temp.out");
ObjectOutputStream oos = new ObjectOutputStream(fos);
SerialTest st = new SerialTest();
oos.writeObject(st);
oos.flush();
oos.close();
}
}
This example is a straightforward one. It serializes an object of type SerialTest, which is derived from parent and has a container object, contain. The serialized format of this object is shown in Listing 7.
AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 05 52 81 5A AC 66 02 F6 02 00 02 49 00 07
76 65 72 73 69 6F 6E 4C 00 03 63 6F 6E 74 00 09
4C 63 6F 6E 74 61 69 6E 3B 78 72 00 06 70 61 72
65 6E 74 0E DB D2 BD 85 EE 63 7A 02 00 01 49 00
0D 70 61 72 65 6E 74 56 65 72 73 69 6F 6E 78 70
00 00 00 0A 00 00 00 42 73 72 00 07 63 6F 6E 74
61 69 6E FC BB E6 0E FB CB 60 C7 02 00 01 49 00
0E 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E 78
70 00 00 00 0B
Figure 2 offers a high-level look at the serialization algorithm for this scenario.
Let's go through the serialized format of the object in detail and see what each byte represents. Begin with the serialization protocol information:
AC ED: STREAM_MAGIC. Specifies that this is a serialization protocol.00 05: STREAM_VERSION. The serialization version.0x73: TC_OBJECT. Specifies that this is a new Object.The first step of the serialization algorithm is to write the description of the class associated with an instance. The example serializes an object of type SerialTest, so the algorithm starts by writing the description of the SerialTest class.
0x72: TC_CLASSDESC. Specifies that this is a new class.00 0A: Length of the class name.53 65 72 69 61 6c 54 65 73 74: SerialTest, the name of the class.05 52 81 5A AC 66 02 F6: SerialVersionUID, the serial version identifier of this class.0x02: Various flags. This particular flag says that the object supports serialization.00 02: Number of fields in this class.Next, the algorithm writes the field int version = 66;.
0x49: Field type code. 49 represents "I", which stands for Int.00 07: Length of the field name.76 65 72 73 69 6F 6E: version, the name of the field.And then the algorithm writes the next field, contain con = new contain();. This is an object, so it will write the canonical JVM signature of this field.
0x74: TC_STRING. Represents a new string.00 09: Length of the string.4C 63 6F 6E 74 61 69 6E 3B: Lcontain;, the canonical JVM signature.0x78: TC_ENDBLOCKDATA, the end of the optional block data for an object.The next step of the algorithm is to write the description of the parent class, which is the immediate superclass of SerialTest.
0x72: TC_CLASSDESC. Specifies that this is a new class.00 06: Length of the class name.70 61 72 65 6E 74: SerialTest, the name of the class0E DB D2 BD 85 EE 63 7A: SerialVersionUID, the serial version identifier of this class.0x02: Various flags. This flag notes that the object supports serialization.00 01: Number of fields in this class.Now the algorithm will write the field description for the parent class. parent has one field, int parentVersion = 100;.
0x49: Field type code. 49 represents "I", which stands for Int.00 0D: Length of the field name.70 61 72 65 6E 74 56 65 72 73 69 6F 6E: parentVersion, the name of the field.0x78: TC_ENDBLOCKDATA, the end of block data for this object.0x70: TC_NULL, which represents the fact that there are no more superclasses because we have reached the top of the class hierarchy.So far, the serialization algorithm has written the description of the class associated with the instance and all its superclasses. Next, it will write the actual data associated with the instance. It writes the parent class members first:
00 00 00 0A: 10, the value of parentVersion.Then it moves on to SerialTest.
00 00 00 42: 66, the value of version.
The next few bytes are interesting. The algorithm needs to write the information about the contain object, shown in Listing 8.
contain con = new contain();
Remember, the serialization algorithm hasn't written the class description for the contain class yet. This is the opportunity to write this description.
0x73: TC_OBJECT, designating a new object.0x72: TC_CLASSDESC.00 07: Length of the class name.63 6F 6E 74 61 69 6E: contain, the name of the class.FC BB E6 0E FB CB 60 C7: SerialVersionUID, the serial version identifier of this class.0x02: Various flags. This flag indicates that this class supports serialization.00 01: Number of fields in this class.Next, the algorithm must write the description for contain's only field, int containVersion = 11;.
0x49: Field type code. 49 represents "I", which stands for Int.00 0E: Length of the field name.63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E: containVersion, the name of the field.0x78: TC_ENDBLOCKDATA.Next, the serialization algorithm checks to see if contain has any parent classes. If it did, the algorithm would start writing that class; but in this case there is no superclass for contain, so the algorithm writes TC_NULL.
0x70: TC_NULL.Finally, the algorithm writes the actual data associated with contain.
00 00 00 0B: 11, the value of containVersion.In this tip, you have seen how to serialize an object, and learned how the serialization algorithm works in detail. I hope this article gives you more detail on what happens when you actually serialize an object.
Sathiskumar Palaniappan has more than four years of experience in the IT industry, and has been working with Java-related technologies for more than three years. Currently, he is working as a system software engineer at the Java Technology Center, IBM Labs. He also has experience in the telecom industry.
Excellent article!!!
Excellent article!!!
You should strongly consider
You should strongly consider Information Cards (end-to-end, or in conjunction with something like LiveID or OpenID) if your scenario isn't one one where you feel comfortable with the security implications of "password reminder e-mails".
Excellent article
Hi javatips (whoever you are),
That was a fine rollicking article. I enjoyed and learned from every character of it.
Could you please share your refrences for this article.
A Good read. Thank you.
Ravi
Once more
Hi again,
Tx again for the good article. I jumped ahead and requested for the references, though they were right under the nose. Now that weary eyes have seen them, I have got them.
Thank you.
Ravi
Good Article...
Good Article...
Really Great Article
Really a Great Article
Thank You for sharing
Very good basic information
Very good basic information for those of us trying to get a better handle on java and serialization. It can seem so confusing, but you've made it possible to get a base to work from. criminal defense attorney
This is nice presentation on
This is nice presentation on XML serialization.
I wonder how we can serialize the Composite Objects. For ex: If I have Department instance associated with Employee instance then Frameworks like JAXB or CASTOR are able to do right marshalling. But I am not seeing the same with XML serialization. Could you share some of your thoughts on this?
I agree, great stuff! Keep
I agree, great stuff! Keep on going...
impressive
impressive
Good Article
Great insights about the Serialization.
Excellent Article!!!!
Excellent Article!!!!
Very Very Useful article
Very Very Useful article
Revealed?
This algorithm has been documented in Core Java in seven editions for almost 15 years, so "revealed" might be a rather strong term.
Like gravity existed before
Like gravity existed before Newton discovered!
Well just because it's been
Well just because it's been documented doesn't mean that everyone has seen it or knew where to find it. It was nice of the poster to put this together for those who are interested and hadn't seen it yet.
Thank You !!
Thank you all for the impressive comments and appreciation.
I would like to share my another article on zero-copy..
http://www.ibm.com/developerworks/linux/library/j-zerocopy/
I hope this will also be very useful........
Thanks!!
Sathish
Really Superb!!!
I throng for this kind article... Its really very informative...
Excellent article and very
Excellent article and very well presented
Nothing new or insightfull
Nothing new gained from this article. Like Cay said, this has been documented for several years!
Agreed. On the other hand, I
Agreed.
On the other hand, I think the point of this article is more to give a simpler version of the spec and encourage people to think deeper about how serialization works.
I think you hit this
I think you hit this spot-on. Instead of blindly going through the motion and just accepting that things work is one thing. But if you dig a little deeper then you end up having a true appreciation for these things.
Maybe so, but the author
Maybe so, but the author here at least took the time to go into such detail and provide the images to boot. I think he deserves at least a small thanks. Surely this has been helpful to quite a few people who may not have seen it previously.
Nice article
Very simple and properly put . i would like to share this link in my blog
Private data members
Thanks for the article - there's just a question that comes to mind however. I'm assuming that the serialisation code is written in Java itself, but as such, how does it access private/protected members of classes - the reflection API (which I assume it uses) will blow up on trying to access private field values?
Good Article...!!
Thank you for giving this type of good article...
Jignesh Dhua
Excellent article
Dear Author,
Thank you for explaining the nuts and bolts of serialization. Its an excellent article
Write on deck ! very well explained
Really good article..very useful undoubtedly
Good article
Hi,
Really its good article, I found more points on how algorith works.
But it talks only on the Serialization on objects, and i couldnt see any points particularly on if the object is not searialized what is the problem with an example.
I mean, why we have to serialize and what is the problem if i transfer the object or write the object without serializing and what would be the output. The example includes with implementing serializable interface and with or without transient type declaration. It will defenetly add clarity to your article. Hope the point will be taken.
i agree with you
i am agree that this is a good artical and apriciate it my blog have a lot of information about pakistan
Post new comment