This month, I'd like to show you how to use four classes in the Java class library to package and compress your application's data. Please read on -- you'll be surprised at the difference in performance these four classes can make.
It may seem like ancient history now, but I remember a time when disk space was an expensive commodity -- 20 megabytes (MB) of storage cost upwards of 00! To stretch hard disk space as far as it would go, we compressed data with a number of different compression and archiving tools: DoubleSpace and PKZip spring to mind.
Even though times have changed, and disk space is now relatively inexpensive, the need to compress data hasn't disappeared. However, instead of expensive hard disk space, users are faced with expensive network bandwidth. For network-savvy languages like Java, this creates a real problem.
Remember patiently waiting for those first applets to load? It seemed to take forever to pull all the class files and associated
data files across the network. Java's designers noticed this as well, and in version 1.1 they added the java.util.zip package to the Java class library to improve this situation. It provided a standard way for Java applications and applets
to compress data.
The savings are impressive. The zip file algorithm compresses class files 25 to 40 percent and text files 70 to 85 percent.
The java.util.zip package includes a number of classes that either compress or support the compression of data. I'll present four. The ZipEntry class represents an entry in a zip file. The ZipOutputStream class and the ZipInputStream class allow applications to write and read zip file data in stream format. The ZipFile class allows applications to read zip file data as a file.
Let's begin with a handful of concepts.
It's important that you understand a little about the zip file format itself. Figure 1 illustrates the layout of a zip file. A zip file consists of zero or more zip entries -- one for each file stored in the zip file. Each entry contains initial header information followed by the compressed data making up the file. At the end of the zip file, after the last zip entry, is the directory. The directory contains information about each entry in the zip file -- its name, its size, the method used to compress it.

The zip file format
Entries can be added to zip files in two different ways: they can be stored or they can be deflated. Stored entries are not compressed -- they are added to the zip file as-is. Deflated entries are compressed.
A question springs to mind here: why choose store over deflate? It turns out that there are two reasons, one obvious and the other not so obvious (in fact, you might even find it counterintuitive).
First, it's faster to store and retrieve an entry if it's not deflated.