Modifying archives, Part 2: The Archive class

The Archive class allows you to write or modify stored archive files

1 2 3 4 Page 2
Page 2 of 4

Looking at the implementations of the concrete products, Archive_InputStream (Listing 2, line 330) adds only a small amount of functionality to the default behavior defined in InputStream. For the most part, the methods just chain to the methods of the wrapped InputStream object. The one exception is the close() method (Listing 2, line 364), which notifies the creating Archive object when the stream is closed by passing it a read_accomplished() message.

The Archive_InputStream demonstrates another design pattern: Decorator. A BufferedInputStream serves as an example of a decorator built into Java -- it wraps an InputStream, implementing the same interface as InputStream but slightly modifying the behaviors of a few of the InputStream methods (to add buffering). Decorators are effective ways to modify behavior by using interface, rather than implementation, inheritance.

The Archive_OutputStream class

The Archive_OutputStream (Listing 2, line 370) class isn't a simple decorator that wraps an object and provides minor modifications to behavior, though. It's a full-blown class that implements all the relevant methods of the OutputStream base class in significant ways. Its close() override (Listing 2, line 406) does notify the creating Archive when it's closed (in the same way that Archive_InputStream does), but the Archive_OutputStream does a lot more additional work. In particular, most of the messy mechanics of talking to a .zip archive are buried in the methods of Archive_OutputStream.

The Archive_OutputStream(...) constructor (Listing 2, line 378) is passed a ZipEntry that represents the file to which it's writing. It puts that ZipEntry into the current temporary-file archive's stream (destination). Then, in the case of a DEFLATED (compressed) file, it just sets up to use the destination stream as the data sink. If the file isn't compressed, then a checksum must be computed, so it uses a DelayedOutputStream wrapped by a FastBufferedOutputStream (both discussed in Part 1 of this series) for the output stream. Characters written to this stream are buffered internally until 2K characters are written to the stream, at which time a temporary file is created and the characters are staged to the temporary file. In this way, you don't incur the overhead of creating an on-disk temporary file unless the file size is large enough to justify doing so.

The write(...) override (Listing 2, line 397) writes the characters to whichever data sink is in use (the actual zip file or the buffer), and it updates the CRC value as it does so.

Most of the real work goes on in the close() override (Listing 2, line 406). In the case of a DEFLATED file, the CRC isn't needed and the characters have already been written to the actual archive file, so you'll skip the big else clause, close the current zip-file entry, and notify the creating Archive object that the write operation is complete (by calling write_accomplished()).

In the case of an uncompressed file, close() updates the ZipEntry object's CRC and size fields; then, if the data hasn't been flushed to a temporary file yet, the buffer is extracted from the FastBufferedOutputStream and written to the real file. Otherwise, the temporary file is opened and its contents are transferred to the actual output file.

The Archive class

That's all the really hard work. Meanwhile, the Archive class does two additional things: it manages the list of ZipEntry objects that represent the directory of the archive that we're modifying, and it handles the locking required to stop two threads from simultaneously accessing the same Archive object.

The zip entries are loaded up at the top of the Archive(String,boolean) constructor (Listing 2, line 51). After the load completes, the constructor creates a temporary file to hold the modified archive and wraps that temporary file in a ZipOutputStream (which in turn wraps a FastBufferedOutputStream). I use FastBufferedOutputStream rather than BufferedOutputStream to avoid the unnecessary synchronization overhead imposed by the fact that BufferedOutputStream's write() method is (inappropriately) declared as synchronized.

The remove(...) (Listing 2, line 102) method is trivial -- it just removes the ZipEntry corresponding to the desired file from the list of source files, thereby preventing that file from being copied to the destination archive when the Archive object is closed.

The output_stream_for method (Listing 2, line 124) is less trivial. Rather than being synchronized, this method acquires a roll-your-own Mutex object on line 30 (see Resources). I've done this rather than simply synchronizing output_stream_for() because I want the Archive object to be locked from the point at which the output stream is created until that output stream has closed. The output_stream_for() method will have returned long before the caller finishes with the stream. A roll-your-own Mutex lets me acquire the lock in the current method and release it from another method entirely (write_accomplished() (Listing 2, line 191), which is called from the returned Archive_OutputStream's close() method (Listing 2, line 406)].

If you're unclear about how a Mutex works, go back and read "Programming Java Threads in the Real World, Part 4" (or get a copy of Taming Java Threads).

After acquiring the lock Mutex, the method then pulls the ZipEntry for the desired file out of the entries list (or manufactures a ZipEntry from scratch if the file is new). It then creates an Archive_OutputStream object that will handle output for that specific file. If this is an append -- rather than an overwrite -- request, output_stream_for() copies the old file contents to the output stream before returning the stream.

The input_stream_for(...) method (Listing 2, line 195) works in a similar way in that it acquires the lock Mutex, which is released when read_accomplished() (Listing 2, line 225) closes the stream.

The final method of interest is close() (Listing 2, line 229), which copies to the destination archive all files that haven't been removed from the entries list due to a previous remove() or output_stream_for() call. It then closes everything and gives the destination file the same name as the original archive. The revert() method (Listing 2, line 301) is just like close(), except that it destroys the destination archive rather than overwriting it.

The final 250 lines or so of the file are just a giant unit test that guarantees that everything works as expected. You'll find it all in Listing 2 below.

Listing 2: /src/com/holub/tools/Archive.java
   1: package com.holub.tools;
   2: 
   3: import java.io.*;
   4: import java.util.*;
   5: import java.util.zip.*;
   6: import com.holub.asynch.Mutex;
   7: import com.holub.io.FastBufferedOutputStream;
   8: import com.holub.io.DelayedOutputStream;
   9: 
  10: import com.holub.io.Std;        // for testing
  11: import com.holub.tools.Tester;  // for testing
  12: // import com.holub.tools.debug.D;  // for testing
  13: import com.holub.tools.D;           // for testing
  14: 
         
/**********************************

A class that simplifies reading from, writing to, and modifying jar/zip files. Sun's support for JAR files is dicey. It's not difficult to read them, but writing or updating is nigh on impossible. The only way to update a jar, for example, is to copy an existing jar into a new one, writing the changes as you copy, and then renaming the new file so that you overwrite the old one. If you have many updates to perform, then this is a time-consuming process, to say the least. The Archive works by creating a second jar file that holds the modified files, then when you close the archive, the close method overwrites the original jar with the copy. Note that the internal_path passed to the various methods of this class must be a fully formed path name (no ".'s" or "..'s") that uses forward slashes as a path separator. This class is thread safe, but access to the Archive is serialized. Only one thread at a time can access the archive. You modify the Archive be requesting an output or input stream for a specific internal file. The returned InputStream or OutputStream must be closed before anyone else is grated access for read or write. In a multi-threaded scenario, the requesting threads will block until the Archive becomes available.

(c) 2000, Allen I. Holub.
You may not distribute this code except in binary form, 
incorporated into a Java .class file. You may use this code 
freely for personal purposes, but you 
may not incorporate it into any product (commercial, shareware, 
or free) without the express written permission 
of Allen I. Holub.
@author Allen I. Holub
*/
  15: public class Archive
  16: {
  17:   private File                source_file;
  18:   private DelayedOutputStream destination_stream;
  19: 
  20:   private ZipFile         source;
  21:   private ZipOutputStream destination; // Temporary file that holds
  22:                                          // modified archive. Overwrites
  23:                                          // source file on close.
  24: 
  25:   private int compression = ZipEntry.DEFLATED ;
  26: 
  27:   private boolean closed  = false;
  28:   private boolean archive_has_been_modified = false;
  29: 
  30:   private Mutex   lock    = new Mutex();   // Locks Archive while read
  31:                                              // or write is in progress.
  32:   private Map     entries = new HashMap(); // Zip entries in the
  33:                                              // source archive, indexed
  34:                                              // by name.
  35: 
  36:   private static final boolean running_under_windows =
  37:                     System.getProperty("os.name").startsWith("Windows");
  38: 
         
/** Alias for true, useful as self-documenting second argument to 
Archive(String,boolean)
*/
  39:   public static final boolean COMPRESSED   = true;
  40: 
         
/** Alias for false, useful as self-documenting second argument to 
Archive(String,boolean)
*/
  41:   public static final boolean UNCOMPRESSED = false;
  42: 
         
/** Alias for true, useful as self-documenting second argument to 
output_stream_for.
*/
  43:   public static final boolean APPEND = true;
  44: 
         
/** Alias for false, useful as self-documenting second argument to 
output_stream_for.
*/
  45:   public static final boolean OVERWRITE = false;
  46: 
         
/*****************************************************************
Thrown by all methods of this class (except the constructor) if you try to access a closed archive.
*/
  47:   public static class Closed extends RuntimeException
  48:     {   Closed(){ super("Tried to modify a closed Archive"); }
  49:     }
  50: 
         
/********************************
Create a new Archive object that represents the .zip or .jar file at the indicated path. @param jar_file_path The path in the file system to the .jar or .zip file. @param compress If true, new files written to the archive (as compared to modifications of existing files) are compressed, otherwise the data in the file is simply stored. The maximum possible compression level is used.
*/
  51:   public Archive( String jar_file_path, boolean compress )
  52:                                                 throws IOException
  53:     {   source_file = new File( jar_file_path );
  54:         try
  55:         {   source = new ZipFile( jar_file_path );
  56: 
  57:             // Transfer all the zip entries into local memory to make
  58:             // them easier to access and manipulate.
  59:             for(Enumeration e = source.entries(); e.hasMoreElements();)
  60:             {   
  61:                 ZipEntry current = (ZipEntry) e.nextElement();
  62:                 entries.put( current.getName(), current );
  63:             }
  64:         }
  65:         catch( Exception e )    // Assume file doen't exist
  66:         {   source = null;      // Since the "entries" list will be
  67:         }                       // empty, "source" won't be used
  68: 
  69:         // The following constructor causes a temporary file to be created
  70:         // when the first write occurs on the output stream. If no
  71:         // writes happen, the file isn't created.
  72: 
  73:         destination_stream = new DelayedOutputStream(
  74:                                 source_file.getName(), ".tmp");
  75: 
  76:         destination        = new ZipOutputStream(
  77:                                 new FastBufferedOutputStream(
  78:                                                 destination_stream));
  79: 
  80:         destination.setLevel(9);    // compression level == max
  81: 
  82:         this.compression=compress ? ZipEntry.DEFLATED : ZipEntry.STORED;
  83:         destination.setMethod( compress ? ZipOutputStream.DEFLATED
  84:                                         : ZipOutputStream.STORED   );
  85:                                              
  86:     }
  87: 
         
/** Convenience method; creates a compressed archive
*/
1 2 3 4 Page 2
Page 2 of 4