Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Modify archives, Part 1

Supplement Java's util.zip package to make it easy to write or modify existing archives

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone
I started out this month intending to write an article about caching class loaders. I wanted to create a class loader that a client-side application could use to automatically update itself from a server-side version every time the application ran. The idea was for the class loader to maintain a jar archive of the class files that made up the application, and to update that archive as needed during the class-loading process. I still intend to write the class loader article at some point in the future. But when I started writing the code, I quickly bogged down in implementing the archive-related piece. It turned out that the java.util.zip APIs weren't nearly as flexible or complete as I had thought, and I had to put a significant effort into a set of archive-maintenance classes before I could proceed with my class loader. Consequently, this article and Part 2 will present those classes; I'll return to the class loader afterwards.

TEXTBOX:

TEXTBOX_HEAD: Modifying archives: Read the whole series!

:END_TEXTBOX

Reading a jar from a URL

Java features a rich set of jar APIs. For example, you can read archives using a URL that takes the form jar:<url_of_jar>!<path_within_jar>. The following URL gets the source code for one of the classes in my threads package from an archive on my Website:

jar:http://www.holub.com/taming.java.threads.zip!/src/com/holub/asynch/Mutex.java


The code in Listing 1 demonstrates how to access a jar file with a URL.

Listing 1. JarURL.java


   1: import java.net.*;
   2: import java.io.*;
   3: import com.holub.io.P;
   4: 
   5: public class JarURL
   6: {
   7:   public static void main(String[] args)
   8:     {   new JarURL();
   9:     }
  10: 
  11:   public JarURL()
  12:     {
  13:         try
  14:         {
  15:             String url_of_file  = "file:/tmp/foo.jar";
  16:             String path_to_file = "tmp/foo.txt";
  17: 
  18:             // Read from the jar using a URL
  19: 
  20:             URL cache_url = new URL("jar:" + url_of_file + "!/" );
  21: 
  22:             URL file_url = new URL( cache_url, path_to_file );
  23:             JarURLConnection connection = 
  24:                     (JarURLConnection)( file_url.openConnection() );
  25: 
  26:             connection.setDoInput(true);
  27:             connection.setDoOutput(false);
  28:             connection.connect();
  29: 
  30:             InputStream in = connection.getInputStream();
  31:             BufferedReader reader =
  32:                     new BufferedReader( new InputStreamReader(in) );
  33:             P.rintln("Read " + reader.readLine() );
  34: 
  35:             // Unfortunately, you can't write to the jar via a URL.
  36:             // The following code does not work:
  37:             //
  38:             //  OutputStream out = connection.getOutputStream();
  39:             //  PrintWriter writer = new PrintWriter( out );
  40:             //  writer.println("Goodbye world");
  41:             //  P.rintln("Wrote");
  42: 
  43:             // Write to a local jar
  44:         }
  45:         catch( MalformedURLException e ){   System.out.println(e);  }
  46:         catch( IOException e )          {   System.out.println(e);  }
  47:     }
  48: 
  49:   private String path_for(String class_name)
  50:     {
  51:         String path_name = class_name.replace('.', '/');
  52:         return "/" + path_name ;
  53:     }
  54: }


Use the ZipFile and ZipEntry

Unfortunately, URL access to a jar file doesn't permit write operations, so I turned to the various classes in java.util.zip. The ZipFile, which seemed particularly promising, gets a list of ZipEntry objects that represent the files in the archive, and then asks the ZipEntry object for information about the file represented by the object. The following code demonstrates the process by printing the names of all the files in my_file.zip:

zip_file = new ZipFile( "my_file.zip" );
for( Enumeration e = zip_file.entries(); e.hasMoreElements(); )
{   
    ZipEntry entry = (ZipEntry) e.nextElement();
    System.out.println( entry.getName() );
}


You can request an InputStream to read the file associated with a given entry from the ZipFile:

InputStream in = zip_file.getInputStream(entry);


and then read from it in the normal way. So far so good; but, to my horror, I found that there is no getOutputStream() method available. It was back to the drawing board!

Modify a jar file: The problem

More digging unearthed ZipOutputStream, but this class is far from easy to use. There are examples in Chan, Lee, and Kramer's Java Class Libraries book (see Resources), but they prove hideously complicated.

It turns out that the only way to modify an archive is to make a new archive from scratch and copy the old one to the new one, making any changes along the way. In truth, you cannot in a simple way use Java's archive classes to modify, replace, or add a file in an existing archive.

With that in mind, the not-so-basic drill is as follows:

  1. Get the ZipEntry objects for the existing archive.
  2. Create a temporary file to hold the new archive as it's being built.
  3. Wrap that temporary file with a FileOutputStream.
  4. To remove a file:
    • Remove its entry from the list of ZipEntry objects made in step 1.
  5. To replace a file in the archive:

    1. Remove the old ZipEntry from the list of entries.
    2. Make a new ZipEntry by copying relevant fields from the old one.
    3. Put the ZipEntry into the ZipOutputStream.
    4. Open an InputStream to the file in the original archive.
    5. Copy the new contents of the file from the InputStream to the ZipOutputStream.
    6. Tell the ZipOutputStream that you're done with the entry.
    7. Close the InputStream.
  6. To add a file to the archive:

    • Follow the steps above for replacing a file, but just write the new bytes rather than transferring them.
  7. Once all modifications have been made, transfer the contents of the files represented by the ZipEntry objects that remain in the list created in Step 1 (that is, the files you haven't deleted or replaced). Use the process described earlier.
  8. Close the new and old archives, then rename the new one so that it has the same name as the old one.


Ugh! (That's a technical term we Java programmers use.) To make matters worse, the requirements for writing a compressed (DEFLATED) file differ from those for writing an uncompressed (STORED) file. The ZipEntry for uncompressed files must be initialized with a CRC value and file size before it can be written to the ZipOutputStream. Since the ZipEntry must be written before the file contents, this means that you have to process the new data twice -- once to figure out the CRC and once to copy the bytes to the ZipOutputStream. Fortunately, the process isn't so brain-dead for a compressed file; you can give the ZipOutputStream a ZipEntry with uninitialized size and CRC fields, and the ZipOutputStream will modify the fields for you as it does the compression.

The double processing of the uncompressed file gave me substantial grief. First, I didn't want to read the file twice. Second, what if my program generated the file programmatically? I didn't want to generate the file contents twice. A temptingly easy strategy is to use the ByteArrayOutputStream -- transfer the file to one of these, extract the resulting buffer, and then process the buffer twice. The problem with this approach is the size of the runtime memory footprint. If I put a 1 MB file into my archive, I'll need 1 MB of memory for the underlying byte array. Even if I have this much memory available, the program's memory footprint would probably get so large that the operating system would start swapping the executable image to disk to allow other programs to run. A program can slow down by an order of magnitude (or more) once the virtual memory manager starts swapping files to disk -- not a good outcome. On the other hand, most of the files that I would be processing in the class-loader application would be small -- a typical jar file comprises a couple KB or less. Nonetheless, it seemed to me that writing the class with the assumption that all files would be small was a bad idea. I wanted to write this class once and be done with it.

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone
Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a JavaWorld account? Log in here. Register now for a free account.
Resources