Newsletter sign-up
View all newsletters

Sign up for our technology specific newsletters.

Enterprise Java
Email Address:

Modifying archives, Part 2: The Archive class

The Archive class allows you to write or modify stored archive files

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone

Author's note: Before we get started on this month's article, I'd like to mention that my new book on Java threading, Taming Java Threads (APress, June 2000 (see Resources)), is finally out. The book shows you how to create production-quality multithreaded programs; it presents a full-blown industrial-strength threading library along with a lot of advice about threading pitfalls and good architecture. Much of the material in this book first appeared in JavaWorld as a nine-part series on threading (see Resources), though the material has been expanded considerably and the code has been cleaned up and expanded as well.

TEXTBOX:

TEXTBOX_HEAD: Modifying archives: Read the whole series!

:END_TEXTBOX

Modifying Archives

As I discussed in Part 1 of this series, the built-in Java archive classes contain no support for modifying an existing archive. They only let you build one from scratch. To modify an archive, you must copy it to another archive, performing the modifications along the way. Three classes are involved in the transfer:

  • ZipFile: Represents the file as a whole; you get ZipEntry objects that represent the archive's contents from here. The constructor takes the full path name of the .zip or .jar file as an argument.
  • ZipEntry: Essentially the directory entry for a file within the archive. You get an InputStream for a particular file within the archive by calling a_ZipFile_object.getInputStream(a_ZipEntry).
  • ZipOutputStream: An output stream that builds an archive. You can write ZipEntry objects onto this stream as well as the actual data (the ZipEntry object has to be written first, then the data). A ZipOutputStream is a standard java.io-style decorator used along the same lines as BufferedOutputStream. You pass an OutputStream representing the physical archive file to the ZipOutputStream as a constructor argument, and you write to the ZipOutputStream wrapper.


Next, we see the general (but not so easy) process for modifying an archive:

  1. Get all the ZipEntry objects for the existing archive.
  2. Create a temporary file to hold the new archive as it's being built.
  3. Wrap that temporary with a ZipOutputStream.
  4. To remove a file:
    • Remove its entry from the list of ZipEntry objects made in Step 1.
  5. To replace a file in the archive:
    1. Remove the old ZipEntry from the list of entries made in Step 1.
    2. Make a new ZipEntry by copying relevant fields from the old one.
    3. Put the new ZipEntry into the ZipOutputStream.
    4. Copy the new contents of the file to the ZipOutputStream.
    5. Tell the ZipOutputStream that you're done with the entry.
    6. Close the InputStream.
  6. To add a file to the archive:
    • It's just like replacing a file, but there's no ZipEntry in the old archive, so you have to create one from scratch.
  7. Once you've made all the modifications, transfer the contents of the files represented by the ZipEntry objects that remain in the list created in Step 1 (that is, the files you haven't deleted or replaced). To do this, you'll have to open an InputStream for each of the entries remaining in the list (by asking the ZipFile for an InputStream for a particular ZipEntry), then transfer bytes from that stream to the ZipOutputStream using the process described earlier.
  8. Close the new and old archives, then rename the new one to have the same name as the old one.


To make matters worse, the requirements for writing a compressed (ZipEntry.DEFLATED) file differ from those for writing an uncompressed (ZipEntry.STORED) file. The ZipEntry for uncompressed files must be initialized with a CRC value (a checksum) and file size before it can be written to the ZipOutputStream. The checksum can be built using Java's CRC32 class (which is passed the bytes that comprise the file and provides a checksum when all the bytes have been imported). The ZipEntry must be written before the file contents, however, so you have to process the data twice -- once to figure out the CRC and once again to copy the bytes to the ZipOutputStream. Fortunately, the process isn't so brain dead for a compressed file; you can give the ZipOutputStream a ZipEntry with uninitialized size and CRC fields, and the ZipOutputStream will modify the fields for you as it does the compression.

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone
Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a JavaWorld account? Log in here. Register now for a free account.
Resources