Recommended: Sing it, brah! 5 fabulous songs for developers
JW's Top 5
Java Tutor is my platform for teaching about Java 7+ and JavaFX 2.0+, mainly via programming projects.
java.util.zip package. According to this package’s description, you can use these types to write/read content in the standard Zip and GZip (GNU Zip) file formats, compress and decompress data
via the DEFLATE compression algorithm that these formats use, and compute the CRC-32 and Adler-32 checksums of arbitrary input streams. This post introduces you to java.util.zip.
| Note: Check out Wikipedia’s Cyclic redundancy check and Adler-32 entries to learn about CRC-32 and Adler-32, respectively. |
ZipEntry class represents a zip entry. You instantiate this class when creating entries to store in a new archive. You obtain instances of this class when extracting content from an existing archive.
ZipEntry declares the following constructors:
ZipEntry(String name) creates a new zip entry with the specified name. This constructor throws java.lang.NullPointerException when name is null, and
java.lang.IllegalArgumentException when the length of the string assigned to name is longer than 65535 bytes.
ZipEntry(ZipEntry ze) creates a new zip entry with fields taken from the specified zip entry.
Additionally, ZipEntry declares methods for creating or extracting entry information. Several methods are listed below:
String getComment() returns the entry’s comment string, or null when there is no comment string. A comment provides user-specific information associated with an entry.
long getCompressedSize() returns the size of the entry’s compressed data, or -1 when not known. The compressed size is the same as the uncompressed size when the entry data is stored without compression.
long getCrc() returns the CRC-32 checksum of the entry’s uncompressed data, or -1 when not known.
int getMethod() returns the compression method used to compress the entry’s data.
String getName() returns the entry’s name.
long getSize() returns the uncompressed size of the entry’s data, or -1 when not known.
boolean isDirectory() returns true when the entry describes a directory; otherwise returns false.
void setComment(String comment) sets the entry’s comment string to comment. A comment string is optional. When specified, the maximum length should be 65535 bytes; remaining bytes are truncated.
void setCompressedSize(long csize) sets the size of the entry’s compressed data to csize.
void setCrc(long crc) sets the CRC-32 checksum of the entry’s uncompressed data to crc. This method throws IllegalArgumentException when crc’s value is less than 0 or greater than 0xFFFFFFFF.
void setMethod(int method) sets the compression method to method. This method throws IllegalArgumentException when any value other than ZipEntry.DEFLATED (compress data file at a specific level) or ZipEntry.STORED (do not compress) is passed to method.
void setSize(long size) sets the uncompressed size of the entry’s data to size. This method throws IllegalArgumentException when size’s value is less than 0, or the value is greater than 0xFFFFFFFF and ZIP64 is not supported.
You will learn how to use this class in the following sections.
ZipOutputStream class is used to create a new zip archive. ZipOutputStream is an output stream filter that writes files in the Zip format, and supports compressed and uncompressed entries.
Note: Use the GZIPOutputStream class to create a gzip archive and write files to this archive in the GZip format.
|
ZipOutputStream declares a pair of constructors for creating this output stream:
ZipOutputStream(OutputStream out) creates a zip output stream that writes its content to underlying stream out, and uses the UTF-8 java.nio.charset.Charset implementation to encode entry names and comments. This constructor has been present since java.util.zip was introduced in Java 1.1.
ZipOutputStream(OutputStream out, Charset charset) creates a zip output stream that writes its content to underlying stream out, and uses the specified Charset implementation to encode entry names and comments. This constructor was introduced in Java 7.
The following code fragment shows you how to instantiate ZipOutputStream with an underlying file output stream:
ZipOutputStream zos = new ZipOutputStream(new FileOutputStream("archive.zip"));
ZipOutputStream also declares several methods, and inherits additional methods from its DeflaterOutputStream superclass. Of the various methods that are available, you minimally work with the
following:
void close() closes the zip output stream along with the underlying output stream. Because ZipOutputStream implements the java.lang.AutoCloseable interface, you can use
ZipOutputStream in the context of Java 7’s try-with-resources statement – you do not have to explicitly invoke close().
void closeEntry() closes the current zip entry and positions the stream for writing the next entry.
void putNextEntry(ZipEntry e) begins writing a new zip entry and positions the stream to the start of the entry data. The current entry is closed when still active (i.e., when closeEntry() was not invoked on the previous entry).
void write(byte[] b, int off, int len) writes len bytes starting at offset off from buffer b to the current zip entry. This method will block until all the bytes are written.
Each method throws java.io.IOException when a generic I/O error has occurred, and ZipException (which subclasses IOException) when a zip-specific I/O error has occurred.
Listing 1 presents the source code to a ZipCreate application that shows you how to minimally use ZipOutputStream and ZipEntry to create a zip file and store assorted files in this archive.
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;
public class ZipCreate
{
public static void main(String[] args) throws IOException
{
if (args.length < 2)
{
System.err.println("usage: java ZipCreate zipfile infile1 "+
"infile2 ...");
return;
}
try (ZipOutputStream zos =
new ZipOutputStream(new FileOutputStream(args[0])))
{
byte[] buf = new byte[1024];
for (String filename: args)
{
if (filename.equals(args[0]))
continue;
try (FileInputStream fis = new FileInputStream(filename))
{
zos.putNextEntry(new ZipEntry(filename));
int len;
while ((len = fis.read(buf)) > 0)
zos.write(buf, 0, len);
zos.closeEntry();
}
}
}
}
}
Listing 1: Creating a zip file and storing assorted files in that archive
Listing 1 is fairly straightforward. It first validates the number of command-line arguments, which must be at least two; the first argument is always the name of the zip file to be created. Assuming success, it then creates the zip output stream with an underlying file output stream to this file, and writes the contents of those files identified by successive arguments to the zip output stream.
The only part of this source code that might seem confusing is if (filename.equals(args[0])) continue;. This decision statement prevents the first command-line argument (the name of the zip archive) from being added to the archive, which doesn’t make sense. If permitted, a ZipException instance containing a “duplicate entry” message would be thrown.
ZipCreate is easy to use. For example, execute the following command line to add ZipCreate.java to an archive named a.zip:
java ZipCreate a.zip ZipCreate.java
You should not observe any output, but should observe a newly created file named a.zip. Unzip this file using your conventional unzip tool and you should observe an extracted ZipCreate.java file.
You cannot store duplicate files in an archive. For example, you will observe an exception message about a duplicate entry when you execute the following command line:
java ZipCreate a.zip ZipCreate.java ZipCreate.java
ZipOutputStream offers more for you to play with. For example, you can use its void setLevel(int level) method to set the compression level for successive entries. Specify an integer argument from 0 through 9, where 0 indicates no compression and 9 indicates best compression – better compression results in slower performance. Alternatively, specify one of the Deflator class’s BEST_COMPRESSION, BEST_SPEED, or DEFAULT_COMPRESSION (to which setLevel() defaults) constants as an argument.
ZipInputStream class is used to access an existing zip archive. ZipInputStream is an input stream filter that reads files in the Zip format, and supports compressed and uncompressed entries.
Note: Use the GZIPInputStream class to access a gzip archive and read files from this archive in the GZip format.
|
ZipInputStream declares a pair of constructors for creating this input stream:
ZipInputStream(InputStream in) creates a zip input stream that reads its content from underlying stream in, and uses the UTF-8 Charset implementation to decode entry names. This constructor has been present since java.util.zip was introduced in Java 1.1.
ZipInputStream(InputStream in, Charset charset) creates a zip input stream that reads its content from underlying stream in, and uses the specified Charset implementation (when possible) to decode entry names. The charset argument is ignored for the entry being read when the language encoding bit of the entry’s general purpose bit flag is set, which indicates that the filename was encoded via UTF-8. See http://www.pkware.com/documents/casestudies/APPNOTE.TXT to
learn about this bit and flag. This constructor was introduced in Java 7.
The following code fragment shows you how to instantiate ZipInputStream with an underlying file input stream:
ZipInputStream zis = new ZipInputStream(new FileInputStream("archive.zip"));
ZipInputStream also declares several methods, and inherits additional methods from its InflaterInputStream superclass. Of the various methods that are available, you minimally work with the following:
void close() closes the zip input stream along with the underlying input stream, releasing any associated system resources. Because ZipInputStream implements the AutoCloseable interface, you can use ZipInputStream in the context of Java 7’s try-with-resources statement – you do not have to explicitly invoke
close().
void closeEntry() closes the current zip entry and positions the stream for reading the next entry.
ZipEntry getNextEntry() reads the next zip entry and positions the stream to the start of the entry data. This method returns null when there are no more entries.
int read(byte[] b, int off, int len) reads a maximum of len bytes from the current zip entry into buffer b starting at offset off. This method will block until all the bytes are read.
Each method throws IOException when a generic I/O error has occurred, and (except for close()) ZipException when a zip-specific I/O error has occurred. Also, read() throws
NullPointerException when b is null, and java.lang.IndexOutOfBoundsException when off is negative, len is negative, or len is greater than b.length–off.
Listing 2 presents the source code to a ZipAccess application that shows you how to minimally use ZipInputStream and ZipEntry to access a zip file and extract its entries.
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
public class ZipAccess
{
public static void main(String[] args) throws IOException
{
if (args.length != 1)
{
System.err.println("usage: java ZipAccess zipfile");
return;
}
try (ZipInputStream zis =
new ZipInputStream(new FileInputStream(args[0])))
{
byte[] buffer = new byte[4096];
ZipEntry ze;
while ((ze = zis.getNextEntry()) != null)
{
System.out.println("Extracting: "+ze);
try (FileOutputStream fos = new FileOutputStream(ze.getName()))
{
int numBytes;
while ((numBytes = zis.read(buffer, 0, buffer.length)) != -1)
fos.write(buffer, 0, numBytes);
}
zis.closeEntry();
}
}
}
}
Listing 2: Accessing a zip file and extracting its entries
Listing 2 is fairly straightforward. It first validates the number of command-line arguments, which must be exactly one: the name of the zip file to be accessed. Assuming success, it then creates the zip input stream with an underlying file input stream to this file, and reads the contents of the various files that are stored in this archive.
ZipAccess is easy to use. For example, execute the following command line to extract ZipCreate.java from the previously created a.zip:
java ZipAccess a.zip
You should observe “Extracting: ZipCreate.java” as the single line of output, and also note the appearance of a ZipCreate.java file in the current directory.
ZipAccess reads zip entries from a zip archive but doesn’t display entry-specific information apart from the name. To address this shortcoming, I’ve created a GUI-based ZipList application that lists a zip archive’s entries in terms of names, compressed sizes, and uncompressed sizes. Figure 1 shows this application’s GUI.
Figure 1: ZipList lists zip archive entries in terms of names, compressed sizes, and uncompressed sizes.
ZipAccess’s GUI is based on JavaFX. I considered using Swing for this task, but (according to various blog posts that I’ve read) Swing is yesterday’s user interface technology and JavaFX is cooler. Listing 3 presents ZipList’s JavaFX 2 source code.
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import javafx.application.Application;
import javafx.application.Platform;
import javafx.collections.FXCollections;
import javafx.collections.ObservableList;
import javafx.scene.Group;
import javafx.scene.Scene;
import javafx.scene.control.TableColumn;
import javafx.scene.control.TableView;
import javafx.scene.control.cell.PropertyValueFactory;
import javafx.scene.paint.Color;
import javafx.stage.Stage;
public class ZipList extends Application
{
private String zipfile;
@Override
@SuppressWarnings("unchecked")
public void start(final Stage primaryStage)
{
Application.Parameters params = getParameters();
List uparams = params.getUnnamed();
zipfile = uparams.get(0);
ObservableList entries = FXCollections.observableArrayList();
try (ZipInputStream zis =
new ZipInputStream(new FileInputStream(zipfile)))
{
System.out.println("Gathering zip entries...");
byte[] buffer = new byte[4096];
ZipEntry ze;
while ((ze = zis.getNextEntry()) != null)
{
System.out.print(".");
if (ze.isDirectory()) // Ignore directory-only entries stored in
continue; // archive.
Entry entry = new Entry();
entry.setName(ze.getName());
entry.setCompressedSize(ze.getCompressedSize());
entry.setSize(ze.getSize());
entries.add(entry);
}
}
catch (IOException ioe)
{
System.err.println("I/O error");
Platform.exit();
}
TableView table = new TableView<>(entries);
TableColumn nameCol = new TableColumn<>("Name");
nameCol.setCellValueFactory(new PropertyValueFactory("name"));
nameCol.setPrefWidth(200.0);
TableColumn compressedSizeCol;
compressedSizeCol = new TableColumn<>("Compressed Size");
compressedSizeCol.
setCellValueFactory(new PropertyValueFactory("compressedSize"));
compressedSizeCol.setPrefWidth(150.0);
TableColumn sizeCol;
sizeCol = new TableColumn<>("Size");
sizeCol.setCellValueFactory(new PropertyValueFactory("size"));
sizeCol.setPrefWidth(150.0);
table.getColumns().setAll(nameCol, compressedSizeCol, sizeCol);
Scene scene = new Scene(table, 500, 300);
primaryStage.setScene(scene);
primaryStage.setTitle("ZipList: "+zipfile);
primaryStage.show();
}
public static void main(String[] args)
{
if (args.length != 1)
{
System.err.println("usage: java ZipList zipfile");
return;
}
launch(args);
}
}
Listing 3: Showing a zip archive’s contents via JavaFX
For brevity, I won’t explain how this code works. Instead, I recommend that you read my Practical JavaFX 2 series and study the JavaFX 2 API documentation on the various JavaFX types that appear in Listing 3.
ZipAccess refers to an Entry helper class that represents a zip entry as a sequence of JavaFX properties. Listing 4 presents Entry’s source code.
import javafx.beans.property.LongProperty;
import javafx.beans.property.SimpleLongProperty;
import javafx.beans.property.SimpleStringProperty;
import javafx.beans.property.StringProperty;
public class Entry
{
private StringProperty name;
public void setName(String value)
{
nameProperty().set(value);
}
public String getName()
{
return nameProperty().get();
}
public StringProperty nameProperty()
{
if (name == null)
name = new SimpleStringProperty(this, "name");
return name;
}
private LongProperty compressedSize;
public void setCompressedSize(long value)
{
compressedSizeProperty().set(value);
}
public long getCompressedSize()
{
return compressedSizeProperty().get();
}
public LongProperty compressedSizeProperty()
{
if (compressedSize == null)
compressedSize = new SimpleLongProperty(this, "compressedSize");
return compressedSize;
}
private LongProperty size;
public void setSize(long value)
{
sizeProperty().set(value);
}
public long getSize()
{
return sizeProperty().get();
}
public LongProperty sizeProperty()
{
if (size == null)
size = new SimpleLongProperty(this, "size");
return size;
}
}
Listing 4: Representing an entry as a sequence of properties
JavaFX introduces a property pattern that includes naming conventions for implementing properties. Listing 4 leverages this pattern to introduce name, compressedSize, and size properties into Entry. The JavaFX documentation's Using JavaFX Properties and Binding document explains this pattern.
C:\Program Files\Oracle\JavaFX 2.0 SDK, execute the following command line to compile ZipList.java:
javac -cp "c:\progra~1\oracle\javafx 2.0 sdk\rt\lib\jfxrt.jar";. ZipList.java
Execute a command line similar to that shown below to run this application:
java -cp "c:\progra~1\oracle\javafx 2.0 sdk\rt\lib\jfxrt.jar";. ZipList d:dev.zip
java.util.zip package contains a ZipFile class that seems to be an alias for ZipInputStream. As with ZipInputStream, you can use ZipFile to read a zip file’s entries. However, ZipFile has several differences that make it worth considering as an alternative:
ZipFile allows random access to zip entries via its ZipEntry getEntry(String name) method. Given a ZipEntry instance, you can call ZipEntry’s InputStream getInputStream(ZipEntry entry) method to obtain an input stream for reading the entry’s content. ZipInputStream supports sequential access to zip entries.
ZipFile internally caches zip entries for improved performance (especially on UNIX platforms). ZipInputStream does not cache entries.
ZipOutputStream declares a void setComment(String comment) method for associating a comment with the entire file, but ZipInputStream does not declare an equivalent String getComment() method to return this comment. ZipFile declares a String getComment() method that returns the file-specific comment.
ZipInputStream reads zip entries sequentially, and because comment information is stored at the end of the archive, ZipEntry’s String getComment() method always returns null for each entry. To obtain entry-specific comment information, invoke this method on an archive opened with ZipFile.
You might be curious about two ZipFile constructors that declare a mode parameter of type int. The argument passed to mode is ZipFile.OPEN_READ or
ZipFile.OPEN_READ|ZipFile.OPEN_DELETE. The latter argument causes the underlying file to be deleted sometime between when it is opened and when it is closed.
This capability was introduced by Java 1.3 to solve a problem related to caching downloaded JAR files in the context of long-running server applications or Remote Method Invocation. The problem is discussed at http://docs.oracle.com/javase/7/docs/technotes/guides/lang/enhancements.html.
ZipCreate to support writing directories of files to a zip archive. Also use java.io.BufferedOutputStream to enhance performance.
ZipAccess to support reading directories of files from a zip archive. Also use java.io.BufferedInputStream to enhance performance.
You can download this post's code and answers here. Code was developed and tested with JDK 7u2 on a Windows XP SP3 platform.
* * *
I welcome your input to this blog, and will write about relevant topics that you suggest. While waiting for the next blog post, check out my TutorTutor website to learn more about Java and other computer technologies (and that's just the beginning).
|
Learn more about the Java 7 language and many of its APIs by reading my book Beginning Java 7. You can obtain information about this book here and here. |