Sun Java Solaris Communities My SDN Account Join SDN
 
Core Java Technologies Tech Tips

Controlling the Creation of ZIP/JAR Entries and Using printf with Custom Classes

 
In This Issue

Welcome to the Core Java Technologies Tech Tips for May 2007. Core Java Technologies Tech Tips provides tips and hints for using core Java technologies and APIs in the Java Platform, Standard Edition 6 (Java SE 6).

In this issue provides tips for the following:

» Controlling the Creation of ZIP/JAR Entries
» Using printf with Custom Classes

These tips were developed using Java SE 6. You can download Java SE 6 from the Java SE Downloads page.

The author of this month's tips is John Zukowski, president and principal consultant of JZ Ventures, Inc..

Controlling the Creation of ZIP/JAR Entries

A handful of earlier tips have explored JAR files:

Previous tech tips have described listing, reading, and updating archived content, but not much has been said about how to create JAR or ZIP archives. Because the JAR-related classes are subclasses of the ZIP-related classes, this tip is more specifically about the ZipOutputStream and the java.util.zip package.

Before digging into creating ZIP or JAR files, it is important to mention their purpose. ZIP files offer a packaging mechanism, allowing multiple files to be bundled together as one. Thus, when you need to download a group of files from the web, you can package them into one ZIP file for easier transport as a single file. The bundling can include additional information like directory hierarchy, thus preserving necessary paths for an application or series of resources once unbundled.

This tip will address three aspects of ZIP file usage: creating them, adding files to them, and compressing those added files.

First up is creating zip files, or more specifically zip streams. The ZipOutputStream offers a stream for compressing the outgoing bytes. There is a single constructor for ZipOutputStream, one that accepts another OutputStream:

public ZipOutputStream(OutputStream out)

If the constructor argument is the type FileOutputStream, then the compressed bytes written by the ZipOutputStream will be saved to a file. However, you aren't limited to using the ZipOutputStream with a file. You can also use the OutputStream that comes from a socket connection or some other non-file-oriented stream. Thus, for a file-oriented ZipOutputStream, the typical usage will look like this:

String path = "afile.zip"; 
FileOutputStream fos = new FileOutputStream(path);
ZipOutputStream zos = new ZipOutputStream(fos);

Once created, you don't just write bytes to a ZipOutputStream. Instead, you need to treat the output stream as a collection of components. Each component of a ZipOutputStream is paired with a ZipEntry. It is this ZipEntry that you create and then add to the ZipOutputStream before actually writing its contents to the stream.

String name = ...; 
ZipEntry entry = new ZipEntry(name);
zos.putNextEntry(entry);
zos.write(<< all the bytes for entry >>); 

Each entry serves as a marker in the overall stream, where you'll find the bytes related to the entry in the library file. After the ZIP file has been created, when you need to get the entry contents back, just ask for the related input stream:

ZipFile zip = "afile.zip"; 
ZipEntry entry = zip.getEntry("anEntry.name");
InputStream is = zip.getInputStream(entry);

Now that you've seen how to create the zip file and add entries to that file, it is important to point out that the java.util.zip libraries offer some level of control for the added entries of the ZipOutputStream. First, the order you add entries to the ZipOutputStream is the order they are physically located in the .zip file. You can manipulate the enumeration of entries returned back by the entries() method of ZipFile to produce a list in alphabetical or size order, but the entries are still stored in the order they were written to the output stream.

Files added to a ZIP/JAR file are compressed individually. Unlike Microsoft CAB files which compress the library package as a whole, files in a ZIP/JAR file are each compressed or not compressed separately. Before adding a ZipEntry to the ZipOutputStream, you determine whether its associated bytes are compressed. The setMethod method of ZipEntry allows you to specify which of the two available compression formats to use. Use the STORED constant of ZipEntry to give you an uncompressed file and the DEFLATED setting for a compressed version. You cannot control the compression efficiency. That depends on the type of data in the associated entry. Straight text can be compressed to around 80% of its size quite easily, whereas MP3 or JPEG data will be compressed much less.

While you might think it obvious that everything should be compressed, it does take time to compress and uncompress a file. If the task of compressing is too costly a task to do at the point of creation, it may sometimes be better to just store the data of the whole file in a STORED format, which just stores the raw bytes. The same can be said of the time cost of uncompression. Of course, uncompressed files are larger, and you have to pay the cost with either higher disk space usage or bandwidth when transferring file. Keep in mind that you need to change the setting for each entry, not the ZipFile as a whole. However, it is more typical to compress or not compress a whole ZipFile, as opposed to different settings for each entry.

There is one key thing you need to know if you use the STORED constant for the compression method: you must explicitly set certain attributes of the ZipEntry which are automatically set when the entry is compressed. These are the size, compressed size, and the checksum of the entry's input stream. Assuming an input file, the size and compressed size can just be the file size. To compute the checksum, use the CRC32 class in the ava.util.zip package. You cannot just pass in 0 or -1 to ignore the checksum value; the CRC value will be used to validate your input when creating the ZIP and when reading from it later.

ZipEntry entry = new ZipEntry(name);
entry.setMethod(ZipEntry.STORED);
entry.setCompressedSize(file.length());
entry.setSize(file.length());
CRC32 crc = new CRC32();
crc.update(<< all the bytes for entry >>); 
entry.setCrc(crc.getValue());
zos.putNextEntry(entry);

To demonstrate, the following program will combine a series of files using the STORED compression method. The first argument to the program will be the ZIP file to create. Remaining arguments represent the files to add. If the ZIP file to create already exists, the program will exit without modifying the file. If you add a non-existing file to the ZIP file, the program will skip the non-existing file, adding any remaining command line arguments to the created ZIP.

import java.util.zip.*;
import java.io.*;
public class ZipIt {
    public static void main(String args[]) throws IOException {
        if (args.length < 2) {
            System.err.println("usage: java ZipIt Zip.zip file1 file2 file3");
            System.exit(-1);
        }
        File zipFile = new File(args[0]);
        if (zipFile.exists()) {
            System.err.println("Zip file already exists, please try another");
            System.exit(-2);
        }
        FileOutputStream fos = new FileOutputStream(zipFile);
        ZipOutputStream zos = new ZipOutputStream(fos);
        int bytesRead;
        byte[] buffer = new byte[1024];
        CRC32 crc = new CRC32();
        for (int i=1, n=args.length; i < n; i++) {
            String name = args[i];
            File file = new File(name);
            if (!file.exists()) {
                System.err.println("Skipping: " + name);
                continue;
            }
            BufferedInputStream bis = new BufferedInputStream(
                new FileInputStream(file));
            crc.reset();
            while ((bytesRead = bis.read(buffer)) != -1) {
                crc.update(buffer, 0, bytesRead);
            }
            bis.close();
            // Reset to beginning of input stream
            bis = new BufferedInputStream(
                new FileInputStream(file));
            ZipEntry entry = new ZipEntry(name);
            entry.setMethod(ZipEntry.STORED);
            entry.setCompressedSize(file.length());
            entry.setSize(file.length());
            entry.setCrc(crc.getValue());
            zos.putNextEntry(entry);
            while ((bytesRead = bis.read(buffer)) != -1) {
                zos.write(buffer, 0, bytesRead);
            }
            bis.close();
        }
        zos.close();
    }
}

For more information on JAR files, including how to seal and version them, be sure to visit the Packing Programs in JAR Files lesson in The Java Tutorial.

Using printf with Custom Classes

Java SE 1.5 added the ability to format output using formatting strings like "%5.2f%n" to print a floating point number and a newline. An October 2004 tip titled Formatting Output with the New Formatter described this.

The Formattable interface is an important feature but wasn't part of the earlier tip. This interface is in the java.util package. When your class implements the Formattable interface, the Formatter class can use it to customize output formatting. You are no longer limited to what is printed by toString() for your class. By implementing the formatTo() method of Formattable, you can have your custom classes limit their output to a set width or precision, left or right justify the content, and even offer different output for different locales, just like the support for the predefined system data types.

The single formatTo() method of Formattable takes four arguments:

public void formatTo(Formatter formatter, int flags, int width, int precision)

The formatter argument represents the Formatter from which to get the locale and send the output when done.

The flags parameter is a bitmask of the FormattableFlags set. The user can have a - flag to specify left justified (LEFT_JUSTIFY), ^ flag for locale-sensitive uppercase (UPPERCASE), and # for using the alternate (ALTERNATE) formatting.

A width parameter represents the minimum output width, using spaces to fill the output if the displayed value is too short. The width value -1 means no minimum. If output is too short, output will be left justified if the flag is set. Otherwise, it is right justified.

A precision parameter specifies the maximum number of characters to output. If the output string is "1234567890" with a precision of 5 and a width of 10, the first five characters will be displayed, with the remaining five positions filled with spaces, defining a string of width 10. Having a precision of -1 means there is no limit.

A width or precision of -1 means no value was specified in the formatting string for that setting.

When creating a class to be used with printf and Formatter, you never call the formatTo() method yourself. Instead, you just implement the interface. Then, when your class is used with printf, the Formatter will call formatTo() for your class to find out how to display its value. To demonstrate, let us create some object that has both a short and long name that implements Formattable. Here's what the start of the class definition looks like. The class has only two properties, an empty implementation of Formattable, and its toString() method.

import java.util.Formattable;
import java.util.Formatter;
public class SomeObject implements Formattable {
    private String shortName;
    private String longName;
    public SomeObject(String shortName, String longName) {
        this.shortName = shortName;
        this.longName = longName;
    }
    public String getShortName() {
        return shortName;
    }
    public void setShortName(String shortName) {
        this.shortName = shortName;
    }
    public String getLongName() {
        return longName;
    }
    public void setLongName(String longName) {
        this.longName = longName;
    }
    public void formatTo(Formatter formatter, int flags,
        int width, int precision) {
    }
    public String toString() {
        return longName + " [" + shortName + "]";
    }
}

As it is now, printing the object with println() will display the long name, followed by the short name within square brackets as defined in the toString() method. Using the Formattable interface, you can improve the output. A better output will use the current property values and formattable flags. For this example, formatTo() will support the ALTERNATE and LEFT_JUSTIFY flags of FormattableFlags.

The first thing to do in formatTo() is to find out what to output. For SomeObject, the long name will be the default to display, and the short name will be used if the precision is less than 7 or if the ALTERNATE flag is set. Checking whether the ALTERNATE flag is set requires a typical bitwise flag check. Be careful with the -1 value for precision because that value means no limit. Check the range for the latter case. Then, pick the starting string based upon the settings.

String name = longName;
boolean alternate = 
    (flags & FormattableFlags.ALTERNATE) == FormattableFlags.ALTERNATE;
alternate |= (precision >= 0 && precision < 7);
String out = (alternate ? shortName : name);

Once you have the starting string, you get to shorten it down if necessary, based on the precision passed in. If the precision is unlimited or the string fits, just use that for the output. If it doesn't fit, then you need to trim it down. Typically, if something doesn't fit, the last character is replaced by a *, which is done here.

StringBuilder sb = new StringBuilder();
if (precision == -1 || out.length() <= precision) {
    sb.append(out);
} else {
    sb.append(out.substring(0, precision - 1)).append('*');
}

To demonstrate how to access the locale setting, the example here will reverse the output string for Chinese. More typically a translated starting string will be used based on the locale. For numeric output, the locale defines how decimals and commas appear within numbers.

if (formatter.locale().equals(Locale.CHINESE)) {
    sb.reverse();
}

Now that the output string is within a StringBuilder buffer, you can fill up the output buffer based upon the desired width and justification setting. For each position available within the desired width, add a space to beginning or end based upon the justification formattable flag.

int len = sb.length();
if (len < width) {
    boolean leftJustified = (flags & FormattableFlags.LEFT_JUSTIFY) 
        == FormattableFlags.LEFT_JUSTIFY;
    for (int i = 0; i < width - len; i++) {
        if (leftJustified) {
            sb.append(' ');
        } else {
            sb.insert(0, ' ');
        }
    }
}

The last thing to do is to send the output buffer to the Formatter. That's done by sending the whole String to the format() method of formatter:

formatter.format(sb.toString());

Add in some test cases, and that gives you the whole class definition, shown here:

import java.util.Formattable;
import java.util.FormattableFlags;
import java.util.Formatter;
import java.util.Locale;
public class SomeObject implements Formattable {
    private String shortName;
    private String longName;
    public SomeObject(String shortName, String longName) {
        this.shortName = shortName;
        this.longName = longName;
    }
    public String getShortName() {
        return shortName;
    }
    public void setShortName(String shortName) {
        this.shortName = shortName;
    }
    public String getLongName() {
        return longName;
    }
    public void setLongName(String longName) {
        this.longName = longName;
    }
    public void formatTo(Formatter formatter, int flags,
            int width, int precision) {
        StringBuilder sb = new StringBuilder();
        String name = longName;
        boolean alternate = (flags & FormattableFlags.ALTERNATE)
            == FormattableFlags.ALTERNATE;
        alternate |= (precision >= 0 && precision < 7); //
        String out = (alternate ? shortName : name);
        // Setup output string length based on precision
        if (precision == -1 || out.length() <= precision) {
            sb.append(out);
        } else {
            sb.append(out.substring(0, precision - 1)).append('*');
        }
        if (formatter.locale().equals(Locale.CHINESE)) {
            sb.reverse();
        }
        // Setup output justification
        int len = sb.length();
        if (len < width) {
            boolean leftJustified =
                    (flags & FormattableFlags.LEFT_JUSTIFY) ==
                    FormattableFlags.LEFT_JUSTIFY;
            for (int i = 0; i < width - len; i++) {
                if (leftJustified) {
                    sb.append(' ');
                } else {
                    sb.insert(0, ' ');
                }
            }
        }
        formatter.format(sb.toString());
    }
    public String toString() {
        return longName + " [" + shortName + "]";
    }
    public static void main(String args[]) {
        SomeObject obj = new SomeObject("Short", "Somewhat longer name");
        System.out.printf(">%s<%n", obj);
        System.out.println(obj); // Regular obj.toString() call
        System.out.printf(">%#s<%n", obj);
        System.out.printf(">%.5s<%n", obj);
        System.out.printf(">%.8s<%n", obj);
        System.out.printf(">%-25s<%n", obj);
        System.out.printf(">%15.10s<%n", obj);
        System.out.printf(Locale.CHINESE, ">%15.10s<%n", obj);
    }
}

Running this program produces the following output:

>Somewhat longer name<
Somewhat longer name [Short]
>Short<
>Short<
>Somewha*<
>Somewhat longer name     <
>     Somewhat *<
>     * tahwemoS<

The test program creates a codeSomeObject with a short name of "Short" and a long name of "Somewhat longer name". The first line here prints out the object's long name with the %s setting. The second outputs the object via the more typical toString(). The third line uses the alternate form. The next line doesn't explicitly ask for the alternate short form, but because the precision is so small, displays it anyways. Next, a precision is specified that is long enough to not use the alternate format, but too short to display the whole long name. Thus, a "*" shows more characters are available. Next the longer name is displayed left justified. The final two show what happens when the width is wider than the precision, with one also showing the reversed "Chinese" version of the string.

That really is all there is to make your own classes work with printf. Whenever you want to display them, be sure to use a properly configured %s setting within the formatting string.

If you still have questions about using printf, be sure to visit the earlier tip mentioned at the start of this tip, titled Formatting Output with the New Formatter.

For more information on the Formattable interface, see the documentation for the interface.

Developer Assistance
Need programming advice on Java SE? Try Developer Expert Assistance.

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.