Wizard API updated!
Tim Boudreau has released a new version of the Swing Wizard library (version 0.997) that fixes the WizardException bug reported in JavaWorld's recent Open Source Java Project profile. The article's examples have been reworked to test out the new, improved WizardException. Thanks, Tim, for this helpful fix!
Open Source Java Projects: The Wizard API

Newsletter sign-up

Sign up for our technology specific newsletters.

Enterprise Java
View all newsletters

Email Address:

Take the sting out of SAX

Generate SAX parsers with XML Schemas

A Simple API for XML (SAX) parser offers an invaluable tool for parsing XML files, especially if you need to parse large XML input files that cannot load into main memory. A SAX parser can also prove helpful if you have a slow input stream, like an Internet connection, and you need to process bytes as soon as they arrive, instead of waiting for the complete input. As a bonus, a well-designed SAX parser is generally faster than the approach of processing a DOM (Document Object Model) tree; you need only one pass over the XML data as opposed to the two passes needed with a DOM tree (one to build the tree, and one to do the processing).

Unfortunately, a SAX parser can be difficult to develop because of its event-driven nature. In this article, I create a source code generator that will help you easily develop a SAX parser.

Note: I don't explain SAX in detail here; see Resources below for some excellent references.

SAX reviewed

SAX is a standard API that parses an XML input stream, like a file or network connection, and triggers events in an event-handler class. Many different SAX parser implementations are available for Java. In my examples here, I use Xerces from the Apache XML Project, one of the most popular parser implementations.

Listings 1 and 2 below show an XML file and a SAX event handler, respectively. (You can download all source code and examples for this article from Resources.)

Listing 1. Example XML

<company name="My Widgets Inc.">
  <employees>
    <employee>
      <name>
        <first>John</first>
        <last>Dole</last>
      </name>
      <office>1-50</office>
      <telephone>123456</telephone>
    </employee>
    <employee>
      <name>
        <first>Jane</first>
        <last>Dole</last>
      </name>
      <office>1-51</office>
      <telephone>123457</telephone>
    </employee>
  </employees>
</company>


Listing 2. SAX handler

    public void startElement(java.lang.String uri, java.lang.String localName, java.lang.String qName, Attributes attributes) throws SAXException
    {
        text.reset();
        
        if (qName.equals ("company"))
        {
            String name = attributes.getValue("name");
            String header = "Employee Listing For "+name;
            System.out.println (header);
            System.out.println ();
        }
        
    }
    public void endElement(java.lang.String uri, java.lang.String localName, java.lang.String qName) throws SAXException
    {
        if (qName.equals ("first"))
        {
            firstName = getText();
        }
        if (qName.equals ("last"))
        {
            lastName = getText();
        }
        
        if (qName.equals ("office"))
        {
            office = getText();
        }
        
        if (qName.equals ("telephone"))
        {
            telephone = getText ();
        }
        
        if (qName.equals ("employee"))
        {
            System.out.println (office + "\t " + firstName + "\t" + 
lastName + "\t" + telephone);
        }
        
    }


The SAX handler above merely prints the XML file's data to the standard output device. It prints a header line containing the company name followed by tab-delimited employee data.

As you can see from Listing 2, parsing even a simple XML file can produce a significant amount of source code. SAX's event-driven (as opposed to document-driven) nature also makes the source code difficult to maintain and debug because you must be constantly aware of the parser's state when writing SAX code. Writing a SAX parser for complex document definitions can prove even more demanding; see Resources for challenging real-life examples.

1 | 2 | 3 | 4 |  Next >
Resources
  • "Programming XML in Java," Mark Johnson (JavaWorld):