Recommended: Sing it, brah! 5 fabulous songs for developers
JW's Top 5
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
TEXTBOX:
TEXTBOX_HEAD: Mapping XML to Java: Read the whole series!
:END_TEXTBOX
However, using XML to build systems poses two challenges. First, while generating XML is a straightforward procedure, the inverse operation, using XML data from within a program, is not. Second, current XML technologies are easy to misapply, which can leave a programmer with a slow, memory-hungry system. Indeed, heavy memory requirements and slow speeds can prove problematic for systems that use XML as their primary data exchange format.
Some standard tools currently available for working with XML are better than others. The SAX API in particular has some important runtime features for performance-sensitive code. In this article, we will develop some patterns for applying the SAX API. You will be able to create fast XML-to-Java mapping code with a minimum memory footprint, even for fairly complex XML structures (with the exception of recursive structures).
In Part 2 of this series, we will cover applying the SAX API to recursive XML structures in which some of the XML elements represent lists of lists. We will also develop a class library that manages the navigational aspects of the SAX API. This library simplifies writing XML mapping code based on SAX.
Writing programs that use XML data is like writing a compiler. That is, most compilers convert source code into a runnable program in three steps. First, a lexer module groups characters into words or tokens that the compiler recognizes -- a process known as tokenizing. A second module, called the parser, analyzes groups of tokens in order to recognize legal language constructs. Last, a third module, the code generator, takes a set of legal language constructs and generates executable code. Sometimes, parsing and code generation are intermixed.
To use XML data in a Java program, we must undergo a similar process. First, we analyze every character in the XML text in order to recognize legal XML tokens such as start tags, attributes, end tags, and CDATA sections.
Second, we verify that the tokens form legal XML constructs. If an XML document consists entirely of legal constructs per the XML 1.0 specification, it is well-formed. At the most basic level, we need to make sure that, for instance, all of the tagging has matching opening and closing tags, and the attributes are properly structured in the opening tag.
Also, if a DTD is available, we have the option to make sure that the XML constructs found during parsing are legal in terms of the DTD, as well as being well-formed XML.