Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
TEXTBOX_HEAD: Mapping XML to Java: Read the whole series!
Regardless of whether you use DOM or SAX, when mapping XML data into Java, two things happen -- navigation and data collection. DOM and SAX differ in how those aspects are addressed. That is, DOM separates navigation and data collection, while SAX merges navigation and collection.
Most of DOM's performance weakness stems from the fact that the separation of the navigational and data-collection aspects seems natural and required as expressed in the DOM programming model, but that separation is not, in fact, a runtime requirement. SAX pierces that illusion by merging navigation and data collection during runtime, at the cost of making its programming model less obvious.
Using DOM, once you've created your in-memory DOM tree, you navigate to find the node that interests you. Then, once you've found the correct node, you collect data. You navigate and collect data, and those two aspects are conceptually separated. Unfortunately, as previously mentioned, using the in-memory DOM tree presents big performance implications.
With SAX, it's more of a juggling game. You listen to SAX events to keep track of where you are -- a different form of navigation. When the SAX events have positioned you in just the right place, you collect data. One of the reasons that SAX hasn't dominated the XML APIs is that the navigational aspect of its programming model is not as intuitive as it is with DOM.
As such, wouldn't it be really cool if we could get the navigational and data collection aspects of SAX into separate corners but keep the runtime performance advantages? Well, pay attention because that's exactly what we will do. That is, no reason exists for not separating navigational and data collection aspects in the programming model during development but leaving them intermixed at runtime.
In Part 1, I went through some basic applications of SAX. I also mentioned that there were some situations that needed special attention, such as recursive data structures. To create a class library that separates out the navigational aspects of SAX in the programming model, we will need a general-purpose approach to dealing with navigation. That approach will have to deal with all the special cases, including ambiguous tag names and recursive data structures.
So, how do we do that?
The key to navigation in SAX: at all times keep track of where you are during parsing. The most complicated navigational case is keeping track of where you are while receiving SAX events for recursive data structures generated while parsing an XML document. The conventional programming approach to using recursive structures -- sometimes called walking the tree -- is to use either a stack data structure or recursive function calls. Unfortunately, we can't use recursive function calls in SAX because we have to return control back to the XML parser after processing each SAX event. But we can use a stack data structure to keep track of SAX events.