Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Mapping XML to Java, Part 2

Create a class library that uses the SAX API to map XML documents to Java objects

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone
As I mentioned in Part 1, one of the big problems programmers face when using the SAX API is conceptual. I'd like to address that issue again as the foundation for the reusable class library that we will develop in this article.

TEXTBOX:

TEXTBOX_HEAD: Mapping XML to Java: Read the whole series!

:END_TEXTBOX

Regardless of whether you use DOM or SAX, when mapping XML data into Java, two things happen -- navigation and data collection. DOM and SAX differ in how those aspects are addressed. That is, DOM separates navigation and data collection, while SAX merges navigation and collection.

Most of DOM's performance weakness stems from the fact that the separation of the navigational and data-collection aspects seems natural and required as expressed in the DOM programming model, but that separation is not, in fact, a runtime requirement. SAX pierces that illusion by merging navigation and data collection during runtime, at the cost of making its programming model less obvious.

Using DOM, once you've created your in-memory DOM tree, you navigate to find the node that interests you. Then, once you've found the correct node, you collect data. You navigate and collect data, and those two aspects are conceptually separated. Unfortunately, as previously mentioned, using the in-memory DOM tree presents big performance implications.

With SAX, it's more of a juggling game. You listen to SAX events to keep track of where you are -- a different form of navigation. When the SAX events have positioned you in just the right place, you collect data. One of the reasons that SAX hasn't dominated the XML APIs is that the navigational aspect of its programming model is not as intuitive as it is with DOM.

As such, wouldn't it be really cool if we could get the navigational and data collection aspects of SAX into separate corners but keep the runtime performance advantages? Well, pay attention because that's exactly what we will do. That is, no reason exists for not separating navigational and data collection aspects in the programming model during development but leaving them intermixed at runtime.

You are here

In Part 1, I went through some basic applications of SAX. I also mentioned that there were some situations that needed special attention, such as recursive data structures. To create a class library that separates out the navigational aspects of SAX in the programming model, we will need a general-purpose approach to dealing with navigation. That approach will have to deal with all the special cases, including ambiguous tag names and recursive data structures.

So, how do we do that?

The key to navigation in SAX: at all times keep track of where you are during parsing. The most complicated navigational case is keeping track of where you are while receiving SAX events for recursive data structures generated while parsing an XML document. The conventional programming approach to using recursive structures -- sometimes called walking the tree -- is to use either a stack data structure or recursive function calls. Unfortunately, we can't use recursive function calls in SAX because we have to return control back to the XML parser after processing each SAX event. But we can use a stack data structure to keep track of SAX events.

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone
Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a JavaWorld account? Log in here. Register now for a free account.
Resources
  • Recent XML articles in JavaWorld
  • XML help
  • Other valuable XML resources
  • Information on logging packages available on the Internet