Newsletter sign-up
View all newsletters

Sign up for our technology specific newsletters.

Enterprise Java
Email Address:

Cut, paste, split, and assemble XML documents with VTD-XML

VTD-XML eliminates the performance overhead associated with updating XML

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone

My last JavaWorld article "Simplify XML Processing with VTD-XML" looked at three important benefits of VTD-XML: performance, memory usage, and ease of use. VTD-XML makes XML applications not only easier to write, but also leaner and faster. XML applications written in VTD-XML are 10 times more responsive when compared to the same applications written with the Document Object Model (DOM) and are capable of serving 10 times the workload, while maintaining the same quality of services for proportionally bigger XML messages.

However, parsing is often only part of what needs to be done in an XML-based application; many applications change the XML data as well. Consider the following three use-cases involving XML content updates:

  1. An address-book application internally saves data into XML format, and the user wants to update the contact information (e.g., a phone number)
  2. An XML-enabled content switch inspects the incoming and outgoing SOAP messages and selectively turns on the values of mustUnderstand attributes in the SOAP header
  3. An SOA (service-oriented architecture) security application canonicalizes a subset of XML data for the subsequent signing and encryption, then inserts the digest and ciphered text back into XML payload

For those use-cases, the ability to efficiently update XML content also contributes significantly to the overall application performance. But with VTD-XML's incremental update feature, the performance overhead normally associated with DOM and SAX is eliminated, as I will illustrate in the example below.

The double whammy of updating XML with DOM or SAX

Unfortunately, DOM and SAX (Simple API for XML) tax application performance twice when applying changes to XML. First with parsing, which is notoriously slow and CPU intensive; even worse is the reserialization needed to generate updated XML. Consider modifying the text content of the following XML snippet from "red" to "blue."

 <color> red </color>  


Using DOM, you would have to go through the following three steps:

  1. Build the DOM tree
  2. Navigate to and then update the text node
  3. Write the updated structure back into XML

SAX and Pull are not even worth mentioning here since neither provides you with the liberty to navigate the tree structure. If the XML file size increases, writing out updated XML—effectively, many string concatenations, buffer allocations, and encoding conversions—further degrades overall application performance already constrained by slow parsing.

Notice that the same task can be done far more efficiently by a human using a text editor. To edit the XML snippet in the previous example, just open the file with a simple notepad, move the cursor to the start of the text node, replace "red" with "blue" and you're done! Notice that this time, the update is "incremental," meaning it does not touch irrelevant parts of the document. And if we humans can edit XML like this, why can't XML parsers?

VTD-XML enables incremental update

VTD-XML is the first XML parser engineered from ground up to support incremental updates. In other words, VTD-XML not only parses XML blazingly fast, but also makes possible zero-overhead XML content updates, distancing itself further from DOM and SAX as an advanced and powerful XML parser.

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone
Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a JavaWorld account? Log in here. Register now for a free account.
Resources