Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Implement complicated data transformations with SAX and XSLT

Standard Java API provides powerful tools for XML data transformations

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone

I was once asked to help in a project that required a simple data transformer for converting raw bill data into different bill layouts. After receiving a brief introduction to the problem, I suggested using XSLT (Extensible Stylesheet Language Transformations).

When I dug deeper into the requirements, it turned out the problem was not as simple as I had first thought. The input data was manageable, but the data needed to perform the transformation simply could not be depicted with a set of static XSLT stylesheets. Part of the transformation data was dynamic and stored in two separate databases. In addition, to produce the bill layouts, the program had to perform relatively complex calculations on the input data using numbers fetched from the two databases. The XSLT solution was quietly forgotten.

The core problem in this case was the data needed for directing the transformation—it was dynamic. In a perfect world, you would never face this issue. Preparation of the input data and the transformation should be clearly separated, so all the information needed for the transformation could easily be included in a single XSLT template. Unfortunately, we don't live in a perfect world, and the requirements of real-life projects are sometimes quite bizarre.

This article suggests one solution to the problem described above. I show by example how the power of SAX (Simple API for XML) can be harnessed to enhance the applicability of XSLT. In addition, I show how XSLT can be used even if neither the input data nor the desired output is XML.

Introduction to XSLT

XSLT is a programming language for transforming XML data. XSLT stylesheets can be applied to transform an XML document into another XML format or practically any other format. While XSLT may not be a simple language to learn—especially to those more familiar with Java-like languages—it is a powerful and flexible way to accomplish relatively complicated data transformations. If you are not familiar with XSLT, plenty of excellent tutorials are available. See, for instance, Chapter 17 of the XML Bible.

Though XSLT is a great language, some tasks are difficult, or nearly impossible, to accomplish with it. Transformations where you must calculate the combinations of data fields taken from several elements of the input XML are usually possible, but often extremely difficult to write. If the data directing the transformation is itself dynamic, XSLT alone is not enough. XSLT templates are static in nature, and, while it may be possible to dynamically regenerate the templates, I can't imagine a situation when this would be feasible. (If you have a different opinion, feel free to send me feedback.)

After experimenting with various ideas, I concluded that the easiest way to accomplish complicated transformations using XSLT was to manipulate the input XML before feeding it to the XSLT transformer. This may sound ridiculously complicated and inefficient, but it turns out that with SAX manipulating the XML data on the fly, it is quite easy.

SAX is an event-driven interface for parsing an XML document. When the SAX parser parses XML data, it generates "callback" notifications about the XML elements that the parser recognizes. For instance, when the parser encounters the XML start tag, it produces a callback event startElement. The name of the tag and other relevant information are sent in the parameters of the callback call. SAX should be used when efficient XML parsing is needed. For more information about SAX, see Sun's tutorial on JAXP. In this article, I use SAX to modify the flow of events before forwarding them to the XSLT transformer.

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone
Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a JavaWorld account? Log in here. Register now for a free account.
Resources