Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

XML document processing in Java using XPath and XSLT

Discover how XPath and XSLT can significantly reduce the complexity of your Java code when handling XML documents

  • Print
  • Feedback
The Extensible Markup Language (XML) is certainly one of the hottest technologies at the moment. While the concept of markup languages is not new, XML seems especially attractive to Java and Internet programmers. The Java API for XML Parsing (JAXP; see Resources), having recently been defined through the Java Community Process, promises to provide a common interface for accessing XML documents. The W3C has defined the so-called Document Object Model (DOM), which provides a standard interface for working with an XML document in a tree hierarchy, whereas the Simple API for XML (SAX) lets a program parse an XML document sequentially, based on an event handling model. Both of these standards (SAX being a de facto standard) complement the JAXP. Together, these three APIs provide sufficient support for dealing with XML documents in Java, and numerous books on the market describe their use.

This article introduces a way to handle XML documents that goes beyond the standard Java APIs for manipulating XML. We'll see that in many cases XPath and XSLT provide simpler, more elegant ways of solving application problems. In some simple samples, we will compare a pure Java/XML solution with one that utilizes XPath and/or XSLT.

Both XSLT and XPath are part of the Extensible Stylesheet Language (XSL) specification (see Resources). XSL consists of three parts: the XSL language specification itself, XSL Transformations (XSLT), and XML Path Language (XPath). XSL is a language for transforming XML documents; it includes a definition -- Formatting Objects -- of how XML documents can be formatted for presentation. XSLT specifies a vocabulary for transforming one XML document into another. You can consider XSLT to be XSL minus Formatting Objects. The XPath language addresses specific parts of XML documents and is intended to be used from within an XSLT stylesheet.

For the purposes of this article, it is assumed that you are familiar with the basics of XML and XSLT, as well as the DOM APIs. (For information and tutorials on these topics, see Resources.)

Note: This article's code samples were compiled and tested with the Apache Xerces XML parser and the Apache Xalan XSL processor (see Resources).

The problem

Many articles and papers that deal with XML state that it is the perfect vehicle to accomplish a good design practice in Web programming: the Model-View-Controller pattern (MVC), or, in simpler terms, the separation of application data from presentation data. If the application data is formatted in XML, it can easily be bound -- typically in a servlet or Java ServerPage -- to, say, HTML templates by using an XSL stylesheet.

But XML can do much more than merely help with model-view separation for an application's frontend. We currently observe more and more widespread use of components (for example, components developed using the EJB standard) that can be used to assemble applications, thus enhancing developer productivity. Component reusability can be improved by formatting the data that components deal with in a standard way. Indeed, we can expect to see more and more published components that use XML to describe their interfaces.

  • Print
  • Feedback

Resources
  • Recent XML articles in JavaWorld
  • XML help
  • Other valuable XML-related resources