Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

XML document processing in Java using XPath and XSLT

Discover how XPath and XSLT can significantly reduce the complexity of your Java code when handling XML documents

  • Print
  • Feedback

Page 2 of 5

Because XML-formatted data is language-neutral, it becomes usable in cases where the client of a given application service is not known, or when it must not have any dependencies on the server. For example, in B2B environments, it may not be acceptable for two parties to have dependencies on concrete Java object interfaces for their data exchange. New technologies like the Simple Object Access Protocol (SOAP) (see Resources) address these requirements.

All of these cases have one thing in common: data is stored in XML documents and needs to be manipulated by an application. For example, an application that uses various components from different vendors will most likely have to change the structure of the (XML) data to make it fit the need of the application or adhere to a given standard.

Code written using the Java APIs mentioned above would certainly do this. Moreover, there are more and more tools available with which you can turn an XML document into a JavaBean and vice versa, which makes it easier to handle the data from within a Java program. However, in many cases, the application, or at least a part of it, merely processes one or more XML documents as input and converts them into a different XML format as output. Using stylesheets in those cases is a viable alternative, as we will see later in this article.

Use XPath to locate nodes in an XML document

As stated above, the XPath language is used to locate certain parts of an XML document. As such, it's meant to be used by an XSLT stylesheet, but nothing keeps us from using it in our Java program in order to avoid lengthy iteration over a DOM element hierarchy. Indeed, we can let the XSLT/XPath processor do the work for us. Let's take a look at how this works.

Let us assume that we have an application scenario in which a source XML document is presented to the user (possibly after being processed by a stylesheet). The user makes updates to the data and, to save network bandwidth, sends only the updated records back to the application. The application looks for the XML fragment in the source document that needs to be updated and replaces it with the new data.

We will create a little sample that will help you understand the various options. For this example, we assume that the application deals with address records in an addressbook. A sample addressbook document looks like this:

<addressbook>
   <address>
      <addressee>John Smith</addressee>
      <streetaddress>250 18th Ave SE</streetaddress>
      <city>Rochester</city>
      <state>MN</state>
      <postalCode>55902</postalCode>
   </address>
   <address>
      <addressee>Bill Morris</addressee>
      <streetaddress>1234 Center Lane NW</streetaddress>
      <city>St. Paul</city>
      <state>MN</state>
      <postalCode>55123</postalCode>
</address>
</addressbook>


The application (possibly, though not necessarily, a servlet) keeps an instance of the addressbook in memory as a DOM Document object. When the user changes an address, the application's frontend sends it only the updated <address> element.

The <addressee> element is used to uniquely identify an address; it serves as the primary key. This would not make a lot of sense for a real application, but we do it here to keep things simple.

  • Print
  • Feedback

Resources
  • Recent XML articles in JavaWorld
  • XML help
  • Other valuable XML-related resources