XSLT blooms with Java

Use Java in your stylesheets when XSLT won't do the trick

Have you ever been stumped by a difficult XML transformation issue that you couldn't solve with XSLT (Extensible Stylesheet Language Transformation) alone? Take, for example, a simple filter stylesheet that selects only those <purchase> nodes dated earlier than five days ago. You've heard that XSLT can filter XML documents, so you figure you'll solve this problem in no time. The first task is getting today's date from within a stylesheet, provided that information is not included in the original XML document. Unfortunately, you cannot complete this task with only XSLT. In a situation such as this, you can simplify your XSLT code and solve the problem faster with a Java extension.

Many XSLT processors allow for some type of extension mechanism; the specification requires them to do so. In the world of Java and XML, the most widely used XSLT processor is the open source Apache Xalan processor. Written in Java, Xalan allows for extensions in Java. Many developers find Xalan's extensibility powerful because it lets them utilize their Java skills from within the stylesheet context. Consider the way JSPs (JavaServer Pages), scriptlets, and custom tags add power to HTML. Xalan extensions add power to stylesheets in much the same way: by allowing Java developers access to their favorite tool, Java.

In this article, I will demonstrate how you can use Java from within an XSLT stylesheet. First, we will use Xalan's extensibility to instantiate and use existing classes within the JDK. Later, I'll show you how to write an XSLT extension function that takes a String argument and returns a DOM (Document Object Model) fragment to the stylesheet processor.

XSLT is important for J2EE (Java 2 Platform, Enterprise Edition) developers because styling XML documents has become a server-side operation. Also, JAXP (the Java API for XML Processing), which includes support for XSLT engines, has become part of the J2EE specification (J2EE 2.6.11). In its infancy, XSLT was intended to style XML on the client; however, most applications style the XML before sending it to the client. For J2EE developers, this means that the XSLT processor will most likely run within the app server.

Before you continue with this article, be warned that using Java extensions in your XSLT stylesheets will reduce their portability. While extensions are part of the XSLT specification, the way they are implemented is not. If your stylesheets will run on processors other than Xalan, such as Internet Explorer's stylesheet engine, you should avoid using extensions at all costs.

XSLT weaknesses

Because XSLT has some weak spots, XSLT extensions prove quite useful. I'm not saying that XSLT is bad; however, it just doesn't offer the best tool for processing everything in an XML document. Consider this section of XML:

<article>
  <text>XSLT isn't as easy to use as some would have you
    ...
  </text>
</article>

Suppose your boss asks you to modify a stylesheet so that it converts all instances of "isn't" to "is not" and localizes common labels. Certainly XSLT provides a mechanism to do something along these lines, right? Wrong. XSLT provides no easy way to replace the occurrence of a word or pattern within a string. The same goes for localization. That's not to say it can't be done with standard XSLT syntax. There are ways, but they are not nearly as easy as we would like. If you really want to write text manipulation functions using recursive templates, be my guest.

XSLT's main weakness is text processing, which seems reasonable since its purpose is to render XML. However, because XML content is entirely text, XSLT needs stronger text handling. Needless to say, stylesheet designers require some extensibility from time to time. With Xalan, Java provides this extensibility.

Use JDK classes within XSLT

You might be pleased to know that you don't have to write any Java code to take advantage of Xalan's extensibility. When you use Xalan, you can create and invoke methods on almost any Java object. Before using a Java class, you must provide an XSLT namespace for it. This example declares "java" as a namespace for everything in or under the Java package (i.e., the entire JDK):

<xsl:stylesheet version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:java="java" >

Now we need something to do. Let's start with a small XML document:

<article>
      <title>Java May Be a Fad</title>
      <author>J. Burke</author>
      <date>11/30/97</date>
</article>

You've been asked to style this XML so the title appears in uppercase. A developer new to XSLT would simply pop open an XSLT reference to look for the toUpper() function; however, she'd be disappointed to find that the reference lacks one. The translate() method is your best bet, but I have an even better method: java.lang.String.toUpperCase(). To use this method, you need to instantiate a String object with the title contents. Here is how you can create a new String instance with the title element's contents:

<xsl:template match="title">
      <xsl:variable name="titleStr" select="java:lang.String.new(.)"/>

The name attribute specifies the handle to your new String instance. You invoke the constructor by first specifying the namespace along with the remaining path to the String class. As you might have noticed, String lacks a new() method. You use new() to construct a Java object in Xalan; it corresponds to Java's new keyword. The arguments given to new() determine the constructor version that will be called. Now that you have the title contents within a Java String object, you can use the toUpperCase() method, like so:

  <xsl:value-of select="java:toUpperCase($titleStr)"/>

This might look strange to you at first. When using Java methods on a particular instance, the first argument is the instance you want the method invoked on. Obviously Xalan uses introspection to provide this capability.

Below you'll find another trick. Here is how you might emit the date and time anywhere within your stylesheet using java.lang.Date:

  <xsl:value-of select="java:util.Date.new()"/>

Here's something that will make the day of anyone required to localize a generic stylesheet between two or more languages. You can use java.util.ResourceBundle to localize literal text within a stylesheet. Since your XML has an author tag, you might want to print "Author:" next to the person's name.

One option is to create a separate stylesheet for each locale, i.e., one for English, another for Chinese, and so on. The problems inherent in this approach should be evident. Keeping multiple stylesheet versions consistent is time consuming. You also need to modify your application so that it chooses the correct stylesheet based on the user's locale.

Instead of duplicating the stylesheet for each language, you can take advantage of Java's localization features. Localizing with the help of a ResourceBundle proves a better approach. Within XSLT, load the ResourceBundle at the beginning of your stylesheets, like so:

<xsl:variable name="resources"
              select="java:util.ResourceBundle.getBundle('General')"/>

The ResourceBundle class expects to find a file called General.properties in your CLASSPATH. Once the bundle is created, it can be reused throughout the stylesheet. This example retrieves the author resource:

<xsl:value-of select="java:getString($resources,'author')"/>

Notice again the strange method signature. Normally, ResourceBundle.getString() takes only one argument; however, within XSLT you need to also specify the object by which you want to invoke the method.

Write your own extensions

For some rare situations, you might need to write your own XSLT extension, in the form of either an extension function or an extension element. I will discuss creating an extension function, a concept fairly easy to grasp. Any Xalan extension function can take strings as input and return strings to the XSLT processor. Your extensions can also take NodeLists or Nodes as arguments and return these types to the XSLT processor. Using Nodes or NodeLists means you can add to the original XML document with an extension function, which is what we will do.

One type of text item encountered frequently is a date; it provides a great opportunity for a new XSLT extension. Our task is to style an article element so the date prints in the following format:

Friday, November 30, 2001

Can standard XSLT complete the date above? XSLT can finish most of the task. Determining the actual day is the difficult part. One way to quickly solve that problem is to use the java.text.SimpleDate format class within an extension function to return a string formatted as we wish. But wait: notice that the day appears in bold text. This returns us to the initial problem. The reason we are even considering an extension function is because the original XML document failed to structure the date as a group of nodes. If our extension function returns a string, we will still find it difficult to style the day field differently than the rest of the date string. Here's a more useful format, at least from the perspective of an XSLT designer:

      <date>
            <month>11</month>
            <day>30</day>
            <year>2001</year>
      </date>

We now create an XSLT extension function, taking a string as an argument and returning an XML node in this format:

      <formatted-date>
            <month>November</month> 
            <day>30</day>
            <day-of-week>Friday</day-of-week>
            <year>2001</year>
      </formatted-date>

The class hosting our extension function doesn't implement or extend anything; we will call the class DateFormatter:

public class DateFormatter {
    public static Node format (String date) {}

Wow, too easy, huh? There are absolutely no requirements placed on the type or interface of a Xalan extension function. Generally, most extension functions will take a String as an argument and return another String. Other common patterns are to send or receive org.w3c.dom.NodeLists or individual Nodes from an extension function, as we will do. See the Xalan documentation for details on how Java types convert to XSLT types.

In the code fragment above, the format() method's logic breaks into two parts. First, we need to parse the date string from the original XML document. Then we use some DOM programming techniques to create a Node and return it to the XSLT processor. The body of our format() method implementation reads:

            Document doc = DocumentBuilderFactory.newInstance().
                        newDocumentBuilder().newDocument();
            Element dateNode = doc.createElement("formatted-date");
            SimpleDateFormat df = (SimpleDateFormat)
                        DateFormat.getDateInstance(DateFormat.SHORT, locale);
            df.setLenient(true);
            Date d = df.parse(date);
            df.applyPattern("MMMM");
                addChild(dateNode, "month", df.format(d));
            df.applyPattern("EEEE");
                addChild(dateNode, "day-of-week", df.format(d));
            df.applyPattern("yyyy");
                dateNode.setAttribute("year", df.format(d));
            return  dateNode;

dateNode will contain our formatted date values that we return to the stylesheet. Notice that we've utilized java.text.SimpleDateFormat() to parse the date. This allows us to take full advantage of Java's date support, including its localization features. SimpleDateFormat handles the numeric date conversion and returns month and day names that match the locale of the VM running our application.

Remember: the primary purpose of an extension function is simply to allow us access to existing Java functionality; write as little code as possible. An extension function, like any Java method, can use other methods within the same class. To simplify the format() implementation, I moved repetitive code into a small utility method:

private void addChild (Node parent, String name, String text) {
    Element child = parent.getOwnerDocument().createElement(name);
    child.appendChild(parent.getOwnerDocument().createTextNode(text));
    parent.appendChild(child);
}

Use DateFormatter within a stylesheet

Now that we have implemented an extension function, we can call it from within a stylesheet. Just as before, we need to declare a namespace for our extension function:

<xsl:stylesheet version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:date="com.example.DateFormatter"
      xmlns:java="java" >

This time, we fully qualified the path to the class hosting the extension function. This is optional and depends on whether you'll be using other classes within the same package or just a single extension object. You can declare the full CLASSPATH as the namespace or use a package and specify the class where the extension function is invoked. By specifying the full CLASSPATH, we type less when we call the function.

To utilize the function, simply call it from within a select tag, like so:

<xsl:template match="date">
<xsl:apply-templates select="date:format(.)"/>
</xsl:template>
1 2 Page
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more