Sometimes it seems you spend more time manipulating XML files than you do writing Java code, so it makes sense to have one or two XML wranglers in your toolbox. In this article, Laurent Bovet gets you started with XmlMerge, an open source tool that lets you use XPath declarations to merge and manipulate XML data from different sources.
As a Java developer you use XML every day in your build scripts, deployment descriptors, configuration files, object-relational mapping files and more. Creating all these XML files can be tedious, but it's not especially challenging. Manipulating or merging the data contained in such disparate files, however, can be difficult and time-consuming. You might prefer to use several files split into different modules, but find yourself limited to one large file because that is the only format the XML's intended consumer can understand. You might want to override particular elements in a large file, but find yourself replicating the file's entire contents instead. Maybe you just lack the time to create the XSL transformations (XSLT) that would make it easier to manipulate XML elements in your documents. Whatever the case, it seems nothing is ever as easy as it should be when it comes to merging the elements in your XML files.
In this article, I present an open source tool I created to resolve many of the common problems associated with merging and manipulating data from different XML documents. EL4J XmlMerge is a Java library under the LGPL license that makes it easier to merge elements from different XML sources. While XmlMerge is part of the EL4J framework, you can use it independently of EL4J. All you need to run the XmlMerge utility from your command line is JDK 1.5 or greater.
In the discussion that follows, you will learn how to use XmlMerge for a variety of common XML merging scenarios, including merging two XML files, merging XML file data from different sources to create a Spring
Resource bean at runtime and combining XmlMerge and Ant to create an automated deployment descriptor at build time. I'll also show you how to use XPath declarations and built-in actions and matchers to specify the treatment of specific elements during an XML merge. I'll conclude with a look at XmlMerge's simple merging algorithm and suggest ways it could be extended for more specialized XML merging operations.
You can Download XmlMerge now if you want to follow along with the examples.
Merging XML files
In Listing 1 you see the very common (and greatly simplified) example of two XML files that need to be merged.
Listing 1. Two XML files that need to be merged
Listing 2 shows the command-line input to merge these two files using the XmlMerge utility, followed by the resulting output.
Listing 2. The two XML files merged using XmlMerge
~ $ java -jar xmlmerge-full.jar file1.xml file2.xml <?xml version="1.0" encoding="UTF-8"?> <root> <a> <b /> <c /> </a> <d /> </root> ~ $
This first example of merging is very simple, but you may have noticed that the order in which the files are merged is important. If you switch the order, you can get different results. (Later in the article you'll see an example of what happens when you switch the order of two files to be merged.) To keep files in order, XmlMerge uses the term original for the first document and patch for the second one. This is easy to remember because the patch document always is merged into the original.
Merging XML files from different sources
You can implement the XmlMerge utility anywhere in your Java code and use it to merge data from different sources into a new, useful document. In Listing 3, I've used it to merge a file from my application filesystem and the contents of a servlet request into a single document object model (DOM).
Listing 3. Merging client and server XML into a DOM
XmlMerge xmlMerge = new DefaultXmlMerge(); org.w3c.dom.Document doc = documentBuilder.parse( xmlMerge.merge( new FileInputStream("file1.xml"), servletRequest.getInputStream()));
Creating Spring Framework resources at runtime
In some cases it is useful to combine XmlMerge and the Spring Framework. For example, the Spring
Resource bean shown in Listing 4 was created at runtime by merging separate XML files into a single XML stream. You could then use the
Resource bean to configure other resources for object-relational mapping, document generation and more.
Listing 4. A Spring Resource bean
<bean name="mergedResource" class="ch.elca.el4j.services.xmlmerge.springframework.XmlMergeResource"> <property name="resources"> <list> <bean class="org.springframework.core.io.ClassPathResource"> <constructor-arg> <value>ch/elca/el4j/tests/xmlmerge/r1.xml</value> </constructor-arg> </bean> <bean class="org.springframework.core.io.ClassPathResource"> <constructor-arg> <value>ch/elca/el4j/tests/xmlmerge/r2.xml</value> </constructor-arg> </bean> </list> </property> <property name="properties"> <map> <entry key="action.default" value="COMPLETE"/> <entry key="XPath.path1" value="/root/a"/> <entry key="action.path1" value="MERGE"/> </map> </property> </bean>
Generating an automated deployment descriptor at build time
You've probably used Ant to automate your builds. How about combining it with XmlMerge to generate an XML deployment descriptor at build time? Listing 5 shows the
XmlMergeTask at work.
Listing 5. XmlMergeTask generates a deployment descriptor
<target name="test-task"> <taskdef name="xmlmerge" classname="ch.elca.el4j.services.xmlmerge.anttask.XmlMergeTask" classpath="xmlmerge-full.jar"/> <xmlmerge dest="out.xml" conf="test.properties"> <fileset dir="test"> <include name="source*.xml"/> </fileset> </xmlmerge> </target>