Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

XML merging made easy

Manipulate XML files using XPath declarations

  • Print
  • Feedback

Page 3 of 3

Customizing XmlMerge

You easily can add your own actions and matchers to the many ones built into XmlMerge. You also can create mapping functions to transform elements before they are written to a merged file (you might want to modify element attributes, for example). For this, you just have to implement an interface, such as the Action interface shown in Listing 9.

Listing 9. The Action interface

public interface Action {

    /**
     * Out of an original element and a second element provided by the patch
     * DOM, applies an operation and modifies the parent node of the result DOM.
     *
     * @param originalElement
     * @param patchElement
     * @param outputParentElement
     */
    void perform(Element originalElement, Element patchElement, Element outputParentElement);

}

You do not need to recompile the XmlMerge library to extend it. Simply add your actions and matcher implementation to the classpath. Note that you should be familiar with DOM4J if you want to extend XmlMerge, because it is the foundation of the XmlMerge library.

Alternatives to XPath configuration

XPath configuration isn't your only option for customizing XmlMerge. You also can place inline attributes in the patch file to specify how elements will be treated (shown below), or use the Configurer interface to specify your own configuration model.

Listing 10 shows the results of an inline configuration, where attributes are placed in the patch document rather than specified in an external properties file.

Listing 10. Inline configuration

Original Patch Result
<root>
<a>
 <b/>
</a>
<d/>
 <e id='1'/>
 <e id='2'/>
</root>
<root xmlns:merge='http://xmlmerge.el4j.elca.ch'>
 <a merge:action='REPLACE'>hello</a>
<c/>
<d merge:action='DELETE'/>
<e id='2' newAttr='3' merge:matcher='ID'/> </root>


<root>
<a>hello</a>
<c/>
 <e id="1" />
 <e id="2" newAttr="3" />
</root>

Using and extending XmlMerge

Since developing XmlMerge, I have used it in many kinds of projects and combined it with various other tools and frameworks. As demonstrated in this article, I've leveraged Spring Framework resources to merge iBatis configuration files on the fly. I also have combined XmlMerge with Ant tasks to merge web.xml deployment descriptors at build time. And I've used it to prepare variants of a base XML document for the purpose of unit-testing a semantic validation tool.

XmlMerge is meant to be useful out of the box for many of the common tasks required to merge data from XML documents. You also can extend it for other purposes, but for that you may need to extend the merging algorithm. Although the algorithm below is for merging two XML documents, XmlMerge can be used to merge any number of documents.

A simple algorithm for merging XML files

The hardest thing about merging XML files is specifying the expected results. For example, what would you expect as a default result of merging the two files shown in Listing 11?

Listing 11. Two files waiting to be merged

Original Patch
<a x="1"/> 
<b x="2"/> 
<a x="3"/>
<b y="4"/> 
<a y="5"/> 
<b y="6"/>

Which of the merged elements would you expect the resulting XML file to begin and end with? Would you want to keep the elements in the same order or not? XmlMerge is based on a straightforward algorithm that traverses each element list only once and returns all elements in the order in which they first appeared. The following rules guarantee a predictable result every time you merge two XML files using XmlMerge:

  • Only one element in the result corresponds to each original element.
  • Only one element in the result corresponds to each patch element.
  • All elements corresponding to original elements appear in the same order in the result as they did in the original.
  • All elements corresponding to the patch elements appear in the same order in the result as they did in the patch.

Cursors traverse both the original document and the patch to find matching pairs. All elements encountered before finding a matching pair are added to the result as is. Here the patch file is incremented first, so the result of merging the two files is as shown in Listing 12.

Listing 12. Patch first

<b y="4" />
<a x="1" y="5" />
<b x="2" y="6" />
<a x="3" />

If you switched the order of the original and patch shown in Listing 11, you would obtain the following result instead:

Listing 13. Result of switching the order of the merge

<a x="1" />
<b y="4" x="2" />
<a y="5" x="3" />
<b y="6" />

Although it is broadly applicable, the default merging algorithm used by XmlMerge will not work for all use cases. For instance, you may want to extend the existing algorithm to handle advanced XML tree merging.

In conclusion

XmlMerge is not a cure-all for XML merging needs. It is a relatively simple tool that leverages DOM4J and XPath declarations to ease the process of merging data from different XML files. It is easily combined with other development tools and frameworks (such as the Spring Framework and Ant) and can be used out of the box or customized for use in specialized projects. Because it's based on DOM4J rather than SAX, XmlMerge is not optimized for performance or memory, both factors that may rule out its use in some development projects. XmlMerge is also a work in progress. Its built-in behavior for handling attributes currently is not as rich as the behavior for handling elements, which are simply merged or replaced. XmlMerge is intended to provide a structurally sound, fully extensible framework for merging and manipulating data from a wide variety of sources. See the Resources section to download XmlMerge and learn more about it.

About the author

Laurent Bovet is a software architect at ELCA, the leading Swiss IT company responsible for Java enterprise development frameworks, such as LEAF and EL4J. He has worked on numerous Java-based distributed systems and is the creator of the EL4J XmlMerge library.

Read more about Tools & Methods in JavaWorld's Tools & Methods section.

  • Print
  • Feedback

Resources