Announcing EAXY: Making XML easier in Java

XML libraries in Java is a minefield. The amount of code required to manipulate and read XML is staggering, the risk of getting class path problems with different libraries is substantial and the handling of namespaces opens for a lot of confusion and errors. The worst thing is that the situation doesn’t seem to improve.

A colleague made me aware of the JOOX library some time back. It’s a very good attempt to patch these problems. I found a few shortcomings with JOOX that made me want to explore alternatives and naturally I ended up writing my own library (as you do). I want the library to allow for Easy manipulation of XML, and in an episode of insufficient judgement, I named the library EAXY. It’s a really bad name, so I appreciate suggestions for improvement.

Here is what I set out to solve:

  • It should be easy to create fairly complex XML trees with Java code
  • It should be straight-forward and fool-proof to use namespaces. (This is where JOOX failed me)
  • It should easy to read values out of the XML structure.
  • It should be easy to work with existing XML documents in the file structure or classpath
  • The library should prefer throwing an exception over silently failing.
  • As a bonus, I wanted to make it even easier to deal with (X)HTML, by adding convenience functions for this.

1. Creating an XML document

An XML document is just a tree. How about if align the tree to the Java syntax tree. For example – lets say you wanted to programmatically wanted to construct some feedback on this article:

<span style="color: #003399;">Element</span> email <span style="color: #339933;">=</span> Xml.<span style="color: #006633;">el</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"message"</span>,
        Xml.<span style="color: #006633;">el</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"recipients"</span>,
            Xml.<span style="color: #006633;">el</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"recipent"</span>,
                    Xml.<span style="color: #006633;">attr</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"type"</span>, <span style="color: #0000ff;">"email"</span><span style="color: #009900;">)</span>,
                    Xml.<span style="color: #006633;">attr</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"role"</span>, <span style="color: #0000ff;">"To"</span><span style="color: #009900;">)</span>,
                    Xml.<span style="color: #006633;">text</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"mailto:johannes@brodwall.com"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>,
            Xml.<span style="color: #006633;">el</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"recipent"</span>, Xml
                    .<span style="color: #006633;">attr</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"type"</span>, <span style="color: #0000ff;">"email"</span><span style="color: #009900;">)</span>,
                    Xml.<span style="color: #006633;">attr</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"role"</span>, <span style="color: #0000ff;">"Cc"</span><span style="color: #009900;">)</span>,
                    Xml.<span style="color: #006633;">text</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"mailto:contact@brodwall.com"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>,
        Xml.<span style="color: #006633;">el</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"subject"</span>, <span style="color: #0000ff;">"EAXY feedback"</span><span style="color: #009900;">)</span>,
        Xml.<span style="color: #006633;">el</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"contents"</span>, <span style="color: #0000ff;">"I think this is an interesting library"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">;</span>

Each element (Xml.el) has a tag name and can nest other elements, attributes (Xml.attr) or text (Xml.text). If the element only contains a text, we don’t even need to make the call to Xml.text. The syntax is optimized so that if you want to do a static import on Xml.* you can write code like this:

<span style="color: #003399;">Element</span> email <span style="color: #339933;">=</span> el<span style="color: #009900;">(</span><span style="color: #0000ff;">"message"</span>,
        el<span style="color: #009900;">(</span><span style="color: #0000ff;">"recipients"</span>,
            el<span style="color: #009900;">(</span><span style="color: #0000ff;">"recipent"</span>,
                    attr<span style="color: #009900;">(</span><span style="color: #0000ff;">"type"</span>, <span style="color: #0000ff;">"email"</span><span style="color: #009900;">)</span>,
                    attr<span style="color: #009900;">(</span><span style="color: #0000ff;">"role"</span>, <span style="color: #0000ff;">"to"</span><span style="color: #009900;">)</span>,
                    text<span style="color: #009900;">(</span><span style="color: #0000ff;">"mailto:johannes@brodwall.com"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>,
            el<span style="color: #009900;">(</span><span style="color: #0000ff;">"recipent"</span>,
                    attr<span style="color: #009900;">(</span><span style="color: #0000ff;">"type"</span>, <span style="color: #0000ff;">"email"</span><span style="color: #009900;">)</span>,
                    attr<span style="color: #009900;">(</span><span style="color: #0000ff;">"role"</span>, <span style="color: #0000ff;">"cc"</span><span style="color: #009900;">)</span>,
                    text<span style="color: #009900;">(</span><span style="color: #0000ff;">"mailto:contact@brodwall.com"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>,
        el<span style="color: #009900;">(</span><span style="color: #0000ff;">"subject"</span>, <span style="color: #0000ff;">"EAXY feedback"</span><span style="color: #009900;">)</span>,
        el<span style="color: #009900;">(</span><span style="color: #0000ff;">"content"</span>, <span style="color: #0000ff;">"I think this is an interesting library"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">;</span>

2. Reading XML

Reading XML with Java code can be a challenge. The DOM API makes it extremely wordy to do anything at all. You an use XPath, but can be a bit too much on the compact side and when you do something wrong, the result is simply that you get an empty collection or a null value back. I think we can improve on this.

Consider the following:

<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">(</span>email.<span style="color: #006633;">find</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"recipients"</span>, <span style="color: #0000ff;">"recipient"</span><span style="color: #009900;">)</span>.<span style="color: #006633;">texts</span><span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">;</span>

I step down the XML tree structure and get all the recipient email addresses of the previous message. But wait – running this code returns an empty list. EAXY allows us to avoid scratching our head over this:

<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">(</span>email.<span style="color: #006633;">find</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"recipients"</span>, <span style="color: #0000ff;">"recipient"</span><span style="color: #009900;">)</span>.<span style="color: #006633;">check</span><span style="color: #009900;">(</span><span style="color: #009900;">)</span>.<span style="color: #006633;">texts</span><span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">;</span>

Now I get the following exception:

org.eaxy.NonMatchingPathException: Can't find 
	{recipient} below [message, recipients].
	Actual elements: [Element{recipent}, Element{recipent}]

As you can see, we misspelled “recipent” in the message. Let’s get back to this problem later, but for now, let’s work around it to create something meaningful:

<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">(</span><span style="color: #003399;">Element</span> recipient <span style="color: #339933;">:</span> email.<span style="color: #006633;">find</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"recipients"</span>, <span style="color: #0000ff;">"recipent"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="color: #009900;">{</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">(</span><span style="color: #0000ff;">"to"</span>.<span style="color: #006633;">equals</span><span style="color: #009900;">(</span>recipient.<span style="color: #006633;">attr</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"role"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="color: #009900;">{</span>
        <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">(</span>recipient.<span style="color: #006633;">text</span><span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">}</span>
<span style="color: #009900;">}</span>

Again, I think this is about as fluent as Java’s syntax allows.

3. Validation and namespaces

So, we had a message where one of the element names was misspelled. If you have an XSD document for the XML you’re using, you can validate the document against this. However, as you may get used to when it comes to Java XML libraries the act of performing this validation is quite well hidden behind complex API’s. So I’ve provided a little help:

Xml.<span style="color: #006633;">validatorFromResource</span><span style="color: #009900;">(</span><span style="color: #0000ff;">"mailmessage.xsd"</span><span style="color: #009900;">)</span>.<span style="color: #006633;">validate</span><span style="color: #009900;">(</span>email<span style="color: #009900;">)</span><span style="color: #339933;">;</span>

This reads the mailmessage.xsd from the classpath, which is the most common use case for me.

Related:
1 2 Page 1
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.