Page 3 of 4
This article's example document does not have a URL as its namespace identifier; instead, it has a made-up URN. Though unusual, it helps to show that the namespace identifier is just that: an identifier. In a real application, our root element would probably read:
<order xmlns="http://www.mycompany.com/xml/myproject"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.mycompany.com/xml/myproject file:./test.xsd">
An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types
and attribute names. A namespace in XML is a bit like a package in Java. It groups a set of elements together. The type user in the urn:nonstandard:test namespace differs from a type user in any other namespace.
Only one namespace can be the default—the others must be given a prefix. The xmlns attribute (which comes from the XML Namespaces Recommendation) defines the default namespace—i.e., the namespace for unprefixed
elements. The form xmlns:xsd defines the namespace for entries prefixed with xsd (xsd is commonly used for the schema prefix, but any prefix would do).
When defining a schema, we refer to our own types (Order, User, Product, etc.) and use types from the schema namespace (element, complexType, string, etc.). For this reason, we usually prefix the schema namespace. We could also prefix our types instead and use the schema
namespace unprefixed. The first part of our schema would then look like this:
<?xml version="1.0"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="urn:nonstandard:test" elementFormDefault="qualified" xmlns:ts="urn:nonstandard:test">
<element name="order" type="ts:Order" />
<complexType name="Order">
<sequence>
<element name="user" type="ts:User" minOccurs="1" maxOccurs="1" />
<element name="products" type="ts:Products" minOccurs="1" maxOccurs="1" />
</sequence>
</complexType>
(...)
Prefixed names are called qualified names. They contain a single colon separating the name into a namespace prefix and a local part. The prefix, which is mapped to a URI reference, selects a namespace.
In writing schema, we define new elements and attributes. The targetNamespace attribute specifies the namespace these new elements will be a part of. An XML document that conforms to this schema will
import that namespace (via an xmlns or xmlns:prefix attribute).
The xmlns:xsi attribute simply imports a namespace and maps it the xsi prefix. The namespace here is a special one: the XML Schema instance namespace. Every XML document that conforms to XML Schema
imports that namespace. The XMLSchema-instance schema declares only four attributes: type, nil, schemaLocation, noNamespaceSchemaLocation.
The schemaLocation attribute indicates where to find the schema to validate each namespace. The format is the namespace, a space, and the URL.
A comma can separate several namespace/URL entries. Since we are only interested in validating our namespace, we just declare
the location of the schema for urn:nonstandard:test—in this case, a file called test.xsd in the current directory (the schema we saved earlier). In a real application, the location would usually be a publicly accessible
URL. schemaLocation just provides a hint to the parser; if the parser is given a different schema by the code invoking it, it will use that schema,
not schemaLocation's.
If the XML document we want to validate comes in at the interface between our application and the external world, we will probably want to use our own copy of the schema for validation. For internal documents, trusting the document's header is probably okay.
The targetNamespace and the schemaLocation are attributes of a schema's root element. An XML Schema document's root element (xsd:schema) must always include at least:
xmlns:xsd="http://www.w3.org/2001/XMLSchema").
targetNamespace attribute.
xmlns="sameAsTargetNamespace.
The elementFormDefault attribute indicates whether locally declared elements should be qualified (prefixed) or not. The following section describes
that attribute.
In our schema, the order entry is the only globally declared element. Every other element is local to a type. For example, user is local to the Order type. We could declare more global types and link to them. For example, the document's beginning could read:
(...)
<xsd:element name="order" type="Order" />
<xsd:element name="user" type="User" />
<xsd:element name="products" type="Products" />
<xsd:complexType name="Order">
<xsd:all>
<xsd:element ref="user" minOccurs="1" maxOccurs="1" />
<xsd:element ref="products" minOccurs="1" maxOccurs="1" />
</xsd:all>
</xsd:complexType>
(...)
In a DTD, all elements are global, which can make DTDs difficult to read. In a schema, declaring only the root element as global makes it easier to read.
A schema's schema root element can take the elementFormDefault attribute, which indicates whether locally declared elements should be qualified or unqualified. If elementFormDefault is unqualified (the default), our XML document will need to specify which namespace the global elements are in (remember our only global
element is the root node order), but not where the local elements are located. If elementFormDefault is unqualified, declaring a namespace for local elements will result in an error.
This document shows unqualified locally declared elements, which is how your documents will usually look:
<ts:order xmlns:ts="urn:nonstandard:test">
<user>
<!-- etc -->
</user>
</ts:order>
This tells the parser that order is in the urn:nonstandard:test namespace and says nothing about user. Internally, order turns into urn:nonstandard:test:order, but user remains as is. It is not qualified by a namespace, but instead is assumed to be in the namespace of its first global parent—in
this case, order.
If, in the schema, we set elementFormDefault="qualified", we would have to do this:
<ts:order xmlns:ts="urn:nonstandard:test">
<ts:user>
<!-- etc -->
</ts:user>
</ts:order>
In qualified mode, the parser does not assume anything about local elements—we must specify their namespaces too. Internally,
order becomes urn:nonstandard:test:order, as before, and user now becomes urn:nonstandard:test:user. The internal expansion of namespace prefixes is important, because it explains why the example below will only work if the schema is set as elementFormDefault='qualified':
<order xmlns="urn:nonstandard:test">
<user>
<!-- etc -->
</user>
</order>
Here, we declare the default namespace as urn:nonstandard:test, so all elements are assumed to be in that namespace. It is an easier-to-read version of the example above, where we qualified everything.
If elementFormDefault had been left as unqualified (remember, that's the default) we would get the error:
error: cvc-complex-type.2.4.a: Invalid content was found starting with element 'user'. One of '{"":user, "":products}' is expected.
This error indicates that the parser was looking for an unqualified local element ("":user), but instead found an element qualified by the default namespace (urn:nonstandard:test:user).
The schema element can also be given the attributeFormDefault attribute, which behaves exactly like elementFormDefault, but for attributes.
Now that we have an XML file and a schema we understand, let's validate the first against the second.
We will use the Java API for XML Processing to find a parser, which we will use to validate the XML, then W3C DOM (Document Object Model) to look at our document. JAXP is an API for finding a parser and has shipped as part of Java since version 1.4 (prior versions of Java must use a separate download). JAXP allows our code to be parser-independent. J2SE 5's default parser is Xerces 2.6.2. J2SE 1.4's default parser is Crimson (Crimson cannot validate XML Schema). Other parsers include Aelfred and Oracle's parser, XDK (short for XML Developer Kit).
If a different parser is on the classpath, JAXP will automatically use that parser. For example, if you include Oracle's parser
on the classpath, the DocumentBuilderFactory you get from DocumentBuilderFactory.newInstance() will be an Oracle implementation, instead of one based on Xerces, which is packaged into J2SE 5.
If you have J2SE 5, the code below should work as is. If you have 1.4, then download either Xerces or Oracle XDK, and make sure it is on your classpath. Both those parsers are XML Namespaces and XML Schema-aware. If you have an earlier version of Java, you'll also need to download JAXP.
This class takes an XML file as a command line argument, validates it using a parser obtained via JAXP, and prints the name of the XML document's root node:
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;
public class XmlTester {
public static final void main(String[] args) {
if ( args.length != 1 ) {
System.out.println("Usage: java XmlTester myFile.xml");
System.exit(-1);
}
String xmlFile = args[0];
try {
XmlTester xmlTester = new XmlTester(xmlFile);
}
catch (Exception e) {
System.out.println( e.getClass().getName() +": "+ e.getMessage() );
}
}
public XmlTester(String xmlFile) throws ParserConfigurationException, SAXException, IOException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
System.out.println("DocumentBuilderFactory: "+ factory.getClass().getName());
factory.setNamespaceAware(true);
factory.setValidating(true);
factory.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema");
// Specify our own schema - this overrides the schemaLocation in the xml file
//factory.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaSource", "file:./test.xsd");
DocumentBuilder builder = factory.newDocumentBuilder();
builder.setErrorHandler( new SimpleErrorHandler() );
Document document = builder.parse(xmlFile);
Node rootNode = document.getFirstChild();
System.out.println("Root node: "+ rootNode.getNodeName() );
}
}
For the above code to work, we also need an error handler:
| Subject |
|
|
|
|
|
|
Thumbs upBy Anonymous on November 12, 2009, 4:27 amVery detailed. Just what i've been looking for. Thank you!
Reply | Read entire comment
Thank youBy Anonymous on July 24, 2009, 2:41 pmIt's just what I looking for!
Reply | Read entire comment
Excelent !!! Helps a lot !By Anonymous on March 3, 2009, 9:09 amExcelent !!! Helps a lot !
Reply | Read entire comment
Excellent article!By Anonymous on December 17, 2008, 8:33 amI liked it very much.
Reply | Read entire comment
View all comments