XML and Java: A powerful combination

Why JavaOne's session on 'XML in the Java Platform' drew the crowds

XML was a well-received new specification when it was first introduced by the World Wide Web Consortium (W3C) in November 1996; now, combining it intrinsically with Java -- that is, making XML-specified code part of a Java program as well as encoding Java semantics (or behavior) into XML markup -- promises to deliver easier and more innovative application computing to the enterprise and beyond. According to Larry Cable and Mark Reinhold, senior staff engineers at Sun and presenters of the "XML in the Java Platform" technical session at the recent JavaOne Developer Conference, the main reason to pay attention to this technological marriage is that XML-based syntax offers a flexible, standard, and robust solution for Java programming, and, conversely, Java applies a universal set of semantics to XML data.

Why XML?

Most information available in the electronic world isn't stored or presented in images, 3D graphics, movies, sound, or any other impressive multimedia format. It exists, instead, in the form of character-based text -- on the Web, in databases, and elsewhere -- and text is most likely here to stay. XML allows developers to contextualize and interpret their data within a standard structure, so that one set of XML-framed data can be combined with another without rebuilding the entire structure with each new addition or modification.

How Java fits into the picture

XML provides a universal syntax for Java semantics (behavior). Simply put, this means that a developer can create descriptions for different types of data to make the data behave in various ways with Java programming code, and can later repeatedly use and modify those descriptions. Since XML and Java are both portable standards, the result of using the combination of the two technologies is portable, reusable data and portable behavior.

The full potential of what one can achieve with either XML or Java alone hasn't been fully tapped yet; to combine the two is to enter largely uncharted territory. Right now, the two main application areas of XML in Java are presentation-oriented publishing and enterprise message-oriented middleware. Specifically, XML can be combined with Java to produce such applications as complex Web documents, dynamic publishing, e-commerce, enterprise application integration, and structured information management and retrieval.

The XML standard extension

The XML standard extension is the basic plumbing that translates XML syntax into Java. The technical and structural details of the plumbing are still being hashed out, but at the end of 1999 Java developers will be able to use the standard extension to build XML-oriented applications. The standard extension involves a number of components: a parser, namespace support in the parser, the simple API for XML (SAX), and the document object model (DOM).

A parser is a software module that parses, or reads the data of, an XML document and checks it for validity. A namespace, part of the W3C XML specification, defines a distinct set of XML markup elements within a document type definition (DTD). The major benefit of a namespace is that it allows multiple vocabularies -- different sets of markup that behave in different ways -- to be combined in a single document instance. Obviously, to exploit this benefit, a parser must support namespaces. Sun Microsystems is still working on this support.

SAX, a freely available, platform- and language-neutral API for event-based XML parsing, allows programs and scripts to dynamically access and update the content, structure, and style of documents. It therefore acts as a middleware layer that interprets the data within an XML document into corresponding Java events.

DOM provides a standard tree-based datastructure interface to an XML parser, modeling the XML data as objects and allowing the objects to be combined, accessed, and manipulated -- for example, by a Java program.

The public draft of the XML standard extension specification and alpha release is due in the third quarter of this year; Sun plans to release the final specification and release during the fourth quarter.

After these issues are ironed out, Sun is planning to consider support for the transformation language and stylesheets (Extensible Stylesheet Language -- XSL) and for the XML Query Language, which allows XML documents to be searched.

XML data binding standard extension

Sun and other XML-Java pioneers believe the XML standard extension isn't sufficient to effectively use XML and Java together. Why? Although XML can provide the syntax for data that Java uses and acts upon, the syntax of an XML message is said to be without meaning -- in essence, it lacks the specific knowledge of what a particular piece of data is and how it fits into the entire informational system. The data binding standard extension employs the use of schemas, a subspecification of XML, which describe the specific structure and datatypes used by XML documents. Java programmers may relate to this analogy: an XML message follows an XML schema in much the same way that a Java object is an instance of a class. Schemas add meaning to XML documents and data by constraining their structure and content (thereby enabling automatic validation), and by describing the conceptual meaning so that a person, and not only a computer, can understand what is intended by the structure just by looking at it.

Understanding schemas is just the beginning of data binding. According to Cable and Reinhold, it's just as important to map XML message components from objects (called unmarshaling) as it is to map to them (marshaling) in order to get the most out of the XML-Java structural and programmatic hybrid. Still in the wings are classes for XML message components that marshal and unmarshal Java code. Cable and Reinhold feel that neither SAX nor DOM solves this problem.

Binding, however, does solve it, at least in theory. The use of binding compiles XML schemas into Java classes, allowing objects to be marshaled and unmarshaled from XML messages at will. Binding generates classes that contain both marshaling and unmarshaling code (thereby allowing for full error- and validity-checking) and component access methods (get and set), so that data mutators are automatically made consistent with the schema. Binding XML to Java programs therefore eliminates the need to write unmarshal as well as the possibility for entering invalid data. Sun engineers plan to add such a binding facility to the Java platform, although no specific date has been announced yet.

Try it now

You can download the binary and source code for Java Project X Technology Release 2, the code name for a Java-based XML technology services package. The Project X package includes a fast XML parser with optional validation and an in-memory object model tree that supports the W3C DOM Level 1 recommendation. According to Sun's Java Project X FAQ (see Resources), Project X is a "high performance, modular, and extensible Java API for developing XML-oriented applications and services." Try it and find out if it's true.

Sun and the industry look ahead to XML

In addition to Sun, a number of technology developers are bridging XML and Java for their partners and customers. For example, Bluestone Software, specializing in enterprise interaction management, supports the Java standard extension for XML in its Bluestone XML Suite. NetPost, developing and providing cross-media publishing solutions using a Java component model, uses XML as a "comprehensive standard data representation" for all the data in their information system. Oracle has developed a number of products in this area, including the XML Parser for Java, the XML Class Generator, and the XSL Processor for Java.

It is worthwhile for Java developers to learn -- or at least explore the possibilities of -- XML. Engineers at Sun are already using XML in the Java 2 Platform. For example, XML is employed in the JavaHelp API to describe metainformation; it is used to describe the deployment descriptor in Enterprise JavaBeans 1.1; and it provides the syntactic basis for the Java 2 Enterprise Edition programming model by interpreting Enterprise JavaBeans into JavaServer Pages and vice-versa. Sun also actively participates in the W3C and other open XML communities, such as XML.org and OASIS (the Organization for the Advancement of Structured Information Standards).

An 11-year veteran of the Internet and former Internet technology consultant, Mariva H. Aviram is an independent writer covering the high-tech industry. Mariva's published works include articles in c|net, JavaWorld, NetscapeWorld, and InfoWorld. Mariva is also the author of XML For Dummies Quick Reference and Palm Computing for Dummies Quick Reference (publication pending). For more information, visit http://www.mariva.com/.

Learn more about this topic