The Extensible Markup Language, or XML, has gained widespread popularity as a way to represent data in a portable, vendor-neutral, readable format. Many software vendors have announced "support for XML," usually meaning their products will produce or consume XML data.
XML is also being viewed as the lingua franca for data exchange between enterprises. It allows enterprises to agree on XML Document Type Definitions (DTDs) for the data being exchanged. These DTDs are independent of the database schema used by the enterprises.
Standards groups representing almost every human endeavor are agreeing on DTDs for exchanging data. One of many examples is the International Press Telecommunications Council (see Resources), which has defined an XML DTD that allows "news information to be transferred with markup and be easily transformed into an electronically publishable format." Such vertical market standards will allow diverse applications to exchange data in unforeseen ways.
But what good are portable, vendor-neutral data if you don't share and process them? The ability to communicate and process XML between distributed computers is desirable. An application that communicates and processes XML between computers is, in fact, a distributed application.
This article explores such distributed applications written in Java. I'll focus on the communication of XML between Java code running in different virtual machines.
The communication of XML
The specification of XML defined by the World Wide Web Consortium, or W3C (see Resources), defines the syntax and semantics of the language. To process XML, an XML document needs to be parsed. It would be regrettable if every Java class that needed to process XML had to parse an XML document, given the complexity of XML's syntax and semantics. To solve this problem, the W3C has defined the Document Object Model (DOM) (see Resources). The DOM is an application programmer's interface to XML data. It is available from many programming languages, including Java. Java programs can access XML data via the DOM API. XML parsers produce a DOM representation of an XML document.
Figure 1 illustrates a simplified model of a Java distributed application that processes XML. The model is sufficient for the purpose of this article: to explore the communication of XML. The model assumes that some data are obtained from a data source such as a relational database. Some Java code processes the data and ultimately produces a DOM representation. This code is represented in Figure 1 as the processor.
The processor code passes the DOM representation of the XML data to the sender. The sender is Java code that communicates the XML data to the receiver. The receiver is Java code that receives the XML data, produces a DOM representation of the data, and passes it to another processor. In short, the sender and the receiver abstract the communication of the DOM representation of XML data.
The sender and the receiver are not implemented in the same Java Virtual Machine. They are connected by a distributed system infrastructure. There are several approaches to implementing the sender and the receiver.
Note that in the model in Figure 1, the sender is a client of the receiver. The sender passes the XML to the receiver. In another possible model, the receiver is the client; it requests the document from the sender. I will not explore the second model in this article since the issues of communicating XML are similar.