Wizard API updated!
Tim Boudreau has released a new version of the Swing Wizard library (version 0.997) that fixes the WizardException bug reported in JavaWorld's recent Open Source Java Project profile. The article's examples have been reworked to test out the new, improved WizardException. Thanks, Tim, for this helpful fix!
Open Source Java Projects: The Wizard API

Newsletter sign-up

Sign up for our technology specific newsletters.

Enterprise Java
View all newsletters

Email Address:

XML speeds along in standards land

Java's little brother has gained support from a number of competitors. Find out what this fresh-faced technology holds for you

The World Wide Web Consortium (W3C) was founded in 1994 to develop common protocols for the evolution of the World Wide Web. W3C is the international standards organization that brought you HTML. Currently, the W3C is reviewing, among other technologies, XML (eXtensible Markup Language) 1.0, a "meta-grammar" that allows for Web automation and data interchange across multiple platforms and applications.

So why should you as a Java developer be concerned with this emerging technology? Well, Java and XML complement each other. Java provides platform-independence, XML provides application-independence; Java gives the consumer a choice of platforms, XML gives the consumer a choice of applications. XML furthers the cause of Java by furthering the cause for consumer freedom.

Java provides a platform-independent coding environment, and XML provides a similar universality in terms of how it expresses and formats data. In essence, XML provides a grammar that can be used to create self-describing data file formats. Thus Java can be viewed as the universal Virtual Machine, and XML can be viewed as the universal Virtual Document. Java is a perfect architecture and vendor-neutral language for processing these architecture and vendor-neutral documents.

Now I'm no XML expert, but I have been doing my homework -- reading the XML 1.0 specification, corresponding with members of the Working Group, and mulling over many of the XML FAQ and tutorials that are available. XML is chock-full of bewildering new acronyms and specialized language. My goal for this article is not to teach you XML -- you can study up on your own with one of several good tutorials (see Resources). Rather, my aim is to cut through as much of this language as possible and explain the significance of XML to you, the Java programmer.

Before I begin, I must acknowledge that the connections between Java and XML have been admirably elucidated by John Bosak in his seminal paper: "XML, Java, and the future of the Web." I simply aim to add my perspective on this emerging technology.

XML defined: the knit apparel analogy

XML is a simplified dialect of SGML (Standard Generalized Markup Language). For those of you unfamiliar with SGML, it is an international standard (ISO-8879) for defining descriptions of the structure and content of documents in an electronic form. XML simplifies SGML by capturing about 80 percent of SGML's functionality with only 20 percent of the complexity.

HTML, which is a description of the structure and content of a single type of document called a "Web page," is just one instance of what can be created with SGML. In other words, if HTML is a single knit sweater, SGML and XML are how-to books on knitting. By learning XML, you can create sweaters, socks, leg warmers, or any kind of knitted apparel you want!

As I noted earlier, XML currently is working its way through the W3C standards process. For more information on what this means for XML, see the sidebar W3C reviews XML.

Key characteristics

In my own (admittedly simplistic) reduction of the key characteristics of XML, I divide the capabilities offered into four categories. Simply put, XML is:
  • Structured
  • Self describing
  • Extensible, and
  • Viewer adaptive


Let's look at each of these characteristics in more detail.

Structured
XML is an extremely structured language specification. Good XML can be both well-formed and valid. More on these features in a moment.

Like SGML, XML documents utilize a DTD (document type definition) for defining the syntax, grammar, and data structure of your XML documents. A DTD also defines whether the use of each of your declared elements is required, optional, or conditional, and if the range of allowable attribute values is implied, has a default value, or is allowed to be an empty tag.

An XML parser uses a DTD to determine if a document is well-formed, meaning that it contains the properly defined start and end tags, and if it is valid, meaning that it conforms to the DTD in its entirety -- variance is not allowed, and even one error will prevent the entire document from being processed. A parser can validate automatically through a built-in DTD, through an externally defined DTD described using the <DOCTYPE> HTML element, without the use of a DTD, or through some combination of these techniques using scripted business logic rules or an externally defined set of processing instructions.

So what do we Java Hackers gain by creating a more rigid data structure? One of the significant benefits with such a structure is the ease with which you can map the document's attributes to database structures or object hierarchies. This enables a reliable mechanism for passing documents back and forth from a client's viewer to the database and back, or to fluidly export the data between two databases using a structured XML document as an intermediary. That is, we enable a reliable means of extracting information from documents (what we familiarly call parsing). Without well-formed documents, we would have to rely on pattern matching to scan a poorly formed document for elements.

Another way of putting it is that the XML structure makes documents machine-readable. Enabling machines to read the Web allows for the automatic sharing of data among different companies through a standard format. Using a DTD, which describes the grammar of novel elements in a document, you can even connect different formats through a common description. For example, a medical document like a patient record might have allergies or blood pressure or other specialized data described as DTD-specified attributes.

This kind of sharing is ideal for EDI applications (Electronic Data Interchange) and supply-chain integration. One company, webMethods, is pioneering a Java example of this technique. Its Web Automation Toolkit is a 100% Pure Java way to integrate and aggregate Web-based data sources into applications of all kinds. You can download and try it free for 30 days (see Resources).

Self-describing
Another important value inherent in XML is the possibility of self-describing information. Although XML documents are not required to be self-describing (they are required only to be well-formed), descriptions add a level of power to Web automation and navigation. These descriptions are known as Metadata (data about data) and can contain such information about the document as security (who gets to read it), popularity, what the document is about, what language the document is in, who wrote it, or anything at all that describes the information. HTML has a facility for adding Metadata (the <META> tag), but the format for interchanging different Metadata attributes is poorly defined. For example, a site that uses the attribute "author" will not be able to share this with a site using the attribute "writer."

1 | 2 | 3 | 4 |  Next >
Resources
  • W3C's XML 1.0 Proposed Recommendation http://www.w3.org/TR/PR-xml.html
  • Carl Davis's HTTP Explorers Client and Server http://homepage.interaccess.com/~cdavis/java/httpexplorer.html
  • XML Resources at finetuning.com http://www.finetuning.com/xml.html
  • Jon Bosak's "XML, Java, and the future of the Web" http://sunsite.unc.edu/pub/sun-info/standards/xml/why/xmlapps.htm
  • Web Interface Definition Language (WIDL) http://www.w3.org/TR/NOTE-widl.html
  • The HTTP Distribution and Replication Protocol http://www.w3.org/TR/NOTE-drp
  • The Open Software Description Format (OSD) http://www.w3.org/TR/NOTE-OSD.html
  • XML Enabled Mechanisms for Distributed Computing on the Web http://208.204.84.117/public/presentation/DocumationEast/
  • webMethods has been doing some serious XML experimentation. Check out its demo page http://www.webmethods.com/products/toolkit/userguide/demos.html
  • webMethods' WIDL White Paper, which was published in the W3C Journal http://webMethods.com/technology/automating.html
  • LT XML (version 0.9.5; release dateAugust 21, 1997) http://www.ltg.ed.ac.uk/software/xml/
  • DataChannel's XML Viewer Applet http://208.204.84.117/XMLTreeViewer/deploy/index.html
  • A Proposal for XSL http://www.w3.org/TR/NOTE-XSL.html
  • CDF Submission http://www.w3.org/TR/NOTE-CDFsubmit.html
  • CSS2 Specification Release http://www.w3.org/TR/WD-CSS2/
  • Mathematical Markup Language http://www.w3.org/TR/WD-math/
  • Document Object Model (XML) Level 1 http://www.w3.org/TR/WD-DOM/level-one-xml-971209.html
  • An MCF Tutorial http://www.w3.org/TR/NOTE-MCF-XML/MCF-tutorial.html
  • Synchronized Multimedia Integration Language http://www.w3.org/TR/WD-smil
  • Parsers
  • An Introduction to XML Processing with Lark http://www.textuality.com/Lark/
  • Pax Syntactica http://208.204.84.117/XMLTreeViewer/
  • NXP - Norbert's XML Parser http://www.edu.uni-klu.ac.at/~nmikula/NXP/
  • Miko's previous articles
  • "The real future of Java" Where are we going with Java, and who is going to take us there?
  • "Demo or die! The quest for the killer app" An open call for keynote demo submissions for the 1998 JavaOne conference.
  • "Ultranet, the next network " Sun's Java Evangelist provides his unique perspective on the four stages of the growth of the network. Read this account of how a bleeding-edge federation of startup companies strives to reinvent the next-generation network.