Newsletter sign-up
View all newsletters

Sign up for our technology specific newsletters.

Enterprise Java
Email Address:

Breaking news in XML

Despite the scanty turnout, the recent XTech 2000 show produced several important XML/Java-related announcements

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone

Page 3 of 4

As is the case in most interesting arguments, neither camp is entirely wrong. XML could benefit from simplification, most notably in the area of notations -- a SGML holdover for binary (multimedia) objects that is better done with MIME standards.

On the other hand, anyone attempting to simplify the standard needs to ask a fundamental question: what is the basic subset we need to keep all of the XML tools we have developed, and need to develop, using XML? For example, the RELAX schema specification, discussed in more detail below, uses attributes. So perhaps attributes really need to be retained. The RELAX presentation also implied the need for mixed content, so perhaps that is necessary as well.

One questioner pointed out that CDATA sections require a lot of parser code, which make XML processing difficult. But without CDATA sections, how could you put a line drawing into XML? Would you be forced to use graphic-authoring tools? More importantly, since Extensible Style Language (XSL) uses CDATA sections for embedding processing scripts in a stylesheet, CDATA would appear to be necessary there as well.

Similarly, without external-parsed entities, how would you reference material from another document and include it inline? To be fair, SML is targeted at a world of pure data, rather than at defining a general-purpose standard useful for both data and documents, as XML is. For that purpose, then, perhaps external references are not required. (The RELAX schema standard described below doesn't include them, either, so perhaps they really are more trouble than they are worth.)

So, while some simplification seems like a good idea, it is not clear that we know exactly which simplification is in order. For the time being then, we should probably let things sit -- right after we get rid of notations.

RELAX: Your schema is here

Members of the development community have been eagerly awaiting a schema standard they could sink their teeth into. Schemata perform serious data validation and are fundamental to the process of automatically generating Java classes for XML data. Consequently, the need for a schema standard is strong.

(For a more detailed discussion on the advantages of schemata over DTDs, see the Sidebar below.)

However, the hoped-for W3C XML Schema standard remains in development. The industry players who are developing the standard have a long list of must-have features. The eventual result, by all accounts, will turn out to be something of a monolith. It will do what everyone says they need it to do, but it's going to take a lot of complex code to do it, and it's taking quite a bit of time for it to take shape.

Meanwhile, a former member of the schema-standards team came up with a better way, the Regular Language for XML, or RELAX. This Japanese standard is due to be submitted as a fast-track ISO proposal this summer. Makoto Murata, its author, took what appears in retrospect to be a simple idea: take the DTD, reformulate it in XML, take advantage of the structuring to provide context-sensitive definitions, and add the vitally important content validation.

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone
Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a JavaWorld account? Log in here. Register now for a free account.
Resources