Integrating Excel and Word documents in Java applications has always been difficult. One option for reading and writing Word and Excel documents is to use an OLE 2 (Object Linking and Embedding) compound document reader like Jakarta POI (Poor Obfuscation Implementation) from the Apache Jakarta Project. POI does not yet have a mature direct API for Word access. Also, POI's underlying structure and architecture is complex and difficult to understand.
Another option for writing an Excel file is to write the output in CSV (comma-separated value) format. You can also read a CSV file from a Java application. The problem with CSV is that it does not allow you to embed formulas and other formatting into your data. When reading data from a CSV file, you have the same limitations as when writing it, with the added hassle that the user must save her files in that format.
Another option is to use an appropriate JDBC (Java Database Connectivity) driver like the one offered by Vista Portal. Many commercially available libraries are also available for reading and writing Excel documents. For reading and writing Word documents, the story is different. The main choices are POI or libraries like JavaBean word processing that are not pure Java solutions. Thankfully, with the new XML formats in Office 2003, reading and writing Excel and Word documents just became much easier.
SpreadsheetML and WordprocessingML are two new formats available in Office 2003. SpreadsheetML is an XML format for saving Excel documents, and WordprocessingML saves Word documents. Microsoft published and made public the schemas for these two formats (see Resources). To save a document as SpreadsheetML and WordprocessingML, all you have to do is select .xml format in the Save As dialog of Word and Excel. Windows recognizes SpreadsheetML and WordprocessingML as Excel or Word documents. So, if you open a SpreadsheetML or WordprocessingML document, Windows will open it with Excel or Word.
In this article, I introduce the basic tags of SpreadsheetML and WordprocessingML. I also provide examples of how to use these elements in a Java application. The article's first section introduces WordprocessingML and its basic elements. The second section presents SpreadsheetML and examples of how to use it. The third section recommends a few applications for WordprocessingML and SpreadsheetML and also presents some final thoughts.
Note: You can download this article's complete source from Resources.
WordprocessingML is an XML format for reading and writing Word documents. To keep with tradition, let's start with a Hello World example:
<?xml version="1.0"?>
<?mso-application progid="Word.Document"?>
<w:wordDocument
xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">
<w:body>
<w:p>
<w:r>
<w:t>Hello JavaWorld.</w:t>
<w:br/>
<w:t>This is a great reporting tool.</w:t>
</w:r>
</w:p>
</w:body>
</w:wordDocument>
Figure 1 shows the output of the XML code in Word.

Figure 1. Result of WordprocessingML code
The code above presents a Word document with the statement "Hello JavaWorld. This is a great reporting tool." Let's analyze the code's important elements:
<?mso-application progid="Word.Document"?> is a processing instruction that tells Windows to treat this XML file as a Word document. You must put it in the document
for it to be recognized by Word and Windows. The text-related elements for WordprocessingML have the prefix w.<wordDocument/>defines a Word document. You can see that w's namespace is defined here.
<body> is a container for text in the document.
<p> defines a paragraph.
<r> defines a run of other elements. Inside this tag, you should place tags like <t>, <br>, and so on. All elements inside a <r> element can have a common set of properties.
<t> is the element that encloses actual text.
<br> inserts breaks inside paragraphs. The <br> element has the Attribute type. The type can be page, column, and text-wrapping.
Those are some of WordprocessingML's basic text and text-structure elements. For more information on them, please refer to the appropriate schemas and manuals in Resources.
WordprocessingML also has numerous formatting elements. Usually, the text and formatting elements have a property element
that you put inside them. A property element's name is the name of the element for which it is a property plus Pr. So, for example, <p> has a <pPr> property element. Inside a property element, you place <style> elements. In the style element, you define font, font size, bold, italic, underline, etc., for the property. Before you use
a style, you must define one. Bellow is an example of how to define a paragraph style called JavaStyle:
<w:styles>
<w:style w:type="paragraph" w:styleId="JavaStyle" w:default="on"/>
</w:styles>
Bellow is an example of how to use JavaStyle for the properties of a paragraph:
<w:p>
<w:pPr>
<w:pStyle w:val="JavaStyle"/>
</w:pPr>
<w:r>
<w:t>Applying styles.</w:t>
</w:r>
</w:p>
So, to use a style, you must create it inside the <styles> element. The <styles> element is outside the <body> element but inside the <wordDocument> element. Then, reference a style through its ID in a property element.
Creating a Java program to output or read WordprocessingML is not difficult. Java has great XML support. I discuss an example of how to create a SpreadsheetML document with a Java program in the next section.
SpreadsheetML is an XML format for reading and writing Excel documents; it's similar to WordprocessingML. Let's look at a quick example:
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
<Worksheet ss:Name="TestSheet">
<ss:Table>
<ss:Row ss:Index="1">
<ss:Cell>
<ss:Data ss:Type="Number">121</ss:Data>
</ss:Cell>
</ss:Row>
<ss:Row ss:Index="2">
<ss:Cell>
<ss:Data ss:Type="String">121 is 11*11</ss:Data>
</ss:Cell>
</ss:Row>
</ss:Table>
</Worksheet>
</Workbook>
Figure 2 shows the example's output.
| Subject |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
other solutionBy Anonymous on May 8, 2009, 10:15 amTry this example: http://www.dancrintea.ro/doc-to-pdf/ With that solution you can: - populate a template(doc or odt) - eventually convert doc to pdf
Reply | Read entire comment
View all comments