Often it is useful for XML data to be presented as a spreadsheet. A typical spreadsheet (for example, a Microsoft Excel spreadsheet) consists of cells represented in a grid of rows and columns, containing textual data, numeric data, or formulas. An Excel spreadsheet defines some standard functions such as SUM and AVERAGE that you can specify in cells. The Apache Jakarta POI project provides the HSSF API to create an Excel spreadsheet from an XML document or to go the opposite way, parsing an Excel spreadsheet and converting to XML. The HSSF API has provisions for setting the layout, border settings, and fonts of an Excel document. In this article, you'll learn how to generate an example Excel spreadsheet by parsing an XML document and adding data from the XML document to a spreadsheet. Subsequently, you'll convert the Excel spreadsheet to an XML document.
The Jakarta POI HSSF API provides classes to create an Excel workbook and add spreadsheets to the workbook. With the POI API, the
HSSFWorkbook class represents a workbook, and you set the spreadsheet fonts, sheet order, and cell styles in the
HSSFWorkbook class. You can represent the spreadsheet using the
HSSFSheet class. Specifically, you set the sheet layout, including the column widths, margins, header, footer, and print setup using the
HSSFSheet class. You can represent a spreadsheet row using the
HSSFRow class, and you set the row height using the
HSSFRow class. The
HSSFCell class represents a cell in a spreadsheet row, and you set the cell style using the
HSSFCell class. The indexing of spreadsheets in a workbook, of rows in a spreadsheet, and of cells in a row is zero based. In this article, we'll show how to convert an example XML document to an Excel spreadsheet and then convert the spreadsheet to an XML document. Listing 1 shows the example document, incomestatements.xml.
Listing 1. incomestatements.xml
<?xml version="1.0" encoding="UTF-8"?> <incmstmts> <stmt> <year>2005</year> <revenue>11837</revenue> <costofrevenue>2239</costofrevenue> <researchdevelopment>1591</researchdevelopment> <salesmarketing>2689</salesmarketing> <generaladmin>661</generaladmin> <totaloperexpenses>7180</totaloperexpenses> <operincome>4657</operincome> <invincome>480</invincome> <incbeforetaxes>5137</incbeforetaxes> <taxes>1484</taxes> <netincome>3653</netincome> </stmt> <stmt> <year>2004</year> <revenue>10818</revenue> <costofrevenue>1875</costofrevenue> <researchdevelopment>1421</researchdevelopment> <salesmarketing>2122</salesmarketing> <generaladmin>651</generaladmin> <totaloperexpenses>6069</totaloperexpenses> <operincome>4749</operincome> <invincome>420</invincome> <incbeforetaxes>5169</incbeforetaxes> <taxes>1706</taxes> <netincome>3463</netincome> </stmt> <incmstmts>
Creating an Eclipse project
In this article, we create and parse an Excel spreadsheet using the Apache POI HSSF API. Before you can set up your project, you need to download Apache POI 2.5.1 and extract the zip file to an installation directory. You also need to download and install Java SE 5.0. (You can also use another version, such as 1.4 or 6.0.)
To compile and run the code examples, you will need an Eclipse project. You can download project Chapter11 from the Apress Website and import it into your Eclipse workspace by selecting File, then Import.
To compile and run the Apache POI code examples, you need some jar files in your project's Java build path; Figure 1 shows these JAR files. The JAR file required for an Apache POI application is poi-2.5.1-final-20040804.jar, which consists of the Apache POI API. You also need to set the Java Runtime Environment system library to JRE 5.0, as shown in Figure 1.
Figure 2 shows the project directory structure.
If you haven't got a copy of Excel handy, you can instead open the Excel spreadsheet generated from the example XML document using Excel Viewer.