Transform data into Web applications with Cocoon

Use Java to implement logic in Cocoon

You might have heard some of the recent buzz about Cocoon from the Apache Software Foundation. Now in its third year of development, Cocoon has gradually matured from a simple XSL (Extensible Stylesheet Language) transformation servlet into a full-blown Web application framework. Developed in Java, Cocoon typically runs as a servlet inside a servlet container like Apache Tomcat, though it can run in standalone mode as well.

Now don't be put off by the mention of XSL, because Cocoon has a great deal to interest the Java developer. In this article, we look at two ways of using Java to build business logic into your Cocoon-based Web applications. But first, let's start with a quick overview of Cocoon.

Cocoon is officially defined as an XML publishing engine, and while technically correct, the description does not do the product justice. The best way to understand Cocoon is to view it as a framework for generating, transforming, processing, and outputting data. Think of Cocoon as a machine that receives data from a wide variety of sources, applies various processing to it, and spits data out in the desired format. Figure 1 helps illustrate this view.

Figure 1. Cocoon big picture

We could also define Cocoon as a data flow machine. That is, when you use Cocoon, you define the data paths or flows that produce the pages that make up your Web application. Even a simple hello-world.html page has a data flow defined to serve the page.

I could spend this entire article discussing Cocoon's architecture, but it boils down to a few basic principles:

  • Cocoon handles all data internally as SAX (Simple API for XML) events; any non-XML input data converts to an XML representation.
  • Components called generators—because they generate SAX events—handle data input.
  • Components called serializers handle data output—they write the data out to the client (that is, browser, file, and so on).
  • The developer combines generators, serializers, and other components to form processing flows called pipelines. All pipelines are defined in a file called the sitemap.
  • URI (Uniform Resource Identifier) patterns identify pipelines, but URIs are completely decoupled from physical resources.

This last point warrants some emphasis. In a traditional Web server, a URI generally maps to a physical resource. So, the URI http://localhost/index.html on an Apache server maps to an HTML file I create on my computer called index.html. In Cocoon, there is absolutely, I repeat, absolutely no inherent correlation between URIs and physical resources. While nothing prevents you from making a correlation, Cocoon does not require one. You are free to design the URI patterns for your application in a way that helps your users better navigate your site. On the back end, you can organize your file resources to facilitate administration and maintenance.

To better understand Cocoon's processing model, let's look at a sample pipeline. This rudimentary example defines a page called index.html. This pipeline lives in the sitemap, which is an XML file typically called sitemap.xmap:

    <map:match pattern="index.html">
      <map:generate type="file" src="content/mainfile.xml"/>
      <map:transform type="xslt" src="content/stylesheets/mainstyle.xsl"/>
      <map:serialize type="html"/>

This pipeline has three steps. First, a generator component, the FileGenerator, reads data from an XML file called content/mainfile.xml. (The FileGenerator's actual definition is made earlier in the sitemap, at which time, the component is assigned a type attribute called file. All pipeline components in Cocoon are referenced by their type attributes.) Then a transformation is applied—in this case, a component called the TraxTransformer applies an XSL stylesheet to the incoming data. Finally, the HTMLSerializer writes data out to a browser client.

Though simple, the example above is common. If you set up this pipeline, create the XML and XSL files, Cocoon will happily serve the file contents according to the XSL instructions when you point your browser to your Cocoon application's index.html.

You might be wondering how all this relates to Java development. Java is vital when it comes to Cocoon's processing layer, represented by Figure 1's middle box. This processing layer, the heart of any Cocoon-based application, is where you apply logic to do something intelligent with the input data and get your desired output. While plenty of situations will arise where your pipelines are as simple as the example above, when you get into serious Web application development, you will need to apply logic to your data flow.

In Cocoon, you can implement logic in four main ways:

  • Using components called transformers: They do exactly what their name implies: they transform incoming data according to the rules they are given. The classic example is the TraxTransformer, which you can see in action in the pipeline above.
  • In the pipeline using various components that help choose the correct processing path based on various request/session/URI settings.
  • In the pipeline based on stock or custom Java-processing units called actions.
  • Using input files that mix Java and content—these are called Extensible Server Pages (XSPs).

This article covers this list's last two approaches: XSPs and actions. If you develop with Cocoon to any extent, you'll end up using them and probably liking them. Plus, you'll be happy to know that in both cases, you are essentially programming within a servlet context. More correctly, both components (in fact, all Cocoon components) have access to request, response, session, and context objects. Much of the logic you implement interacts with these objects in some way.

Let's look at XSPs and actions individually. If you haven't yet, you'll need to download the latest Cocoon distribution, now 2.0.3. Getting Cocoon up and running reaches beyond this article's scope. The easiest way to start is to go to, download the latest official release of Cocoon, install it in a servlet container like Apache Tomcat (4.0.4 is recommended), and start playing. Read the installation instructions carefully. The documentation on the Cocoon Website is getting better by the day, so bookmark it. The Cocoon users mailing list is also well stocked with people willing to help you get started. Note that if you follow the default installation instructions for Tomcat, you point your browser to http://localhost:8080/cocoon/index.html to access the Cocoon Web application. This article's examples assume you have this basic setup and a URI base of http://localhost:8080/cocoon.

eXtensible Server Pages

XSPs are an innovation of the Cocoon project. You can compare them to JSPs (JavaServer Pages) because they mix logic and content and can import functionality via taglib-like files called logicsheets. XSPs represent a pipeline's start; that is, they actually convert on the fly into generators that then produce the data for the rest of the pipeline to use.

Let's start with a simple example, called sample1.xsp:

<?xml version="1.0"?>
<xsp:page language="java" xmlns:xsp="">
  Date now = new Date();
  String msg = "Boo!";
 <title>Welcome to Cocoon</title>
   This is an XSP. You can see how we it contain both logic (inside the 
<xsp:logic> tags) and content. In the logic block above, we created 
a Date object whose value is <xsp:expr>now</xsp:expr>. Oh, we 
also had a 
special message for you: <xsp:expr>msg</xsp:expr>

First note that this document's root tag is <xsp:page>. This tag defines the XSP's language (either Java or JavaScript) and lists the namespaces of all the logicsheets being used. (When you see logicsheet, think taglib—I'll discuss this concept more later.) Next comes a <xsp:logic> block in which we set two Java variables. These blocks—you can have more than one—can appear anywhere you need them and can contain all sorts of Java code. Finally, we have our content, starting with the root user tag, which, in our case, is <content>. Inside this content, however, we can access the variables we set at the beginning using the special tag <xsp:expr>.

Remember, an XSP is actually a generator. Cocoon turns it into a Java source file, compiles it, and then executes it. (If you want to see an XSP's complete Java source file, look under your servlet container's work directory. If you use Tomcat 4.0.4, for example, this file will typically be under a directory like $CATALINA_HOME/work/Standalone/localhost/cocoon/cocoon-files/org/apache/cocoon/www.) The XML data produced during execution then passes to the rest of the pipeline's components.

To see this XSP in action, let's create the following pipeline:

   <map:pipeline match="*.xsp">
     <map:generate type="serverpages" src="examples/{1}.xsp"/>
     <map:serialize type="xml"/>

Here, we use a special generator, the ServerPagesGenerator, to process our simple XSP. Rather than doing anything with the data, we simply return it to the client in raw XML form. Note the use of the special {1} variable reference: it refers to the substitution value indicated by the wildcard character at the start of the pipeline. In other words, if we point our browser to sample1.xsp in our Web application, the value of {1} will be sample1, since {1} is substituted for the wildcard character *. This feature results in a more generic pipeline and will serve for the rest of this article's XSP examples.

With an XML-aware browser like Internet Explorer, we will see sample1.xsp's output shown in Figure 2.

Figure 2. XML output from sample1.xsp. Click on thumbnail to view full-size image.

Remember that XSPs, like most Cocoon components, have access to request, response, session, and context objects. These objects are actually Cocoon encapsulations of HttpServletRequest, HttpServletResponse, HttpSession, and HttpServletContext and are called Request, Response, Session, and Context, respectively. The Cocoon versions provide access to most of the methods in the official versions.

XSPs prove especially useful for retrieving data from a database. When you think about it, database data makes great XML data, since it is naturally organized into rows and columns. However, JDBC (Java Database Connectivity) does not make for economical code. XSPs, on the other hand, can make your life much easier when retrieving data, thanks to the ESQL logicsheet. Besides hiding the gory JDBC details, the ESQL logicsheet allows you to encapsulate your rows and columns with custom tags. You can implement nested queries and run database update commands as well.

Suppose we want to store a list of Cocoon resources in a database table. First, we define the table and then use an XSP to retrieve the rows found when a user searches by keyword. Later, we'll build a form to add new rows.

The table definition and rows are shown below. I use MySQL, so please make the appropriate DDL (data description language) changes if you use a different database. In case you haven't, you must configure a connection pool in Cocoon for your database. See "Sidebar 1: Databases and Cocoon" for more information.

use test;
create table Resources (
ResourceURL   varchar(255) not null,
ResourceName  varchar(64) not null
insert into Resources values ('', 'Cocoon Home Page');
insert into Resources values ('', 'Cocoon portal - first look');
insert into Resources values ('', 'Cocoon 2.0 Tips and Tricks');
insert into Resources values ('', 'Deploying Cocoon 2 in JBoss');
insert into Resources values ('', 'Integrating Apache, Tomcat 3.2.x and Cocoon 2.0 on Windows');
insert into Resources values ('', 'Integrating Tomcat 4.0.x and Cocoon 2.0 on Unix');
insert into Resources values ('', 'Integrating Tomcat 4.0.x and Cocoon 2.0 on Windows');
insert into Resources values ('', 'Creating a Navigation Menu');
insert into Resources values ('', 'Your Guide to Apache Cocoon');
insert into Resources values ('', 'Logicsheet Development');

With our table built and Cocoon properly configured, we can write a simple XSP:

1 2 3 4 Page 1
Page 1 of 4