Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Transform data into Web applications with Cocoon

Use Java to implement logic in Cocoon

  • Print
  • Feedback

You might have heard some of the recent buzz about Cocoon from the Apache Software Foundation. Now in its third year of development, Cocoon has gradually matured from a simple XSL (Extensible Stylesheet Language) transformation servlet into a full-blown Web application framework. Developed in Java, Cocoon typically runs as a servlet inside a servlet container like Apache Tomcat, though it can run in standalone mode as well.

Now don't be put off by the mention of XSL, because Cocoon has a great deal to interest the Java developer. In this article, we look at two ways of using Java to build business logic into your Cocoon-based Web applications. But first, let's start with a quick overview of Cocoon.

Cocoon is officially defined as an XML publishing engine, and while technically correct, the description does not do the product justice. The best way to understand Cocoon is to view it as a framework for generating, transforming, processing, and outputting data. Think of Cocoon as a machine that receives data from a wide variety of sources, applies various processing to it, and spits data out in the desired format. Figure 1 helps illustrate this view.

Figure 1. Cocoon big picture

We could also define Cocoon as a data flow machine. That is, when you use Cocoon, you define the data paths or flows that produce the pages that make up your Web application. Even a simple hello-world.html page has a data flow defined to serve the page.

I could spend this entire article discussing Cocoon's architecture, but it boils down to a few basic principles:

  • Cocoon handles all data internally as SAX (Simple API for XML) events; any non-XML input data converts to an XML representation.
  • Components called generators—because they generate SAX events—handle data input.
  • Components called serializers handle data output—they write the data out to the client (that is, browser, file, and so on).
  • The developer combines generators, serializers, and other components to form processing flows called pipelines. All pipelines are defined in a file called the sitemap.
  • URI (Uniform Resource Identifier) patterns identify pipelines, but URIs are completely decoupled from physical resources.


This last point warrants some emphasis. In a traditional Web server, a URI generally maps to a physical resource. So, the URI http://localhost/index.html on an Apache server maps to an HTML file I create on my computer called index.html. In Cocoon, there is absolutely, I repeat, absolutely no inherent correlation between URIs and physical resources. While nothing prevents you from making a correlation, Cocoon does not require one. You are free to design the URI patterns for your application in a way that helps your users better navigate your site. On the back end, you can organize your file resources to facilitate administration and maintenance.

To better understand Cocoon's processing model, let's look at a sample pipeline. This rudimentary example defines a page called index.html. This pipeline lives in the sitemap, which is an XML file typically called sitemap.xmap:

    <map:match pattern="index.html">
      <map:generate type="file" src="content/mainfile.xml"/>
      <map:transform type="xslt" src="content/stylesheets/mainstyle.xsl"/>
      <map:serialize type="html"/>
    </map:match>


This pipeline has three steps. First, a generator component, the FileGenerator, reads data from an XML file called content/mainfile.xml. (The FileGenerator's actual definition is made earlier in the sitemap, at which time, the component is assigned a type attribute called file. All pipeline components in Cocoon are referenced by their type attributes.) Then a transformation is applied—in this case, a component called the TraxTransformer applies an XSL stylesheet to the incoming data. Finally, the HTMLSerializer writes data out to a browser client.

Though simple, the example above is common. If you set up this pipeline, create the XML and XSL files, Cocoon will happily serve the file contents according to the XSL instructions when you point your browser to your Cocoon application's index.html.

You might be wondering how all this relates to Java development. Java is vital when it comes to Cocoon's processing layer, represented by Figure 1's middle box. This processing layer, the heart of any Cocoon-based application, is where you apply logic to do something intelligent with the input data and get your desired output. While plenty of situations will arise where your pipelines are as simple as the example above, when you get into serious Web application development, you will need to apply logic to your data flow.

In Cocoon, you can implement logic in four main ways:

  • Using components called transformers: They do exactly what their name implies: they transform incoming data according to the rules they are given. The classic example is the TraxTransformer, which you can see in action in the pipeline above.
  • In the pipeline using various components that help choose the correct processing path based on various request/session/URI settings.
  • In the pipeline based on stock or custom Java-processing units called actions.
  • Using input files that mix Java and content—these are called Extensible Server Pages (XSPs).


This article covers this list's last two approaches: XSPs and actions. If you develop with Cocoon to any extent, you'll end up using them and probably liking them. Plus, you'll be happy to know that in both cases, you are essentially programming within a servlet context. More correctly, both components (in fact, all Cocoon components) have access to request, response, session, and context objects. Much of the logic you implement interacts with these objects in some way.

Let's look at XSPs and actions individually. If you haven't yet, you'll need to download the latest Cocoon distribution, now 2.0.3. Getting Cocoon up and running reaches beyond this article's scope. The easiest way to start is to go to http://xml.apache.org/cocoon/, download the latest official release of Cocoon, install it in a servlet container like Apache Tomcat (4.0.4 is recommended), and start playing. Read the installation instructions carefully. The documentation on the Cocoon Website is getting better by the day, so bookmark it. The Cocoon users mailing list is also well stocked with people willing to help you get started. Note that if you follow the default installation instructions for Tomcat, you point your browser to http://localhost:8080/cocoon/index.html to access the Cocoon Web application. This article's examples assume you have this basic setup and a URI base of http://localhost:8080/cocoon.

  • Print
  • Feedback

Resources