|
|
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
Page 3 of 3
The JSP spec has had many incarnations, and different JSP products implement different, incompatible versions of the spec. I will use Tomcat, for the following reasons:
(For more information on Tomcat, see Resources.)
You are welcome to use any JSP engine you like, but configuring it is up to you! Be sure that the engine supports at least the JSP 1.0 spec; there were many changes between 0.91 and 1.0. The JSWDK (Java Server Web Development Kit) will work just fine.
When building a JSP-driven Website (also known as a Webapp), I prefer to put common functions, imports, constants, and variable declarations in a separate file called init.jsp, located in the source code for this article.
I then load that file into each JSP file using <%@include file="init.jsp"%>. The <%@include%> directive acts like the C language's #include, pulling in the text of the included file (here, init.jsp) and compiling it as if it were part of the including file (here, picture.jsp). By contrast, the <jsp:include> tag compiles the file as a separate JSP file and embeds a call to it in the compiled JSP.
When the JSP starts, the first thing it needs to do after initialization is find the XML file you want. How does it know which
of the many files you need? The answer is from a CGI parameter. The user will invoke the JSP with the URL picture.jsp?file=summer99/alex-beach.pix (or by passing a file parameter through an HTML form).
However, when the JSP receives the parameter, you're still only halfway there. You still need to know where on the filesystem
the root directory lies. For example, on a Unix system, the actual file may be in the directory /home/alex/public_html/pictures/summer99/alex-beach.pix. JSPs do not have a concept of a current directory while executing, so you need to provide an absolute pathname to the java.io package.
The Servlet API provides a method to turn a URL path, relative to the current JSP or Servlet, into an absolute filesystem
path. The method ServletContext.getRealPath(String) does the trick. Every JSP has a ServletContext object called application, so the code would be:
String picturefile =
application.getRealPath("/" + request.getParameter("file"));
or
String picturefile =
getServletContext().getRealPath("/" + request.getParameter("file"));
which also works inside a servlet. (You must append a / because the method expects to be passed the results of request.getPathInfo().)
One important note: whenever you access local resources, be very careful to validate the incoming data. A hacker, or a careless
user, can send bogus data to hack your site. For instance, consider what would happen if the value file=../../../../etc/passwd were entered. The user could in this way read your server's password file.
DOM stands for the Document Object Model. It is a standard API for browsing XML documents, developed by the World Wide Web Consortium (W3C). The interfaces are in
package org.w3c.dom and are documented at the W3C site (see Resources).
There are many DOM parser implementations available. I have chosen IBM's XML4J, but you can use any DOM parser. This is because the DOM is a set of interfaces, not classes -- and all DOM parsers must return objects that faithfully implement those interfaces.
Unfortunately, though standard, the DOM has two major flaws:
org.w3c.dom.Document object, the means of initializing the parser and loading the file itself is always parser specific.
The simple picture file described above is represented in the DOM by several objects in a tree structure.
Document Node
--> Element Node "picture"
--> Text Node "\n " (whitespace)
--> Element Node "title"
--> Text Node "Alex On The Beach"
--> Element Node "date"
--> ... etc.
To acquire the value Alex On The Beach you would have to make several method calls, walking the DOM tree. Further, the parser may choose to intersperse any number
of whitespace text nodes, through which you would have to loop and either ignore or concatenate (you can correct this by calling
the normalize() method). The parser may also include separate nodes for XML entities (like &), CDATA nodes, or other element nodes (for instance, the <b>big<b> bear would turn into at least three nodes, one of which is a b element, containing a text node, containing the text big). There is no method in the DOM to simply say "get me the text value of the title element." In short, walking the DOM is
a bit cumbersome. (See the XPath section of this article for an alternative to DOM.)
From a higher perspective, the problem with DOM is that the XML objects are not available directly as Java objects, but they must be accessed piecemeal via the DOM API. See my conclusion for a discussion of Java-XML Data Binding technology, which uses this straight-to-Java approach for accessing XML data.
I have written a small utility class, called DOMUtils, that contains static methods for performing common DOM tasks. For instance, to acquire the text content of the title child element of the root (picture) element, you would write the following code:
Document doc = DOMUtils.xml4jParse(picturefile); Element nodeRoot = doc.getDocumentElement(); Node nodeTitle = DOMUtils.getChild(nodeRoot, "title"); String title = (nodeTitle == null) ? null : DOMUtils.getTextValue(nodeTitle);
Getting the values for the image subelements is equally straightforward:
Node nodeImage = DOMUtils.getChild(nodeRoot, "image"); Node nodeSrc = DOMUtils.getChild(nodeImage, "src"); String src = DOMUtils.getTextValue(nodeSrc);
And so on.
Once you have Java variables for each relevant element, all you must do is embed the variables inside your HTML markup, using standard JSP tags.
<table bgcolor="#FFFFFF" border="0" cellspacing="0" cellpadding="5"> <tr> <td align="center" valign="center"> <img src="<%=src%>" width="<%=width%>" height="<%=height%>" border="0" alt="<%=src%>"></td> </tr> </table>
See the full source code for more details. The HTML output produced by the JSP file -- an HTML screenshot, if you will --
is in picture-dom.html.
All that code at the top of picture-dom.jsp, located in the source code, is unattractive. While you can put hundreds of lines of Java code inside a JSP, a cleaner approach
exists: you can use JSP JavaBeans to store significant amounts of Java code, while reserving the use of JSP scriptlet tags
(<% and %>) for control flow and minor variable manipulation inside the JSP page.
For prototyping purposes, it is generally easier to start a project by throwing all your Java code inside the JSP. Once you have a better idea of your needs, you can go back and extract the code and write some JavaBeans. The investment is higher, but so is the payoff in the long run, since your applications will be more modular. You can use the same beans in several different pages without the horror of copy-and-paste code reuse.
In our case, a clear candidate for a JSP JavaBean is the code that extracts String values from an XML file. You can define classes Picture, Image, and Thumbnails, representing the major elements in the XML file. These beans will have constructors or setter methods that take in a DOM
node or a filename from which to extract their values. You can browse the picturebeans package source directory from the source code file in Resources.
When looking through the source, be sure to notice the following:
List, in the DOM itself, or even in a database.
picturebeans. All JSP beans must be in a package; most JSP engines won't be able to find classes that are in the default package.
<%=picture.getCaption()%> instead of just <%=caption%>, since the values are stored in a bean rather than in local variables. However, if you want, you can define local variables
like String caption = picture.getCaption();. This is acceptable because it makes the code a little easier to read and understand.
You may have noticed that the output from my first JSP, picture-dom.html, used the full-sized source image file. Let's change the code slightly, so that instead of showing the full-sized image,
it shows a smaller, thumbnail version. I will use the list of thumbnail images stored in the XML data file.
Let's define a parameter, zoom, whose value determines which of the thumbnail images to display. Clicking on the thumbnail will show the full-sized raw
image source; clicking on a Zoom In or Zoom Out button will select the next or previous thumbnail in the list.
Since the Thumbnails object returns a java.util.List of Image objects, finding the right thumbnail couldn't be easier: just say (Image)picture.getThumbnails().get(i).
To build the Zoom In and Zoom Out links, you must generate a recursive reference to the same page, with different parameters.
For this, you use the request.getRequestURI() method. This only gives you the path to the servlet, with no parameters, so you can then tack on the parameters you want.
<%
if (zoom < (thumbnails.size() -1)) {
out.print("<a href='" +
request.getRequestURI() +
"?file=" + request.getParameter("file") +
"&zoom=" + (zoom+1) +
"'>");
out.print("Zoom In</a>");
}
%>
Here is an HTML screenshot of the working JSP page.
The JSP spec defines the <jsp:useBean> tag for automatically instantiating and using JavaBeans from a JSP page. The useBean tag can always be replaced by embedded Java code, and that's what I've done here. For this reason, many people question the
need for the useBean and setProperty tags. The arguments in favor of these tags are:
useBean has a scope parameter that automatically determines whether the bean should be stored as a local variable, a session variable, or an
application attribute.
useBean initializes it if necessary, but fetches the variable if it already exists.
The equivalent useBean syntax for this application is:
<jsp:useBean id="picture" scope="request" class="picturebeans.DOMPicture"> <% Document doc = DOMUtils.xml4jParse(picturefile); Element nodeRoot = doc.getDocumentElement(); nodeRoot.normalize(); picture.setNode(nodeRoot); %> </jsp:useBean>
or, if you define a setFile(String) method inside DOMBean:
<jsp:useBean id="picture" scope="request" class="picturebeans.DOMPicture"> <jsp:setProperty name="picture" property="file" value="<%=picturefile%>"/> </jsp:useBean>
To overcome some of the difficulties of using the DOM APIs, I have created a class called XMLEntryList. This class implements the Java Collections interface java.util.List, as well as the get and put methods of java.util.Map, providing a more intuitive set of methods with which to traverse a simple XML tree structure. You can use the standard abstraction
of the Collections API to do things like acquire iterators and subviews. Each entry in an EntryList has a key and a value, like a Map; the keys are the names of the child nodes, and the values are either Strings or child XMLEntryLists.
XMLEntryList is not meant to be a full replacement for the DOM; it cannot perform several DOM functions. However, it is a convenient wrapper
for performing basic getting, setting, and list-oriented functions on your XML data structure. For instance, to get the caption element of the picture node, you can say:
String caption = (String)picturelist.get("caption");
The value of the caption field has already been parsed and stored as a String.
Whatever its advantages, parsing an XML file takes time. To improve the performance of XML-based applications, you must use
some sort of cache. This cache must store XML objects in memory based on the name of the file from which they came. If the
file has been modified in the time since the object was loaded, then the object must be reloaded. I have implemented a simple
implementation of this data structure, called CachedFS.java. You can feed a CachedFS callback function, using inner classes, that actually performs the XML parsing, transforming a file into an object. The cache
then stores that object in memory.
Here is the code for creating a cache. This object has application scope, so future requests get to use the same object cache.
I will put this code in init.jsp, so that you don't need to copy and paste the initialization code into the other JSPs in the Webapp. In general, you should
define application-scope objects in a common location, so you don't end up with different initialization routines in different
areas.
<jsp:useBean id="cache" class="com.purpletech.io.CachedFS" scope="application">
<% cache.setRoot(application.getRealPath("/"));
cache.setLoader( new CachedFS.Loader() {
// load in a single Picture file
public Object process(String path, InputStream in) throws IOException
{
try {
Document doc = DOMUtils.xml4jParse
(new BufferedReader(new InputStreamReader(in)));
Element nodeRoot = doc.getDocumentElement();
nodeRoot.normalize();
Picture picture = new DOMPicture(nodeRoot);
return picture;
}
catch (XMLException e) {
e.printStackTrace();
throw new IOException(e.getMessage());
}
}
});
%>
</jsp:useBean>
XPath is a simple syntax for locating nodes in an XML tree. It is easier to use than DOM, since instead of making method calls
each time you want to step to another node, you embed the entire path in a string -- for example, /picture/thumbnails/image[2]. The Resin product, by Caucho (see Resources), includes an XPath processor that you can use in your own apps. You can use the Caucho XPath object on its own, without
buying into the rest of the Resin framework.
Node verse = XPath.find("chapter/verse", node);
Resin also includes a scripting language, compatible with JavaScript, that allows easy access to XPath and XSL functionality from inside your JSP.
This article has discussed embedding Java inside a JSP to extract data from XML nodes. There is another popular model for accomplishing this task: the Extensible Stylesheet Language (XSL). This model is radically different than the JSP model I've been discussing. In JSP, the main document is HTML, containing snippets of Java code; in XSL, the main document is an XSL document, containing snippets of HTML. There is a lot to say about the relationship between XSL and Java/JSP, much more than I have space for here. A future article in JavaWorld will explore using XSL and JSP together.
After reading this tutorial, you should have a good idea of the structure of a JSP-XML application, and of its power. You should also have some idea of its limitations.
The most tedious part of developing a JSP-XML application is creating JavaBeans for each of the elements in your XML schema. The XML Data Binding group is developing technology that will automatically generate Java classes from a given schema. Also, I have developed a prototype open source Java-XML data binding technology. And IBM alphaWorks has recently released XML Master, or XMas, another XML-Java data binding system.
Another possibility is to expand the functionality of the filesystem, building some more powerful features, such as queries and transactions. Naturally, I am contemplating implementing this type of functionality as an open source project as well. Anybody want to write an XML search engine?
Read more about Enterprise Java in JavaWorld's Enterprise Java section.
XMLEntryListServer-side Java: Read the whole series -archived on JavaWorld