Servlet 2.4: What's in store

A full update on the latest servlet spec

On March 7, 2003, Sun Microsystems (working with the JSR (Java Specification Request) 154 expert group) published the "Proposed Final Draft 2" of the Servlet 2.4 specification (see Resources for a link to the formal specification). As it's still in the Proposed Final Draft stage, the specification is not quite finished, and technical details may change. A Proposed Final Draft 3 is even a possibility. However, changes should not prove significant before the specification's final release, and, in fact, server vendors have already begun to implement the new features. That makes now a good time to start learning about what's coming in Servlet 2.4 and check out its integration with J2EE (Java 2 Platform, Enterprise Edition) 1.4.

In this article, I describe what changed between 2.3 and 2.4. I also explain the decision-making process behind the changes and tell you about a few things that didn't make it. To keep the article focused, I assume you're familiar with the classes and methods of previous Servlet API versions. If that's not the case, you can peruse Resources for links to sites (and my book!) that will help get you up to speed.

Servlet 2.4 lacks some of the fireworks of past releases. Servlet 2.2 introduced the notion of self-contained Web applications. Servlet 2.3 added the power of filters and filter chains. Servlet 2.4, while adding several interesting features, has no superstars and spends more time polishing and clarifying the features that came beforeā€”a tying up of loose ends. This work's effect is that servers faithfully implementing 2.4 will be more interoperable than any past servers. But don't let me imply there's nothing new in Servlet 2.4! Here's a list of what's new:

  • Servlets now require HTTP/1.1 and J2SE (Java 2 Platform, Standard Edition) 1.3, and can work with J2EE 1.4
  • ServletRequest has new methods to observe the client connection
  • New support for internationalization and charset choice
  • RequestDispatcher has new features and clarifications
  • New ServletRequest listener classes and methods
  • A deprecated SingleThreadModel
  • HttpSession details and interaction with logins has been clarified
  • Classloading and welcome-file behavior has been clarified
  • The web.xml file now uses XML Schema and has added a slew of new elements

Before we begin looking at these changes, let me point out that most servers don't yet have fully compliant Servlet 2.4 implementations. If you want to test those features, your best bet is to download the official reference implementation server, Apache Tomcat 5.0. It's open source, and you can download the server for free. Tomcat 5.0 will be the first Tomcat version to support Servlet 2.4, but, of course, its latest release is still an alpha. (See Resources for more information on Tomcat.)

Upgraded support for HTTP, J2SE, and J2EE

Servlet 2.4 depends on HTTP/1.1 and J2SE 1.3. Previously, servlets relied upon HTTP/1.0 and J2SE 1.2. Having 2.4 upgrade these minimum-level requirements means servlet authors can reliably depend on the new features of HTTP/1.1 and J2SE 1.3. At the same time, these requirements complicate the task of the servlet container developer because HTTP/1.1 has more special cases and complexities than HTTP/1.0. Some servers already support HTTP/1.1, but those that don't will have to spend some time upgrading. Note that, unlike what some people think, having J2SE 1.3 as a minimum requirement does not mean that if you implement Servlet 2.4 on J2SE 1.2, you've succeeded with a good hack. That breaks the contract with the servlet author, a contract that says an author may rely on J2SE 1.3 features when he writes against Servlet 2.4.

As another change, Servlet 2.4 will be part of the upcoming J2EE 1.4 (in fact, the two specs will be released concurrently, probably around JavaOne 2003). Of course, it's important to note that servlets can run standalone; you don't have to buy a full J2EE container to run servlets. Apache Tomcat, for example, doesn't implement all of J2EE. But when you run servlets as part of J2EE 1.4, you can take advantage of extra features and extra deployment descriptor elements exposing JNDI (Java Naming and Directory Interface) resources, EJB (Enterprise JavaBeans), message queues, and JAX-RPC (Java API for XML-based Remote Procedure Call) services. I'll talk about some of those elements later.

The upgrade to HTTP/1.1 caused one code change. Servlets have a new static constant HttpServletResponse.SC_FOUND to represent status code 302. Found is the HTTP/1.1 name for what HTTP/1.0 called Moved temporarily. HttpServletResponse.SC_MOVED_TEMPORARILY still exists and represents 302, but SC_FOUND is now preferred. SC_MOVED_TEMPORARILY can be considered deprecated, but deprecating variables, even public constants is technically impossible in Java.

New ServletRequest methods

The ServletRequest interface (and the ServletRequestWrapper class) adds four new methods in Servlet 2.4:

  • getRemotePort(): Returns the IP source port of the client or last proxy that sent the request
  • getLocalName(): Returns the host name of the IP interface on which the request was received
  • getLocalAddr(): Returns the IP address of the interface on which the request was received
  • getLocalPort(): Returns the IP port number of the interface on which the request was received

These methods provide a mechanism to query the low-level IP connection details and understand how the connection routed. The getRemotePort() method, combined with the preexisting getRemoteAddr() and getRemoteHost() methods, exposes the client side of the IP connections. The new getLocalPort(), getLocalAddr(), and getLocalName methods expose the server side of the IP connections. The preexisting getServerName() and getServerPort() methods have been newly defined to expose the HTTP-layer details by simply returning the "host:port" extracted from the HTTP Host header. On a virtual hosted or load-balanced system, these methods provide a way to learn what clients, proxies, or load-balance devices connect, where they physically connect, and where they virtually connect.

Internationalization

Also in Servlet 2.4, the ServletResponse interface (and the ServletResponseWrapper) adds two new methods:

  • setCharacterEncoding(String encoding): Sets the response's character encoding. This method provides an alternative to passing a charset parameter to setContentType(String) or passing a Locale to setLocale(Locale). This method has no effect if called after getWriter() has been called or if the response has committed. For a list of acceptable Internet charsets, see Resources.
  • getContentType(): Returns the response's content type. This may include a charset parameter set by either setContentType(), setLocale(), or setCharacterEncoding(). If no type has been specified, the method returns null.

The setCharacterEncoding() method pairs with the preexisting getCharacterEncoding() method to provide an easy way to manipulate and view the response's character encoding (charset). You can now avoid setting the charset via the awkward setContentType("text/html; charset=UTF-8") call.

The new getContentType() method pairs with the preexisting setContentType() method to expose the content type you've assigned. Formerly, this wouldn't have been too interesting, but now the type might be dynamically set with a combination of setContentType(), setLocale(), and setCharacterEncoding() calls, and this method provides a way to view the generated type string.

So which is better, setLocale() or setCharacterEncoding()? It depends. The former lets you specify a locale like ja for Japanese and lets the container handle the work of determining an appropriate charset. That's convenient, but, of course, many charsets might work for a given locale, and the developer has no choice in the matter. The latter method provides a new, easy way to choose a specific charset, letting you override the container's choice of Shift_JIS with EUC-JP, for example.

However, the story doesn't end there. Servlet 2.4 also introduces a new <locale-encoding-mapping-list> element in the web.xml deployment descriptor to let the deployer assign locale-to-charset mappings outside servlet code. It looks like this:

<locale-encoding-mapping-list>
  <locale-encoding-mapping>
    <locale>ja</locale>
    <encoding>Shift_JIS</encoding>
  </locale-encoding-mapping>
  <locale-encoding-mapping>
    <locale>zh_TW</locale>
    <encoding>Big5</encoding>
  </locale-encoding-mapping>
</locale-encoding-mapping-list>

Now within this Web application, any response assigned to the ja locale uses the Shift_JIS charset, and any assigned to the zh_TW Chinese/Taiwan locale uses the Big5 charset. These values could later be changed to UTF-8 when it grows more popular among clients. Any locales not mentioned in the list will use the container-specific defaults as before.

RequestDispatcher changes

Servlet 2.4 adds five new request attributes to provide extra information during a RequestDispatcher forward() call. In case you've forgotten, when you forward() to a servlet, the servlet container changes the target servlet's path environment as if it were the first servlet being invoked. The methods getRequestURI(), getContextPath(), getServletPath(), getPathInfo(), and getQueryString() all return information based on the URI (Uniform Resource Identifier) passed to the getRequestDispatcher() method. However, sometimes an advanced forward() target servlet might like to know the true original request URI. For this, Servlet 2.4 adds the following request attributes:

  • javax.servlet.forward.request_uri
  • javax.servlet.forward.context_path
  • javax.servlet.forward.servlet_path
  • javax.servlet.forward.path_info
  • javax.servlet.forward.query_string

Inside a forwarded servlet you'll see getRequestURI() return the path to the target servlet as always, but now if you want the original path, you can call request.getAttribute("javax.servlet.forward.request_uri"). One special caveat: if forward() happens through a getNamedDispatcher() call, these attributes aren't set because, in that case, the original path elements aren't changed.

This set of attributes may remind you of these request attributes added with Servlet 2.2:

  • javax.servlet.include.request_uri
  • javax.servlet.include.context_path
  • javax.servlet.include.servlet_path
  • javax.servlet.include.path_info
  • javax.servlet.include.query_string

However, these work just the opposite of the forward() attributes. In an include(), the path elements don't change, so the include attributes act as the backdoor to access the target servlet's path elements. Compare this with a forward() where the path elements change so the forward attributes represent the backdoor to the original path elements. Yes, it gets complicated. As soon as servlets began to use the URI space as an internal dispatch mechanism, the door to complexity opened.

Another area where we see this complexity is in the interaction between the RequestDispatcher and filters. Should filters invoke for forwarded requests? Included requests? What about for URIs invoked via the <error-page> mechanism? Before Servlet 2.4, these questions were left as open issues. Now Servlet 2.4 makes it a developer's choice. There's a new <dispatcher> element in the deployment descriptor with possible values REQUEST, FORWARD, INCLUDE, and ERROR. You can add any number of <dispatcher> entries to a <filter-mapping> like this:

<filter-mapping>
  <filter-name>Logging Filter</filter-name>
  <url-pattern>/products/*</url-pattern>
  <dispatcher>REQUEST</dispatcher>
  <dispatcher>FORWARD</dispatcher>
</filter-mapping>

This indicates the filter should be applied to requests directly from the client as well as forward requests. Adding the INCLUDE and ERROR values also indicates that the filter should additionally be applied for include requests and <error-page> requests. Mix and match for what you want. If you don't specify any <dispatcher> elements, the default is REQUEST.

The last RequestDispatcher change is to allow, for the first time, relative paths in request.getRequestDispatcher() calls. The path will be interpreted relative to the current request's path. It's a minor change, but comes in handy when dispatching to a sibling servlet.

Listeners

Servlet 2.3 introduced the idea of context and session listeners, classes that could observe when a context or session was initialized or about to be destroyed, and when attributes were added or removed to the context or session. Servlet 2.4 expands the model to add request listeners, allowing developers (or more likely tool vendors) to observe as requests are created and destroyed, and as attributes are added and removed from a request. Servlet 2.4 adds the following classes:

  • ServletRequestListener
  • ServletRequestEvent
  • ServletRequestAttributeListener
  • ServletRequestAttributeEvent

These classes have been modeled after the familiar ServletContextListener, ServletContextEvent, ServletContextAttributeListener, and ServletContextAttributeEvent design, and are assigned to execute using the same <listener> elements. The request variety of listeners were added primarily to help debugging tools hook into the request handling. The practical applications beyond that may be slim, so I don't dig into details here.

Servlet 2.4 also attempts to clarify what happens when a listener throws an exception. Because a listener always invokes outside the service() call stack, the exception can't propagate to the servlet for handling. This issue remains undecided in Proposed Final Draft 2, but odds are listener exceptions will be handled by the <error-page> directive if it exists, and if not, will result in a simple 500 error to the client.

Session changes

Perhaps the most popular new method added in Servlet 2.4 is this one: HttpSession.logout(). This method provides a mechanism to reliably log out a user that logged in using one of the standard <auth-method> mechanisms (BASIC, DIGEST, FORM, CLIENT-CERT). Unfortunately, this method may also be the one that must be dropped from 2.4 before the final release. Placing the logout() call on the session assumes that the session manages the login, when, for everything but FORM logins, it won't. In BASIC, DIGEST, and CLIENT-CERT authentication, the client holds the login credentials and presents them to the server during the request. You can call logout() and empty the session, but the client will still send valid credentials. The server has no way to reliably make the client stop sending them. This is one reason why so many Webpages use form-based logins; they allow logout by invalidating the session or removing a client cookie.

Another 2.4 session change allows zero or negative values in the <session-timeout> element to indicate sessions should never timeout. In general, such an extreme measure should be avoided, but in some cases, it might prove useful if you manually call invalidate() in some reliable fashion. Use with caution.

Lastly, Servlet 2.4 changes from may to must the requirement that a distributed session throw an IllegalArgumentException if an object placed in the session can't be serialized or otherwise sent across the wire. The added restriction should help portability by avoiding silent failures.

Miscellaneous clarifications

Probably my favorite clarification in Servlet 2.4 is this: welcome files can be servlets. This means that index.tea with a *.tea handler can be the default file just as well as index.html or index.jsp. Most servers supported this, but some didn't, and now happily with 2.4, all must support it.

Also clarified is that any library files exposed by the container apart from the WEB-INF structure (such as the JARs Tomcat loads from $CATALINA_HOME/shared/lib) must be loaded by the same classloader within any single JVM. This enhances inter-Web application communication by avoiding potential ClassCastException problems.

Deprecations

Only one class or method was deprecated in Servlet 2.4, and this class sorely deserved it. SingleThreadModel (STM), a bad idea from the beginning, has been deprecated as of 2.4. STM is dead! Long live multithreaded programming! Although STM looks good at first blush, the alternate life cycle imposed by STM actually provides no benefits regarding thread safety, just a false sense of security. The expert group unanimously decided to deprecate it. For details on why STM should be considered harmful, see Resources.

Schema

The last change I talk about isn't a code change, but rather a format change. The web.xml file, formerly defined using a document type definition (DTD) now has its definition specified with the W3C's (World Wide Web Consortium) XML Schema language. Version 2.4 servers must still accept the 2.2 and 2.3 deployment descriptor formats, but all new elements are solely specified in Schema.

Schema is a much more verbose language than DTDs, more expressive in some ways and less expressive in others. Some restrictions have been added, like <role-name> uniqueness. Others have been relaxed; for example, the ordering of elements directly under <web-app> is no longer fixed, and <distributable/> may appear any number of times without error. Also, the <description/> tag now supports an xml:lang attribute to indicate which language is used in the description if not English.

Simple servlet containers aren't required to validate against Schema; J2EE containers are. The spec warns developers, "The deployment descriptor must be valid against the Schema," and that's good advice for portability. For the most part, the change from DTDs to XML Schema won't affect the average servlet programmer; however, it does make understanding the new web.xml format more difficult if you don't have help.

One feature of Schema (or bug, depending on how you look at things) is that elements in the web.xml file can be defined in other Schema documents from other J2EE specifications. So while the Servlet 2.4 web.xml schema mentions <message-destination>, <message-destination-ref>, and <service-ref> and dictates where they may go in web.xml, these elements' actual definitions and children are imported. Also, some elements that formerly were defined within the Servlet specification have been removed and, while still referenced by name, now get their definitions from the J2EE specification. This list includes <env-entry>, <ejb-ref>, <ejb-local-ref>, <resource-ref>, <resource-env-ref>, and all their many possible child elements. You'll also see that the <jsp-config> element definition has been moved into the JSP (JavaServer Pages) specification, although it too still appears in the Servlet schema by name.

How all these imports are supposed to be managed hasn't been made clear. And

it definitely seems odd for a technology low in the stack like servlets to

reference a technology above it, such as how <service-ref>

gets imported from JAX-RPC. It's like TCP/IP needing to know about HTTP. Sun

has said before that standalone servlets aren't a high priority, which

probably helps explain some of this integrated design. How this tight

coupling plays out as the specs continue to update will be interesting to

watch.

What you don't see

Version 2.4 drops a few interesting things. One is Schema extensibility, present until the Public Final Draft stage but removed in Public Final Draft 2. The extensibility proposal intended to provide a way to add even more third-party elements to web.xml files. It was removed at the expert group's behest because using a single file for configuration quickly creates an unworkable mess, like putting all your source code in the same file.

A few other items were postponed. One is the New I/O (input/output) API, an exciting new J2SE feature that greatly speeds client-server communication thanks to a new channel metaphor that lets you buffer in system memory and memory-map files, leverage DMA (direct memory access), and scatter/gather hardware devices' I/O features. Unfortunately, for Servlet 2.4 to use New I/O and channels, J2SE 1.4 would have been set as a minimum requirement, and doing so was considered premature. Server vendors can still use New I/O in their implementations if they like, but servlets won't be able to take full advantage of New I/O until they can get a true channel to communicate with the client.

Also not included are any rules on how HTTP and HTTPS interfaces on the same server should interoperate. Should sessions be the same, or must they differ? Should forward() and include() work, or should you use sendRedirect()? Perhaps these issues can be clarified in the next release.

Start serving up Servlet 2.4

As I've described in this article, Servlet 2.4 adds new minimum requirements, new methods to observe the request, new methods to handle the response, new internationalization support, several RequestDispatcher enhancements, new request listener classes, session clarifications, and a new Schema-based deployment descriptor as well as several new J2EE elements. The specification document overall has also been tightened to remove ambiguities that might interfere with cross-platform deployment. All in all, the spec includes four new classes and seven methods added to existing classes, one new constant variable, and one deprecated class. For a cheat sheet on moving from 2.3 to 2.4, see the sidebar below.

Jason Hunter is author of the book Java Servlet Programming, 2nd Edition (O'Reilly, 2001; ISBN: 0596000405) and coauthor of the new Java Enterprise Best Practices (O'Reilly, 2002; ISBN: 0596003846). He's an Apache Member and, as Apache's representative to the Java Community Process Executive Committee, he established a landmark agreement for open source Java. He's publisher of Servlets.com, an original contributor to Apache Tomcat, and a member of the expert groups responsible for Servlet/JSP and JAXP (Java API for XML Parsing) development. He cocreated the open source JDOM library to enable optimized Java and XML integration. Recently he designed and developed CountryHawk, a product that quickly determines a user's country based on their IP address.

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies