A recipe for cookie management

Integrate an easy-to-use library for client-side cookie handling

While developing a universal email client offering single-point access to all major Internet mail services, Web-based or otherwise, I found my application often had to act as a mini Web browser to interact with mail provider Websites. I stumbled upon the same need for Website interaction while developing XML Web service implementations to facilitate machine access to Websites. These sites often use cookies for state management—that is, to maintain user session data. In both cases, I realized that most Website interaction logic dealt with cookie handling. I also noticed that although both applications performed cookie handling, the logic was quite different and not interchangeable. In response to these limitations, I set out to develop a lightweight general-purpose library devoted to cookie handling. In this article, I share this library with you.

To illustrate the library in action, I present a console-based Hotmail email checker. Further, I explore client-side state management from the mobile device perspective using the Mobile Information Device Profile (MIDP) from the Java 2 Platform, Micro Edition (J2ME).

Cookie basics

Lets begin by answering some questions:

  • What is state management, and why do we need it?
  • What are cookies, and how do they fit into the picture?

To answer the first question, we must examine HTTP more closely. HTTP is basically a stateless protocol because, from a Web server's perspective, all HTTP requests are independent of preceding requests. That means each HTTP response depends entirely on the information contained in the corresponding request. While this behavior allows a simple and efficient Web server implementation, using it as a basis for complex Web applications proves rather difficult.

The state management mechanism overcomes this HTTP limitation and allows Web clients and servers to maintain a relationship between requests. The period during which this relationship holds is called a session. Most Web applications that require you to log in use sessions and state management. Shopping cart applications use state management to hold a list of all items marked for purchase. State management enables customization of portals and search engines to a particular user's preferences. Web applications can even use state management to tailor Website content based on user interests.

Cookies affect state management. Cookies are small pieces of text stored by the server on the local machine and sent with every request to that same server. The IETF RFC 2965 HTTP State Management Mechanism is the current cookie specification. The Web server uses HTTP headers, specified in the RFC (request for comments), to send cookies to the client. At the client end, the browser parses these cookies and stores them in a local file. It then attaches these cookies automatically to any requests to the same server. In the remainder of this article, I use the terms cookie handling and state management synonymously.

If you want to find out which sites you visit use cookies, try this simple experiment:

Warning: Perform this exercise only if you feel comfortable changing your browser's settings and generally know your way around your system.

  • Open your favorite browser. I assume you're using either Internet Explorer (IE) 5+ or Netscape Navigator 4+.
  • Disable automatic cookie handling:

    • In IE, from the Tools menu, choose Internet Options, then Security. Click Custom Level and scroll down until you see "Allow cookies that are stored on your computer." Choose the Prompt radio button. Also choose the Prompt radio button under "Allow per-session cookies (not stored)." Click OK to return to your main window.
    • In Netscape Navigator, from the Edit menu, choose Preferences, then Advanced. Check "Warn me before accepting a cookie." Now click OK to return to the main window.
  • Now, browse your favorite sites. Odds are, when you check your Webmail or shop online, dialog boxes will bombard you, asking for your permission to accept cookies.

Reverse the steps above to return to your original settings. You can also see what cookies are stored on your local machine (previous warning applies):

  • For IE: Using Windows Explorer or My Computer, browse to C:\Windows\Cookies. All the text files in this directory contain cookies
  • For Netscape Navigator:

    • On Windows, using Windows Explorer or My Computer, browse to C:\Program Files\Netscape\Users. Look for a file named "cookies.txt" or "cookies" in the subdirectories
    • On Unix(-like) systems, look for a file named "cookies" in the ".netscape" directory

Note: Depending on your system setup, the steps to disable automatic cookie handling and view stored cookies might vary.

Now that you know the basics, I next explain how this discussion relates to Java.

State management in Java

Java applications require cookie handling in the following scenarios:

  • Website interaction: To interact with Websites, Internet-based client-side applications often act as mini Web browsers. These sites use cookies for state management—that is, to maintain user session data.
  • Web services implementations: Web services promise to make the Web a friendlier place for machines. One desirable way to permit machine-Website interaction is to have a Web service front the Website. Thus, the Web service presents a machine-friendly view of the target Website. Such Web services' implementations would need cookie handling to carry out the actual Website interaction.
  • Web browsers: Java-based Web browsers would need cookie-handling modules to support state management.

To carry out client-side cookie handling, browsers follow these steps:

  • For retrieving cookies:

    1. Extract the cookies from the incoming HTTP headers
    2. Parse the cookies for individual components (name, value, path, and so on)
    3. Determine if the host is permitted to set those cookies
  • For sending cookies:

    1. Determine which cookies can be sent to host
    2. For multiple cookies, determine the order in which cookies must be sent
    3. Format and send the cookies with outgoing HTTP headers

A client-side Java application should follow all the above steps. But implementing all those steps using the rules specified in RFC 2965 eats up considerable time and distracts the developer from the core application. As a result, the developer often chooses to compromise the specification and ends up with slipshod custom cookie-handling code that could break easily.

For example, suppose you want to write a Java client application that interacts with a servlet-based shopping Web application. On the server side, when the servlet first asks the servlet container for a session by calling request.getSession(), the container creates a new session and the server uses a session ID to retrieve the session object on subsequent requests. The server automatically sends this session ID to the client as an HTTP cookie. In subsequent requests, the client sends back the same session ID along with its request. The server uses the ID to locate the right session object for the servlet processing the request. Typically, the code on the client would look like this:

      /* To get cookie. */
      HttpURLConnection huc= (HttpURLConnection) url.openConnection();
      InputStream is = huc.getInputStream();
      // Retrieve session ID from response.
      String cookieVal = hc.getHeaderField("Set-Cookie");
      String sessionId;
      if(cookieVal != null)
            sessionId = cookieVal.substring(0, cookieVal.indexOf(";"));
      /* To send cookie. */
      HttpURLConnection huc= (HttpURLConnection) url.openConnection();
      if(sessionId != null)
            huc.setRequestProperty("Cookie", sessionId);
      InputStream is = huc.getInputStream();

The cookie specification RFC 2965 defines a new header, Set-Cookie2, for version 1 cookies. If we upgraded the server to use the new header, the above code would break. The above code could also not handle multiple cookies. In addition, a version 1 cookie's value can be a quoted string. If the session cookie's value were a quoted string containing a semicolon, that would also cause the above code to break. In short, the code snippet above is not insulated from the cookie version used and is unnecessarily coupled to it.

The above code might be acceptable for a simple application that interacts with only one particular host and path map. But for a more ambitious application, cookie management grows complex when multiple hosts and paths are involved. It would prove painful and unproductive for a developer to implement all the algorithms, security checks, and balances outlined in the cookie specification.

Enter jCookie

To ease some of this heartache, I've developed a general-purpose cookie library christened, predictably, jCookie, which implements the cookie specification. The library minimizes the extra coding and effort required for client-side cookie handling and lets the developer focus on the core application. Other APIs/libraries do exist (for example, HTTPClient from Apache), but they use architectures far removed from the built-in native java.net APIs, thus introducing a new learning curve. My API can be as simple as a method call on existing java.net objects.

You could also use a stripped-down version of jCookie currently under development, called jCookieMicro, on J2ME-enabled mobile devices to create an exciting suite of fat clients that can interact with Web server-based applications.

I now introduce the jCookie API's major actors. I'll start with the two main data structures:

  1. Class Cookie: An instance of this class represents a single cookie. It encapsulates all the cookie properties defined by RFC 2965 and provides access to those properties using getters and setters.
  2. Class CookieJar: An instance of this class acts as a container for a collection of Cookie objects. It conforms to the Collections Framework and provides operations to manipulate the cookie collection.

The API presents two views to simultaneously satisfy the developer requiring transparent cookie handling and the developer requiring advanced features. The following figure illustrates these views, or layers.

jCookie library's layered views

The jCookie architecture

Below, I describe the layers and the various classes they use.

Layer 1

Those developers wanting (almost) transparent cookie handling, which is usually the case, use Layer 1. At this level, you use the Client class to handle cookies. It has two primary methods:

  • public CookieJar getCookies(URLConnection urlConn): This method extracts the cookies from the given URLConnection, parses them into Cookie objects, and returns them as a CookieJar
  • public CookieJar setCookies(URLConnection urlConn, CookieJar cj): This method picks out the appropriate Cookie objects from the CookieJar and sets URLConnection's headers

Layer 0

Those developers who cannot breathe without being neck-deep in code (myself included) use Layer 0. At this level, you can change the parsing logic and the security rules used by the cookie-handling code. To do this, first implement the CookieParser interface. It has the following four methods:

  • public Header getCookieHeaders(CookieJar cj): Converts the Cookies in the CookieJar to a header set suitable for sending with an HTTP request
  • public boolean allowedCookie(Cookie c, URL url): Checks whether a request to the given URL can return the specified Cookie
  • public CookieJar parseCookies(Header h, URL url): Converts the headers in an HTTP response into a CookieJar of Cookie objects
  • public boolean sendCookieWithURL(Cookie c, URL url, boolean bRespectExpires): Checks whether the given Cookie can be sent with a request for the given URL

You can use the Client class's setCookieParser(CookieParser cp) method to set the CookieParser implementation. The default CookieParser used by the library is an implementation of the RFC 2965 cookie specification.

At Layer 1, jCookie acts as a library; at Layer 0, it incorporates elements of an API.

jCookie usage

The Client class invokes the cookie-handling logic at both layers. It provides the application developer's library view. To use the jCookie library, follow these steps:

  • To retrieve cookies from responses to requests:

    1. Create a URLConnection object and set it up
    2. Connect the URLConnection
    3. Create a Client object and set a custom CookieParser if desired
    4. Obtain a CookieJar of Cookies by calling the Client instance's getCookies() method, passing in URLConnection as an argument
    5. Do something with the HTTP response
  • To send cookies with a request (assuming a CookieJar was retrieved earlier):

    1. Create a URLConnection object and set it up
    2. Create a Client object and set a custom CookieParser if desired
    3. Set cookie headers by calling the Client instance's setCookies() method, passing in URLConnection and CookieJar as arguments
    4. Connect the URLConnection
    5. Do something with the HTTP response

The following snippet shows common jCookie usage. The jCookie code is highlighted:

1 2 3 Page 1
Page 1 of 3