Pool resources using Apache's Commons Pool Framework

Create a simple thread pool to handle concurrent requests

Pooling resources (also called object pooling) among multiple clients is a technique used to promote object reuse and to reduce the overhead of creating new resources, resulting in better performance and throughput. Imagine a heavy-duty Java server application that sends hundreds of SQL queries by opening and closing connections for every SQL request. Or a Web server that serves hundreds of HTTP requests, handling each request by spawning a separate thread. Or imagine creating an XML parser instance for every request to parse a document without reusing the instances. These are some of the scenarios that warrant optimization of the resources being used.

Resource usage could prove critical at times for heavy-duty applications. Some famous Websites have shut down because of their inability to handle heavy loads. Most problems related to heavy loads can be handled, at a macro level, using clustering and load-balancing capabilities. Concerns remain at the application level with respect to excessive object creation and the availability of limited server resources like memory, CPU, threads, and database connections, which could represent potential bottlenecks and, when not utilized optimally, bring down the whole server.

In some situations, the database usage policy could enforce a limit on the number of concurrent connections. Also, an external application could dictate or restrict the number of concurrent open connections. A typical example is a domain registry (like Verisign) that limits the number of available active socket connections for registrars (like BulkRegister). Pooling resources has proven to be one of the best options in handling these types of issues and, to a certain extent, also helps in maintaining the required service levels for enterprise applications.

Most J2EE application server vendors provide resource pooling as an integral part of their Web and EJB (Enterprise JavaBean) containers. For database connections, the server vendor usually provides an implementation of the DataSource interface, which works in conjunction with the JDBC (Java Database Connectivity) driver vendor's ConnectionPoolDataSource implementation. The ConnectionPoolDataSource implementation serves as a resource manager connection factory for pooled java.sql.Connection objects. Similarly, EJB instances of stateless session beans, message-driven beans, and entity beans are pooled in EJB containers for higher throughput and performance. XML parser instances are also candidates for pooling, because the creation of parser instances consumes much of a system's resources.

A successful open source resource-pooling implementation is the Commons Pool framework's DBCP, a database connection pooling component from the Apace Software Foundation that is extensively used in production-class enterprise applications. In this article, I briefly discuss the internals of the Commons Pool framework and then use it to implement a thread pool.

Let's first look at what the framework provides.

Commons Pool framework

The Commons Pool framework offers a basic and robust implementation for pooling arbitrary objects. Several implementations are provided, but for this article's purposes, we use the most generic implementation, the GenericObjectPool. It uses a CursorableLinkedList, which is a doubly-linked-list implementation (part of the Jakarta Commons Collections), as the underlying datastructure for holding the objects being pooled.

On top, the framework provides a set of interfaces that supply lifecycle methods and helper methods for managing, monitoring, and extending the pool.

The interface org.apache.commons.PoolableObjectFactory defines the following lifecycle methods, which prove essential for implementing a pooling component:

   // Creates an instance that can be returned by the pool
    public Object makeObject() {}
    // Destroys an instance no longer needed by the pool
    public void destroyObject(Object obj) {}
    // Validate the object before using it
    public boolean validateObject(Object obj) {}
    // Initialize an instance to be returned by the pool
    public void activateObject(Object obj) {}
    // Uninitialize an instance to be returned to the pool
    public void passivateObject(Object obj) {}

As you can make out by the method signatures, this interface primarily deals with the following:

  • makeObject(): Implement the object creation
  • destroyObject(): Implement the object destruction
  • validateObject(): Validate the object before it is used
  • activateObject(): Implement the object initialization code
  • passivateObject(): Implement the object uninitialization code

Another core interface—org.apache.commons.ObjectPool—defines the following methods for managing and monitoring the pool:

  // Obtain an instance from my pool
    Object borrowObject() throws Exception;
    // Return an instance to my pool
    void returnObject(Object obj) throws Exception;
    // Invalidates an object from the pool
    void invalidateObject(Object obj) throws Exception;
    // Used for pre-loading a pool with idle objects
    void addObject() throws Exception;
    // Return the number of idle instances
    int getNumIdle() throws UnsupportedOperationException;
    // Return the number of active instances
    int getNumActive() throws UnsupportedOperationException;
    // Clears the idle objects
    void clear() throws Exception, UnsupportedOperationException;
    // Close the pool
    void close() throws Exception;
    //Set the ObjectFactory to be used for creating instances
    void setFactory(PoolableObjectFactory factory) throws IllegalStateException,
    UnsupportedOperationException;

The ObjectPool interface's implementation takes a PoolableObjectFactory as an argument in its constructors, thereby delegating object creation to its subclasses. I don't talk much about design patterns here since that is not our focus. For readers interested in looking at the UML class diagrams, please see Resources.

As mentioned above, the class org.apache.commons.GenericObjectPool is only one implementation of the org.apache.commons.ObjectPool interface. The framework also provides implementations for keyed object pools, using the interfaces org.apache.commons.KeyedObjectPoolFactory and org.apache.commons.KeyedObjectPool, where one can associate a pool with a key (as in HashMap) and thus manage multiple pools.

The key to a successful pooling strategy depends on how we configure the pool. Badly configured pools can be resource hogs, if the configuration parameters are not well tuned. Let's look at some important parameters and their purpose.

Configuration details

The pool can be configured using the GenericObjectPool.Config class, which is a static inner class. Alternatively, we could just use the GenericObjectPool's setter methods to set the values.

The following list details some of the available configuration parameters for the GenericObjectPool implementation:

  • maxIdle: The maximum number of sleeping instances in the pool, without extra objects being released.
  • minIdle: The minimum number of sleeping instances in the pool, without extra objects being created.
  • maxActive: The maximum number of active instances in the pool.
  • timeBetweenEvictionRunsMillis: The number of milliseconds to sleep between runs of the idle-object evictor thread. When negative, no idle-object evictor thread will run. Use this parameter only when you want the evictor thread to run.
  • minEvictableIdleTimeMillis: The minimum amount of time an object, if active, may sit idle in the pool before it is eligible for eviction by the idle-object evictor. If a negative value is supplied, no objects are evicted due to idle time alone.
  • testOnBorrow: When "true," objects are validated. If the object fails validation, it will be dropped from the pool, and the pool will attempt to borrow another.

Optimal values should be provided for the above parameters to achieve maximum performance and throughput. Since the usage pattern varies from application to application, tune the pool with different combinations of parameters to arrive at the optimal solution.

To understand more about the pool and its internals let's implement a thread pool.

Proposed thread pool requirements

Suppose we were told to design and implement a thread pool component for a job scheduler to trigger jobs at specified schedules and report the completion and, possibly, the result of the execution. In such a scenario, the objective of our thread pool is to pool a prerequisite number of threads and execute the scheduled jobs in independent threads. The requirements are summarized as follows:

  • The thread should be able to invoke any arbitrary class method (the scheduled job)
  • The thread should be able to return the result of an execution
  • The thread should be able to report the completion of a task

The first requirement provides scope for a loosely coupled implementation as it doesn't force us to implement an interface like Runnable. It also makes integration easy. We can implement our first requirement by providing the thread with the following information:

  • The name of the class
  • The name of the method to be invoked
  • The parameters to be passed to the method
  • The parameter types of the parameters passed

The second requirement allows a client using the thread to receive the execution result. A simple implementation would be to store the result of the execution and provide an accessor method like getResult().

The third requirement is somewhat related to the second requirement. Reporting a task's completion may also mean that the client is waiting to get the result of the execution. To handle this capability, we can provide some form of a callback mechanism. The simplest callback mechanism can be implemented using the java.lang.Object's wait() and notify() semantics. Alternatively, we could use the Observer pattern, but for now let's keep things simple. You might be tempted to use the java.lang.Thread class's join() method, but that won't work since the pooled thread never completes its run() method and keeps running as long as the pool needs it.

Now that we have our requirements ready and a rough idea as to how to implement the thread pool, it's time to do some real coding.

At this stage, our UML class diagram of the proposed design looks like the figure below.

Our design's UML class diagram. Click on thumbnail to view full-sized image.

Implementing the thread pool

The thread object we are going to pool is actually a wrapper around the thread object. Let's call the wrapper the WorkerThread class, which extends the java.lang.Thread class. Before we can start coding WorkerThread, we must implement the framework requirements. As we saw earlier, we must implement the PoolableObjectFactory, which acts as a factory, to create our poolable WorkerThreads. Once the factory is ready, we implement the ThreadPool by extending the GenericObjectPool. Then, we finish our WorkerThread.

Implementing the PoolableObjectFactory interface

We begin with the PoolableObjectFactory interface and try to implement the necessary lifecycle methods for our thread pool. We write the factory class ThreadObjectFactory as follows:

 

public class ThreadObjectFactory implements PoolableObjectFactory{

public Object makeObject() { return new WorkerThread(); } public void destroyObject(Object obj) { if (obj instanceof WorkerThread) { WorkerThread rt = (WorkerThread) obj; rt.setStopped(true);//Make the running thread stop } } public boolean validateObject(Object obj) { if (obj instanceof WorkerThread) { WorkerThread rt = (WorkerThread) obj; if (rt.isRunning()) { if (rt.getThreadGroup() == null) { return false; } return true; } } return true; } public void activateObject(Object obj) { log.debug(" activateObject..."); }

public void passivateObject(Object obj) { log.debug(" passivateObject..." + obj); if (obj instanceof WorkerThread) { WorkerThread wt = (WorkerThread) obj; wt.setResult(null); //Clean up the result of the execution } } }

Let's walk through each method in detail:

Method makeObject() creates the WorkerThread object. For every request, the pool is checked to see whether a new object is to be created or an existing object is to be reused. For example, if a particular request is the first request and the pool is empty, the ObjectPool implementation calls makeObject() and adds the WorkerThread to the pool.

Method destroyObject() removes the WorkerThread object from the pool by setting a Boolean flag and thereby stopping the running thread. We will look at this piece again later, but notice that we are now taking control over how our objects are being destroyed.

1 2 3 Page 1
Page 1 of 3