Solving common Java EE performance problems

A troubleshooting manual for your Java EE environment

Java EE (Java Platform, Enterprise Edition) applications, regardless of the application server they are deployed to, tend to experience the same sets of problems. As a Java EE tuner, I have been exposed to a variety of environments and have made some observations about common problems. In this capacity, I see my role as similar to that of an automobile mechanic: you tell your mechanic that the engine is chirping; then he asks you a series of questions that guide you in quantifying the nature, location, and circumstances of the chirp. From this information, he forms a good idea about a handful of possible causes of the problem.

In much the same way, I spend the first day of a tuning engagement interviewing my clients. During this interview, I look for known problems as well as architectural decisions that may negatively affect the performance of the application. With an understanding of the application architecture and the symptoms of the problem, I greatly increase my chances of resolving the problem. In this chapter, I share some of the common problems that I have encountered in the field and their symptoms. Hopefully, this article can serve as a troubleshooting manual for your Java EE environment.

Out-of-memory errors

One of the most common problems that plagues enterprise applications is the dreaded OutOfMemoryError. The error is typically followed by one of the following:

  • An application server crash
  • Degraded performance
  • A seemingly endless loop of repeated garbage collections that nearly halts processing and usually leads to an application server crash

Regardless of the symptoms, you will most likely need to reboot the application server before performance returns to normal.

Causes of out-of-memory errors

Before you attempt to resolve an out-of-memory error, first understanding how it can occur is beneficial. If the JVM runs out of memory anywhere in its process memory space, including all regions in the heap as well as the permanent memory space, and a process attempts to create a new object instance, the garbage collector executes to try to free enough memory to allow the new object's creation. If the garbage collector cannot free enough memory to hold the new object, then it throws an OutOfMemoryError.

Out-of-memory errors most commonly result from Java memory leaks. Recall from previous discussions that a Java memory leak is the result of maintaining a lingering reference to an unused object: you are finished using an object, but because one or more other objects still reference that object, the garbage collector cannot reclaim its memory. The memory occupied by that object is thus lost from the usable heap. These types of memory leaks typically occur during Web requests, and while one or two leaked objects may not crash your application server, 10,000 or 20,000 requests might. Furthermore, most objects that are leaked are not simple objects such as Integers or Doubles, but rather represent subgraphs within the heap. For example, you may inadvertently hold on to a Person object, and that Person object has a Profile object that has several PerformanceReview objects that each maintain sets of data. Rather than losing 100 bytes of memory that the Person object occupies, you lose the entire subgraph that might account for 500 KB or more of memory.

In order to identify the root of this problem, you need to determine whether a real memory leak exists or whether something else is manifesting as an OutOfMemoryError. I use the following two techniques when making this determination:

  • Analyze deep memory statistics
  • Inspect the growth pattern of the heap

The JVM tuning process is not the same for all JVMs, such as Sun and IBM, but some commonalities exist.

Sun JVM memory management

The Sun JVM is generational, meaning that objects are created in one space and given several chances to die before they are tenured into a long-term space. Specifically, the Sun JVM is broken into the following spaces:

  • Young generation, including Eden and two survivor spaces (the From space and the To space)
  • Old generation
  • Permanent generation

Figure 1 illustrates the breakdown of the Sun heap's generations and spaces.

Figure 1. The Sun JVM is partitioned into two major generations: the old generation and the young generation

Objects are created in Eden. When Eden is full, the garbage collector iterates over all objects in Eden, copies live objects to the first survivor space, and frees memory for any dead objects. When Eden again becomes full, it repeats the process by copying live objects from Eden to the second survivor space, and then copying live objects from the first survivor space to the second survivor space. If the second survivor space fills and live objects remain in Eden or in the first survivor space, then these objects are tenured (that is, they are copied to the old generation). When the garbage collector cannot reclaim enough memory by executing this type of minor collection, also known as a copy collection, then it performs a major collection, also known as a stop-the-world collection. During the stop-the-world collection, the garbage collector suspends all threads and performs a mark-and-sweep collection on the entire heap, leaving the entire young generation empty and ready to restart this process.

Figures 2 and 3 illustrate how minor collections run.

Figure 2. Objects are created in Eden until it is full. Click on thumbnail to view full-sized image.
Figure 3. The order of processing is important: The garbage collector first traverses Eden and then the survivor space; this ensures that objects are given ample opportunity to die before being tenured. Click on thumbnail to view full-sized image.

Figure 4 illustrates how a major collection runs.

Figure 4. When the garbage collector frees all dead objects and moves all live objects to a newly compacted tenured space, it leaves Eden and both survivor spaces empty. Click on thumbnail to view full-sized image.

From Sun's implementation of garbage collection, you can see that objects in the old generation can be collected only by a major collection. Long-lived objects are expensive to clean up, so you want to ensure that short-lived objects die in a timely manner before they have a chance to be tenured, and hence require a major garbage collection to reclaim their memory.

All of this background prepares us to identify memory leaks. Memory is leaked in Java when an object maintains an unwanted reference to another object, hence stopping the garbage collector from reclaiming its memory. In light of the architecture of the Sun JVM, objects that are not dereferenced will make their way through Eden and the survivor spaces into the old generation. Furthermore, in a multiuser Web-based environment, if multiple requests are being made to leaky code, we will see a pattern of growth in the old generation.

Figure 5 highlights potential candidates for leaked objects: objects that survive multiple major collections in the tenured space. Not all objects in the tenured space represent memory leaks, but all leaked objects will eventually end up in the tenured space. If a true memory leak exists, the tenured space will begin filling up with leaked objects until it runs out of memory.

Therefore, we want to track the effectiveness of garbage collection in the old generation: each time that a major garbage collection runs, how much memory is it able to reclaim? Is the memory use in the old generation growing according to any discernable pattern?

Figure 5. The shaded objects are those that have survived multiple major collections and are potential memory leaks

Some of this information is available through monitoring APIs, and detailed information is available through verbose garbage collection logs. The level of logging affects the performance of the JVM, and as with almost any monitoring technology, the more detailed (and useful) information you want, the more expensive it is to obtain. For the purposes of determining whether a memory leak exists, I use relatively standard settings that show the overall change in generational memory between garbage collections and draw conclusions from that. Sun reports the overhead for this level of logging at approximately 5 percent, and many of my clients run with these settings enabled all the time to ensure that they can manage and tune garbage collection. The following settings usually give you enough information to analyze:

 –verbose:gc –xloggc:gc.log –XX:+PrintGCDetails –XX:+PrintGCTimeStamps

Observable trends in the heap overall can point to a potential memory leak, but looking specifically at the growth rate of the old generation can be more definitive. But remember that none of this investigation is conclusive: in order to conclusively determine that you have a memory leak, you need to run your application off-line in a memory profiler.

IBM JVM memory management

The IBM JVM works a little differently. Rather than starting with a large generational heap, it maintains all objects in a single space and frees memory as the heap grows. It runs different levels of garbage collections. The main behavior of this heap is that it starts relatively small, fills up, and at some point executes a mark-sweep-compact garbage collection to clean up dead objects as well as to compact live objects at the bottom of the heap. As the heap grows, long-lived objects get pushed to the bottom of the heap. So your best bet for identifying potential memory leaks is to observe the behavior of the heap in its entirety: is the heap trending upward?

Resolving memory leaks

Memory leaks are elusive, but if you can identify the request causing the memory leak, then your work is much easier. Take your application to a development environment and run it inside a memory profiler, performing the following steps:

  1. Start your application inside the memory profiler
  2. Execute your use-case (make the request) once to allow the application to load all of the objects that it needs in memory to satisfy the request; this reduces the amount of noise that you have to sift through later
  3. Take a snapshot of the heap to capture all objects in the heap before the use-case has been executed
  4. Execute your use-case again
  5. Take another snapshot of the heap to capture all objects in the heap after the use-case has been executed
  6. Compare the two snapshots and look for objects that should not remain in the heap after executing the use-case

At this point, you will need access to developers involved in coding the request you are testing so that they can make a determination about whether an object is, in fact, being leaked or if it is supposed to remain in memory for some purpose.

If nothing screams out as a leaked object after performing this exercise, one trick I sometimes use is to perform Step 4 a distinct number of times. For example, I might configure my load tester to execute the request 17 times, in hopes that my leak analysis might show 17 instances of something (or some multiple of 17). This technique is not always effective, but it has greatly helped me out when each execution of a request leaks objects.

If you cannot isolate the memory leak to a specific request, then you have two options:

  • Profile each suspected request until you find the memory leak
  • Configure a monitoring tool with memory capabilities

The first option is feasible in a small application or if you were lucky enough to partially isolate the problem, but not very feasible for large applications. The second option is more effective if you can gain access to the monitoring tools. These tools track object creation and destruction counts through bytecode instrumentation and typically report the number of objects held in predefined or user-defined classes, such as the Collections classes, as a result of individual requests. For example, a monitoring tool might report that the /action/login.do request left 100 objects in a HashMap after it completed. This report does not tell you where the memory leak is in the code or the specific object that it leaks, but it tells you, with very low overhead, what requests you need to look at inside a memory profiler. Finding memory leaks in a production environment without crashing your application server is tricky, but tools with these monitoring capabilities make your job much easier!

Artificial memory leaks

A few issues can appear to be memory leaks that in actuality are not. I refer to these as artificial memory leaks, and they may appear in the following situations:

  • Premature analysis
  • Leaky sessions
  • Permanent space anomalies

This section examines each artificial memory leak, describing how to detect it and how to work around it.

Premature analysis

To avoid a false positive when searching for memory leaks, you need to ensure that you are observing and analyzing the heap at the appropriate time. The danger is that, because a certain number of long-lived objects need to be in the heap, a trend may look deceiving until the heap reaches a steady state and contains its core objects. Wait until your application reaches this steady state prior to performing any trend analysis on the heap.

To detect whether or not you are analyzing the heap prematurely, continue monitoring it after your analysis snapshot for a couple hours to see if the upward heap trend levels off or if it continues upward indefinitely. If the trend levels off, then capture a new memory recording at this point. If the trend continues upward, then analyze the memory session you have.

Leaky sessions

Memory leaks tend to occur during Web requests, but during a Web request, objects can be stored only in a finite number of places. Those places include the following:

  • Page scope
  • Request scope
  • Session scope
  • Application scope
  • Static variables
  • Long-lived class variables, such as inside a servlet itself

When implementing JSPs (JavaServer Pages), any variable created inside the JSP itself will be eligible for garbage collection as soon as the page completes; these variables exist for the lifetime of a single page.

Attributes and parameters that are passed from the Web server to the application server, as well as attributes that are passed between servlets and JSPs, live inside an HttpServletRequest object. The HttpServletRequest object serves as a communication mechanism for various components in your dynamic Web tier, but as soon as the request is complete and the socket connected to the user is closed, the servlet container frees all variables stored in the HttpServletRequest. These variables exist for the lifetime of a single request.

HTTP is a stateless protocol, meaning that a client makes a request of the server, the server responds to the request, the communication is terminated, and the conversation is complete. Because we appreciate being able to log on to a Webpage, add items to a shopping cart, and then check out, Web servers have devised a mechanism to define an extended conversation that spans multiple requests—the session. Attributes and parameters can be stored on a per-user basis inside an HttpSession object and then accessed by any servlet or JSP in the application when that user accesses them. In this way, the login page can locate your information and add it to the HttpSession, so that the shopping cart can add items to it and the check-out page can access your credit card number to bill you. For a stateless protocol, the client always initiates the communication with the server, requiring the server to know how long the maximum break in communications can be before it considers the conversation over and discards the user's data. This length of time is referred to as the session time-out, and it is configurable inside the application server. Unless objects are explicitly removed from the session or the session is programmatically invalidated, objects will stay in the session for at least the duration of the time-out, measured from the last time the user accessed the Web server.

While the session manages objects on a per-user basis, the ServletContext object manages objects on an application basis. The ServletContext is sometimes referred to as application scope, because through a servlet's ServletContext or a JSP's application object, you are able to maintain and share objects with all other servlets and JSPs for all users in the same application. The ServletContext is a prime location to place application configuration information and to cache application-wide data, such as database JNDI (Java Naming and Directory Interface) lookup results.

If data is not stored in one of these four predefined areas: page scope, request scope, session scope, or application scope, objects may be stored in the following objects:

  • Static variables
  • Long-lived class variables

Static variables are maintained in the JVM on a per-class basis and do not require a class instance to be alive in the heap for the static variable to exist. All class instances share the same static variable values, so changing a static variable in one class instance affects all other instances of the same class type. Therefore, if the application places an object into a static variable for a class and nullifies that variable, the static object is not reclaimed by the JVM. These static objects are prime locations for leaking memory!

Finally, objects can be added to internal data structures or member variables inside long-lived classes such as servlets. When a servlet is created and loaded into memory, it has only one instance in memory, and multiple threads are configured to access that servlet instance. If it loads configuration information in its init() method, stores it in class variables, and reads that information while servicing requests, then all instances are assured of seeing the same information. One common problem that I have seen is the use of servlet class variables to store data such as page caches. These caches, in and of themselves, are good to have, but probably the worst place to manage them is from inside a servlet. If you are considering using a cache, then you are best served by integrating a third-party cache, like Tangosol's Coherence, into your application framework for that specific purpose.

When page- or request-scoped variables maintain references to objects, they are automatically cleaned up before the request completes. Likewise, if session-scoped variables maintain references to objects, they are automatically cleaned up when your application explicitly invalidates the session or when the session time-out is exceeded.

Probably the greatest number of false positives in memory leak detection that I see are leaky sessions. A leaky session does not leak anything at all; it consumes memory, resembling a memory leak, but its memory is eventually reclaimed. If the application server is about to run out of memory, the best strategy to determine whether you have a memory leak or a poorly managed session is to stop all input to this application server instance, wait for the sessions to time out, and then see if memory is reclaimed. Obviously, this procedure is not possible in production, but it offers a surefire way to test in production-staging, with your load tester, if you suspect that you may have large sessions rather than a memory leak.

In general, if you have excessively large sessions, the true resolution is to refactor your application to reduce session memory overhead. The following two workaround solutions can minimize the impact of excessively large sessions:

  • Increase the heap size to support your sessions
  • Decrease the session time-out to invalidate sessions more quickly

A larger heap will spend more time in garbage collection, which is not an ideal situation, but a better one than an OutOfMemoryError. Increase the size of your heap to be able to support your sessions for the duration of your time-out value; this means that you need enough memory to hold all active user sessions as well as all sessions for users who abandon your Website within the session time-out interval. If the business rules permit, decreasing the session time-out will cause session data to time out earlier and lessen the impact on the heap memory it is occupying.

In summary, here are the steps to perform, prioritized from most desirable to least desirable:

  • Refactor your application to store the minimum amout of information that is necessary in session-scoped variables
  • Encourage your users to log out of your application and explicitly invalidate sessions when users log out
  • Decrease your session time-out to force memory to be reclaimed sooner
  • Increase your heap size.

However, unwanted object references maintained from application-scoped variables, static variables, and long-lived classes are, in fact, memory leaks that need to be analyzed in a memory profiler.

Permanent space anomalies

The purpose of the permanent space in the JVM process memory is typically misunderstood. The heap itself only contains class instances, but before the JVM can create an instance of a class on the heap, it must load the class bytecode (.class file) into the process memory. It can then use that class bytecode to create an instance of the object in the heap. The space in the process memory that the JVM uses to store the bytecode versions of classes is the permanent space. Figure 6 illustrates the relationship between the permanent space and the heap: it exists inside the JVM process memory, but is not part of the heap itself.

Figure 6. The relationship between the permanent space and the heap

In general, you want the permanent space to be large enough to hold all classes in your application, because reading classes from the file system is obviously more expensive than reading them from memory. To help you ensure that classes are not unloaded from the permanent space, the JVM has a tuning option:

 –noclassgc

This option tells the JVM not to perform garbage collection on (and unload) the class files in the permanent space. This tuning option is very intelligent, but it raises a question: what does the JVM do if the permanent space is full when it needs to load a new class? In my observation, the JVM examines the permanent space and sees that it needs memory, so it triggers a major garbage collection. The garbage collection cleans up the heap, but cannot touch the permanent space, so its efforts are fruitless. The JVM then looks at the permanent space again, sees that it is full, and repeats the process again, and again, and again.

When I first encountered this problem, the customer was complaining of very poor performance and an eventual OutOfMemoryError after a certain amount of time. After examining verbose garbage collection logs in conjunction with heap utilization and process memory utilization charts, I soon discovered that the heap was running well, but the process was running out of memory. This customer maintained literally thousands of JSPs, and as such, each one was translated to Java code, compiled to bytecode, and loaded in the permanent space before creating an instance in the heap. Their environment was running out of permanent space, but because of the –noclassgc tuning option on the heap, the JVM was unable to unload classes to make room for new ones. To correct this out-of-memory error, I configured their heap with a huge permanent space (512 MB) and disabled the –noclassgc JVM option.

As Figure 7 illustrates, when the permanent space becomes full, it triggers a full garbage collection that cleans up Eden and the survivor spaces, but does not reclaim any memory from the permanent space.

Figure 7. Garbage collection behavior when the permanent space becomes full. Click on thumbnail to view full-sized image.
Note
When sizing the permanent space, consider using 128 MB, unless your applications have a large number of classes, in which case, you can consider using 256 MB. If you have to configure the permanent space to use anything more, then you are only masking the symptoms of a significant architectural issue. Configuring the permanent space to 512 MB is OK while you address your architectural issues, but just realize that it is only a temporary solution to buy you time while you address the real problems. Creating a 512 MB permanent space is analogous to getting painkillers from your doctor for a broken foot. True, the painkillers make you feel better, but eventually they will wear off, and your foot will still be broken. The real solution is to have the doctor set your foot and put a cast on it to let it heal. The painkillers can help while the doctor sets your foot, but they are used to mask the symptoms of the problem while the core problem is resolved.

As a general recommendation, when configuring the permanent space, make it large enough to hold all of your classes, but allow the JVM to unload classes when it needs to. Size it large enough so that hopefully it will not unload classes, but a minor slowdown to load classes from the file system is far more preferable than a JVM OutOfMemoryError crash!

Thread pools

The main entry point into any Web or application server is a process that receives a request and places it into a request queue for an execution thread to process. After tuning memory, the tuning option with the biggest impact in an application server is the size of the execution thread pool. The size of the thread pool controls the number of simultaneous requests that can be processed at one time. If the pool is sized too small, then requests will wait in the queue for processing, and if the pool is sized too large, then the CPU will spend too much time switching contexts between the various threads.

Each server has a socket it listens on. A process that receives an incoming request places the request into an execution queue, and the request is subsequently removed from the queue by an execution thread and processed. Figure 8 illustrates the components that make up the request processing infrastructure inside a server.

Figure 8. The request processing infrastructure inside a server. Click on thumbnail to view full-sized image.

Thread pools that are too small

When my clients complain of degraded performance at relatively low load that worsens measurably as the load increases, I first check the thread pools. Specifically, I am looking for the following information:

  • Thread pool utilization
  • Number of pending requests (queue depth)

When the thread pool is 100 percent in use and requests are pending, the response time degrades substantially, because requests that otherwise would be serviced quickly spend additional time inside a queue waiting for an execution thread. During this time, CPU utilization is usually low, because the application server is not doing enough work to keep the CPU busy. At this point, I increase the size of the thread pool in steps, monitoring the throughput of the application until it begins to decrease. You need consistent load or, even better, an accurate load tester to ensure your measurements' accuracy. Once you observe a dip in the throughput, lower the thread pool size down one step, to the size where throughput was maximized.

Figure 9 illustrates the behavior of a thread pool that is sized too small.

Figure 9. When all threads are in use, requests back up in the execution queue. Click on thumbnail to view full-sized image.

Every time I read performance tuning documents, one thing that bothers me is that they never recommend specific values for the size of your thread pools. Because these values depend so much on what your application is doing, the documents are completely accurate to generalize their recommendations; but it would greatly benefit the reader if they presented best-practice starting values or ranges of values. For example, consider the following two applications:

  • One application retrieves a string from memory and forwards it to a JSP for presentation.
  • Another application queries 1,000 metric values from a database and computes the average, variance, and standard deviation against those metrics. The first application responds to requests very rapidly, maybe returning in less than 0.25 seconds and does not make much use of the CPU. The second application may take 3 seconds to respond and is CPU intensive. Therefore, configuring a thread pool with 100 threads for the first application may be too low, because the application can support 200 simultaneous requests; but 100 threads may be too high for the second application, because it saturates the CPU at 50 threads.

However, most applications do not exhibit this extreme dynamic in functionality. Most do similar things, but do them for different domains. Therefore, my recommendation is for you to configure between 50 and 75 threads per CPU. For some applications, this number may be too low, and for others it may be too high, but as a best practice, I start with 50 to 75 threads per CPU, monitor the CPU performance along with application throughput, and make adjustments.

Thread pools that are too large

In addition to having thread pools that are sized too small, environments can be configured with too many threads. When load increases in these environments, the CPU is consistently high, and response time is poor, because the CPU spends too much time switching contexts between threads and little time allowing the threads to perform their work.

The main indication that a thread pool is too large is a consistently high CPU utilization rate. Many times, high CPU utilization is associated with garbage collection, but high CPU utilization during garbage collection differs in one main way from that of thread pool saturation: garbage collection causes CPU spikes, while saturated thread pools cause consistently high CPU utilization.

When this occurs, requests may be pending in the queue, but not always, because pending requests do not affect the CPU as processing requests do. Decreasing the thread pool size may cause requests to wait, but having requests waiting is better than processing them if processing the requests saturates the CPU utilization. A saturated CPU results in abysmal performance across the board, and performance is better if a request arrives, waits in a queue, and then is processed optimally. Consider the following analogy: many highways have metering lights that control the rate that traffic can enter a crowded highway. In my opinion, the lights are ineffective, but the theory is sound. You arrive, wait in line behind the light for your turn, and then enter the highway. If all of the traffic entered the highway at the same time, we would be in complete gridlock, with no one able to move, but by slowing down the rate that new cars are added to the highway, the traffic is able to move. In practice, most metropolitan areas have so much traffic that the metering lights do not help, and what they really need is a few more lanes (CPUs), but if the lights could actually slow down the rate enough, then the highway traffic would flow better.

To fix a saturated thread pool, reduce the thread pool size in steps until the CPU is running between 75 and 85 percent during normal user load. If the size of the queue becomes too unmanageable, then you need to do one of the following two things:

  • Run your application in a code profiler, and tune the application code
  • Add additional hardware

If your user load has exceeded the capacity of your environment, you need to either change what you are doing (refactor and tune code) to lessen the CPU impact or add CPUs.

JDBC connection pools

Most Java EE applications connect to a backend data source, and often these applications communicate with that backend data source through a JDBC (Java Database Connectivity) connection. Because database connections can be expensive to create, application servers opt to pool a specific number of connections and share them among processes running in the same application server instance. If a request needs a database connection when one is unavailable in the connection pool, and the connection pool is unable to create a new connection, then the request must wait for a connection to become available before it can complete its operation. Conversely, if the database connection pool is too large, then the application server wastes resources, and the application has the potential to force too much load on the database. As with all of our tuning efforts, the goal is to find the most appropriate place for a request to wait to minimize its impact on saturated resources; having a request waiting outside the database is best if the database is under duress.

An application server with an inadequately sized connection is characterized by the following:

  • Slow-running application
  • Low CPU utilization
  • High database connection pool utilization
  • Threads waiting for a database connection
  • High execution thread utilization
  • Pending requests in the request queue (potentially)
  • Database CPU utilization that is medium to low (because enough requests cannot be sent to it to make it work hard)

If you observe these characteristics, increase the size of the connection pool until database connection pool utilization is running at 70 to 80 percent utilization during average load and threads are rarely observed waiting for a connection. Be cognizant of the load on the database, however, because you do not want to force enough load to the database to saturate its resources.

JDBC prepared statements

Another important tuning aspect related to JDBC is the correct sizing of JDBC connection prepared statement caches. When your application executes a SQL statement against the database, it does so by passing through three phases:

  • Preparation
  • Execution
  • Retrieval

During the preparation phase, the database driver may ask the database to compute an execute plan for the query. During the execution phase, the database executes the query and returns a reference to a result set. During the retrieval phase, the application iterates over the result set and obtains the requested information.

The database driver optimizes this process: the first time you prepare a statement, it asks the database to prepare an execution plan and caches the result. On subsequent preparations, it loads the already prepared statement from the cache without having to go back to the database.

When the prepared statement cache is sized too small, the database driver is forced to prepare noncached statements again, which incurs additional processing time as well as network time if the database connection goes back to the database. The primary symptom of an inadequately sized prepared statement cache is a significant amount of JDBC processing time spent repeatedly preparing the same statement. The breakdown of time that you would expect is for the preparation time to be high initially and then begin to diminish on subsequent calls.

To complicate things ever so slightly, prepared statements are cached on a per-connection basis, meaning that a cached statement can be prepared for each connection. The impact of this complication is that if you have 100 statements that you want to cache, but you have 50 database connections in your connection pool, then you need enough memory to hold 5,000 prepared statements.

Through performance monitoring, determine how many unique SQL statements your application is running, and from those unique statements, consider how many of them are executed very frequently.

Entity bean and stateful session bean caches

While stateless objects can be pooled, stateful objects like entity beans and stateful session beans need to be cached, because each bean instance is unique. When you need a stateful object, you need a specific instance of that object, and a generic instance will not suffice. As an analogy, consider that when you check out of a supermarket, which cashier you use doesn't matter; any cashier will do. In this example, cashiers can be pooled, because your only requirement is a cashier, not Steve the cashier. But when you leave the supermarket, you want to bring your children with you; other peoples' children will not suffice: you need your own. In this example, children need to be cached.

The benefit to using a cache is that you can serve requests from memory rather than going across the network to load an object from a database. Figure 10 illustrates this benefit. Because caches hold stateful information, they need to be configured at a finite size. If they were able to grow without bound, then your entire database would eventually be in memory! The size of the cache and the number of unique, frequently accessed objects dictate the performance of the cache.

Figure 10. The application requests an object from the cache that is in the cache, so a reference to that object is returned without making a network trip to the database. Click on thumbnail to view full-sized image.

When a cache is sized too small, the cache management overhead can dramatically affect the performance of the cache. Specifically, when a request queries for an object that is not present in a full cache, then the following steps, illustrated in Figure 11, must be performed:

  1. The application requests an object
  2. The cache is examined to see if the object is already in the cache
  3. An object is chosen to remove from the cache (typically using a least-recently-used algorithm)
  4. The object is removed from the cache (passivated)
  5. The new object is loaded from the database into the cache (activated)
  6. A reference to the object is returned to the application
Figure 11. Because the requested object is not in the cache, an object must be selected for removal from the cache and removed from it. Click on thumbnail to view full-sized image.

If these steps must be performed for the majority of requested objects, then using a cache would not be the best idea in the first place! When this process occurs frequently, the cache is said to thrash. Recall that removing an object from the cache is called passivation, and loading an object loaded from persistent storage into the cache is called activation. The percentage of requests that are served by the cache is the hit ratio, and the percentage that are not served is the miss ratio.

While the cache is being initialized, its hit ratio will be zero, and its activation count will be high, so you need to observe the cache performance after it is initialized. To work around the initialization phase, you can monitor the passivation count as compared to the total requests for objects in the cache, because passivations will only occur after the cache has been initialized. But in general, we are mostly concerned with the cache miss ratio. If the miss ratio is greater than 25 percent, then the cache is probably too small. Furthermore, if the miss count is above 75 percent, then either the cache is too small or the object probably should not be cached.

Once you determine that your cache is too small, try increasing its size and measure the improvement. If the miss ratio comes down to less than 20 percent, then your cache is well sized, but if increasing the size of the cache does not have much of an effect, then you need to work with the application technical owner to determine whether the object should be cached or whether the application needs to be refactored with respect to that object.

Stateless session bean and message-driven bean pools

Stateless session beans and message-driven beans implement business processes, and as such, do not maintain their states between invocations. When your application needs access to these beans' business functionality, it obtains a bean instance from a pool, calls one or more of its methods, and then returns the bean instance to the pool. If your application needs the same bean type later, it obtains another one from the pool, but receiving the same instance is not guaranteed.

Pools allow an application to share resources, but they present another potential wait point for your application. If there is not an available bean in the pool, then requests will wait for a bean to be returned to the pool before continuing. These pools are tuned pretty well by default in most applications servers, but I have seen environments where customers have introduced problems by sizing them too small. Stateless bean pools should generally be sized the same as your execution thread pool, because a thread can use only one instance at a time; anything more would be wasteful. Furthermore, some application servers optimize pool sizes to match the thread count, but as a safety precaution, you should configure them this way yourself.

Transactions

One of the benefits to using enterprise Java is its inherent support for transactions. By adding an annotation to methods in a Java EE 5 EJB (Enterprise JavaBeans), you can control how the method participates in transactions. A transaction can complete in one of the following two ways:

  • It can be committed
  • It can be rolled back

When a transaction is committed, it has completed successfully, but when it rolls back, something went wrong. Rollbacks come in the following two flavors:

  • Application rollbacks
  • Nonapplication rollbacks

An application rollback is usually the result of a business rule. Consider a Web application that asks users to take a survey to enter a drawing for a prize. The application may ask the user to enter an age, and a business rule might state that users need to be 18 years of age or older to enter the drawing. If a 16-year-old submits information, the application may throw an exception that redirects the user to a Webpage informing that user that he or she is not eligible to enter the drawing. Because the application threw an exception, the transaction in which the application was running rolled back. This rollback is a normal programming practice and should be alarming only if the number of application rollbacks becomes a measurable percentage of the total number of transactions.

A nonapplication rollback, on the other hand, is a very bad thing. The three types of nonapplication rollbacks follow:

  • System rollback
  • Time-out rollback
  • Resource rollback

A system rollback means that something went very wrong in the application server itself, and the chances of recovery are slim. A time-out rollback indicates that some process within the application server timed out while processing a request; unless your time-outs are set very low, this constitutes a serious problem. A resource rollback means that when the application server was managing its resources internally, it had a problem with one of them. For example, if you configure your application server to test database connections by executing a simple SQL statement, and the database becomes unavailable to the application server, then anything interacting with that resource will receive a resource rollback.

Nonapplication rollbacks are always serious issues that require immediate attention, but you do need to be cognizant of the frequency of application rollbacks. Many times people overreact to the wrong types of exceptions, so knowing what each type means to your application is important.

Summary

While each application and each environment is different, a common set of issues tends to plague most environments. This article focused not on application code issues, but on the following environmental issues that can manifest poor performance:

  • Out-of-memory errors
  • Thread pool sizes
  • JDBC connection pool sizes
  • JDBC prepared statement cache sizes
  • Cache sizes
  • Pool sizes
  • Excessive transaction rollbacks

In order to effectively diagnose performance problems, you need to understand how problem symptoms map the root cause of the underlying problem. If you can triage the problem to application code, then you need to forward the problem to the application support delegate, but if the problem is in the environment, then resolving it is within your control.

The root of a problem depends on many factors, but some indicators can increase your confidence when diagnosing problems and completely eliminate others. I hope this article can serve as a beginning troubleshooting guide for your Java EE environment that you can customize to your environment as issues arise.

Steven Haines is the author of three Java books: The Java Reference Guide (InformIT/Pearson, 2005), Java 2 Primer Plus (SAMS, 2002), and Java 2 From Scratch (QUE, 1999). In addition to contributing chapters and coauthoring other books, as well as technical-editing countless software publications, he is also the Java Host on InformIT.com. As an educator, he has taught all aspects of Java at Learning Tree University as well as at the University of California, Irvine. By day he works as a Java EE 5 performance architect at Quest Software, defining performance tuning and monitoring software as well as managing and performing Java EE 5 performance tuning engagements for large-scale Java EE 5 deployments, including those of several Fortune 500 companies.

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more