J2EE application performance optimization

How to extract maximum performance from your J2EE Web applications

In today's world of larger-then-ever software applications, users still expect real-time data and software that can process data at blazing speeds. With the advent of broadband access, users have grown accustomed to receiving responses in seconds, irrespective of how much data the server handles. It is becoming increasingly important for Web applications to provide as short response times as possible. The most obvious and simple way to improve a Website's performance is by scaling up hardware. But scaling the hardware does not work in all cases and is definitely not the most cost-effective approach. Other solutions can improve performance, without extra costs for the hardware. This article provides some suggestions that prove helpful when trying to maximize a J2EE Web application's performance.

If you decide to try this article's recommendations, keep in mind this article provides suggestions only. Performance tuning is as much an art as it is a science. Changes that often result in improvement might not make any difference in some cases, or, in rare scenarios, they can result in degradation. For best results with performance tuning, take a holistic approach.

Overview

Figure 1 illustrates at a broad level how a J2EE application appears when deployed in a production environment. To get the best performance from a J2EE application, all the underlying layers must be tuned, including the application itself.

Figure 1. J2EE application architecture. Click on thumbnail to view full-sized image.

For maximum performance, all the components in Figure 1—operating system, Web server, application server, and so on—need to be optimized. This article will give a glimpse into tuning a J2EE application server, Web server, relational database, and your J2EE application. For maximum payoff from this article, follow these guidelines:

  • Set a goal: Before you begin tuning your J2EE application's performance, set a goal. Often this goal addresses the maximum concurrent users the application will support for a given limit on response times. But the goal can also focus on other variables—for example, the response times should not increase more than 10 percent during the peak hour of user load.
  • Identify problem areas: It is important to identify the bottlenecks when you start making changes to improve performance. A little investigation into problems might reveal the specific component that causes poor performance. For example, if the CPU usage on an application server is high, you will want to focus on tuning the application server first.
  • Follow a methodical and focused path: Once the goal is set, try to make changes that are expected to have the biggest impact on performance. Your time is better spent tuning a method that takes 10 seconds but gets called 100 times than tuning a method that takes one minute but gets called only once. In an ideal world, you test one change at a time before using it in a production environment. You make one change and stress-test it. If the change results in positive impact, only then will you make it permanent.

Identify bottlenecks

The goal of performance tuning is to identify bottlenecks and remove them. It is an iterative process. Once one area of the application improves, another area will become a bottleneck. You must repeat the cyclic process of first identifying the bottleneck, then resolving the bottleneck, then identifying the next bottleneck until the desired goal has been reached. You will need two kinds of tools that prove helpful in this process. First, you need stress tools that generate load for your application. Second, you need monitoring tools that collect data for various performance indicators.

Stress tools

Many different stress tools are available in the market today. Some of the popular ones are:

  • Mercury Interactive's LoadRunner
  • Segue's SilkPerformer
  • RadView Software's WebLoad

For a comprehensive list of stress tools see this article's Resources section. You must choose a tool that best fits your needs based on the tool's features and associated price tag. Some of the important features you should consider before choosing a tool are:

  • Support for a large number of concurrent Web users, each running through a predefined set of URL requests
  • Ability to record test scripts automatically from browser
  • Support for cookies, HTTP request parameters, HTTP authentication mechanisms, and HTTP over SSL (Secure Socket Layer)
  • Option to simulate requests from multiple client IP addresses
  • Flexible reporting module that lets you specify the log level and analyze the results long after the tests were run
  • Option to specify the test duration, schedule test for a later time, and specify the total load

For the sample tests mentioned in this article, LoadRunner was used.

Performance monitors

Using a monitoring tool, you collect data for various system performance indicators for all the appropriate nodes in your network topology. Many stress tools also provide monitoring tools. Windows OS also has a built-in performance monitor sufficient for many purposes. This Windows performance monitor can be started from the Administrative Tools menu, accessed from the Control Panel menu, or by typing "perfmon" in the Run window (accessed from the Start menu). You can display the performance counters data in real time, but usually, you'll want to log this data into a file so it can be viewed later. To log the data into a file, go to the Counter Logs selection in the left-hand side of the Performance window, right click with your mouse, and select New Log Settings as shown in Figure 2.

Figure 2. Windows performance monitor. Click on thumbnail to view full-sized image.

You can also set the file path/name for where the data should be logged, as well as a schedule of when to collect this data. It is possible to log the data and view it in real time, but the performance counters must be added twice—first, in the System Monitor link on the left-hand side and second, in Counter Logs, as shown above.

Many performance counters are available in Windows OS. The following table lists some of the important counters that you should always monitor:

System resourcePerformance monitor Description
CPUSystem: Processor queue lengthProcessor queue length indicates the number of threads in the server's processor queue waiting to be executed by the CPU. As a rule of thumb, if the processor queue remains at a value higher than 2 for an extended period of time, most likely, the CPU is a bottleneck.
Processor: Percent of processor timeThis counter provides the total overall CPU utilization for the server. A value that exceeds 70-80 percent for long periods indicates a CPU bottleneck.
System: Percent of total privileged timeThis counter measures what percentage of the total CPU execution time is used for running in kernel/privileged mode. All the I/O operations run under kernel mode, so a high value (about 20-30 percent) usually indicates problems with the network, disk, or any other I/O interface.
RAMMemory: Pages per secondPages per second is the number of pages read from or written to the disk for resolving hard page faults. A value that continuously exceeds 25-30 indicates a memory bottleneck.
Memory: Available bytesAvailable bytes is the amount of physical memory available to processes running on the computer, in bytes. A low value (less than 10 MB) usually means the machine requires more RAM.
Hard diskPhysical disk: Average disk queue lengthThe average number of requests queued for the selected disk. A sustained value above 2 indicates an I/O bottleneck.
Physical disk: Percent of disk timeThe percentage of elapsed time that the selected disk drive is busy servicing requests. A continuous value above 50 percent indicates a bottleneck with hard disks.
NetworkNetwork interface: Total bytes per secondShows the bytes transfer rate (sent and received) on the selected network interface. Depending on the network interface bandwidth, this counter can tell if the network is the problem.
Network interface: Output queue lengthIndicates the length of an output packet queue in packets. A sustained value higher than 2 indicates a bottleneck in the network interface.

You should add the above counters (and any others as appropriate) in your counter log and collect this data while you stress-test your application using the stress tool. A file generated by the counter log can be opened later by clicking on the View Log File Data button on the right-hand side toolbar. Looking at these counters should give you some hint as to where the problem exists—application server, Web server, or database server.

After identifying the bottleneck this way, you should try to resolve it. You can use two different strategies—either tune the hosting environment in which your application runs or tune the application itself.

Environment tuning

In this section, we look at the possible tuning options in a typical J2EE Web application hosting environment. As already discussed above, a J2EE application environment usually consists of an application server, Web server, and a backend database.

Web server/application server tuning

Most application servers and Web servers provide similar kinds of configuration options though they have different mechanisms to set them. In this article, I cover Tomcat 4.1.x and Apache 1.3.27, which talk to each other using the JK connector. The configuration options presented here exist in most of the other application servers and Web servers, but you will need to locate the correct place to set them.

Apache

Probably the most important setting for your Windows Apache HTTP Server is the option for number of threads. This value should be high enough to handle the maximum number of concurrent users, but not so high that it starts adding its own overhead of too many context switches. The optimum value can be determined by monitoring the number of threads in use during peak hours. To monitor the threads in use, make sure you have the following configuration directives present in the Apache configuration file (httpd.conf):

LoadModule status_module modules/mod_status.so
<Location /server-status>
    SetHandler server-status
    Allow from all
</Location>

Now, from your browser, make an HTTP request to your Apache server with this URL: http://<apache_machine>/server-status. It displays how many requests are being processed and their status (reading request, writing response, etc.). Monitor this page during peak load on the server to ensure the server is not running out of idle threads. After you come up with the optimum number of threads for your application, change the ThreadsPerChild directive in the configuration file to an appropriate value.

A few other items that improve performance in the Apache HTTP Server are:

  • DNS reverse lookups are inefficient. Switch off DNS lookups by setting HostnameLookups in the configuration file to off.
  • Do not load unnecessary modules. Apache allows dynamic modules that extend basic functionality of the Apache HTTP Server. Comment out all the LoadModule directives you don't need.
  • Try to minimize logging as much as possible. Look for directives LogLevel, CustomLog, and LogFormat in the configuration file for changing logging level.
  • Minimize the JK connector's logging also by setting the JkLogLevel directive to emerg.

Tomcat

The two most important configuration options for Tomcat are its heap size and number of threads. Unfortunately there is no good way to determine the heap size needed because in most cases, the JVM doesn't start cleaning up the memory until it reaches the maximum memory allocated. One good rule of thumb is to allocate half of the total physical RAM as Tomcat heap size. If you still run out of memory, look into your application design to reduce memory usage, identify any memory leaks, or try various garbage collector options in the JVM. To change the heap size add -Xms<size> -Xmx<size> as the JVM parameter in the command line that starts Tomcat. <size> is the JVM heap size usually specified in megabytes by appending a suffix m, for example, 512m. Initial heap size is -Xms, and -Xmx is the maximum heap size. For server applications, both should be set to the same value.

The number of threads in Tomcat can be modified by changing the values of minProcessors and maxProcessors attributes for the appropriate connector in <Tomcat>/conf/server.xml. If you are using the JK connector, change the values of its attributes. Again, there is no simple way to decide the optimum value for these attributes. The value should be set such that enough threads are available to handle your Web application's peak load. You can monitor a process's current thread count in the Windows Task Manager, which can assist in determining the correct value of these attributes.

A few other options you should be aware of:

1 2 3 Page
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more