J2EE application performance optimization

How to extract maximum performance from your J2EE Web applications

1 2 3 Page 2
Page 2 of 3
  • If you are not using the latest JRE (Java Runtime Environment), consider upgrading to the latest one. I have seen up to a 30 percent performance improvement after upgrading from JRE 1.3.1 to JRE 1.4.1.
  • Add the -server option to the JVM options for Tomcat, which should result in better performance for server applications. Note that this option, in some cases, causes the JVM to crash for no apparent reason. If you face this problem, remove the option.
  • Change the default Jasper (JavaServer Pages, or JSP, compiler) settings in <Tomcat>/conf/web.xml by setting development="false", reloading="false" and logVerbosityLevel="FATAL".
  • Minimize logging in Tomcat by setting debug="0" everywhere in <Tomcat>/conf/server.xml.
  • Remove any unnecessary resources from the Tomcat configuration file. Some examples include the Tomcat examples Web application and extra <Connector>, <Listener> elements.
  • Set the autodeploy attribute of the <Host> tag to false (unless you need any of the default Tomcat applications like Tomcat Manager).
  • Make sure you have set reloadable="false" for all your Web application contexts in <Tomcat>/conf/server.xml.

Database tuning

In the case of Microsoft SQL Server, more often than not, you do not need to modify any configuration options, since it automatically tunes your database to a great degree. You should change these settings only if your stress tests identify the database as a bottleneck. Some of the configuration options that you can try are:

  • Run your SQL Server on a dedicated server instead of a shared machine.
  • Keep your application database and your temporary database on different hard disks.
  • Consider taking local backups and moving them to a different machine. The backups should complete much faster.
  • Normalize your database to the third normal form. This is usually the best compromise, as the fourth and fifth forms of normalization can result in performance degradation.
  • If you have more than 4 GB of physical RAM available, set the awe enabled configuration option to 1, which will allow SQL Server to use more than 4 GB of memory up to a maximum of 64 GB (depending on the SQL Server edition).
  • In case you have many concurrent queries executing and enough memory is available, you can increase the value of the min memory per query option (default is 1,024 KB).
  • Change the value of the max worker threads option, which indicates the maximum number of user connections allowed. Once this limit is reached, any new user requests will wait until one of the existing worker threads finishes its current task. The default value for this option is 255.
  • Set the priority boost option to 1. This will allow SQL Server to run with a higher priority as compared to the other applications running on the same server. If you are running on a dedicated server, it is usually safe to set this option.

If none of the configuration options resolve the bottleneck, you should consider scaling up the database server. Horizontal scaling is not possible in SQL Server, as it does not support true clustering, so usually, the only option is vertical scaling.

Application tuning

After you have tuned your hosting environment, now it is time to get down and dirty inside your application source code and database schema. In this section, we look at one of the many possible ways to tune your Java code and your SQL queries.

Java code optimization

The most popular way to optimize Java code is by using a profiler. Sun's JVM has built-in support for profiling (Java Virtual Machine Profiler Interface, or JVMPI) that can be switched at execution time by passing the right JVM parameters. Many commercial profilers are available; some rely on JVMPI, others provide their own custom hooks into Java applications (using bytecode instrumentation or some other method). But be aware that all these profilers add significant overhead. Thus, your application cannot be profiled at a realistic load level. Use these profilers with a single user or a limited number of users. It is still a good idea to run your application through the profiler and analyze the results for any obvious bottlenecks.

To identify your application's slowest areas in a full-fledged deployed environment, you can add your own timing logs to the application, which can be switched off easily in the production environment. A logging API, such as log4j or J2SE 1.4's Java Logging API, is handy for this purpose. The code below shows a sample utility class that can help you add timing logs to your application:

import java.util.HashMap;
//Import org.apache.log4j.Logger;
public class LogTimeStamp {
    private static HashMap ht = new HashMap();
    // Preferably we should use log4j instead of System.out
//    private static Logger logger = Logger.getLogger("LogTimeStamp");
    private static class ThreadExecInfo {
        long timestamp;
        int stepno;
    }
    public static void LogMsg(String Msg) {
        LogMsg(Msg, false);
    }
    /*
     * Passing true in the second parameter of this function resets the counter for
     * the current thread. Otherwise it keeps track of the last invocation and prints
     * the current counter value and the time difference between the two invocations.
     */
    public static void LogMsg(String Msg, boolean flag) {
        LogTimeStamp.ThreadExecInfo thr;
        long timestamp = System.currentTimeMillis();
        synchronized (ht) {
            thr = (LogTimeStamp.ThreadExecInfo) ht.get(Thread.currentThread().getName());
            if (thr == null) {
                thr = new LogTimeStamp.ThreadExecInfo();
                ht.put(Thread.currentThread().getName(), thr);
            } 
        }
        if (flag == true) {
            thr.stepno = 0;
        }
        if (thr.stepno != 0) {
//            logger.debug(Thread.currentThread().getName() + ":" + thr.stepno + ":" +
//                    Msg + ":" + (timestamp - thr.timestamp));
            System.out.println(Thread.currentThread().getName() + ":" + thr.stepno + ":" +
                    Msg + ":" + (timestamp - thr.timestamp));
        }
        thr.stepno = thr.stepno + 1;
        thr.timestamp = timestamp;
    }
}

After adding the above class in your application, you must invoke method LogTimeStamp.LogMsg() at various checkpoints in your code. This method prints the time (in milliseconds) it took for one thread to get from one checkpoint to the next one. First, call LogTimeStamp.LogMsg("Your Msg", true) at one place in the code that is the start of a user request. Now you can insert the following invocations in your code:

    public void startingMethod() {
        ...
        LogTimeStamp.LogMsg("This is a test message", true); //This is starting point
        ...
        LogTimeStamp.LogMsg("One more test message"); //This will become check point 1
        method1();
        ...
    }
    public void method1() {
        ...
        LogTimeStamp.LogMsg("Yet another test message"); //This will become check point 2
        method2();
        ...
        LogTimeStamp.LogMsg("Oh no another test message"); //This will become check point 4
    }
    public void method2() {
        ...
        LogTimeStamp.LogMsg("Wow! another test message"); //This will become check point 3
        ...
    }

The Perl script analyze.pl, which can be downloaded from Resources, can take the output of the above log messages as input and print the results in the format below. From these results, you now know which part of the code requires the most time and can concentrate on optimizing that part:

Transactions                           Avg. Time  Max Time  Min Time
--------------------------------------------------------------------
[This is a ...] to [One more t...]         14410     20937      7500
[One more t...] to [Yet anothe...]            16        62         0
[Yet anothe...] to [Wow! anoth...]         39860     50844     27703
[Wow! anoth...] to [Oh no anot...]           711      1844        94
[Oh no anot...] to [OK thats e...]         68089    228452     19718

The above approach represents just one of the ways to tune your Java code. You can use whatever methodology works for you. Some excellent resources are available for Java performance tuning. Check Resources for some links. Some general suggestions you should be aware of while developing a J2EE application are:

  • Avoid using synchronized blocks in your code as much as possible. That does not mean you should abdicate handling synchronization for your code's multithreaded parts, but you should try to limit its usage. Synchronized blocks can severely impair your application's scalability.
  • Proper logging proves necessary in serious software development. You should try to use a logging mechanism (like log4j) that lets you switch off logging in the production environment to reduce logging overhead.
  • Instead of creating and destroying resources every time you need them, use a resource pool for every resource that is costly to create. One obvious choice for this is your JDBC (Java Database Connectivity) Connection objects. Threads are also usually good candidates for pooling. Many free APIs are available for pooling various resources.
  • Try to minimize the objects you store in HttpSession. Extra objects in HttpSession not only lead to more memory usage, they also add additional overhead for serialization/deserialization in case of persistent sessions.
  • Where ever possible, use RequestDispatcher.forward() instead of HttpServletResponse.sendRedirect(), as the latter involves a trip to the browser.
  • Minimize the use of SingleThreadModel in servlets so that the servlet container does not have to create many instances of your servlet.
  • Java stream objects perform better than reader/writer objects because they do not have to deal with string conversion to bytes. Use OutputStream in place of PrintWriter.
  • Reduce the default session timeout either by changing your servlet container configuration or by calling HttpSession.setMaxInactiveInterval() in your code.
  • Just as we switched off the DNS lookup in the Web server configuration, try not to use ServletRequest.getRemoteHost(), which involves a reverse DNS lookup.
  • Always add directive <%@ page session="false"%> to JSP pages where you do not need a session.
  • Excessive use of custom tags also may result in poor performance. Keep performance in mind while designing your custom tag library.

SQL query optimization

The optimization of SQL queries is a vast subject in itself, and many books cover only this topic. SQL query running times can vary by many orders of magnitude even if they return the same results in all cases. Here I just show how to identify the slow queries and offer a few suggestions as to how to fix some of the most common mistakes.

First of all, to identify slow queries, you can use SQL Profiler, a tool from Microsoft that comes standard with SQL Server 2000. This tool should be run on a machine other than your SQL Server database server and the results should be stored in a different database as well. Storing results in a database allows all kinds of reports to be generated using standard SQL queries. Profiling any application inherently adds a lot of overhead, so try to use appropriate filters that can reduce the total amount of data collected.

To start the profiling, from SQL Profiler's File menu, select New, then Trace, and give the connection information and the appropriate credentials to connect to the database you want to profile. A Trace Properties windows will open, where you should enter a meaningful name you will recognize later. Select Save To Table option and now give the connection information and credentials for the database server (this should differ from the server you are profiling) where you want to store the data collected by the profiler. Next, you should be asked to provide the database and the table name where the results will be stored. Usually, you would also want to add a filter, so go to the Filters tab and add the appropriate filters (for example "duration greater than or equal to 500 milliseconds" or "CPU greater than or equal to 20" as shown in Figure 3). Now click on the Run button and the profiling will start.

Figure 3. SQL Profiler. Click on thumbnail to view full-sized image.

Let's say based on SQL Profile's results, you have identified the following query as the most time consuming:

SELECT [TABLE1].[T1COL1], [TABLE1].[T1COL2], 
       [TABLE1].[T1COL3], [TABLE1].[T1COL4]
FROM ((([TABLE1] LEFT JOIN [TABLE4] ON [TABLE1].[T1COL4] = [TABLE4].[T4COL4]) 
        LEFT JOIN [TABLE3] ON [TABLE1].[T1COL3] = [TABLE3].[T3COL3]) 
        LEFT JOIN [TABLE2] ON [TABLE1].[T1COL2] = [TABLE2].[T2COL2]) 
WHERE [TABLE1].[T1COL5] = 'VALUE1'

Now your task is to optimize this query to improve performance. You look at your database and find that TABLE1 has 700,000 records, TABLE2 has 16, TABLE3 has 100, and TABLE4 happens to have more than 4 million records. You should first understand this query's cost, and Query Analyzer comes in handy for this task. Select Show Execution Plan in the Query menu and execute this query in Query Analyzer. Figure 4 shows the resulting execution plan.

Figure 4. Execution plan before indexes. Click on thumbnail to view full-sized image.

From this plan, you see that SQL Server is clearly doing full-table scans for all four tables, and together, they make up around 80 percent of the total query cost. Luckily, another feature in Query Analyzer can analyze a query and recommend appropriate indexes. Run Index Tuning Wizard from the Query menu again. This wizard analyzes the query and gives recommendations for indexes. As shown in Figure 5, it recommends two indexes to be created ([TABLE4].[T4COL4] & [TABLE1].[T1COL5]) and also indicates performance will improve 99 percent!

Figure 5. Index Tuning Wizard. Click on thumbnail to view full-sized image.

After creating the indexes, the execution time declines from 4,125 milliseconds to 110 milliseconds, and the new execution plan shown in Figure 6 shows only two table scans (not a problem as TABLE2 and TABLE3 both have limited records).

Figure 6. Execution plan after indexes. Click on thumbnail to view full-sized image.

This was just an example of what proper tuning can achieve in terms of performance. In general, SQL Server's auto-tuning features automatically handle many tasks for you. For example, reordering WHERE clauses will never yield any benefit, as SQL Server internally handles that. Still, here are a few things you should keep in mind while writing SQL queries:

1 2 3 Page 2
Page 2 of 3