Wizard API updated!
Tim Boudreau has released a new version of the Swing Wizard library (version 0.997) that fixes the WizardException bug reported in JavaWorld's recent Open Source Java Project profile. The article's examples have been reworked to test out the new, improved WizardException. Thanks, Tim, for this helpful fix!
Open Source Java Projects: The Wizard API

Newsletter sign-up

Sign up for our technology specific newsletters.

Enterprise Java
View all newsletters

Email Address:

Optimize a query on a Map

Comparing techniques for performance tuning a query on a Map class

In "Optimizing a query on a collection", I considered how to optimize a query on an indexable collection. However, optimizing queries on Map classes turns out to be more complicated. In this article, I'll provide an example of how you can speed up bottlenecks consisting of Map queries.

The query

First I'll start with the problem. I'll use strings as keys for my collection. For their corresponding values, I'll simply use integer objects to indicate some value object. For the query, I'll use a simple test that checks whether any particular string key includes one of a specified substrings set, and the query will simply return the summation of the integer values for those string keys, which include the substrings. For example, the Map might be:

  • key -> value
  • "code" -> 1
  • "rode" -> 2
  • "load" -> 3
  • "toad" -> 4
  • "road" -> 5


The query in that case might be "sum of the values for those string keys in the map that contain the substrings od or lo" (the answer would be 1+2+3=6 for this example list).

For my actual keys, I'll generate multicharacter strings by using lowercase letters (a to z). For example, a collection of all four-character strings would generate a collection of 26 x 26 x 26 x 26 = 456976 four-character strings. The values will simply be an integer counter that increments by one as each string is added to the Map. I'll query that Map for the summation of integer values for those strings that contain any of the substrings ie, xy, or pq. I've elected to use a Hashtable object to hold the collection for the start of the tests.

I've chosen to use an easily generated collection for the data and a straightforward query to avoid any application-specific distractions. I want to focus on the tuning here. The query represents many query types I've seen in applications, though a more representative test would return the value collection for the keys that satisfy the query. I've opted for the summation to avoid generating too many collections in my tests.

The simple straightforward version of the query is:

    int count = 0;
    Enumeration enumeration = map.keys();
    String s;
    while(enumeration.hasMoreElements())
    {
      s = (String) enumeration.nextElement();
      if(     ( s.indexOf("ie") != -1 )
           || ( s.indexOf("xy") != -1 )
           || ( s.indexOf("pq") != -1 ) )
      {
        count += ((Integer) this.get(s)).intValue();
      }
    }
    return count;


For the tests, I'll use the Sun virtual machines (VMs) from the Java SDK 1.2 and 1.3. In addition, I'll test with the HotSpot VMs delivered with those two SDKs -- HotSpot 1.0 and 2.0. I'll also test one non-JIT VM, which in this case will be the 1.2 VM with the JIT turned off. You can easily turn off the JIT by setting the java.compiler property to NONE:

  java "-Djava.compiler=NONE" ...


Avoid synchronization

As I said earlier, I started by using a Hashtable object to hold the map. In most applications, bottleneck queries tend to be read-only or single-threaded. In either case, you can normally use a nonsynchronized object to hold the collection. To do so here requires a simple change to using a HashMap object instead of the Hashtable object I initially used. In addition, I need to change the code to use an iterator, as HashMap does not support an enumerator. In fact, I really shouldn't have used the enumerator in the original query since the Map interface does not support its use.

The query now reads as the following:

    int count = 0;
    Iterator iterator = map.keySet().iterator();
    String s;
    while(iterator.hasNext())
    {
      s = (String) iterator.next();
      if(     ( s.indexOf("ie") != -1 )
           || ( s.indexOf("xy") != -1 )
           || ( s.indexOf("pq") != -1 ) )
      {
        count += ((Integer) this.get(s)).intValue();
      }
    }
    return count;


However, the test (test2 in Table 1 below) does not change the query time very much. A couple of the VMs even register slower times. That looks slightly odd to me and, looking back at what I've done, I noticed that I've been a little naughty. I made two changes to the code and tested those changes together without determining whether either change individually speeds up the query time. I changed the class and changed the enumerator to an iterator. I should really check each change separately.

My first concern is that the iterator is accessed via a Set of Map elements. I would have a real performance problem if the Set were a whole new collection with all the elements copied from Map. But the Set implementations of the JDK Map classes are actually alternate views of the underlying collection, and so they should not cause any adverse performance impact for this query.

Table 1: Percentage of Speed Increase in Query Time for Each Optimization Test, Relative to the 1.2 JIT VM Test
Test 1.2 JIT VM HotSpot 1.0 2nd run* 1.3 JIT VM HotSpot 2.0 2nd run* 1.2 non-JIT Test details
test1 100% 129% 134% 125% 985% Hashtable with enumerator
test2 98.1% 129% 135% 124% 1104% HashMap with iterator
test3 108% 131% 138% 125% 1187% Hashtable with iterator
test4 91.5% 125% 124% 118% 1001% replaced .hasNext() call in test2
test5 120% 131% 134% 126% 1009% use Map.Entry Set view
test6 83.0% 107% 102% 99.1% 727% query in Map class
test7 78.1% 103% 105% 102% 720% specialized Map class of String keys and int values
test8 77.7% 73.3% 104% 69.7% 697% specialized Map class of char[] keys and int values
test9 19.4% 18.7% 22% 18.5% 111% specialized Map class of four-char keys and int values
test10 8.7% 12.0% 11.4% 10% 82.3% perfect minimum hashing on four-char keys and int values


NOTE: HotSpot VMs run methods in interpreted mode until the HotSpot VM profiler decides that the method is better run natively compiled. The method is then natively compiled with extensive optimizations applied to it. For the test runs here, some methods were optimized by the second test run, but not until the third test run were the HotSpot optimizations fully applied to the test by the VM. Consequently, I show the results of the first and third test runs in the HotSpot VMs.

The process of testing the two changes individually is straightforward: simply use a Hashtable with the iterator. The test result (test3 in Table 1) is very illuminating. Test3 clearly shows that enumerators are faster than iterators for the Maps I've reviewed here, and that is probably true for other JDK classes. The slower iterators combined with the faster nonsynchronized HashMap class balanced each other out to give test2's mixed results. Unfortunately, we don't have the option of using the faster nonsynchronized HashMap together with the faster enumerators.

1 | 2 | 3 | 4 | 5 |  Next >
Resources