Hashtables

When you create your own key object in a Hashtable, be careful

June 21, 2002

Q: When I use an object as a key in a Hashtable , what in the Object class must I override and why?

A: When you create your own key object for use in a Hashtable, you must override the Object.equals() and Object.hashCode() methods since Hashtable uses a combination of the key's hashCode() and equals() methods to store and retrieve its entries quickly. It's also a general rule that when you override equals(), you always override hashCode().

More on why

A slightly more in-depth explanation will help you understand Hashtable's mechanism for storage and retrieval. A Hashtable internally contains buckets in which it stores the key/value pairs. The Hashtable uses the key's hashcode to determine to which bucket the key/value pair should map.

Figure 1. A Hashtable and its buckets

Figure 1 shows a Hashtable and its buckets. When you pass a key/value to the Hashtable, it queries the key's hashcode. The Hashtable uses that code to determine the bucket in which to place the key/value. So, for example, if the hashcode equals zero, the Hashtable places the value into Bucket 0. Likewise, if the hashcode is two, the Hashtable places the value into Bucket 2. (This is a simplistic example; the Hashtable will massage the hashcode first so the Hashtable doesn't try to insert the value outside the bucket.)

By using the hashcode this way, the Hashtable can also quickly determine in which bucket it has placed the value when you try to retrieve it.

Hashcodes, however, represent only half the picture. The hashcode only tells the Hashtable into which bucket to drop the key/value. Sometimes, however, multiple objects may map to the same bucket, an event known as a collision. In Java, the Hashtable responds to a collision by placing multiple values into the same bucket (other implementations may handle collisions differently). Figure 2 shows what a Hashtable might look like after a few collisions.

Figure 2. A Hashtable after a few collisions

Now imagine that you call get() with a key that maps to Bucket 0. The Hashtable will now need to peform a sequential search through the key/value pairs in Bucket 0 to find your requested value. To perform this lookup, the Hashtable executes the following steps:

  1. Query the key's hashcode
  2. Retrieve the list of key/values residing in the bucket given by the hashcode
  3. Scan through each entry sequentially until a key that equals the key passed into get() is found

A long answer to a short question I know, but it gets worse. Properly overriding equals() and hashCode() is a nontrivial exercise. You must take extreme care to guarantee both methods' contracts.

On implementing equals()

According to the equals() Javadoc, the method must conform to the following rules:

"The equals() method implements an equivalence relation:
  • It is reflexive: For any reference value x, x.equals(x) should return true
  • It is symmetric: For any reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true
  • It is transitive: For any reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true
  • It is consistent: For any reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the object is modified
  • For any non-null reference value x, x.equals(null) should return false"

In Effective Java, Joshua Bloch offers a five-step recipe for writing an effective equals() method. Here's the recipe in code form:

public class EffectiveEquals {
    private int valueA;
    private int valueB;
    . . .
    public boolean equals( Object o ) {
        if(this == o) {  // Step 1: Perform an == test
            return true;
        }
        if(!(o instanceof EffectiveEquals)) {  // Step
2: Instance of check
            return false;
        }
        EffectiveEquals ee = (EffectiveEquals) o; //
Step 3: Cast argument
        // Step 4: For each important field, check to
see if they are equal
        // For primitives use ==
        // For objects use equals() but be sure to also
        // handle the null case first
        return ee.valueA == valueA &&
               ee.valueB == valueB;
    }
    . . .
}

Note: You need not perform a null check since null instanceof EffectiveEquals will evaluate to false.

Finally, for Step 5, go back to equals()'s contract and ask yourself if the equals() method is reflexive, symmetric, and transitive. If not, fix it!

On implementing hashCode()

For hashCode()'s general contract, the Javadoc says:

  • "Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode() method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals(java.lang.Object method, then calling the hashCode() method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables."

Creating a properly working hashCode() method proves difficult; it becomes even more difficult if the object in question is not immutable. You can calculate a hashcode for a given object in many ways. The most effective method bases the number upon the object's fields. Moreover, you can combine these values in various ways. Here are two popular approaches:

  • You can turn the object's fields into a string, concatenate the strings, and return the resulting hashcode
  • You can add each field's hashcode and return the result

While other, more thorough, approaches exist, the two aforementioned approaches prove the easiest to understand and implement.

Tony Sintes is an independent consultant and founder of First Class Consulting, a consulting firm that specializes in bridging disparate enterprise systems and training. Outside of First Class Consulting, Tony is an active freelance writer, as well as author of Sams Teach Yourself Object-Oriented Programming in 21 Days (Sams, 2001; ISBN: 0672321092).

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies