The ultimate superclass, Part 2

My previous post launched a three-part series on the java.lang.Object class and its methods. After introducing Object, I examined clone() and equals(). In this post, I dig deeper into Object by covering the finalize(), getClass(), and hashCode() methods.

Finalization

Q: What does the finalize() method accomplish?

A: The finalize() method lets an object that overrides this method, and which is known as a finalizer, perform cleanup tasks (such as releasing system resources) when called by the garbage collector. This cleanup activity is known as finalization. The default finalize() method does nothing; it returns when called.

Q: I've heard that I shouldn't use finalize(). Is this true?

A: As a rule, you should avoid using finalize(). Finalizers may be called more promptly on some JVM implementations than on others -- perhaps the finalizer thread has a lower priority on some JVMs resulting in finalizers executing infrequently. If you depend on a finalizer to close open files or some other system resource, you could run out of those resources and your application could fail when trying to open a new file or otherwise obtain a new system resource because of a tardy finalizer.

Q: What should I use in place of a finalizer?

A: You should provide a method that explicitly cleans up the object (e.g., java.io.FileInputStream's void close() method) and call this method in the context of a try-finally construct, which ensures that the object is cleaned up regardless of an exception being thrown from the try block or not. For example, consider the following idiom for releasing a lock:

Lock l = ...; // ... is a placeholder for the actual lock-acquisition code
l.lock();
try 
{
   // access the resource protected by this lock
} 
finally 
{
   l.unlock();
}

This idiom ensures that a lock is unlocked regardless of the try block completing normally or via a thrown exception.

Q: Are there any use cases where a finalizer would be appropriate?

A: You could use a finalizer to act as a safety net in case an explicit termination method (such as a java.io.FileOutputStream object's close() method or a java.util.concurrent.Lock object's lock() method) isn't called. When this happens, the finalizer should be called eventually and the critical resource released.

Q: How do I code finalize()?

A: You often code finalize() according to the following pattern:

@Override
protected void finalize() throws Throwable
{
   try
   {
      // Finalize the subclass state.
      // ...
   }
   finally
   {
      super.finalize();
   }
}

A subclass finalizer typically calls its superclass finalizer. After finalizing the subclass in a try block, call super.finalize(); in a matching finally block. This way, the superclass is guaranteed to be finalized whether or not an exception is thrown from the try block.

Q: What happens when finalize() throws an exception?

A: When finalize() throws an exception, the exception is ignored. Furthermore, finalization of the object terminates, which can leave the object in a corrupt state. If another thread tries to use this object, the resulting behavior is nondeterministic. Although a thrown exception usually causes the thread to terminate and a warning message to be printed, neither behavior happens when the exception is thrown from the finalize() method.

Q: I'd like to experiment with finalize(). Can you provide me with a simple application?

A: Check out Listing 1.

Listing 1. Experiencing finalization

class LargeObject
{
   byte[] memory = new byte[1024*1024*4];

   @Override
   protected void finalize() throws Exception
   {
      System.out.println("finalized");
   }
}

public class FinalizeDemo
{
   public static void main(String[] args)
   {
      while (true)
         new LargeObject();
   }
}

Listing 1 presents the source code to a FinalizeDemo application that repeatedly instantiates the LargeObject class. Each LargeObject instance creates a 4-megabyte byte array. At some point, a LargeObject object will be garbage collected because the application keeps no reference to the object.

One of the tasks performed by the garbage collector when garbage collecting an object is to call that object's finalize() method. LargeObject's overriding finalize() method prints a message to the standard output stream when called. It doesn't invoke its superclass finalize() method because Object is the superclass and its finalize() method does nothing.

Compile Listing 1 (javac FinalizeDemo.java) and run this application (java FinalizeDemo). When I build/run this application on my 64-bit Windows 7 platform via JDK 7u6, I observe a list of finalized messages. When I build/run this application on the same platform with the official JDK 8 release, I observe java.lang.OutOfMemoryError (along with several finalized messages).

Get the Class object

Q: What does the getClass() method accomplish?

A: The getClass() method lets you obtain the java.lang.Class object that's associated with the object on which getClass() is called. The returned Class object is the object that's locked by static synchronized methods of the represented class; for example, static synchronized void foo() {}. It's also the entry point into the Reflection API. Because the class of the object on which getClass() was called is in memory, type safety is assured.

Q: Are there other ways to obtain a Class object?

A: There are two other ways to obtain a Class object. You can use a class literal, which is the name of a class followed by a .class suffix; for example, Account.class. Alternatively, you can call one of Class's forName() methods. Class literals are compact and the compiler enforces type safety; it won't compile source code when it cannot find the literal's specified class. forName() lets you dynamically load any reference type by specifying its package-qualified name. However, type safety isn't enforced, which can lead to a runtime exception.

Q: Should I prefer getClass() to instanceof when implementing the equals() method?

A: The topic of whether to use getClass() or instanceof when implementing the equals() method has been a source of much debate in the Java community. An excellent resource that can help you make this choice is Angelika Langer's Secrets of equals() - Part 1 article. As well as discussing problems when incorrectly overriding equals() (such as symmetry violation), Langer presents helpful guidelines for ensuring that equals() is correctly overridden.

Hash codes

Q: What does the hashCode() method accomplish?

A: The hashCode() method returns a hash code (the value returned from a hash function) for the object on which this method is called. This method is used by hash-based collection classes, such as java.util.HashMap, java.util.HashSet, and java.util.Hashtable.

Q: Why must I override hashCode() when also overriding equals() in my classes?

A: You must override hashCode() when also overriding equals() in your classes to ensure that your objects function properly with all hash-based collections. This is a good habit to get into, even when your objects won't be stored in hash-based collections.

Q: What is hashCode()'s general contract?

A: hashCode()'s general contract is as follows:

  • Whenever hashCode() is invoked on the same object more than once during an execution of a Java application, hashCode() must consistently return the same integer, provided no information used in equals() comparisons on the object is modified. However, this integer doesn't need to remain consistent from one execution of an application to another execution of the same application.
  • When two objects are equal according to the overriding equals() method, calling hashCode() on each of the two objects must produce the same integer result.
  • When two objects are unequal according to the overriding equals() method, the integers returned from calling hashCode() on these objects can be identical. However, having hashCode() return distinct values for unequal objects may improve hashtable performance.

Q: What happens when I override equals() and don't override hashCode()?

A: When you override equals() and don't override hashCode(), you run into problems when storing your objects in a hash-based collection. For example, consider Listing 2.

Listing 2. Experiencing hash-based collection difficulty when only equals() is overridden

import java.util.HashMap;
import java.util.Map;

final class Employee
{
   private String name;
   private int age;

   Employee(String name, int age)
   {
      this.name = name;
      this.age = age;
   }

   @Override
   public boolean equals(Object o)
   {
      if (!(o instanceof Employee))
         return false;

      Employee e = (Employee) o;
      return e.getName().equals(name) && e.getAge() == age;
   }

   String getName()
   {
      return name;
   }

   int getAge()
   {
      return age;
   }
}

public class HashDemo
{
   public static void main(String[] args)
   {
      Map<Employee, String> map = new HashMap<>();
      Employee emp = new Employee("John Doe", 29);
      map.put(emp, "first employee");
      System.out.println(map.get(emp));
      System.out.println(map.get(new Employee("John Doe", 29)));
   }
}

Listing 2 declares an Employee class that overrides equals() and doesn't override hashCode(). It also declares a HashDemo class whose main() method demonstrates what can go wrong when storing an Employee instance as the key in a hashmap.

main() first creates a hashmap followed by an Employee object and stores an entry consisting of this object as the key and a string as the value in the hashmap. It then retrieves the entry by passing this object as the key and outputs the result. Similarly, it attempts to retrieve the entry by passing a new Employee object with identical fields and outputs the result.

Compile Listing 2 (javac HashDemo.java) and run this application (java HashDemo). You should observe the following output:

first employee
null

If hashCode() had been properly overridden, you would have seen first employee instead of null on the second output line because the two Employee objects are equal according to equals(), and hashCode()'s general contract states that when two objects are equal according to the overriding equals() method, calling hashCode() on each of the two objects must produce the same integer result.

Q: How do I properly override hashCode()?

A: Item 8 in the first edition of Joshua Bloch's Effective Java Programming Language Guide presents a simple 4-step recipe for properly overriding hashCode(). The following steps are similar to what Bloch presents:

  1. Introduce an int variable named result (or whatever name you want) and initialize it to a constant nonzero value (e.g., 31). A nonzero value is used so that it will be affected by any initial fields whose hash value (computed in Step 2.1) is zero. If the initial value assigned to result was 0, the final hash value would be unaffected by such initial fields and the chance of collisions while hashing would increase. The nonzero value assigned to result is arbitrary.
  2. For each of the object's significant fields (these would be the fields used in an equals() comparison), f, perform the following steps:
    1. Calculate int-based hash code hc on field f, as follows:
      1. For a boolean field, calculate hc = f ? 0 : 1;.
      2. For a byte, char, short, or int field, calculate hc = (int) f;.
      3. For a long field, calculate hc = (int) (f ^ (f >>> 32));. This expression exclusive ORs the long integer's least significant 32 bits with its most significant 32 bits.
      4. For a float field, calculate hc = Float.floatToIntBits(f);.
      5. For a double field, calculate long l = Double.doubleToLongBits(f); hc = (int) (l ^ (l >>> 32));.
      6. For a reference field, if this class's equals() method compares the field by recursively calling equals() on the field, recursively invoke hashCode() on the field. If a more complex comparison is needed, calculate a "canonical representation" of the field and calculate that representation's hash code. If the field contains null, calculate f = 0;.
      7. For an array field, regard each element as a separate field. For each significant field, calculate a hash code on the field by applying this recipe recursively and combine the resulting hash code into the overall hash code, as described in Step 2.2.
    2. Execute result = 37*result+hc, which merges hc into the overall hash code. The multiplication causes the hash value to depend on field order, which results in an improved hash function when the class contains multiple similar fields. The number 37 is an odd prime number. It's used to avoid information loss when multiplying and an overflow occurs.
  3. Return result.
  4. After writing hashCode(), make sure that equal instances of the class produce the same hash codes.

To illustrate this recipe, Listing 3 presents a second version of Listing 2 whose Employee class properly overrides hashCode().

1 2 Page 1
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.