Java 101: Trash talk, Part 2

The Reference Objects API allows programs to interact with the garbage collector

Java's garbage collection features tend to confuse new developers. I wrote this two-part series on garbage collection to dispel that confusion. Part 1 introduced you to garbage collection, explored various garbage collection algorithms, showed you how to request that Java run the garbage collector, explained the purpose behind finalization, and mentioned resurrection -- a technique for bringing objects back from the dead. Part 2 explores the Reference Objects API.

Read the whole series on garbage collection:

As you learned in Part 1, Java's garbage collector destroys objects. Although you typically write programs that ignore the garbage collector, situations arise in which a program needs to interact with the garbage collector.

For example, suppose you plan to write a Java-based Web browser program similar to Netscape Navigator or Internet Explorer. When it comes to displaying Webpage images, your first thought is for the browser to always download all images before displaying them to a user. However, you soon realize that the user will spend too much time waiting for images to download. Although a user might be willing to wait when visiting a Webpage for the first time, the user would probably not tolerate waiting each time he revisits the Webpage. To decrease the user's wait time, you can design the browser program to support an image cache, which allows the browser to save each image after the download completes in an object on the object heap. The next time the user visits the Webpage in the same browsing session, the browser can retrieve the corresponding image objects from the object heap and quickly display those images to the user.

To keep the discussion simple, I don't discuss a second-level disk-based image cache mechanism. Browsers like Netscape Navigator and Internet Explorer use this mechanism.

The image cache idea features a problem -- insufficient heap memory. When the user visits numerous Webpages with different sized images, the browser must store all images in heap memory. At some point, the heap memory will decrease to a level where no more room exists for images. What does the browser do? By taking advantage of the Reference Objects API, the browser allows the garbage collector to remove images when the JVM needs additional heap space. In turn, when the browser needs to redraw an image, the garbage collector tells the browser if that image is no longer in memory. If the image is not in memory, the browser must first reload that image. The browser can then restore that image to the image cache -- although the garbage collector might need to remove another image from the cache to make heap memory available for the original image, assuming the object heap's free memory is low.

In addition to teaching you how to use the Reference Objects API to manage an image cache, this article teaches you how to use that API to obtain notification when significant objects are no longer strongly reachable and perform post-finalization cleanup. But first, we must investigate object states and the Reference Objects API class hierarchy.

Object states and the Reference Objects API class hierarchy

Prior to the release of Java 2 Platform, Standard Edition (J2SE) 1.2, an object could be in only one of three states: reachable, resurrectable, or unreachable:

  • An object is reachable if the garbage collector can trace a path from a root-set variable to that object. When the JVM creates an object, that object stays initially reachable as long as a program maintains at least one reference to the object. Assigning null to an object reference variable reduces the object's references by one. For example:

    Employee e = new Employee (); Employee e2 = e; e = null;

    In the above code fragment, the Employee object is initially reachable through e. Then it is reachable through e2 as well as through e. After null assigns to e, the object is only reachable through e2.

  • An object is resurrectable if it is currently unreachable through root-set variables, but has the potential to be made reachable through a garbage collector call to that object's overridden finalize() method. Because finalize()'s code can make the object reachable, the garbage collector must retrace all paths from root-set variables in an attempt to locate the object after finalize() returns. If the garbage collector cannot find a path to the object, it makes the object unreachable. If a path does exist, the garbage collector makes the object reachable. If the object is made reachable, the garbage collector will not run its finalize() method a second time when no more references to that object exist. Instead, the garbage collector makes that object unreachable.
  • An object is unreachable when no path from root-set variables to that object exists and when the garbage collector cannot call that object's finalize() method. The garbage collector is free to reclaim the object's memory from the heap.

With the release of J2SE 1.2, three new object states representing progressively weaker forms of reachability became available to Java: softly reachable, weakly reachable, and phantomly reachable. Subsequent sections explore each of those states.

Also with the J2SE 1.2 release, the state previously known as reachable became known as strongly reachable. For example, in code fragment Employee e = new Employee ();, the Employee object reference in root-set variable e (assuming e is a local variable) is strongly reachable through e.

The new object states became available to Java through reference objects. A reference object encapsulates a reference to another object, a referent. Furthermore, the reference object is a class instance that subclasses the abstract Reference class in the Reference Objects API -- a class collection in package java.lang.ref. Figure 1 presents a hierarchy of reference object classes that constitute much of the Reference Objects API.

Figure 1. A hierarchy of reference object classes composes much of the Reference Objects API

Figure 1's class hierarchy shows a class named Reference at the top and SoftReference, WeakReference, and PhantomReference classes below. The abstract Reference class defines those operations common to the other three classes. Those operations include:

  • Clear the current reference object
  • Add the current reference object to the currently registered reference queue
  • Return the current reference object's referent
  • Determine if the garbage collector has placed the current reference object on a reference queue

The aforementioned operations introduce a reference queue. What are reference queues, and why are they part of the Reference Objects API? I'll answer both questions during our exploration of soft references.

Soft references

The softly reachable state manifests itself in Java through the SoftReference class. When you initialize a SoftReference object, you store a reference to a referent in that object. The object contains a soft reference to the referent, and the referent is softly reachable if there are no other references, apart from soft references, to that referent. If heap memory is running low, the garbage collector can find the oldest softly reachable objects and clear their soft references -- by calling SoftReference's inherited clear() method. Assuming there are no other references to those referents, the referents enter the resurrectable state (if they contain overridden finalize() methods) or the unreachable state (if they lack overridden finalize() methods). Assuming the referents enter the resurrectable state, the garbage collector calls their finalize() methods. If those methods do not make the referents reachable, the referents become unreachable. The garbage collector can then reclaim their memory.

To create a SoftReference object, pass a reference to a referent in one of two constructors. For example, the following code fragment uses the SoftReference(Object referent) constructor to create a SoftReference object, which encapsulates an Employee referent:

SoftReference sr = new SoftReference (new Employee ());

Figure 2 shows the resulting object structure.

Figure 2. A SoftReference object and its Employee referent

According to Figure 2, the SoftReference object is strongly reachable through root-set variable sr. Also, the Employee object is softly reachable from the soft reference field inside SoftReference.

You often use soft references to implement image and other memory-sensitive caches. You can create an image cache by using the SoftReference and java.awt.Image classes. Image subclass objects allow images to load into memory. As you probably know, images can consume lots of memory, especially if they have large horizontal and vertical pixel dimensions and many colors. If you kept all such images in memory, the object heap would quickly fill up, and your program would grind to a halt. However, if you maintain soft references to Image subclass objects, your program can arrange for the garbage collector to notify you when it clears an Image subclass object's soft reference and moves it to the resurrectable state -- assuming no other references to Image exist. Eventually, assuming the Image subclass object lacks a finalize() method with code that resurrects the image, Image will transition to the unreachable state, and the garbage collector will reclaim its memory.

By calling SoftReference's inherited get() method, you can determine if an Image subclass object is still softly referenced or if the garbage collector has cleared that reference. get() returns null when the soft reference clears. Given the preceding knowledge, the following code fragment shows how to implement an image cache for a single Image subclass object:

SoftReference sr = null;
// ... Sometime later in a drawing method.
Image im = (sr == null) ? null : (Image) (sr.get());
if (im == null) 
    im = getImage (getCodeBase(), "truck1.gif");
    sr = new SoftReference (im);
// Draw the image.
// Later, clear the strong reference to the Image subclass object.
// That is done, so -- assuming no other strong reference exists -- 
// the only reference to the Image subclass object is a soft 
// reference. Eventually, when the garbage collector notes that it 
// is running out of heap memory, it can clear that soft reference 
// (and eventually remove the object).
im = null;

The code fragment's caching mechanism works as follows: To begin, there is no SoftReference object. As a result, null assigns to im. Because im contains null, control passes to the getImage() method, which loads truck1.gif. Next, the code creates a SoftReference object. As a result, there is both a strong reference (via im) and a soft reference (via sr) to the Image subclass object. After the code draws the image, null assigns to im. Now there is only a single soft reference to Image. If the garbage collector notices that free memory is low, it can clear Image's soft reference in the SoftReference object that sr strongly references.

Suppose the garbage collector clears the soft reference. The next time the code fragment must draw the image, it discovers that sr lacks null and calls sr.get () to retrieve a strong reference to Image -- the referent. Assuming the soft reference is now null, get() returns null, and null assigns to im. We can now reload the image by calling getImage() -- a relatively slow process. However, if the garbage collector did not clear the soft reference, sr.get() would return a reference to the Image subclass object. Then we could immediately draw that image without first loading it. And that is how a soft reference allows us to cache an image.

The previous code fragment called sr.get() to learn whether or not the garbage collector cleared the sr-referenced object's internal soft reference to an Image subclass object. However, a program can also request notification by using a reference queue -- a data structure that holds references to Reference subclass objects. Under garbage collector (or even program) control, Reference subclass object references arrive at the end of the reference queue and exit from that queue's front. As a reference exits from the front, the following reference moves to the front, and other references move forward. Think of a reference queue as a line of people waiting to see a bank teller.

To use a reference queue, a program first creates an object from the ReferenceQueue class (located in the java.lang.ref package). The program then calls the SoftReference(Object referent, ReferenceQueue q) constructor to associate a SoftReference object with the ReferenceQueue object that q references, as the following code fragment demonstrates:

ReferenceQueue q = new ReferenceQueue ();
SoftReference sr = new SoftReference (new Employee (), q);
1 2 3 Page 1