Type dependency in Java, Part 2

Using covariance and contravariance in your Java programs

chrstphré cæmpbëll (Creative Commons BY or BY-SA)

Understanding type compatibility is fundamental to writing good Java programs, but the interplay of variances between Java language elements can seem highly academic to the uninitiated. This two-part article is for software developers ready to tackle the challenge! Part 1 revealed the covariant and contravariant relationships between simpler elements such as array types and generic types, as well as the special Java language element, the wildcard. Part 2 explores type dependency in the Java Collections API, in generics, and in lambda expressions.

We'll jump right in, so if you haven't already read Part 1, I recommend starting there.

API examples for contravariance

For our first example, consider the Comparator version of java.util.Collections.sort(), from the Java Collections API. This method's signature is:


<T> void sort(List<T> list, Comparator<? super T> c)

The sort() method sorts any List. Usually it's easier to use the overloaded version, with the signature:


sort(List<T extends Comparable<? super T>>)

In this case, extends Comparable expresses that the sort() may be called only if the necessary method-comparing elements (namely compareTo) have been defined in the element type (or in its supertype, thanks to ? super T):


sort(integerList); // Integer implements Comparable 
sort(customerList); // works only if Customer implements Comparable

Using generics for comparison

Obviously, a list is sortable only if its elements can be compared among each other. Comparison is done by the single method compareTo, which belongs to the interface Comparable. You must implement compareTo in the element class.

This type of element can be sorted just one way, however. For example, you may sort a Customer by their ID, but not by birthday or postal code. Using the Comparator version of sort() is more flexible:


publicstatic <T> void sort(List<T> list, Comparator<? super T> c)

Now we compare elements not in the element's class, but in an additional Comparator object. This generic interface has one object method:


int compare(T o1, T o2);

Contravariant parameters

Instantiating an object more than once enables you to sort objects using different criteria. But do we really need such a complicated Comparator<? super T> type parameter? In most cases, Comparator<T> would be enough. We could use its compare() method to compare any two elements in the List<T> object, as follows:

class DateComparator implements Comparator<java.util.Date> {
	public int compare(Date d1, Date d2) { return ... }
		// compares the two Date objects
}
List<java.util.Date> dateList = ... ; // List of Date objects
sort(dateList, new DateComparator()); // sorts dateList

Using the more complicated version of the method Collection.sort() set us up for additional use cases, however. The contravariant type parameter of Comparable makes it possible to sort a list of type List<java.sql.Date>, because java.util.Date is a supertype of java.sql.Date:


List<java.sql.Date> sqlList = ... ;
sort(sqlList, new DateComparator());

If we omit contravariance in the sort() signature (using only <T> or the unspecified, unsafe <?>), then the compiler rejects the last line as type error.

In order to call


sort(sqlList, new SqlDateComparator());

you would have to write an extra featureless class:


class SqlDateComparator extends DateComparator {}

Additional methods

Collections.sort() isn't the only Java Collections API method equipped with a contravariant parameter. Methods like addAll(), binarySearch(), copy(), fill(), and so on, can be used with similar flexibility.

Collections methods like max() and min() offer contravariant result types:


public static <T extends Object & Comparable<? super T>> T max(
	Collection<? extends T> collection) { ... }

As you see here, a type parameter can be requested to satisfy more than one condition, just by using &. The extends Object might appear superfluous, but it stipulates that max() returns a result of type Object and not of row Comparable in the bytecode. (There are no type parameters in bytecode.)

The overloaded version of max() with Comparator is even funnier:


public static <T> T max(Collection<? extends T> collection,
	Comparator<? super T> comp)

This max() has both contravariant and covariant type parameters. While the elements of the Collection must be of (possibly different) subtypes of a certain (not explicitly given) type, Comparator must be instantiated for a supertype of the same type. Much is required of the compiler's inference algorithm, in order to differentiate this in-between type from a call like this one:


Collection<AType> collection = ... ; 
Comparator<AnOtherType> comparator = ... ; 
max(collection, comparator);

Boxed binding of type parameters

As our last example of type dependency and variance in the Java Collections API, let's reconsider the signature of the sort() with Comparable. Note that it uses both extends and super, which are boxed:


static <T extends Comparable<? super T>> void sort(List<T> list) { ... }

In this case, we're not as interested in the compatibility of references as we are in binding the instantiation. This instance of the sort() method sorts a list object with elements of a class implementing Comparable. In most cases, sorting would work without <? super T> in the method's signature:


sort(dateList); // java.util.Date implements Comparable<java.util.Date> 
sort(sqlList); // java.sql.Date implements Comparable<java.sql.Date> 

The lower bound of the type parameter allows additional flexibility, however. Comparable doesn't necessarily need to be implemented in the element class; it's enough to have implemented it in the superclass. For example:


class SuperClass implements Comparable<SuperClass> {
		public int compareTo(SuperClass s) { ... } }
	class SubClass extends SuperClass {} // without overloading of compareTo()
	List<SuperClass> superList = ...;
	sort(superList);
	List<SubClass> subList = ...;
	sort(subList);

The compiler accepts the last line with


static <T extends Comparable<? super T>> void sort(List<T> list) { ... }

and rejects it with

static <T extends Comparable<T>> void sort(List<T> list) { ... }

The reason for this rejection is that the type SubClass (which the compiler would determine from the type List<SubClass> in the parameter subList) is not suitable as a type parameter for T extends Comparable<T>. The type SubClass doesn't implement Comparable<SubClass>; it only implements Comparable<SuperClass>. The two elements are not compatible due to the lack of implicit covariance, although SubClass is compatible to SuperClass.

On the other hand, if we use <? super T>, the compiler doesn't expect SubClass to implement Comparable; it's enough if SuperClass does it. It's enough because the method compareTo() is inherited from SuperClass and can be called for SubClass objects: <? super T> expresses this, effecting contravariance.

Contravariant accessing variables of a type parameter

The upper or the lower bound applies only to type parameter of instantiations referred by a covariant or contravariant reference. In the case of Generic<? extends SuperType> covariantReference; and Generic<? super SubType> contravariantReference;, we can create and refer objects of different Generic instantiations.

Different rules are valid for the parameter and result type of a method (such as for input and output parameter types of a generic type). An arbitrary object compatible to SubType can be passed as parameter of the method write(), as defined above.


contravariantReference.write(new SubType()); // OK
contravariantReference.write(new SubSubType()); // OK too
contravariantReference.write(new SuperType()); // type error
((Generic<? super SuperType>)contravariantReference).write(
	new SuperType()); // OK

Because of contravariance, it's possible to pass a parameter to write(). This is in contrast to the covariant (also unbounded) wildcard type.

The situation doesn't change for the result type by binding: read() still delivers a result of type ?, compatible only to Object:


Object o = contravariantReference.read();
SubType st = contravariantReference.read(); // type error

The last line produces an error, even though we've declared a contravariantReference of type Generic<? super SubType>.

The result type is compatible to another type only after the reference type has been explicitly converted:


SuperSuperType sst = ((Generic<SuperSuperType>)contravariantReference).read();
sst = (SuperSuperType)contravariantReference.read(); // unsafer alternative

Examples in the previous listings show that reading or writing access to a variable of type parameter behaves the same way, regardless of whether it happens over a method (read and write) or directly (data in the examples).

Reading and writing to variables of type parameter

Table 1 shows that reading into an Object variable is always possible, because every class and the wildcard are compatible to Object. Writing an Object is possible only over a contravariant reference after appropriate casting, because Object is not compatible to the wildcard. Reading without casting into an unfitting variable is possible with a covariant reference. Writing is possible with a contravariant reference.

The rows in Table 1 refer to the sort of reference, and the columns to the type of data to be accessed. The headings of "supertype" and "subtype" indicate the wildcard bounds. The entry "cast" means the reference must be casted. An instance of "OK" in the last four columns refers to the typical cases for covariance and contravariance.

See the end of this article for a systematic test program for the table, with detailed explanations.

Creating objects

On the one hand, you cannot create objects of the wildcard type, because they are abstract. On the other hand, you can create array objects only of an unbounded wildcard type. You cannot create objects of other generic instantiations, however.


Generic<Object>[] genericArray = new Generic<Object>[20]; // type error
Generic<?>[] wildcardArray = new Generic<?>[20]; // OK
genericArray = (Generic<Object>[])wildcardArray; // unchecked conversion
genericArray[0] = new Generic<Object>();
genericArray[0] = new Generic<String>(); // type error
wildcardArray[0] = new Generic<String>(); // OK

Because of the covariance of arrays, the wildcard array type Generic<?>[] is the supertype of the array type of all instantiations; therefore the assignment in the last line of the above code is possible.

Within a generic class, we cannot create objects of the type parameter. For example, in the constructor of an ArrayList implementation, the array object must be of type Object[] upon creation. We can then convert it to the array type of the type parameter:


class MyArrayList<E> implements List<E> {
	private final E[] content;
	MyArrayList(int size) {
		content = new E[size]; // type error
		content = (E[])new Object[size]; // workaround
	}
	...
}

For a safer workaround, pass the Class value of the actual type parameter to the constructor:


content = (E[])java.lang.reflect.Array.newInstance(myClass, size);

Multiple type parameters

A generic type can have more than one type parameter. Type parameters don't change the behavior of covariance and contravariance, and multiple type parameters can occur together, as shown below:


class G<T1, T2> {}
G<? extends SuperType, ? super SubType> reference;
reference = new G<SuperType, SubType>(); // without variance
reference = new G<SubType, SuperType>(); // with co- and contravariance

The generic interface java.util.Map<K, V> is frequently used as an example for multiple type parameters. The interface has two type parameters, one for key and one for value. It's useful to associate objects with keys, for example so that we can more easily find them. A telephone book is an example of a Map object using multiple type parameters: the subscriber's name is the key, the phone number is the value.

The interface's implementation java.util.HashMap has a constructor for converting an arbitrary Map object into an association table:


public HashMap(Map<? extends K,? extends V> m) ...

Because of covariance, the type parameter of the parameter object in this case does not have to correspond with the exact type parameter classes K and V. Instead, it can be adapted through covariance:


Map<CustomerNumber, Customer> customers;
	...
contacts = new HashMap<Id, Person>(customers); // covariant

Here, Id is a supertype of CustomerNumber, and Person is supertype of Customer.

Variance of methods

We've talked about variance of types; now let's turn to a somewhat easier topic.

A method is dependent on the type of its parameters. Along with input parameters, we also consider the method's return type as an output parameter. However, the return type does not belong to the method's signature, only its input (raw) parameter types.

Methods with different names or with a different number of parameters are incompatible. The question of compatibility arises only for methods with the same name and the same number of parameters.

Compatibility of method declarations and definitions

A method call in the body of a class can be compatible or incompatible to method definitions (in classes) and declarations (in abstract classes and interfaces). For a method declaration, the question is the compatibility to other declarations--this information helps you decide if the method is to be overridden or overloaded.

Among Java declarations and definitions, there is no variance concerning signature: if a method overrides another method, the signatures must be same. However, there is covariance concerning result type:


interface SuperType {
	void procedure(SuperType parameter);
	SuperType function();
 }
interface SubType extends SuperType {
	void procedure(SubType parameter); // overloaded: different signature
	@Override
	SubType function(); // overridden: same signature, different result type
}

Abiding Java's strict rules regarding signatures, procedure() is in this case overloaded, not overridden. Adding the annotation @Override for SubType.procedure() would produce an error message in the compiler, because the parameter type of procedure() is not the same for the two interfaces.

Whether a call is compatible to a declaration or a definition is decided on the basis of the signature. There is covariance here: the upward compatibility of the parameters implies the compatibility of the call. If we are concerned with the target object of a call, however, we don't speak about variance but about polymorphism:


SuperType superParameter = ... ;
SubType subParameter = ... ;
superReference.procedure(subParameter); // call is covariant concerning signature
subReference.procedure(superParameter); // polymorph

Concerning signatures, calls to declarations and definitions are covariant. Note that the last ones below are invariant.

Figure 1. Variance of methods concerning signature

Variance in functions

Method declarations and definitions in Java are covariant concerning result types. In contrast to parameters, the result type of a function does not belong to its signature. The result type can be extended upward in the overridden version, although it cannot be otherwise changed.

When it comes to function results in a call, we don't look for variance but, instead, upward compatibility:


SuperType superResult = subReference.function(); // upward compatible
SubType subResult = superReference.function();
	// type error: no downward compatibility

Note, however, that the upward compatibility demonstrated in the above code could also be considered a form of covariance.

Like the function result, access protection (private, public, etc.) and the exception specification (throws) do not belong to the signature. (These could be extended upward if the method was overriden, however.) Appending final prevents further overriding, as shown:


class SuperClass {
	void method() throws SuperException { ... }
}
class SubClass extends SuperClass {
	@Override
	final void method() throws SubException { ... }
}

All in all, method variances are simple: Declarations and definitions of the method signature are invariant to each other; all other elements--result types, exception specifications, and calls to declarations and definitions--are covariant. Table 2 summarizes method variances.

Using variances in lambda expressionsLambda expressions are anonymous methods. Like all anonymous language elements (including classes and objects) they are most suited to one-time usage. Lambdas satisfy the principle of code readability, and thus of software quality. We can summarize that principle as: The closer the definition is to its usage, the better.

With a lambda expression, you define anonymous program elements as they are used; like so:


method(new MyClass()); // anonymous object without reference, for one-time usage
new Interface() { ... };
	// anonymous class implementing the interface, for one-time usage
procedure(parameter -> { ... }); // lambda
	// anonymous method passed as parameter to procedure

The code assumes the following declarations:


@FunctionalInterface
interface Functionalinterface {
	void method(Type parameter);
}
...
void procedure(Functionalinterface functionalinterface) { ... }

Before Java 8, calling procedure() would have looked more complicated:


class Implementation implements Functionalinterface {
	public void method(Type parameter) { ... } }
...
Functionalinterface fi = new Implementation();
procedure(fi);

A somewhat simpler example, implemented with an anonymous class and object:


procedure(new Functionalinterface() {
	public void method(Type parameter) { ... } } );

Prior to Java 8, a callback was implemented like so:

i
Figure 2. Callbacks

In a traditional callback procedure, method a passes method c as parameter while calling b, so that that b calls c back. If you are programming b, you know that you will call a method at the place marked with the red circle in Figure 2, but you don't know which method you'll call.

Lambdas offer a much simpler way to implement callbacks:


procedure(parameter -> { ... }); // lambda

Simplifying calls with lambdas

If you were to program an algorithm for integrating a function, you would need to write it without knowing which function (for instance, sine) was to be integrated. Figure 3 shows an algorithm for calculating integrals of sine.

Wikipedia
Figure 3. Integrals of sine

The user of your integration algorithm would then pass their function (along with the limits a and b) when calling the algorithm. Below, you see the algorithm used to calculate the integral of sine between 0 and π, which is 1.0.


double result = integral((double x) -> Math.sin(x), 0, Math.PI);

In Java, a listener is a good example for a callback:


button.addActionListener( // pre-Java 8 version
	new ActionListener() {
		public void actionPerformed(ActionEvent e) {
			System.out.println("Button pushed"); }});

Say you are programming a class, Button. Your button needs to be programmed to react to a button push, but the action is undefined. Only the user of the class knows what exact action the application requires (in this case it will be System.out.println()).

To develop the generic button, you would create an ActionListener object and pass it to the Button object by calling addActionListener(). You would then implement the ActionListener method actionPerformed(), where the action is programmed. (In a more complex example you could use the ActionEvent parameter as well.) The user method will be "called back" by the Button class when the button is pushed.

Introducing lambdas simplifies this program:


button.addActionListener(
  e -> System.out.println("Button pushed") // lambda expression
);

Using the ready-to-use method reference for System simplifies even further:


button.addActionListener(System.out::println);

Because Math exports a method reference for sin, you can also use this one when calling integral():


result = integral(Math::sin, 0, Math.PI);

Then you must have declared


double integral(Function<Double, Double> function, double a, double b)

where Function is a generic interface with two type parameters.

Lambdas in functional interfaces

You can use lambda expressions after declaring a functional interface. Any interface is acceptable, so long as it declares exactly one (abstract) non-generic method. The compiler checks for these criteria if the interface has been marked by the annotation @FunctionalInterface.

The parameter type of Button.addActionListener(), namely java.awt.event.ActionListener, satisfies this criteria; therefore it can be called with a lambda expression as parameter. The lambda expression represents an anonymous method (the "value of the lambda") with one parameter, e, before the arrow ->, and a method body (a block) after it.

In Java 8, the lambda expression behaves like an object and can be passed as a parameter. The object is considered to be "of a lambda type." In the case of our example, the object is of type ActionListener of the functional interface.

You can also define a reference to a lambda expression:


ActionListener actionListener = e -> System.out.println("Button pushed");

and then use it later in the call:


button.addActionListener(actionListener);

This might be useful if you want to assign the same listener to more than one button.

Another frequent example is Runnable:


class OldRunnable implements Runnable {
  public void run() {
    System.out.println("Old");
  }
};
Runnable old = new OldRunnable();
new Thread(old).start();

This call is much simpler with a lambda expression:


new Thread(() -> System.out.println("New")).start();

A lambda expression has the type of its corresponding functional interface. In the following example, we could specify the types (String) of the two parameters, left and right, but the compiler can also determine types by inference:


Comparator<String> c;
c = (String left, String right) -> left.compareTo(right);
c = (left, right) -> left.compareTo(right); // equivalent

For this reference, we call the lambda value (a method body) as follows:


System.out.println(c.compare("Hallo", "World"));

Simplifying calls is not the only advantage of lambdas. Lambda expressions enable a new programming paradigm in Java, of mixing object-oriented programming with functional-style programming. You can write many algorithms more concisely and more readably using lambdas, and they even help improve program correctness, in many cases.

Covariance of lambda expressions

Lambdas must not be generic; therefore, they leave little room for variances. Oftentimes, it's a challenge for the compiler's inference algorithm to determine the type of a lambda expression; therefore the parameter types of the lambda expression must fit (almost) exactly to the types of the functional interface: just the usual compatibility rules for method parameters apply:


@FunctionalInterface
interface SuperFI {
	void method(SuperType parameter);
}
@FunctionalInterface
interface SubFI {
	void method(SubType parameter);
}
...
void covariantProcedure(SuperFI fi, SuperType superParameter {
	fi.method(superParameter);
}
void contravariantProcedure(SubFI sfi, SuperType subParameter) {
	sfi.method((SubType)subParameter); // run time error: ClassCastException
}
...
covariantProcedure(parameter -> {}, new SubType());
	// SubType is compatible to SuperType
contravariantProcedure(parameter -> {}, new SuperType());

The compiler accepts the unfitting SuperType object in the last row, but the casting in the body of contravariantProcedure() raises a ClassCastException at runtime. So we can say that lambda expressions (just like method calls) are covariant with respect to their signature.

Lambdas also behave like methods with respect to their result type and exception specification; in these cases they are covariant:


@FunctionalInterface
interface SuperInterface {
	SuperType function() throws SuperException;
}
@FunctionalInterface
interface SubInterface {
	SubType function() throws SubException;
}
...
SuperType superFunction(SuperInterface superParameter) throws SuperException {
	return superParameter.function(); // or do something more interesting
}
SubType subFunction(SubInterface subParameter) throws SubException {
	return subParameter.function(); // similarly
}
...
SuperType superVariable = superFunction(() -> new SuperType()); // normal
superVariable = subFunction(() -> new SubType()); // simply compatible
superVariable = superFunction(() -> new SubType()); // covariant
SubType subVariable = (SubType)superFunction(() -> new SuperType()); // error
subVariable = (SubType)superFunction(() -> new SuperType()); // run time error

The castings in the last two lines above cause a ClassCastException. Casting isn't helpful because there is no explicit contravariance. However, lambda expressions (just like methods) are implicitly covariant with respect to the types of their function result and exception specifications (see the third line in the above code).

Here, we've passed a lambda expression of SubType to a parameter of SuperType. This is a special case for lambdas: superFunction's parameter type is SuperInterface; usually the call would take a parameter of SubInterface, which is not its subtype. But lambdas are implicitly covariant, so it works.

Inference is the trick: if we define the two versions of function() with distinguishable signatures--meaning with parameters of different types--and implement the SubInterface traditionally, we must specify function()'s parameter type explicitly. But if a lambda expression has an unspecified parameter type, the compiler finds the one best fitting, and this is flexible:


superVariable = superFunction(parameter -> new SubType());
	// unspecified parameter type: inference changes SubType to SuperType

This flexibility ensures additional covariance: the lambda expression is upwardly compatible, even in a case were the traditional method is not.

Generic lambda expressions

Unlike their methods, functional interfaces may be generic. You saw this in the Comparator example, which is a generic functional interface. The type parameter may also be bounded, just like other generic types. We can also declare a wildcard reference, as shown:


@FunctionalInterface
interface GenericFunctionalInterface<T> {
	void method(T t);
}
...
GenericFunctionalInterface<?> wildcardReference;
wildcardReference = (String string) -> System.out.println(string);
wildcardReference = (Integer integer) -> integer++;

There is no way to use this reference, however. Any attempt to call this method fails:


wildcardReference.method("Wildcard"); // type error
wildcardReference.method(1); // type error
wildcardReference.method(new Object()); // type error

Nothing is compatible to the wildcard: no parameter we try to pass to method() will fit, not even an Object. Unfortunately, not even an upper bound for the wildcard helps:


GenericFunctionalInterface<? extends Type> covariantReference;
covariantReference = (Type parameter) -> System.out.println(parameter);
covariantReference.method(new Type(){}); // type error

The error shows that a lambda expression takes only the exact type of its parameter--there is no flexibility; there is no covariance.

The lower bound seems a bit more useful, because method() can be called with an object of the bound:


GenericFunctionalInterface<? super SubType> contravariantReference;
contravariantReference = (SuperType parameter) -> System.out.println(parameter);
contravariantReference.method(new SubType(){}); // compiles well
contravariantReference.method(new SuperType(){}); // type error

However, a supertype of the bound doesn't work anymore: the parameter must be exactly of the type of the lambda expression. In this case, it would be ?, and there is no such object, s there is no use of ? super.

Because of these rules, we can say that generic lambdas are invariant with respect to their type parameters.

In sum: Lambda expressions (just like method calls) are covariant with respect to signature, result type, and exception specification. Generic lambda types (functional interfaces) are invariant with respect to type parameter.

Conclusion

Covariance and contravariance refer to the compatibility or incompatibility of type-dependent language elements. In Java, array types are implicitly covariant and explicitly contravariant (through type conversion, or casting). Parametrized (generic) types are implicitly invariant, and this does not change with type conversion. Covariance and contravariance can be declared for parametrized types through binding the wildcard, however: we declare covariance through an upper bound (<? extends Bound>) and contravariance through a lower bound (<? super Bound>). Wildcard types are abstract, and can be used only for references.

Method calls are covariant to declarations and definitions concerning signature and result type. Method declarations and definitions are invariant among each other, concerning signature, and covariant concerning result type.

Table 3 summarizes this discussion of type dependency and variances in Java.