Type dependency in Java, Part 1

Covariance and contravariance for array types, generic types, and the wildcard element

chrstphre (CC BY 2.0)

Understanding type compatibility is fundamental to writing good Java programs, but the interplay of variances between Java language elements can seem highly academic to the uninitiated. This article is for software developers ready to tackle the challenge! Part 1 reveals the covariant and contravariant relationships between simpler elements such as array types and generic types, as well as the special Java language element, the wildcard. Part 2 explores type dependency and variance in common API examples and in lambda expressions.

download
Get the source code for this article, "Type dependency in Java, Part 1." Created for JavaWorld by Dr. Andreas Solymosi.

Concepts and terminology

Before we get into the relationships of covariance and contravariance among various Java language elements, let's be sure that we have a shared conceptual framework.

Compatibility

In object-oriented programming, compatibility refers to a directed relation between types, as shown in Figure 1.

Andreas Solymosi

Figure 1. Type compatibility

We say that two types are compatible in Java if it's possible to transfer data between variables of the types. Data transfer is possible if the compiler accepts it, and is done through assignment or parameter passing. As an example, short is compatible to int because the assignment intVariable = shortVariable; is possible. But boolean is not compatible to int because the assignment intVariable = booleanVariable; is not possible; the compiler won't accept it.

Because compatibility is a directed relation, sometimes T1 is compatible to T2 but T2 is not compatible to T1, or not in the same way. We'll see this further when we get to discussing explicit or implicit compatibility.

What matters is that compatibility among reference types is possible only within a type hierarchy. All class types are compatible to Object, for example, because all classes inherit implicitly from Object. Integer is not compatible to Float, however, because Float is not a superclass of Integer. Integer is compatible to Number, because Number is an (abstract) superclass of Integer. Because they are located in the same type hierarchy, the compiler accepts the assignment numberReference = integerReference;.

We talk about implicit or explicit compatibility, depending on whether compatibility has to be marked explicitly or not. For example, short is implicitly compatible to int (as shown above) but not vice versa: the assignment shortVariable = intVariable; is not possible. However, short is explicitly compatible to int, because the assignment shortVariable = (short)intVariable; is possible. Here we must mark compatibility by casting, also known as type conversion.

Similarly, among reference types: integerReference = numberReference; is not acceptable, only integerReference = (Integer) numberReference; would be accepted. Therefore, Integer is implicitly compatible to Number but Number is only explicitly compatible to Integer.

Dependency

A type might depend on other types. For example, the array type int[] depends on the primitive type int. Similarly, the generic type ArrayList<Customer> is dependent on the type Customer. Methods can also be type dependent, depending on the types of their parameters. For example, the method void increment(Integer i); depends on the type Integer. Some methods (like some generic types) depend on more than one types--such as methods having more than one parameter.

Covariance and contravariance

Covariance and contravariance determine compatibility based on types. In either case, variance is a directed relation. Covariance can be translated as "different in the same direction," or with-different, whereas contravariance means "different in the opposite direction," or against-different. Covariant and contravariant types are not the same, but there is a correlation between them. The names imply the direction of the correlation.

So, covariance means that the compatibility of two types implies the compatibility of the types dependent on them. Given type compatibility, one assumes that dependent types are covariant, as shown in Figure 2.

Andreas Solymosi

Figure 2. Covariance

The compatibility of T1 to T2 implies the compatibility of A(T1) to A(T2). The dependent type A(T) is called covariant; or more precisely, A(T1) is covariant to A(T2).

For another example: because the assignment numberArray = integerArray; is possible (in Java, at least), the array types Integer[] and Number[] are covariant. So, we can say that Integer[] is implicitly covariant to Number[]. And while the opposite is not true--the assignment integerArray = numberArray; is not possible--the assignment with type casting (integerArray = (Integer[])numberArray;) is possible; therefore, we say, Number[] is explicitly covariant to Integer[] .

To summarize: Integer is implicitly compatible to Number, therefore Integer[] is implicitly covariant to Number[], and Number[] is explicitly covariant to Integer[] . Figure 3 illustrates.

Andreas Solymosi

Figure 3. Contravariance

Generally speaking, we can say that array types are covariant in Java. We'll look at examples of covariance among generic types later in the article.

Contravariance

Like covariance, contravariance is a directed relationship. While covariance means with-different, contravariance means against-different. As I previously mentioned, the names express the direction of the correlation. It is also important to note that variance is not an attribute of types generally, but only of dependent types (such as arrays and generic types, and also of methods , which I'll discuss in Part 2).

A dependent type such as A(T) is called contravariant if the compatibility of T1 to T2 implies the compatibility of A(T2) to A(T1). Figure 4 illustrates.

Andreas Solymosi

Figure 4. Covariance and contravariance

A language element (type or method) A(T) depending on T is covariant if the compatibility of T1 to T2 implies the compatibility of A(T1) to A(T2). If the compatibility of T1 to T2 implies the compatibility of A(T2) to A(T1), then the type A(T) is contravariant. If the compatibility of T1 between T2 does not imply any compatibility between A(T1) and A(T2), then A(T) is invariant.

Array types in Java are not implicitly contravariant, but they can be explicitly contravariant , just like generic types. I'll offer some examples later in the article.

Type-dependent elements: Methods and types

In Java, methods, array types, and generic (parametrized) types are the type-dependent elements. Methods are dependent on the types of their parameters. An array type, T[], is dependent on the types of its elements, T. A generic type G<T> is dependent on its type parameter, T. Figure 5 illustrates.

Andreas Solymosi

Figure 5. Dependent Java language elements

Mostly this article focuses on type compatibility, though I will touch on compatibility among methods toward the end of Part 2.

Implicit and explicit type compatibility

Earlier, you saw the type T1 being implicitly (or explicitly) compatible to T2. This is only true if the assignment of a variable of type T1 to a variable of type T2 is allowed without (or with) tagging. Type casting is the most frequent way to tag explicit compatibility:


variableOfTypeT2 = variableOfTypeT1; // implicit compatible
variableOfTypeT2 = (T2)variableOfTypeT1; // explicit compatible

For example, int is implicitly compatible to long and explicitly compatible to short:


int intVariable = 5;
long longVariable = intVariable; // implicit compatible
short shortVariable = (short)intVariable; // explicit compatible

Implicit and explicit compatibility exists not only in assignments, but also in passing parameters from a method call to a method definition and back. Together with input parameters, this means also passing a function result, which you would do as an output parameter.

Note that boolean isn't compatible to any other type, nor can a primitive and a reference type ever be compatible.

A (reference) subtype is implicitly compatible to its supertype, and a supertype is explicitly compatible to its subtype. This means that reference types are compatible only within their hierarchy branch--upward implicitly and downward explicitly:


referenceOfSuperType = referenceOfSubType; // implicit compatible
referenceOfSubType = (SubType)referenceOfSuperType; // explicit compatible

The Java compiler typically allows implicit compatibility for an assignment only if there is no danger of losing information at runtime between the different types. (Note, however, that this rule isn't valid for losing precision, such as in an assignment from int to float.) For example, int is implicitly compatible to long because a long variable holds every int value. In contrast, a short variable does not hold any int values; thus, only explicit compatibility is allowed between these elements.

Andreas Solymosi

Figure 6. Implicit compatibility of arithmetic types in Java

Note that the implicit compatibility in Figure 6 assumes the relationship is transitive: short is compatible to long.

Similar to what you see in Figure 6, it's always possible to assign a reference of a subtype int a reference of a supertype. Keep in mind that the same assignment in the other direction could throw a ClassCastException, however, so the Java compiler allows it only with type casting.

Covariance and contravariance for array types

In Java, some array types are covariant and/or contravariant. In the case of covariance, this means that if T is compatible to U, then T[] is also compatible to U[]. In the case of contravariance, it means that U[] is compatible to T[]. Arrays of primitive types are invariant in Java:


longArray = intArray; // type error
shortArray = (short[])intArray; // type error

Arrays of reference types are implicitly covariant and explicitly contravariant, however:


SuperType[] superArray;
SubType[] subArray;
	...
superArray = subArray; // implicit covariant
subArray = (SubType[])superArray; // explicit contravariant

Andreas Solymosi

Figure 7. Implicit covariance for arrays

Figure 7. Implicit covariance for arrays

What this means, practically, is that an assignment of array components could throw ArrayStoreException at runtime. If an array reference of SuperType references an array object of SubType, and one of its component is then assigned to a SuperType object, then:


superArray[1] = new SuperType(); // throws ArrayStoreException

This is sometimes called the covariance problem. The true problem is not so much the exception (which could be avoided with programming discipline), but that the virtual machine must check every assignment in an array element at runtime. This puts Java at an efficiency disadvantage against languages without covariance (where a compatible assignment for array references is prohibited) or languages like Scala, where covariance can be switched off.

An example for covariance

In a simple example, the array reference is of type Object[] but the array object and the elements are of different classes:


Object[] objectArray; // array reference
objectArray = new String[3]; // array object; compatible assignment
objectArray[0] = new Integer(5); // throws ArrayStoreException

Because of covariance, the compiler cannot check the correctness of the last assignment to the array elements--the JVM does this, and at significant expense. However, the compiler can optimize the expense away, if there is no use of type compatibility between array types.

Andreas Solymosi

Figure 8. The covariance problem for arrays

Remember that in Java, for a reference variable of some type referring an object of its supertype is forbidden: arrows in Figure 8 must not be directed upwards.

Variances and wildcards in generic types

Generic (parametrized) types are implicitly invariant in Java, meaning that different instantiations of a generic type are not compatible among each other. Even type casting will not result in compatibility:


Generic<SuperType> superGeneric;
Generic<SubType> subGeneric;
subGeneric = (Generic<SubType>)superGeneric; // type error
superGeneric = (Generic<SuperType>)subGeneric; // type error

The type errors arise even though subGeneric.getClass() == superGeneric.getClass(). The problem is that the method getClass() determines the raw type--this is why a type parameter does not belong to the signature of a method. Thus, the two method declarations


void method(Generic<SuperType> p); 
void method(Generic<SubType> p); 

must not occur together in an interface (or abstract class) definition.

Although generic types in Java are implicitly invariant, some variables can be used covariantly. These must be defined with a wildcard (?), which can be used as an actual type parameter. Generic<?> is the abstract supertype of all instantiations of the generic type, so all the instantiations of Generic are compatible to Generic<?>, as shown here:


Generic<?> wildcardReference; 
wildcardReference = new Generic<String>(); // implicitly compatible 
wildcardReference = new Generic<Integer>(); // implicitly compatible 

Because the wildcard type is abstract, it can be used only for references, and not for objects: new Generic<?>() would be rejected by the compiler.

An example for the usage of the wildcard is the parameter of a method manipulating a collection or an array, independently of its element type. Covariance makes it easy to write such a method for arrays:


static void swap(Object[] array, int i, int j) {
	... // swaps the elements with index i and j
}

The method can be called for an arbitrary array because every type is compatible to Object, and due to the covariance of array types:


Integer[] integerArray = {1, 2, 3};
swap(integerArray, 0, 2); // Integer[] is compatible to Object[]

The generic version of this method is more type safe because its calling does not use compatibility but generic instantiation, which can be checked by the compiler:


static <T> void swap(T[] array, int i, int j) { ... } 

Note that the two swap definitions may not occur together in the same class because they don't have a distinguishable signature.

Such a solution for ArrayList wouldn't work because of the rule of invariance for generic types. Because wildcard affects covariance, it can be used as a workaround:


static void swap(List<?> list, int i, int j) { ... } // similarly 

Calling the wildcard version is possible with an arbitrary element type:


List<Integer> list = ...;
swap(list, 0, 2); // List<Integer> is compatible to List<?>

We call such compatibility unspecified covariance, because we have not specified which type or supertype makes the covariance possible. It's possible for compatibility based on unspecified covariance to occur on two levels at once: the level of the generic types (ArrayList to List) and the level of the elements (Integer to ?):


ArrayList<Integer> arrayList = ... ; // ArrayList<T> implements List<T> 
swap(list, 0, 2); // ArrayList<Integer> is compatible to List<?> 

Explicit covariance for generic types

The light covariance shown in the previous section can be generalized. If Generic<SubType> were compatible to Generic<SuperType>, the types would be implicitly covariant. Binding the wildcard with extends would make the types explicitly covariant: Generic<SubType> is compatible to Generic<? extends SuperType>. We could declare a reference of this bounded wildcard type (but not of objects because all wildcard types are abstract):


Generic<? extends SuperType> covariantReference; 

This reference may refer any instance of Generic with an actual type parameter of a subtype of SuperType:


covariantReference = new Generic<SuperType>(); // normal 
covariantReference = new Generic<SubType>(); // covariant 
covariantReference = new Generic<Object>(); // type error: only subtypes work 

This means that the type Generic<? extends SuperType> is the abstract supertype of all instantiations of Generic with a subtype of SuperType (just like Generic<?> is the abstract supertype of all instantiations of Generic with any type, as explained above).

Using wildcards for binding is necessary in cases where certain properties are expected of the type parameter--for example if the elements of the parameter collection are to be manipulated:


static void increment(Number[] array) { ... } // for every element + 1
static void increment(Collection<? extends Number> collection) { ... }
	// similarly

The first method may be called with any Number[] parameter as a consequence of covariance between arrays:


increment(integerArray); // Integer[]  is compatible to Number[]
 

The second method may be called with any ArrayList parameter with a Number instantiation as a consequence of covariance with the wildcard:


increment(integerArrayList); // ArrayList<Integer> is compatible to Collection<? extends Number>

Note that in the above code snip, we're using compatibility on two levels again: the parametrized ArrayList<T> is a subtype of Collection<T> and Integer is a subtype of Number.

Covariant accessing variables of a type parameter

You've seen in the previous few examples how the bounded wildcard can be used for explicit covariance among generic types. While this covariance works on variables of the type parameter, it doesn't always work on method parameters in a generic class.

Assume that in the class Generic we use the type parameter T as the type of the (input and output) parameters of methods:


class Generic<T> {
	T data;
	void write(T data) { this.data = data; } // T is input parameter type
	T read() { return data; } } // T is output parameter type

In this case, the method write() cannot be called directly for a wildcardReference (it can be called only after type casting):


wildcardReference.write(new Object()); // type error
((Generic<Object>)wildcardReference).write(new Object()); // OK
	// however, warning by the compiler: unchecked cast

Because the wildcard is not a type, no type is compatible to it, not even Object. But the wildcard itself is compatible to Object (and to no other type); so the type parameter result of a function can be referred by an Object reference:


Object object = wildcardReference.read(); 

These rules apply for bounded wildcard types--note that input parameters cannot be passed directly; they can be passed only after casting. The output parameters can be assigned to a variable of the type (or a supertype) of the bound:


covariantReference.write(new SuperType()); // type error
covariantReference.write(new SubType()); // type error
((Generic<SuperType>)covariantReference).write(new SuperType()); // OK
((Generic<SuperType>)covariantReference).write(new SubType()); // OK
((Generic<SubType>)covariantReference).write(new SubType()); //OK
object = covariantReference.read(); // OK
SuperType superReference = covariantReference.read(); // OK
SubType subReference1 = covariantReference.read(); // type error
SubType subReference2 = ((Generic<SubType>)covariantReference).read(); // OK
SubType subReference3 = (SubType)covariantReference.read(); // unsafe

The type conversions in the last two program lines can throw a ClassCastException (as always), hence they are unsafe. Whether it happens or not depends on the type of the object in the last write(): if it is new SubType() (as in the sequence above), no exception will be thrown. The difference between the last two lines is that in the first one the reference is being converted (from Generic<? extends SuperType> to Generic<SubType>) and then read() is called, while in the second one the result of read() will be converted (from ? to SubType), so the first one is (somewhat) safer.

We can interpret this to mean that an unbounded wildcard is bounded by Object: Generic<?> has (almost) the same effect as Generic<? extends Object>. Thus, we can say that the unspecified covariance is an explicit covariance through the compatibility to Object.

These rules are valid not only for parameters but for every reading or writing access to variables of the type of the type parameter (assuming they are public or otherwise accessible). No type is compatible to ? but ? is compatible to the upper bound (as the case may be, to Object--but then to no other type):


wildcardReference.data = new Object(); // writing access -> type error
object = wildcardReference.data; // reading access is OK
wildcardReference.write(new Object()); // writing access -> type error
object = wildcardReference.read(); // reading access is OK
String string = wildcardReference.data; // error: ? is compatible only to Object
Generic<? extends String> covariant = new Generic<String>();
String s = covariant.data; // reading is possible with upper bound

Contravariance for parametrized types

Recall that contravariance means downwards compatibility. Arrays are explicitly contravariant; syntactically this can be expressed through type conversion (via casting, see above):


subArray = (SubType[])superArray; // explicit compatible (contravariant) 

Among different instantiations of a generic type, the compiler rejects type conversion. But generic types are explicitly contravariant, too. Syntactically this can be expressed through a lower bound of the wildcard with super:


Generic<? super SubType> contravariantReference; 

This variable can refer any instantiation of Generic with any supertype (e.g., an Object) of SubType:


contravariantReference = new Generic<SubType>(); // normal 
contravariantReference = new Generic<SuperType>(); // contravariant 
contravariantReference = new Generic<Object>(); // always possible 

Here, the assignment of the SuperType instantiation takes place downwards, namely to the SubType instantiation contravariantReference--this means contravariance.

The upper bound changes the behavior for reading and writing (as discussed in the previous section), so the contravariance with Object makes both possible:


Generic<? super Object> contravariantO = new Generic<Object>(); // like above
contravariantO.data = new Object(); // now is OK: writing is possible
object = contravariantO.data; // also reading
contravariantO.write(new Object()); // now is OK
object = contravariantO.read(); // ? is compatible to Object

The contravariance (? super) with other types (like String) reverses the direction for reading and writing compared with covariance (? extends):


Generic<? super String> contravariantS = new Generic<String>();
contravariantS.data = new String(); // OK
string = contravariantS.data; // type error: contrary to covariant case
contravariantS.write(new String()); // OK
string = contravariantS.read(); // type error

The reason is that the lower bound (here String) is compatible to the wildcard but not vice versa. This is the difference between contravariantO.read() (OK) and contravariantS.read() (error): ? is compatible to Object but not to String.

Conclusion to Part 1

Java types may be implicitly or explicitly compatible to each other. Implicit compatibility is asserted by the compiler, whereas explicit compatibility must be asserted by the programmer, in order to avoid exceptions at runtime. Dependent (array and generic) types may also be compatible: upward compatibility (called covariance) is mostly implicit, while downward compatibility (contravariance) is explicit for array types.

Java doesn't allow implicit variance for generic types, because doing so would threaten type safety. The wildcard is a compromise, allowing (implicit) unspecified covariance. Covariance and contravariance for generic types can only be explicit: the developer must define the upper and lower limits, as shown in Table 1.

If you've ever wondered about the many question marks (wildcards) and boxed genericity found in more recent versions of the Java standard libraries, Part 2 of this article should help. We'll look at contravariance in several API examples, and I'll also explain why the compiler sometimes rejects accessing variables of a generic type. You'll learn how to create objects of a generic type, as well as how the idea of variance can be transferred to method declarations, definitions, and calls. We'll conclude with a quick look into compatibility and variance in lambdas--including generic lambda expressions, which could be of interest to programming language enthusiasts.