Covariance and contravariance (computer science)
Many programming language type systems support subtyping. For instance, if the type Variance is how subtyping between more complex types relates to subtyping between their components. For example, how should a list of Depending on the variance of the type constructor, the subtyping relation of the simple types may be either preserved, reversed, or ignored for the respective complex types. In the OCaml programming language, for example, "list of Cat" is a subtype of "list of Animal" because the list type constructor is covariant. This means that the subtyping relation of the simple types is preserved for the complex types. On the other hand, "function from Animal to String" is a subtype of "function from Cat to String" because the function type constructor is contravariant in the parameter type. Here, the subtyping relation of the simple types is reversed for the complex types. A programming language designer will consider variance when devising typing rules for language features such as arrays, inheritance, and generic datatypes. By making type constructors covariant or contravariant instead of invariant, more programs will be accepted as well-typed. On the other hand, programmers often find contravariance unintuitive, and accurately tracking variance to avoid runtime type errors can lead to complex typing rules. In order to keep the type system simple and allow useful programs, a language may treat a type constructor as invariant even if it would be safe to consider it variant, or treat it as covariant even though that could violate type safety. Formal definition
Suppose
The article considers how this applies to some common type constructors. C# examplesFor example, in C#, if
The variance of a C# generic interface is declared by placing the The typing rules for interface variance ensure type safety. For example, an ArraysRead-only data types (sources) can be covariant; write-only data types (sinks) can be contravariant. Mutable data types which act as both sources and sinks should be invariant. To illustrate this general phenomenon, consider the array type. For the type We have the option to treat this as either:
If we wish to avoid type errors, then only the third choice is safe. Clearly, not every Conversely, a Covariant arrays in Java and C#Early versions of Java and C# did not include generics, also termed parametric polymorphism. In such a setting, making arrays invariant rules out useful polymorphic programs. For example, consider writing a function to shuffle an array, or a function that tests two arrays for equality using the boolean equalArrays(Object[] a1, Object[] a2);
void shuffleArray(Object[] a);
However, if array types were treated as invariant, it would only be possible to call these functions on an array of exactly the type Therefore, both Java and C# treat array types covariantly.
For instance, in Java As discussed above, covariant arrays lead to problems with writes into the array. Java[4]: 126 and C# deal with this by marking each array object with a type when it is created. Each time a value is stored into an array, the execution environment will check that the run-time type of the value is equal to the run-time type of the array. If there is a mismatch, an // a is a single-element array of String
String[] a = new String[1];
// b is an array of Object
Object[] b = a;
// Assign an Integer to b. This would be possible if b really were
// an array of Object, but since it really is an array of String,
// we will get a java.lang.ArrayStoreException.
b[0] = 1;
In the above example, one can read from the array (b) safely. It is only trying to write to the array that can lead to trouble. One drawback to this approach is that it leaves the possibility of a run-time error that a stricter type system could have caught at compile-time. Also, it hurts performance because each write into an array requires an additional run-time check. With the addition of generics, Java[4]: 126–129 and C# now offer ways to write this kind of polymorphic function without relying on covariance. The array comparison and shuffling functions can be given the parameterized types <T> boolean equalArrays(T[] a1, T[] a2);
<T> void shuffleArray(T[] a);
Alternatively, to enforce that a C# method accesses a collection in a read-only way, one can use the interface Function typesLanguages with first-class functions have function types like "a function expecting a Cat and returning an Animal" (written Those languages also need to specify when one function type is a subtype of another—that is, when it is safe to use a function of one type in a context that expects a function of a different type.
It is safe to substitute a function f for a function g if f accepts a more general type of argument and returns a more specific type than g. For example, functions of type
Using inference rule notation the same rule can be written as: In other words, the → type constructor is contravariant in the parameter (input) type and covariant in the return (output) type. This rule was first stated formally by John C. Reynolds,[5] and further popularized in a paper by Luca Cardelli.[6] When dealing with functions that take functions as arguments, this rule can be applied several times. For example, by applying the rule twice, we see that if . In other words, the type is covariant in the position of . For complicated types it can be confusing to mentally trace why a given type specialization is or isn't type-safe, but it is easy to calculate which positions are co- and contravariant: a position is covariant if it is on the left side of an even number of arrows applying to it. Inheritance in object-oriented languagesWhen a subclass overrides a method in a superclass, the compiler must check that the overriding method has the right type. While some languages require that the type exactly matches the type in the superclass (invariance), it is also type safe to allow the overriding method to have a "better" type. By the usual subtyping rule for function types, this means that the overriding method should return a more specific type (return type covariance) and accept a more general argument (parameter type contravariance). In UML notation, the possibilities are as follows (where Class B is the subclass that extends Class A which is the superclass):
For a concrete example, suppose we are writing a class to model an animal shelter. We assume that class AnimalShelter {
Animal getAnimalForAdoption() {
// ...
}
void putAnimal(Animal animal) {
//...
}
}
Now the question is: if we subclass Covariant method return typeIn a language which allows covariant return types, a derived class can override the class CatShelter extends AnimalShelter {
Cat getAnimalForAdoption() {
return new Cat();
}
}
Among mainstream OO languages, Java, C++ and C# (as of version 9.0 [7]) support covariant return types. Adding the covariant return type was one of the first modifications of the C++ language approved by the standards committee in 1998.[8] Scala and D also support covariant return types. Contravariant method parameter typeSimilarly, it is type safe to allow an overriding method to accept a more general argument than the method in the base class: class CatShelter extends AnimalShelter {
void putAnimal(Object animal) {
// ...
}
}
Only a few object-oriented languages actually allow this (for example, Python when typechecked with mypy). C++, Java and most other languages that support overloading and/or shadowing would interpret this as a method with an overloaded or shadowed name. However, Sather supported both covariance and contravariance. Calling convention for overridden methods are covariant with out parameters and return values, and contravariant with normal parameters (with the mode in). Covariant method parameter typeA couple of mainstream languages, Eiffel and Dart[9] allow the parameters of an overriding method to have a more specific type than the method in the superclass (parameter type covariance). Thus, the following Dart code would type check, with class CatShelter extends AnimalShelter {
void putAnimal(covariant Cat animal) {
// ...
}
}
This is not type safe. By up-casting a Despite the type safety problem, the Eiffel designers consider covariant parameter types crucial for modeling real world requirements.[11] The cat shelter illustrates a common phenomenon: it is a kind of animal shelter but has additional restrictions, and it seems reasonable to use inheritance and restricted parameter types to model this. In proposing this use of inheritance, the Eiffel designers reject the Liskov substitution principle, which states that objects of subclasses should always be less restricted than objects of their superclass. One other instance of a mainstream language allowing covariance in method parameters is PHP in regards to class constructors. In the following example, the __construct() method is accepted, despite the method parameter being covariant to the parent's method parameter. Were this method anything other than __construct(), an error would occur: interface AnimalInterface {}
interface DogInterface extends AnimalInterface {}
class Dog implements DogInterface {}
class Pet
{
public function __construct(AnimalInterface $animal) {}
}
class PetDog extends Pet
{
public function __construct(DogInterface $dog)
{
parent::__construct($dog);
}
}
Another example where covariant parameters seem helpful is so-called binary methods, i.e. methods where the parameter is expected to be of the same type as the object the method is called on. An example is the In older versions of Java, the comparison method was specified as an interface interface Comparable {
int compareTo(Object o);
}
The drawback of this is that the method is specified to take an argument of type class RationalNumber implements Comparable {
int numerator;
int denominator;
// ...
public int compareTo(Object other) {
RationalNumber otherNum = (RationalNumber)other;
return Integer.compare(numerator * otherNum.denominator,
otherNum.numerator * denominator);
}
}
In a language with covariant parameters, the argument to Avoiding the need for covariant parameter typesOther language features can provide the apparent benefits of covariant parameters while preserving Liskov substitutability. In a language with generics (a.k.a. parametric polymorphism) and bounded quantification, the previous examples can be written in a type-safe way.[12] Instead of defining class Shelter<T extends Animal> {
T getAnimalForAdoption() {
// ...
}
void putAnimal(T animal) {
// ...
}
}
class CatShelter extends Shelter<Cat> {
Cat getAnimalForAdoption() {
// ...
}
void putAnimal(Cat animal) {
// ...
}
}
Similarly, in recent versions of Java the class RationalNumber implements Comparable<RationalNumber> {
int numerator;
int denominator;
// ...
public int compareTo(RationalNumber otherNum) {
return Integer.compare(numerator * otherNum.denominator,
otherNum.numerator * denominator);
}
}
Another language feature that can help is multiple dispatch. One reason that binary methods are awkward to write is that in a call like Giuseppe Castagna[13] observed that in a typed language with multiple dispatch, a generic function can have some parameters which control dispatch and some "left-over" parameters which do not. Because the method selection rule chooses the most specific applicable method, if a method overrides another method, then the overriding method will have more specific types for the controlling parameters. On the other hand, to ensure type safety the language still must require the left-over parameters to be at least as general. Using the previous terminology, types used for runtime method selection are covariant while types not used for runtime method selection of the method are contravariant. Conventional single-dispatch languages like Java also obey this rule: only one argument is used for method selection (the receiver object, passed along to a method as the hidden argument Castagna suggests that examples where covariant parameter types are superior (particularly, binary methods) should be handled using multiple dispatch; which is naturally covariant. However, most programming languages do not support multiple dispatch. Summary of variance and inheritanceThe following table summarizes the rules for overriding methods in the languages discussed above.
Generic typesIn programming languages that support generics (a.k.a. parametric polymorphism), the programmer can extend the type system with new constructors. For example, a C# interface like There are two main approaches. In languages with declaration-site variance annotations (e.g., C#), the programmer annotates the definition of a generic type with the intended variance of its type parameters. With use-site variance annotations (e.g., Java), the programmer instead annotates the places where a generic type is instantiated. Declaration-site variance annotationsThe most popular languages with declaration-site variance annotations are C# and Kotlin (using the keywords InterfacesIn C#, each type parameter of a generic interface can be marked covariant ( interface IEnumerator<out T>
{
T Current { get; }
bool MoveNext();
}
With this declaration, The type checker enforces that each method declaration in an interface only mentions the type parameters in a way consistent with the
As an example of how these rules apply, consider the interface IList<T>
{
void Insert(int index, T item);
IEnumerator<T> GetEnumerator();
}
The parameter type In the common case of a generic data structure such as DataC# allows variance annotations on the parameters of interfaces, but not the parameters of classes. Because fields in C# classes are always mutable, variantly parameterized classes in C# would not be very useful. But languages which emphasize immutable data can make good use of covariant data types. For example, in all of Scala, Kotlin and OCaml the immutable list type is covariant: Scala's rules for checking variance annotations are essentially the same as C#'s. However, there are some idioms that apply to immutable datastructures in particular. They are illustrated by the following (excerpt from the) definition of the sealed abstract class List[+A] extends AbstractSeq[A] {
def head: A
def tail: List[A]
/** Adds an element at the beginning of this list. */
def ::[B >: A] (x: B): List[B] =
new scala.collection.immutable.::(x, this)
/** ... */
}
First, class members that have a variant type must be immutable. Here, Second, even if a data structure is immutable, it will often have methods where the parameter type occurs contravariantly. For example, consider the method def :: (x: A): List[A]
However, this would be a type error, because the covariant parameter Inferring varianceIt is possible to design a type system where the compiler automatically infers the best possible variance annotations for all datatype parameters.[16] However, the analysis can get complex for several reasons. First, the analysis is nonlocal since the variance of an interface For these reasons[17] most languages do very little variance inference. C# and Scala do not infer any variance annotations at all. OCaml can infer the variance of parameterized concrete datatypes, but the programmer must explicitly specify the variance of abstract types (interfaces). For example, consider an OCaml datatype type ('a, 'b) t = T of ('a -> 'b)
The compiler will automatically infer that type (-'a, +'b) t = T of ('a -> 'b)
Explicit annotations in OCaml become useful when specifying interfaces. For example, the standard library interface module type S =
sig
type key
type (+'a) t
val empty: 'a t
val mem: key -> 'a t -> bool
...
end
This ensures that e.g. Use-site variance annotations (wildcards)One drawback of the declaration-site approach is that many interface types must be made invariant. For example, we saw above that Use-site variance means the desired variance is indicated with an annotation at the specific site in the code where the type will be used. This gives users of a class more opportunities for subtyping without requiring the designer of the class to define multiple interfaces with different variance. Instead, at the point a generic type is instantiated to an actual parameterized type, the programmer can indicate that only a subset of its methods will be used. In effect, each definition of a generic class also makes available interfaces for the covariant and contravariant parts of that class. Java provides use-site variance annotations through wildcards, a restricted form of bounded existential types. A parameterized type can be instantiated by a wildcard Animal a = l.get(3);
because the type l.add(new Animal());
will be rejected as a type error since an While non-wildcard parameterized types in Java are invariant (e.g. there is no subtyping relationship between By applying two of the above three forms of subtyping, it becomes possible to, for example, pass an argument of type In the common case of a generic data structure Wildcards are flexible, but there is a drawback. While use-site variance means that API designers need not consider variance of type parameters to interfaces, they must often instead use more complicated method signatures. A common example involves the <T extends Comparable<T>> T max(Collection<T> coll);
However, this type is not general enough—one can find the max of a <T extends Comparable<? super T>> T max(Collection<T> coll);
The bounded wildcard The <T extends Comparable<? super T>> T max(Collection<? extends T> coll);
Comparing declaration-site and use-site annotationsUse-site variance annotations provide additional flexibility, allowing more programs to type check. However, they have been criticized for the complexity they add to the language, leading to complicated type signatures and error messages. One way to assess whether the extra flexibility is useful is to see if it is used in existing programs. A survey of a large set of Java libraries[16] found that 39% of wildcard annotations could have been directly replaced by declaration-site annotations. Thus the remaining 61% is an indication of places where Java benefits from having the use-site system available. In a declaration-site language, libraries must either expose less variance, or define more interfaces. For example, the Scala Collections library defines three separate interfaces for classes which employ covariance: a covariant base interface containing common methods, an invariant mutable version which adds side-effecting methods, and a covariant immutable version which may specialize the inherited implementations to exploit structural sharing.[19] This design works well with declaration-site annotations, but the large number of interfaces carry a complexity cost for clients of the library. And modifying the library interface may not be an option—in particular, one goal when adding generics to Java was to maintain binary backwards compatibility. On the other hand, Java wildcards are themselves complex. In a conference presentation[20] Joshua Bloch criticized them as being too hard to understand and use, stating that when adding support for closures "we simply cannot afford another wildcards". Early versions of Scala used use-site variance annotations but programmers found them difficult to use in practice, while declaration-site annotations were found to be very helpful when designing classes.[21] Later versions of Scala added Java-style existential types and wildcards; however, according to Martin Odersky, if there were no need for interoperability with Java then these would probably not have been included.[22] Ross Tate argues[23] that part of the complexity of Java wildcards is due to the decision to encode use-site variance using a form of existential types. The original proposals[24][25] used special-purpose syntax for variance annotations, writing Since wildcards are a form of existential types they can be used for more things than just variance. A type like However, type inference for existential types is a difficult problem. For the compiler implementer, Java wildcards raise issues with type checker termination, type argument inference, and ambiguous programs.[27] In general it is undecidable whether a Java program using generics is well-typed or not,[28] so any type checker will have to go into an infinite loop or time out for some programs. For the programmer, it leads to complicated type error messages. Java type checks wildcard types by replacing the wildcards with fresh type variables (so-called capture conversion). This can make error messages harder to read, because they refer to type variables that the programmer did not directly write. For example, trying to add a method List.add (capture#1) is not applicable (actual argument Cat cannot be converted to capture#1 by method invocation conversion) where capture#1 is a fresh type-variable: capture#1 extends Animal from capture of ? extends Animal Since both declaration-site and use-site annotations can be useful, some type systems provide both.[16][23] EtymologyThese terms come from the notion of covariant and contravariant functors in category theory. Consider the category whose objects are types and whose morphisms represent the subtype relationship ≤. (This is an example of how any partially ordered set can be considered as a category.) Then for example the function type constructor takes two types p and r and creates a new type p → r; so it takes objects in to objects in . By the subtyping rule for function types this operation reverses ≤ for the first parameter and preserves it for the second, so it is a contravariant functor in the first parameter and a covariant functor in the second. See also
References
External links
|