From Polymorphism To Polytypism
by Nicolas Wu
Posted on 10 December 2010
This post serves as an introduction to polytypic programming, and draws heavily from Generics for the Masses by Ralf Hinze, which I can heartily recommend if you enjoy this post. While most functional programmers will have heard of polymorphism, polytypic programming is less well known. In polytypic programming the focus is on structural induction to define a collection of types and functions. Ultimately, it means that functions which are similar in structure need only be defined in their base cases and construction, rather than in their specific instances.
Before we look at polytypic programming, let’s look at polymorphic programming. Polymorphic programming allows us to be more expressive with our type system by writing functions which generalise over different datatypes. For instance, the
foldr function is polymorphic over the type of list in question:
Here we have an implicit universal quantification which indicates that the function works for all values of type
b, thus making this function polymorphic.
We can use
foldr to obtain another polymorphic function like the
length function, which works out the length of a list:
While this function is generic in the sense that its definition is solely in terms of the structure of lists, it lacks genericity in the sense that it applies only to lists.
The example of
length can be generalised by writing a specific version that can be used for all the different container datastructures we are interested in. This generalised function, which we’ll call
size, can be defined explicitly for different structures:
Another version of
size can be written for lists:
These definitions could be brought together into a single
size function by making a new class definition,
Sizeable, which provides an interface for a
size function that is instantiated according to the datastructure in question.
But something has been lost along the way: neither of these definitions make use of the foldable properties of their respective container types. To fix this, we can make use of the
Foldable class, defined in
Data.Foldable, which has the following minimal interface:
The key point is that a type is considered to be foldable so long as there is a sensible conversion from that class to a monoidal datatype. A monoidal datatype instantiates the following class:
This allows us to rewrite and unify our
size functions by making the most of the underlying monoidal property of size:
Then we can define size in a different way:
While this example has shown us how
size can be thought of as a specific instance of a monoidal function applied to a foldable datatype, the solution has merely moved the problem of genericity away from the
Sizeable class, and into the
Foldable class, where an instance must still be written for each datastructure of interest: the value of
t cannot be universally quantified, and the result must be monoidal. The problem here is that the implicit structure of the data is not being exploited to its full effect.
With polytypic programming, the aim is to provide a generic definition of a function by using induction on the structure of types. In order to achieve this, we decompose our data constructions into their fundamental building blocks: every new datatype is composed of primitive types (like
Float) which are composed together using elementary types: the unit type, sums, and products.
In this post, the elementary types are represented using the types
Either a b and
(a, b). Together, these types can be used to represent other algebraic types. For example, the usual definition for a
List is as follows:
Roughly speaking, the following equivalence holds:
The isomorphism between these two representations of lists can be made explicit by providing the following two methods which witness that isomorphism:
This notion of isomorphism can be captured by using the following structure:
The representation in terms of elementary types provides us with a means of decomposing the structure of lists, and this allows us to define a polytypic version of
size which follows the structure of its argument. We do this by recursively constructing a function that can operate on any representable datatype. These construction functions are bundled together as instances of the class
The idea here is to construct a generic function
g for each representable type
a. The class
Generic is instantiated by a datatype that corresponds to each generic definition we require, and provides a family of functions, one for each primitive and elementary type, that constructs the generic function type.
All the datatypes which can be represented are members of the
Rep class, so we provide instances for each of the elementary and primitive types that could be encountered, where the responsibility of evaluating each type is delegated to a function given by the
Generic class that builds the generic function.
The tricky part is to provide an instance for each algebraic datatype of interest. To do so, we delegate to a function
datatype which makes use of the isomorphism between the datatype in question and the elementary representation:
This single instance declaration is enough to provide an interface for a multitude of polytypic function definitions. While this is similar to the
Foldable class we saw earlier on, in that an instance must be declared for each type we wish to evaluate, the difference is that
Rep can be used to recursively pick apart a datatype in terms of its isomorphism with its elementary type, and we are not restricted to returning monoidal types.
As mentioned above, the
Generic class is used to provide a collection of functions which construct generic types that can consume each of the primitive and elementary types.
For example, the type
Size a is used to turn an arbitrary type
a into an
Int that represents the size of
The generic instance of
Size demonstrates how the size of each primitive and elementary type can be calculated:
Here we make use of the
The interesting case is the
datatype function which uses the
fromData part of the datatype isomorphism.
Finally, the generic version of
size is defined as follows:
This curious definition brings together the represented datatype, through the explicit type class constraint,
Rep a, as well as the generic function definition, found implicitly in the datatype which
appSize operates on.
Since this post is actually a literate Haskell file we can use it to evaluate, say, the size of a list of characters in ghci:
This article has discussed the differences between polymorphism and polytypism. In the literature polytypic programming has been given several different names:
- Data-generic programming
- Structural polymorphism
- Type parametric programming
- Shape polymorphism
- Intensional polymorphism
Examples of polytypicity in practice can be found in the derivable operators in Haskell. For example, a function such as
(==) is polytypic:
This can be considered polytypic since we can ask the compiler to derive an instance of
Eq for an arbitrary datatype, and this is done based on the structure of the type. While such polytypic functionality exists in Haskell, it can only be accessed by using the
deriving keyword, and there is no way to contribute to the family of classes which can be derived.
Our discussion has detailed one of the many frameworks that can be used to achieve polytypicity. I particularly like this incarnation because it is possible in ordinary Haskell without making use of any extensions. The downside is that the isomorphism between datatypes and their elementary representation must be given explicitly.