From Polymorphism To Polytypism
by Nicolas Wu
Posted on 10 December 2010
Tags: Haskell
This post serves as an introduction to polytypic programming, and draws heavily from Generics for the Masses by Ralf Hinze, which I can heartily recommend if you enjoy this post. While most functional programmers will have heard of polymorphism, polytypic programming is less well known. In polytypic programming the focus is on structural induction to define a collection of types and functions. Ultimately, it means that functions which are similar in structure need only be defined in their base cases and construction, rather than in their specific instances.
Polymorphism
Before we look at polytypic programming, let’s look at polymorphic programming.
Polymorphic programming allows us to be more expressive with our type system
by writing functions which generalise over different datatypes.
For instance, the foldr
function is polymorphic over the type of list in
question:
foldr :: (a -> b -> b) -> b -> [a] -> b
Here we have an implicit universal quantification which indicates that the
function works for all values of type a
and b
,
thus making this function polymorphic.
We can use foldr
to obtain another polymorphic
function like the length
function, which works out the
length of a list:
length :: [a] -> Int
length = foldr (const succ) 0
While this function is generic in the sense that its definition is solely in terms of the structure of lists, it lacks genericity in the sense that it applies only to lists.
Foldability
The example of length
can be generalised by writing a specific
version that can be used for all the different container datastructures
we are interested in.
This generalised function, which we’ll call size
,
can be defined explicitly for different structures:
size_Maybe :: Maybe a -> Int
Nothing = 0
size_Maybe Just _) = 1 size_Maybe (
Another version of size
can be written for lists:
size_List :: [a] -> Int
= 0
size_List [] :xs) = 1 + size_List xs size_List (_
These definitions could be brought together into a single size
function by
making a new class definition, Sizeable
, which provides an interface for a
size
function that is instantiated according to the datastructure in question.
But something has been lost along the way: neither of these definitions make
use of the foldable properties of their respective container types.
To fix this, we can make use of the Foldable
class, defined in
Data.Foldable
, which has the following minimal interface:
class Foldable t where
foldMap :: Monoid m => (a -> m) -> t a -> m
The key point is that a type is considered to be foldable so long as there is a sensible conversion from that class to a monoidal datatype. A monoidal datatype instantiates the following class:
class Monoid m where
mempty :: m
mappend :: m -> m -> m
This allows us to rewrite and unify our size
functions
by making the most of the underlying monoidal property of size:
data Measure = Measure { measure :: Int }
instance Monoid Measure where
mempty = Measure 0
Measure x) `mappend` (Measure y) = Measure (x + y) (
Then we can define size in a different way:
size_Foldable :: Foldable t => t a -> Int
= measure . foldMap (const (Measure 1)) size_Foldable
While this example has shown us how size
can be thought of as a specific
instance of a monoidal function applied to a foldable datatype, the
solution has merely moved the problem of genericity away
from the Sizeable
class, and into the Foldable
class,
where an instance must still be written for each datastructure of interest:
the value of t
cannot be universally quantified,
and the result must be monoidal.
The problem here is that the implicit structure of the data is not
being exploited to its full effect.
Polytypism
With polytypic programming, the aim is to provide a generic definition of a
function by using induction on the structure of types.
In order to achieve this, we decompose our data constructions into
their fundamental building blocks: every new datatype is composed
of primitive types (like Int
, Char
, or Float
) which
are composed together using elementary types: the unit type, sums, and
products.
Representation
In this post, the elementary types are represented using the types ()
,
Either a b
and (a, b)
.
Together, these types can be used to represent other algebraic types.
For example, the usual definition for a List
is as follows:
data List a = Nil | Cons a (List a)
Roughly speaking, the following equivalence holds:
Nil | Cons a (List a) == Either () (a, List a)
The isomorphism between these two representations of lists can be made explicit by providing the following two methods which witness that isomorphism:
fromList :: [a] -> Either () (a, [a])
= Left ()
fromList [] :xs) = Right (x, xs) fromList (x
toList :: Either () (a, [a]) -> [a]
Left ()) = []
toList (Right (x, xs)) = (x:xs) toList (
This notion of isomorphism can be captured by using the following structure:
data Iso a b = Iso { fromData :: b -> a, toData :: a -> b }
The representation in terms of elementary types provides us with a means of
decomposing the structure of lists, and this allows us to define
a polytypic version of size
which follows the structure of its argument.
We do this by recursively constructing a function that can operate
on any representable datatype.
These construction functions
are bundled together as instances of the class Rep
:
class Rep a where
rep :: (Generic g) => g a
The idea here is to construct a generic function g
for each representable type a
. The class Generic
is instantiated
by a datatype that corresponds to each generic definition we require,
and provides a family of functions, one for each primitive and elementary type,
that constructs the generic function type.
All the datatypes which can be represented are members of the Rep
class,
so we provide instances for each of the elementary and primitive types
that could be encountered, where the responsibility of evaluating each type is
delegated to a function given by the
Generic
class that builds the generic function.
instance Rep () where
= unit
rep instance (Rep a, Rep b) => Rep (Either a b) where
= plus
rep instance (Rep a, Rep b) => Rep (a, b) where
= pair
rep instance Rep Char where
= char
rep instance Rep Int where
= int rep
The tricky part is to provide an instance for each algebraic datatype
of interest. To do so, we delegate to a function datatype
which makes use of the isomorphism between the datatype in question
and the elementary representation:
instance (Rep a) => Rep [a] where
= datatype (Iso fromList toList) rep
This single instance declaration is enough to provide an interface
for a multitude of polytypic function definitions. While
this is similar to the Foldable
class we saw earlier on,
in that an instance must be declared for each type we wish
to evaluate,
the difference is that Rep
can be used to recursively
pick apart a datatype in terms of its isomorphism with its elementary type,
and we are not restricted to returning monoidal types.
Generics
As mentioned above, the Generic
class is used to provide
a collection of functions which construct generic types that
can consume each of the primitive and elementary types.
class Generic g where
unit :: g ()
plus :: (Rep a, Rep b) => g (Either a b)
pair :: (Rep a, Rep b) => g (a, b)
char :: g Char
int :: g Int
datatype :: (Rep a) => Iso a b -> g b
For example, the type Size a
is used to turn an
arbitrary type a
into an Int
that represents
the size of a
:
newtype Size a = Size { appSize :: a -> Int }
The generic instance of Size
demonstrates how
the size of each primitive and elementary type can be calculated:
instance Generic Size where
= Size (const 0)
unit = Size (either size size)
plus = Size (uncurry (+) . (size × size))
pair = Size (const 1)
char = Size (const 1)
int = Size (size . fromData iso) datatype iso
Here we make use of the (×)
operator:
:: (a -> c) -> (b -> d) -> (a, b) -> (c, d)
(×)= (f a, g b) (f × g) (a, b)
The interesting case is the datatype
function
which uses the fromData
part of the datatype isomorphism.
Finally, the generic version of size
is defined as follows:
size :: (Rep a) => a -> Int
= appSize rep size
This curious definition brings together the represented datatype,
through the explicit type class constraint, Rep a
,
as well as the generic function definition, found implicitly
in the datatype which appSize
operates on.
Since this post is actually a literate Haskell file we can use it to evaluate, say, the size of a list of characters in ghci:
*Main> size "Hello World!"
12
Conclusion
This article has discussed the differences between polymorphism and polytypism. In the literature polytypic programming has been given several different names:
- Data-generic programming
- Structural polymorphism
- Type parametric programming
- Shape polymorphism
- Intensional polymorphism
Examples of polytypicity in practice can be found in the derivable
operators in Haskell. For example, a function such as (==)
is
polytypic:
class Eq a where
(==) :: a -> a -> Bool
This can be considered polytypic since we can ask the compiler to derive an
instance of Eq
for an arbitrary datatype, and this is done based on
the structure of the type.
While such polytypic functionality exists in Haskell, it can only be accessed
by using the deriving
keyword, and there is no way to contribute to the
family of classes which can be derived.
Our discussion has detailed one of the many frameworks that can be used to achieve polytypicity. I particularly like this incarnation because it is possible in ordinary Haskell without making use of any extensions. The downside is that the isomorphism between datatypes and their elementary representation must be given explicitly.