Lecturer in Computer Science
I am a lecturer in Computer Science at the University of Bristol, where I am a member of the Theory and Algorithms Group. I was previously a postdoctoral researcher at the University of Oxford, where I also obtained my doctorate and was an undergraduate student. My research is in the field of programming languages, where I develop tools and techniques that help software engineers write efficient code that is free from defects. I am particularly interested in functional programming, category theory, and the algebra of programming.
Here is a selection of papers that I have authored.
Folds and unfolds have been understood as fundamental building blocks for total programming, and have been extended to form an entire zoo of specialised structured recursion schemes. A great number of these schemes were unified by the introduction of adjoint folds, but more exotic beasts such as recursion schemes from comonads proved to be elusive. In this paper, we show how the two canonical derivations of adjunctions from (co)monads yield recursion schemes of significant computational importance: monadic catamorphisms come from the Kleisli construction, and more astonishingly, the elusive recursion schemes from comonads come from the Eilenberg–Moore construction. Thus, we demonstrate that adjoint folds are more unifying than previously believed.
The use of formal description techniques aims to prevent the defects found in software that arise due to poor planning at the design stage. However, the ensuing specifications are often designed with only a single application in mind and are not easily generalised. One area in which these deficiencies arise is that of the formal modelling of relational databases: many authors have drawn parallels between the formal description language, Z, and the relational model of data, but none of these contributions have managed to be both close to the relational model in terms of providing a practical means of database design and fully formal in terms of providing an appropriate metamodel. In this paper, we describe a generative template language, based on the formal template language (FTL). In particular, we extend the FTL, which was developed originally as means of expressing templates, to underpin an approach that facilitates the reuse of specifications in Z, paying particular attention to the formal design of relational databases. These templates encapsulate the common structure found in specifications and can be instantiated to produce specifications tailored to suit particular needs. To achieve this, we extend the FTL and present a mechanism for naming and referencing templates. We also introduce the semantics of template annotations to enforce the syntactic correctness of instantiations.
Modules over monads (or: actions of monads on endofunctors) are structures in which a monad interacts with an endofunctor, composed either on the left or on the right. Although usually not explicitly identified as such, modules appear in many contexts in programming and semantics. In this paper, we investigate the elementary theory of modules. In particular, we identify the monad freely generated by a right module as a generalisation of Moggi’s resumption monad and characterise its algebras, extending previous results by Hyland, Plotkin and Power, and by Filinski and Støvring. Moreover, we discuss a connection between modules and algebraic effects: left modules have a similar feeling to Eilenberg–Moore algebras, and can be seen as handlers that are natural in the variables, while right modules can be seen as functions that run effectful computations in an appropriate context (such as an initial state for a stateful computation).
Algebraic effect handlers are a recently popular approach for modelling side-effects that separates the syntax and semantics of effectful operations. The shape of syntax is captured by functors, and free monads over these functors denote syntax trees. The semantics is captured by algebras, and effect handlers pass these over the syntax trees to interpret them into a semantic domain.
This approach is inherently modular: different functors can be composed to make trees with richer structure. Such trees are interpreted by applying several handlers in sequence, each removing the syntactic constructs it recognizes. Unfortunately, the construction and traversal of intermediate trees is painfully inefficient and has hindered the adoption of the handler approach.
This paper explains how a sequence of handlers can be fused into one, so that multiple tree traversals can be reduced to a single one and no intermediate trees need to be allocated. At the heart of this optimization is keeping the notion of a free monad abstract, thus enabling a change of representation that opens up the possibility of fusion. We demonstrate how the ensuing code can be inlined at compile time to produce efficient handlers.
The past decades have witnessed an extensive study of structured recursion schemes. A general scheme is the hylomorphism, which captures the essence of divide-and-conquer: a problem is broken into sub-problems by a coalgebra; sub-problems are solved recursively; the sub-solutions are combined by an algebra to form a solution. In this paper we develop a simple toolbox for assembling recursive coalgebras, which by definition ensure that their hylo equations have unique solutions, whatever the algebra. Our main tool is the conjugate rule, a generic rule parametrized by an adjunction and a conjugate pair of natural transformations. We show that many basic adjunctions induce useful recursion schemes. In fact, almost every structured recursion scheme seems to arise as an instance of the conjugate rule. Further, we adapt our toolbox to the more expressive setting of parametrically recursive coalgebras, where the original input is also passed to the algebra. The formal development is complemented by a series of worked-out examples in Haskell.
A long-standing problem in logic programming is how to cleanly separate logic and control. While solutions exist, they fall short in one of two ways: some are too intrusive, because they require significant changes to Prolog’s underlying implementation; others are lacking a clean semantic grounding. We resolve both of these issues in this paper.
We derive a solution that is both lightweight and principled. We do so by starting from a functional specification of Prolog based on monads, and extend this with the effect handlers approach to capture the dynamic search tree as syntax. Effect handlers then express heuristics in terms of tree transformations. Moreover, we can declaratively express many heuristics as trees themselves that are combined with search problems using a generic entwining handler. Our solution is not restricted to a functional model: we show how to implement this technique as a library in Prolog by means of delimited continuations.
Algebraic effect handlers are a powerful means for describing effectful computations. They provide a lightweight and orthogonal technique to define and compose the syntax and semantics of different effects. The semantics is captured by handlers, which are functions that transform syntax trees.
Unfortunately, the approach does not support syntax for scoping constructs, which arise in a number of scenarios. While handlers can be used to provide a limited form of scope, we demonstrate that this approach constrains the possible interactions of effects and rules out some desired semantics.
This paper presents two different ways to capture scoped constructs in syntax, and shows how to achieve different semantics by reordering handlers. The first approach expresses scopes using the existing algebraic handlers framework, but has some limitations. The problem is fully solved in the second approach where we introduce higher-order syntax.
A domain-specific language can be implemented by embedding within a general-purpose host language. This embedding may be deep or shallow, depending on whether terms in the language construct syntactic or semantic representations. The deep and shallow styles are closely related, and intimately connected to folds; in this paper, we explore that connection.
Dynamic programming algorithms embody a widely used programming technique that optimizes recursively defined equations that have repeating subproblems. The standard solution uses arrays to share common results between successive steps, and while effective, this fails to exploit the structural properties present in these problems. Histomorphisms and dynamorphisms have been introduced to expresses such algorithms in terms of structured recursion schemes that leverage this structure. In this paper, we revisit and relate these schemes and show how they can be expressed in terms of recursion schemes from comonads, as well as from recursive coalgebras. Our constructions rely on properties of bialgebras and dicoalgebras, and we are careful to consider optimizations and efficiency concerns. Throughout the paper we illustrate these techniques through several worked-out examples discussed in a tutorial style, and show how a recursive specification can be expressed both as an array-based algorithm as well as one that uses recursion schemes.
Folds over inductive datatypes are well understood and widely used. In their plain form, they are quite restricted; but many disparate generalisations have been proposed that enjoy similar calculational benefits. There have also been attempts to unify the various generalisations: two prominent such unifications are the ‘recursion schemes from comonads’ of Uustalu, Vene and Pardo, and our own ‘adjoint folds’. Until now, these two unified schemes have appeared incompatible. We show that this appearance is illusory: in fact, adjoint folds subsume recursion schemes from comonads. The proof of this claim involves standard constructions in category theory that are nevertheless not well known in functional programming: Eilenberg-Moore categories and bialgebras.
This paper discusses our entry to the 2012 ICFP Programming Contest, written entirely in Haskell. Our solution uses many features of Haskell: pure immutable data structures, laziness, higher-order functions, concurrency, and exception handling. Each of these features plays an essential part in our overall solution, and we demonstrate how these key elements can be composed together. In this exposition, we stress the importance of how the code was structured in such a way that made safely refactoring and extending the model a relatively easy task, and how Haskell’s strong type system made it possible for our team to remain agile under changing specifications.
Sorting algorithms are an intrinsic part of functional programming folklore as they exemplify algorithm design using folds and unfolds. This has given rise to an informal notion of duality among sorting algorithms: insertion sorts are dual to selection sorts. Using bialgebras and distributive laws, we formalise this notion within a categorical setting. We use types as a guiding force in exposing the recursive structure of bubble, insertion, selection, quick, tree, and heap sorts. Moreover, we show how to distill the computational essence of these algorithms down to one-step operations that are expressed as natural transformations. From this vantage point, the duality is clear, and one side of the algorithmic coin will neatly lead us to the other “for free”. As an optimisation, the approach is also extended to paramorphisms and apomorphisms, which allow for more efficient implementations of these algorithms than the corresponding folds and unfolds.
A bidirectional transformation is a pair of mappings between source and view data objects, one in each direction. When the view is modified, the source is updated accordingly. The key to handling large data objects that are subject to relatively small modifications is to process the updates incrementally. Incrementality has been explored in the semi-structured settings of relational databases and graph transformations; this flexibility in structure makes it relatively easy to divide the data into separate parts that can be transformed and updated independently. The same is not true if the data is to be encoded with more general-purpose algebraic datatypes, with transformations defined as functions: dividing data into well-typed separate parts is tricky, and recursions typically create interdependencies. In this paper, we study transformations that support incremental updates, and devise a constructive process to achieve this incrementality.
Generic Haskell is an extension of Haskell that supports datatype generic programming. The central idea of Generic Haskell is to interpret a type by a function, the so-called instance of a generic function at that type. Since types in Haskell include parametric types such as ‘list of’, Generic Haskell represents types by terms of the simply-typed lambda calculus. This paper puts the idea of interpreting types as functions on a firm theoretical footing, exploiting the fact that the simply-typed lambda calculus can be interpreted in a cartesian closed category. We identify a suitable target category, a subcategory of Cat, and argue that slice, coslice and comma categories are a good fit for interpreting generic functions at base types.
Many authors have drawn parallels between the relational model of data and the formal description technique Z, yet none of these contributions have managed to be both close to the relational model in terms of providing a practical means of database design and fully formal in terms of providing an appropriate metamodel. We compare these various formalisms, and suggest how the use of the formal template approach of Amálio et al might help to overcome some of the issues faced. We demonstrate the application of this work via a short case study, and suggest further enhancements to the template language.