Euler #2 : Streaming Fibonaccis

Posted on 8 July 2010 by Nicolas Wu
Tags: Haskell, Euler

Time for the second problem in our series.

Find the sum of all the even-valued terms in the Fibonacci sequence which do not exceed four million.

First we’ll show how we can solve this problem using a list comprehension, like in the last post, and then we’ll show a different, far more efficient, solution.

Recursion

We start by writing a function that, given a value n, tells us the nth value in the Fibonacci sequence. The type of this function is as follows:

> fib :: Int -> Int

We begin its definition by describing the two base cases of a Fibonacci number:

> fib 0 = 0
> fib 1 = 1

For any other value of n, the definition of fib is the sum of the two previous values of fib:

> fib n = fib (n-1) + fib (n-2)

The order of definitions matters in Haskell: in this example the case for n overlaps with the other two cases, and would cause problems if the compiler always matched this case instead of the other two. To resolve these ambiguities, Haskell tries to match lines one after the other before applying them.

Maps

Our first goal is to produce a list of consecutive Fibonnaci numbers. This can be obtained by transforming a list of integers by using our function fib on each element. This kind of transformation is quite common, and a special function, map, does this for us.

A map takes a function and a list as arguments, and applies the function to each element in the list. Here’s its definition:

map :: (a -> b) -> [a] -> [b]
map f []     = []
map f (x:xs) = f x : map f xs

The definition tells us that when the list is empty, we return an empty list back. Otherwise, we apply the function f to the first element of the list, and add the result to the front of the list generated by mapping f to the remainder. Since map is used all the time, this is a function supplied in the prelude, so we don’t need to define it.

Here’s what we get when we map fib onto a list of Int values:

*Main> map fib [1..10]
[1,1,2,3,5,8,13,21,34,55]

Using the same technique as in the last question we can produce a first attempt at answering the question:

> euler2 = sum [ x | x <- map fib [2 .. ] , even x, x < 4000000]

Here our list comprehension draws the value of x from the infinite list of Fibonacci values, which is generated using a map on the list of all natural numbers starting from 2. The predicate we use here tells the list only to accept values smaller than 4000000.

Unfortunately, this program never terminates: it keeps checking for values even when it has far surpassed the 4000000 mark because it has no notion of the monotonic property that fib enjoys: the program has no way of knowing that each new value of x will be greater than the last, and so it keeps checking the next value of x even after the value of 4000000 has been surpassed, just in case one is found that is smaller.

TakeWhile

To get round this problem, we need a function that takes values while they are smaller than 4000000, and when a value greater than this is found, we should stop taking values. This kind of behaviour is captured precisely by a takeWhile function, which takes values while some predicate holds true, and stops taking values when the predicate is false:

takeWhile :: (a -> Bool) -> [a] -> [a]
takeWhile p []     = []
takeWhile p (x:xs)
  | p x            = x : takeWhile p xs
  | otherwise      = []

This definition uses slightly different notation than what we’ve been used to because it does a case analysis based on the value of p x. This case analysis checks each condition to the right of the | symbol, and uses the definition to its right if the condition evaluates to True. The first condition passes when p x holds, and if it does, then the function returns the value of x added to the front of the result of taking xs while p holds. If none of the cases pass, then the definition to the right of otherwise is used, and in this case, x is discarded, and nothing else is returned. This function is also very common, and found in the prelude.

We modify our answer to make use of this takeWhile function, and end up with the following definition:

> euler2' = (sum . takeWhile (< 4000000))
>   [x | x <- map fib [2 .. ], even x]

Here we make use of the (.) operator, known as function composition, which performs one function after another. In this case, we perform a sum after the result of takeWhile (< 4000000) applied to our list of even Fibonnacis.

Function composition is very useful since it allows us to chain functions one after the other in a really easy way. Here’s its definition:

(.) :: (b -> c) -> (a -> b) -> a -> c
(g . f) x = g (f (x))

The composition of two functions, g . f, pronounced g after f, first applies f to some value x, and then applies g to the result. The type of this operator is a bit confusing at first, but when you consider that the function f is the first one that is applied to a value of type a, then everything falls into place.

Although the solution to our problem eventually returns a result, it is incredibly slow and inefficient, taking about 25 seconds before anything is returned on my machine. So we’ll now think of another way of solving the problem.

Streams

The reason our previous solution, euler2', is slow, is that each application of fib in our map must be recalculated from scratch. To make our program run faster, we’ll tackle the problem head on by avoiding the use of a map, and generating the Fibonacci sequence directly, and make use of the fact that each new value in the sequence is related to the previous two.

Here’s a definition that produces an infinite list, sometimes called a stream, of Fibonacci values:

> fibs   :: [Int]
> fibs   = 0 : fibs'
> fibs'  = 1 : fibs''
> fibs'' = zipWith (+) fibs fibs'

This definition makes use of mutual recursion, where the three functions fibs, fibs' and fibs'' depend on each other. The interesting case is that of fibs'', which makes use of a zipWith function. A zipWith basically takes a function and two lists, and combines the two lists by using that function to produce another list. Here’s the definition:

zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith f (x:xs) (y:ys) = f x y : zipWith f xs ys
zipWith _ _      _      = []

The first part of this definition simply applies f to x and y before zipping the rest of the two lists with f again. The second part of the definition is basically a “catch all” that basically says that if the first pattern doesn’t match, then return an empty list. This uses underscores, _, instead of values, since underscores match any expression.

Using our new version of fibs, we get the following solution to the problem:

> euler2'' = (sum . takeWhile (< 4000000))
>   [x | x <- fibs'', even x]

This version returns its result immediately!

Summary

In this post we’ve seen two different solution to our problem, one that uses map, and another that uses a stream. Here are some of the points we’ve covered:

  • Using map we can transform one list into another.
  • A takeWhile helps us limit infinite lists.
  • Functions can be composed using the (.) operator.
  • Creating streams can be very efficient!

Once again, we’ve covered a lot of ground in this short article, and I hope I haven’t left you confused. I’d love to hear some feedback from you, so please feel free to leave comments!

Comments