by Nicolas Wu

Posted on 15 July 2010

This week’s challenge:

Find the largest prime factor of a composite number.

What’s the prime factor of a number? Well, the short answer is that it’s the unique bag of primes that, when multiplied together, gives you the number you’re looking for.

The composite number we’ve been challenged with is `600851475143`

, which is pretty massive. In fact, since an `Int`

can only promise an upper bound of `2^29-1 = 536870911`

, we’ll have to use a different type, called `Integer`

in our calculations, since these are unbounded.

Values of type `Integer`

work just like `Int`

, and also make use of functions like `(+)`

and `div`

. You might be wondering how this works in terms of the types and the truth is that I simplified things earlier on, when we discussed the type of those functions.

The type of `(+)`

is actually:

`(+) :: (Num a) => a -> a -> a`

This basically says that `(+)`

is polymorphic and takes values of type `a`

, where each `a`

must be part of the `Num`

class. We haven’t talked about classes yet, but they’re basically a way of expressing the fact that some types are related, like `Int`

and `Integer`

, and other numbers like `Float`

that represent fractions. Other functions like `div`

and `mod`

are defined similarly. I’ll talk more about this in another article.

To solve our problem, we’ll need a definition that gives us the primes. To make our version reasonably efficient, we’ll implement a version of the Sieve of Eratosthenes, and pay tribute to the old codger while we strike off the primes from our list as we encounter them (though, apparently, the variant used here is actually due to Turner).

```
> primes :: [Integer]
> primes = sieve [2 .. ]
> where
> sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p > 0]
```

This definition makes use of a `where`

clause, where part of the definition is given after the main body of our function, and restricted to the scope of the parent function. Here a `where`

is really what we want because `sieve`

is defined in terms of itself, as a recursive function, and we don’t particularly want to expose its definition to the rest of the program.

The idea behind `sieve`

is that we start with some prime number `p`

and the rest of the numbers `xs`

that we haven’t considered yet. The list of primes becomes the value of `p`

tagged onto the front of the rest of `xs`

where we remove multiples of `p`

.

Our next task is to use our list of primes and find out the largest one which factors into our number. Using basic arithmetic, it’s obvious to see that our factor will only be as big as the square root of our number, unless it is prime. We’ll start by defining a function that calculates integer square roots for us.

The prelude gives us a function `sqrt`

that gives us the square root of a floating point number, and has this type:

`sqrt :: (Floating a) => a -> a`

Here, the class `Floating`

is used for types that represent numbers with a “floating point”, like `1.5`

, or `4.125`

.

Since we want to work with values of type `Integer`

, or `Int`

, we want to convert to and from these types. To do so, we’ll make use of `fromIntegral`

:

`fromIntegral :: (Integral a, Num b) => a -> b`

This basically takes a value of type `a`

that is a member of the `Integral`

class, which are just whole numbers, and produces a value of type `b`

that is a `Num`

. To convert from our number that is fractional, we make use of the function `floor`

, which basically removes the fractional remainder of a number.

We can then compose these functions together to get a version of `sqrt`

that works for values of type `Integer`

:

```
> isqrt :: (Integral a) => a -> a
> isqrt = floor . sqrt . fromIntegral
```

We can now write a function, `pfactors`

, that solves the problem for us by finding the prime factors. This definition uses a list comprehension to generate all the primes less than the square root of the number, `x`

, that we’re interested in, such that each of those primes is a factor of `x`

:

```
> pfactors :: Integer -> [Integer]
> pfactors x
> | null ps = [x]
> | otherwise = ps
> where ps = [p | p <- takeWhile (<= isqrt x) primes, x `mod` p == 0]
```

Sadly, using this version is very inefficient, since we’re not making the most of the fact that when we find a factor, `p`

, we can reduce the problem to finding the factors of our original number divided by `p`

. In other words, we know that the following property holds when `p`

is a prime factor:

`pfactors x == p : pfactors (x `div` p)`

So let’s write a program that does make the most of this fact:

```
> pfactors' :: Integer -> [Integer] -> [Integer]
> pfactors' x (p:ps)
> | p > x = []
> | x `mod` p == 0 = p : pfactors' (x `div` p) (p:ps)
> | otherwise = pfactors' x ps
```

This version takes an additional argument, `(p:ps)`

, which is the increasing list of all primes. The first condition `p > x`

has the same effect as our `takeWhile`

, and returns the empty list when we know that no more factors can be found.

The next condition makes use of our fact about factors directly: if `p`

is a multiple of `x`

, then we add `p`

to the list of prime factors found, and we continue factoring using `x `div` p`

. The only tricky part is that we need to remember to pass on the whole list of primes `(p:ps)`

, and include `p`

, since that prime might factor more than once into the new number.

If none of the other conditions hold, then we know that `p`

is not a factor, and we keep searching the rest of `ps`

for factors of `x`

.

To answer our problem, we want to find the last value in our list, since factors are produced in increasing order, which we can do with the function `last`

, found in the prelude:

`last :: [a] -> a`

This function just takes a list, and returns its last value. Putting this together, we have the following:

`> euler3 = last (pfactors' 600851475143 primes)`

In this article we took a quick look at classes, and how they can be used to relax some of functions, like `(+)`

, so that they can be used in related types, like `Int`

and `Integer`

. We’ll talk more about classes in another article.

We’ve also seen the following points:

- Using a
`where`

clause allows us to create local function definitions. - Values of different types can be converted.
- Exploiting problem properties makes algorithms much faster!