Project Euler Problem 53
02 Jun 2015Rather than cherry-pick another PE problem, I'm gonna move right along to the next one in the sequence.
Combinatoric selections
How many, not necessarily distinct, values of n choose r, for 1 <= n <= 100, are greater than one million?
Naturally, the first thing to do is to define our choose
function. We can
pull the definition straight from the formula given in the problem definition:
choose :: Int -> Int -> Int
choose n r = fact n `div` (fact r * fact (n-r))
where fact i = product [1..i]
But this turns out to be a really inefficient way of computing binomial
coefficients (the fancy name for the output of this function). Its inefficient,
since we'll have to calculate three different factorials, and it'll also give
us nonsense answers when the factorials get big enough to cause integer
overflow. Of course, we can avoid overflow by using Integer
rather than
Int
, but we'd be better off avoiding this definition altogether.
choose :: Int -> Int -> Int
choose _ 0 = 1
choose 0 _ = 0
choose n k = choose (n-1) (k-1) * n `div` k
This is much better. It uses a recurrence relation for binomial coefficients that doesn't involve computation of factorials, which I found in the wikipedia article for combinations.
We still have the problem of overflow, which is unavoidable due to the size of
the values we'll be computing. For instance, 100 choose 50
is over 10^29 --
much higher than the maximum Int
value. Thankfully, all we need to do is
change our type signature:
choose :: Integer -> Integer -> Integer
The fact that we can change our type signature this easily means that our
choose
definition is not as type-specific as we're forcing it to be. In fact,
we can make it more abstract:
choose :: Integral a => a -> a -> a
This is possible because our definition uses three functions internally: (-)
,
(*)
, and div
. The first two are defined in the Num
typeclass, while the
third is defined in the Integral
typeclass -- which is itself defined in
terms of Num
. Thus Integral
is the least specific constraint we have to put
on this function; any Integral
can work as input.
Now, telling GHC to use a specific Integral
instance is as simply as
providing a type annotation. Doing so in GHCI shows us the difference between
the overflowing Int
and the infinite-precision Integer
:
> 100 `choose` 48 :: Integer
93206558875049876949581681100
> 100 `choose` 48 :: Int
-9982283485397623
We need to make a list of all the values of n choose k
for each n
between 1
and 100 inclusive, and for each k
between 1 and n
inclusive. We can define a
function that will give us all values for k
in terms of some input n
:
choices :: Integral a => a -> [a]
choices n = map (choose n) [1..n]
...and then map that function over all the values of n
. This will give us a
list of lists, but for our purposes it would be best to have a flat list. Thus,
we can use concatMap
.
values :: Integral a => [a]
values = concatMap choices [1..100]
We're not defining a function here, just a bare list of Integrals
. We can
filter the list by the predicate given in the problem, and find the length of
the filtered list.
solution :: Int
solution = length $ filter (>1000000) values
This is all well and good, but GHC will give us a warning if we compile this
definition. It warns that it's defaulting to Integer
as the instance for
the values in values
. That's fine for our purposes, since we wanted the
precision of Integer
s, but I hate having to deal with the warning (especially
since it shows up in ghc-mod
, my linter.)
To avoid it, we can inline a type annotation for our use of values.
solution = length $ filter (>1000000) (values :: [Integer])
However, its a little cleaner, and a lot more sensible, to put this annotation
in the type signature for values
itself. It doesn't make that much sense to
have values
be polymorphic, especially since we know that most instances of
Integral
are going to give us overflowing, nonsense results.
Lastly, we can note that concatMap
is the definition for >>=
in the list
monad. So if we want to look a bit fancier, we can rewrite values as
values = [1..100] >>= choices
Using >>=
makes it a little less obvious what's going on, though experienced
Haskellers will be able to tell that we're feeding all of the members of the
list through choices
, one after the other, and flattening the result.
However, I prefer the more explicit version, where we use concatMap
directly.