05 December 2006

The Division Bell Tolls for Me, Part One

With apologies to both Pink Floyd and John Donne.

I've been toying with Haskell. I wish I had enough free time for a systematic study of Richard Bird's Introduction to Functional Programming Using Haskell, Second Edition, but for now I have to content myself with dipping into it in small scraps of free time when both babies are asleep and the dishes are done. In addition, I'm using Haskell to model the behavior of some functions that I'm actually writing in C++.

One of the reasons I like Haskell is that with a tool like GHCi, which gives me a REPL (a Read, Evaluate, Print loop for interactive development), I can write toy one-liners that do something useful, and since Haskell supports typing, I can test how machine types behave. With an important exception, which I will discuss.

Let's say I want to try out a function that maps the fractional values -7/7 through 7/7 to unsigned integers in the range 0..65536 (the range of values representable by 16 bits, plus one). In Haskell I can try this out using a one-liner (typed into GHCi). Let's build it up. First, the range of input values. Ignore for now the possibility of using or introducing a true rational number type; instead we'll generate a set of tuples. In Haskell you can express this very concisely using a list comprehension:

[(numer, 7) | numer <- [-7..7] ]

[(-7,7),(-6,7), (-5,7), (-4,7), (-3,7), (-2,7), (-1,7), (0,7), (1,7), (2,7), (3,7), (4,7), (5,7), (6,7), (7,7)]

The list comprehension can be (roughly) read as "Make a list of tuples out of numer, 7 where numer takes on the values from the list -7 to 7." Think for a moment about how you could do this in your favorite language.

Now let's do our math on the resulting list. We can use map to apply a function to the list. We can use fst and snd to get to the values in the tuple; for example, to look at the numerators, do this:

map fst [(numer, 7) | numer <- [-7..7] ]

[-7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7]

Let's go ahead and do our scaling. We want to describe a single function to pass each tuple to; in fact we want a lambda abstraction, a function that we don't bother to name:

map (\tup -> fst tup / snd tup * 65536 ) [(numer, 7) | numer <- [-7..7] ]

[-65536.0, -56173.71428571428, -46811.42857142857, -37449.142857142855, -28086.85714285714, -18724.571428571428, -9362.285714285714, 0.0, 9362.285714285714, 18724.571428571428, 28086.85714285714, 37449.142857142855, 46811.42857142857, 56173.71428571428, 65536.0]

Whoops, that's not quite right. We want to map to only the positive values. We have to shift the domain (the incoming numerators) so they are non-negative:

map (\tup -> (fst tup + snd tup) / snd tup * 65536 ) [(numer, 7) | numer <- [-7..7] ]

[0.0, 9362.285714285714, 18724.571428571428, 28086.85714285714, 37449.142857142855, 46811.42857142857, 56173.71428571428, 65536.0,  74898.28571428571, 84260.57142857143, 93622.85714285714, 102985.14285714286, 112347.42857142857, 121709.71428571429, 131072.0]

But now the codomain is too big; we have to cut it in half:

map (\tup -> (fst tup + snd tup) / snd tup * 65536 / 2) [(numer, 7) | numer <- [-7..7]]

[0.0, 4681.142857142857, 9362.285714285714, 14043.42857142857, 18724.571428571428, 23405.714285714286, 28086.85714285714, 32768.0, 37449.142857142855, 42130.28571428572, 46811.42857142857, 51492.57142857143, 56173.71428571428, 60854.857142857145, 65536.0]

That looks about right. Now let's apply a type. Because I'm eventually going to be implementing this without floating-point, I want the math to be done on machine integers. There are some exceptions, but most target architectures these days give you 32 bits for int, so Haskell's Int, which is defined to be 32 bits, should yield approximately what I want. To introduce typing, all I have to do is add a type to the list values. Type inferencing will do the rest. Oh, and since division by slash is not defined on Int, I have to change it to `div` (infix division):

map (\tup -> (fst tup + snd tup) `div` snd tup * 65536 `div` 2) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

[0, 0, 0, 0, 0, 0, 0, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 65536]

Whoah. Something has gone disastrously long. Take a moment to figure out what it is. I'll wait.

Did you figure it out?

The problem is that when I use integer division in an expression, the result is truncated (the fractional part is lost). If the division occurs before any of the other operations, most of the significant bits of the math are lost! I can fix this by making sure to do the division last:

map (\tup -> (fst tup + snd tup) * 65536 `div` snd tup `div` 2) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

[0, 4681, 9362, 14043, 18724, 23405, 28086, 32768, 37449, 42130, 46811, 51492, 56173, 60854, 65536]

That's better. While I'm at it, I can simplify out that factor of 2 by using half of 65536:

map (\tup -> (fst tup + snd tup) * 32768 `div` snd tup) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

[0, 4681, 9362, 14043, 18724, 23405, 28086, 32768, 37449, 42130, 46811, 51492, 56173, 60854, 65536]

In fact, I can ask GHC to verify that the results are the same:

map (\tup -> (fst tup + snd tup) * 65536 `div` snd tup `div` 2) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ] == map (\tup -> (fst tup + snd tup) * 32768 `div` snd tup) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

True

Very cool! Of course, this is not a rigorous proof, but I'm satisfied that my function is correct for the given inputs, and sometimes that's enough.

5 comments:

The Alternate Moebyus said...

> With an important exception, which I will discuss.

I'd actually claim that the "machine types" are behaving exactly like they would in the C world :) Integer division is truncation, which is usually avoided by the usual trick of using the associative law

Nice to see you using Haskell, though. It's a breath of fresh air.

Paul R. Potts said...

Hi alternate moebyus,

Thanks for your comment. I am going to continue this topic and explore just how integer division in C actually behaves, what guarantees can be made about it, and what can't.

Unknown said...

I suspect a more Haskellish way to write the original division code would be zipWith (/) (repeat 7) [-7 .. 7]. Or even [ 7 / y | y <- [-7 .. 7] :-) The final code then becomes x + y) * 32768 `div` y) | x <- 7::Int, y <- (-7::Int)..(7::Int)] ] (Untested, but should work).

Paul R. Potts said...

Miles, thanks for the suggestions... I will play with these a little later. I am very new to Haskell.

Unknown said...

Hmmm, some brackets went missing in my comment. I meant, of course,

[ 7 / y | y <- [-7 .. 7] ]
and

[ (x + y) * 32768 `div` y) | x <- 7::Int, y <- (-7::Int)..(7::Int)] ]