05 December 2006

The Division Bell Tolls for Me, Part Two

OK. In our last installment I started talking about using Haskell one-liners to calculate some functions. The function in question was designed to map a set of fractions in the range -7/7 to 7/7 to the machine integers 0..65536. The function itself, along with a wrapper to test, it fits neatly into a Haskell one-liner (well, it might be more than one line, depending on your terminal width). You can use GHCi to evaluate it:

map (\tup -> (fst tup + snd tup) * 32768 `div` snd tup) [(numer, 7) | numer]

I showed in the last installment that this function gave me the results I wanted:

[0, 4681, 9362, 14043, 18724, 23405, 28086, 32768, 37449, 42130, 46811, 51492, 56173, 60854, 65536]

Let's play with that function a little more, and then I want to talk about division (hence the title of the entry). In particular, integer division.

First, let's simplify our expressions by breaking them down and establishing some bindings. We can also get rid of the presumption that our output range starts at zero:

let gen_nths denom = [(numer, denom::Int) | numer <- denom::int let div_map m1 m2 = map (\tup -> (fst tup + snd tup) * (((m2::Int) - (m1::Int)) `div` 2) `div` snd tup + m1)

And now we can test it. Note that we need to add a few parentheses:

div_map (-32768) 32768 (gen_nths 11)

[-32768, -29790, -26811, -23832, -20853, -17874, -14895, -11916, -8937, -5958, -2979, 0, 2978, 5957, 8936, 11915, 14894, 17873, 20852, 23831, 26810, 29789, 32768]

OK. Now what is it we've done exactly? Well, we've generated some test values, and an algorithm, for distributing a fractional range onto a machine integer range. For our purposes, think of it as a little machine for testing division.

To see just why I chose this algorithm to play with Haskell, we have to go back and consider some C code. It's quite a headache-inducing paradigm shift to go back to C/C++, but bear with me. Let's just start with the simplest thing: we'll create the same mapping we just generated, hard-coding the values:

long denom = 11;
long numer;
for ( numer = -11; numer <= 11; numer++ )
{
    long val = numer * 32768 / denom;
    printf("%d\n", val);
}

And the results:

-32768
-29789
-26810
-23831
-20852
-17873
-14894
-11915
-8936
-5957
-2978
0
2978
5957
8936
11915
14894
17873
20852
23831
26810
29789
32768

Hmmm... wait a minute, let's compare the results:

Haskell            C
-32768             same
-29790             -29789
-26811             -26810
-23832             -23831
-20853             -20852
-17874             -17873
-14895             -14894
-11916             -11915
-8937              -8936
-5958              -5957
-2979              -2978
0                  same
2978               same
5957               same
8936               same
11915              same
14894              same
17873              same
20852              same
23831              same
26810              same
29789              same
32768              same

It looks like every single one of the negative values is off by one! In our third and final installment I'll dive a little deeper into the semantics of integer division.

2 comments:

The Alternate Moebyus said...

long val = ( numer + denom ) * 32768 / denom

Paul, am I reading this correctly? At numer=-11, this should evaluate to (-11+11)*whatever=0, and not -32768?

Shouldn't it be something on the lines of

(numer+denom)*32768/denom - 32768

Out of curiosity, what is sizeof(long)? Haskell defines Data.Bits.bitSize (1::Int) to be 32 somewhere in the standard...

Paul R. Potts said...

Hi alternate moebyus,

You are quite right... I made a mistake in the C code. Actually I copied-and-pasted the wrong version. To get rid of the shift to positive numbers just take out the +denom inside (numer+denom). I have made this change in the original article.

There was also a problem in that the declaration of the for loop variable inside the parens after "for" is a C++ construct, also supported in C99, but not C90. I have changed that too.

On the platforms I have access to: Mac PowerPC, an embedded PowerPC platform, Win32 and Linux on a Pentium, sizeof(long) is 4 (bytes).