07 December 2006

The Division Bell Tolls for Me, Part Three

In the last installment, I pointed out a discrepancy between the way that Haskell does integer division and the way that some C code does it. Let's look at just how integer division behaves in Haskell a little more closely. You can type these expressions into GHCi:

6 / 5

1.2

5 / 6

0.8333333333333334

(6::Int) `div` (5::Int)

1

(5::Int) `div` (6::Int)

0

Integer division in the positive numbers always rounds down: 1.2 rounds down to 1, and 0.8333... rounds down to 0. In fact, it looks like the fractional part is just truncated. But let's check some negative values to be sure:

-6 / 5

-1.2

(-6::Int) `div` (5::Int)

-2

-5 / 6

-0.8333333333333334

(-5::Int) `div` (-6::Int)

-1

That's interesting; clearly it is not strictly truncation (rounding towards zero) that is happening, or else -5/6 would give us zero, not -1. And we clearly aren't getting "rounding" in the usual sense of rounding to the nearest integer (although there is no universal rounding algorithm to round to the nearest integer; there are lots of different variations in practice).

It looks like we have floor() behavior, which in the case of negative values means rounding away from zero. To state it slightly more rigidly, if the result of the div operation is a whole number, we get the whole number, otherwise we get an integer that represents the exact answer minus the fractional part.

If you learned integer division the old-fashioned way, on paper, you know what remainders are. The remainder is the leftover part of the process, and represents the numeator of the fractional part of the quotient. In Haskell, div gives you the whole number result of integer division, and mod will give you the remainder. So, for example,

(2::Int) `div` (3::Int)

0

(2::Int) `mod` (3::Int)

2

In fact, what this points to is that there is a relationship between the result of div and the result of mod. It should hold true in all cases, actually; in C, it is part of the ISO standard. The relationship is this:

For any quotient o = m `div` n where n is not zero, and the remainder p = m `mod` n, o * n + p = m.

In other words, you can reverse the integer division by multiplication as long as you add the remainder back in.

This relationship should be obviously true for any non-negative n and m. But I'll go further and say that while the exact meaning of mod for negative numbers varies from language to language, and perhaps from implementation to implementation, this relationship should still hold true, with the possible exception of cases involving overflow (which I'll discuss further below).

We can represent the relationship in Haskell like this:

10 `div` 3

3

10 `mod` 3

1

let undivide n o p = o * n + p
undivide 3 ( 10 `div` 3 ) ( 10 `mod` 3 )

10

In fact, let's test it for a few values of n and m:

let check_divide m n = ( undivide n ( m `div` n ) ( m `mod` n ) ) == m
[ check_divide m n | m <- [(-5::Int)..(5::Int)], n <- [(-5::Int)..(-1::Int)] ]
[ check_divide m n | m <- [(-5::Int)..(5::Int)], n <- [(-5::Int)..(-1::Int)] ]

Try out the results for yourself; again, not a rigorous proof, but suggestive.

But what about the overflow cases? Let's examine them. If Haskell's Int type is a 32-bit signed int, then we should be able to represent the values −2,147,483,648 (corresponding to 10000000000000000000000000000000 in binary or 0x80000000 in hexadecimal), to +2,147,483,647 (corresponding to 01111111111111111111111111111111 in binary or 0x7FFFFFFF in hexadecimal). Let's ask Haskell what happens when we try to exceed our largest positive number:

(2147483647::Int) + (1::Int)

-2147483648

Yep, we overflow; we get the largest negative number. Conversely, we can underflow:

(-2147483648::Int) - (1::Int)

2147483647

We don't get run-time error checking (and probably don't want it, for the sake of efficiency, when using mahine types). Of course, division by zero is still considered a run-time error:

(2147483648::Int) `div` (0::Int)

*** Exception: divide by zero

What about overflow or underflow in the results of division? With multiplication, it is very easy to overflow; multiplying two large numbers can easily exceed the width of a 32-bit signed value. You can come pretty close with:

(46341::Int) * (46340::Int)

2147441940

because 46340 is the next-lowest integer value to the square root of 2,147,483,647. Generate a result just slightly higher and you will roll over to a large negative:

(46341::Int) * (46341::Int)

-2147479015

With multiplication there are lots of combinations of values that will overflow, but with division it is a little bit harder to trigger overflow or underflow. There is one way, though. While 2,147,483,647 divided by any lesser value will be a lesser value, and divided by itself will equal one, recall that twos-complement representation gives us a slightly asymmetric set of values: the largest representable negative number is one larger, in absolute magnitude, than the largest representable positive value. Thus:

(214483647::Int) `div` (1::Int)

214483647

(214483647::Int) `div` (-1::Int)

-214483647

(-214483648::Int) `div` (1::Int)

-214483648

In 32 bits of twos-complement, -2,147,483,648 is the "weird number." As a general rule, if you take a twos-complement number, flip all the bits, and add one, you get the negative of that number. The only exception is the weird number. Let's see what happens when we try to divide the weird number by -1:

(-2147483648::Int) `div` (-1::Int)

Now, I would like to tell you what GHCi says when I ask it to divide the weird number by -1. I'd also like to see what it says about the mod. But I can't. GHCi crashes when I type in those expressions.

So, I can't tell you what Haskell says. But I can tell you what the C standard says. Here's a hint: not only is it not defined, but in fact the result of any overflow that occurs with arithmetic operations on signed integral types is not defined by the language standard. Only the results of arithemtic operations on unsigned integers have defined semantics for overflow, making them portable -- assuming that the size of the type in question is the same. More on that next time.

7 comments:

Anonymous said...

What platform did GHCi crash on? I tried (-2147483648::Int) `div` (-1::Int) and (-2147483648::Int) `mod` (1::Int) and both produced 0 for me. I'm using Mac OS X 10.4.8 PPC.

Paul R. Potts said...

Hi docmach,

I am using GHCi 6.6, the binary distribution, on Win32 (Windows XP Pro SP2) on Intel (Pentium 4 3.6 GHz).

I just tested the Sep 2006 version of WinHUGS and it did the same thing.

I will have to try it at home; there I can test it on Windows 2000 and Fedora Core 5 on a different PC, and on a Mac Mini (G4) also running MacOS X 10.4.8.

Interesting that you got zero. The answer for the `div` is +2147483648, but of course that is not representable.

Antti-Juhani Kaijanaho said...

I trust you are aware of the quot-rem pair of functions in Haskell, in addition to the div-mod pair? :)

Paul R. Potts said...

Hi antti-juhani,

No, I was not! I am very new to Haskell, and just learning. I will look them up.

Thanks for adding my blog to Planet Haskell!

Paul R. Potts said...

Before anyone asks -- I have just filed a bug report on the GHC bug tracker for the "weird number" div -1 crash.

Gaurang said...

hey, am just a passerby, came via google searches. Basically wanted to tell you that I believe that PPC processor does not segfault when divide by zero happens, instead gives you some other answer like zero or a very high number (dont know which). On intel, the same thing generates an exception, and crashes.

Anyway this is what I beleive right now, am googling for this thing itself, when i found your page.

Paul R. Potts said...

Hi gaurang,

Thanks for your comment. I discovered the existence of SIGFPE to handle division overflow on Pentium CPUs. I had not previously been aware that overflow could give that result; it seems like a strange choice, since overflow from multiply does not.

I guess it comes down to what guarantees, if any, GHC makes (or can make) about the behavior of the code. Unfortunately I suspect they will probably just decide that this case has to be left undefined, since there probably isn't much they can do that would not ruin the efficiency of the compiled code with runtime checks.