22 December 2006

Happy Life Day and Wishing You Lots of Haskell in 2007

Let's remember the words of Princess Leia, in the Star Wars Holiday Special:

We celebrate a day of peace. A day of harmony. A day of joy we can all share together joyously. A day that takes us through the darkness. A day that leads us into might. A day that makes us want to celebrate the light. A day that brings the promise that one day, we'll be free to live, to laugh, to dream, to grow, to trust, to love, to be.

Happy Life Day, everyone! Or whatever tradition you celebrate.

I am not planning to work on any more entries until early January, so I just wanted say that I am extremely grateful for all the comments and suggestions I've received on my blog entries. With two babies at home I don't get much time to study textbooks or tutorials, although I try, and I don't get much dedicated coding time. But the suggestions and concrete examples have helped me make good use of what free time I do have. Thanks to all of you for your support learning this exciting language!

19 December 2006

From Bits to Cells: Simple Cellular Automata in Haskell, Part Two

Let's get back to our CA generator. Literate Haskell follows:

Last time we defined a function to generate the next state of a given
cell in our cellular universe, given a rule number and a tuple consisting
of the current state of the cell to the left, the cell itself, and the
cell to the right.

>import Data.Bits

>genNextBit :: Int -> ( Bool, Bool, Bool ) -> Bool
>genNextBit rulenum ( left, center, right ) = rulenum `testBit` idx
>    where idx = ( if left then (4::Int) else (0::Int) ) .|.
>           ( if center then (2::Int) else (0::Int) ) .|.
>           ( if right then (1::Int) else (0::Int) )

Recall that we can use automatic currying to make a rule-applying
function like so:

>rule_30 = genNextBit 30

We can ask GHCi for the type:

:type rule_30

rule_30 :: (Bool, Bool, Bool) -> Bool

I've put it off while I work on the rules, but it is time to figure out
how to actually represent our CA universe. Let's start by using a list.
I know that I'm going to write a number of inefficient functions, and
do evil things like take the length of lists a lot, but let's suspend all
concerns about efficiency over to a future discussion and consider this
purely a proof-of-concept.

Our inital universe at time zero has one cell set to True:

>initial_universe = [True]

But that isn't quite the right representation for the universe, because
it implies that our whole universe is one cell in size. We can't even
apply our rule once because there is no left cell and right cell!
Really, we want to pretend that we have an _infinite_ universe; at
time zero, all the cells to the left and right hold False. Remember,
Haskell is so powerful that it can traverse an infinite list in only
0.0003 seconds! Well, if you don't evaluate the whole thing, that is.
Taking advantage of lazy evaluation, you can define all kinds of
infinite structures. This construct will give us an infinite list of
False values:

>allFalse :: [Bool]
>allFalse = False : allFalse

We don't want to evaluate allFalse, but we can partially evaluate it
using a function like take. So can we represent our universe like this?

>genInfiniteUniverse :: [Bool] -> [Bool]
>genInfiniteUniverse known_universe = allFalse ++ known_universe ++ allFalse

Let's try it:

take 10 ( genInfiniteUniverse initial_universe )

[False,False,False,False,False,False,False,False,False,False]

Nope! Since the left-hand side of the universe is infinite, we will
never reach the middle element, no matter how far we travel from the
start of the list!

That's no good. However, we can do it another way. We'll allow our
universe to be expanded on demand on the left and right sides:

>expandUniverse :: Int -> [Bool] -> [Bool]
>expandUniverse expand_by known_universe =
>   ( take expand_by allFalse ) ++ known_universe ++ ( take expand_by allFalse )

expandUniverse 3 initial_universe

[False,False,False,True,False,False,False]

We can use the expandUniverse function to expand our initial universe
out to a standardized width before we start applying the rules.

First, here's a routine to stringify a universe for display:

>boolToChar :: Bool -> Char
>boolToChar True = '#'
>boolToChar False = ' '

>stringifyUniverse :: [Bool] -> String
>stringifyUniverse ls = map boolToChar ls

Now our infrastructure is in place, so let's figure out how to apply
our generator rule. We know that we want to start with our initial
universe. Let's expand it to a fixed size. This will give us enough
elements to start making left/center/right tuples out of each consecutive
set of three cells. Each tuple is then used to look up the next state
of the cell at the center; this will become an element in our next
universe. Then we move to the next cell (not three cells down). This
means that the tuples overlap.

Let's make the tuples. We have to do some thinking here and consider
all the cases; the behavior isn't immediately obvious. The following
almost works:

universeToTuples :: [Bool] -> [(Bool, Bool, Bool)]
universeToTuples universe | length universe >= 3 =
   ( universe !! 0, universe !! 1, universe !! 2 ) :
   universeToTuples ( tail universe )
universeToTuples universe = []

universeToTuples [False, True, True, True, False]

[(False,True,True),
(True,True,True),
(True,True,False)]

But it isn't quite right; it leaves off the end cases; when we apply
our rules, the intermediate representation of the universe as a list
of tuples to look up cell mappings will shrink. We actually want the
following tuples:

[(False,False,True),
(False,True,True),
(True,True,True),
(True,True,False),
(True,False,False)]

where the first element of the list is considered as if it was just
to the right of an implied False, and the last element is considered
as if it was just to the left of another implied False. This sounds
like another place we can use our universe expander:

>universeToTuples :: [Bool] -> [(Bool, Bool, Bool)]
>universeToTuples [] = []
>universeToTuples universe = tupleize $ expandUniverse 1 universe
>   where
>       tupleize xs =
>            if length xs > 3 then tuple3 xs : tupleize ( tail xs )
>               else [ tuple3 xs ]
>       tuple3 xs = ( xs !! 0, xs !! 1, xs !! 2 )

Why did I write it that way? Well, I tried to write tupleize using
guards, special-casing length xs > 3 followed by an unguarded case for
all other possibilities, but GHC didn't like it -- it told me I had non-exhaustive patterns. There is probably a smarter way to write this,
but note that we definitely don't want this version:

universeToTuples universe = ( xs !! 0, xs !! 1, xs !! 2 )
    : universeToTuples ( tail xs )
       where xs = expandUniverse 1 universe

In that version, the universe keeps expanding as we walk down the list,
and we never get to the end!

OK, now that we have our tuples, we want to turn them into our new
universe, given a cellular rule number:

>tuplesToUniverse :: Int -> [(Bool, Bool, Bool)] -> [Bool]
>tuplesToUniverse rule [] = []
>tuplesToUniverse rule (tup:tups) = genNextBit rule tup : tuplesToUniverse rule tups

Note that we don't have to explicitly take the tail since we provide a
name for it in the pattern. We're ready to define our genUniverses function
that applies a given CA rule. We can express a given generation like this:

>nextUniverse :: Int -> [Bool] -> [Bool]
>nextUniverse rule universe = tuplesToUniverse rule $ universeToTuples universe

Now, let's generalize it:

>genUniverses :: Int -> Int -> Int -> [[Bool]]
>genUniverses rule width count = take count
>   ( iterate ( nextUniverse rule ) ( expandUniverse ( width `div` 2 ) initial_universe ) )

(You could also use a fold, and I'm sure there are lots of other ways to
do it, but iterate seems to work fine).

And now, witness the unfolding of a universe! Note that the parameters
that go to genUniverses are the rule number, the width for display, and the number of generations:

putStr $ unlines $ map stringifyUniverse $ genUniverses 222 19 10

          #
         ###
        #####
       #######
      #########
     ###########
    #############
   ###############
  #################
 ###################

In general, a width of twice the number of generations - 1 will show
all the transitions we are interested in; you could consider the diagonal
area above to be the "light cone" of events causally connected to that
single point (although some rules will generate True cells outside of
that "light cone" based on the other False values in the initial universe). Let's make a helper function to choose a width for us:

>showRule rule gens =
>   putStr $ unlines $ map stringifyUniverse $
>       genUniverses rule ( gens * 2 - 1 ) gens

Let's try a few of the other rules:

showRule 252 15
              #
              ##
              ###
              ####
              #####
              ######
              #######
              ########
              #########
              ##########
              ###########
              ############
              #############
              ##############
              ###############

showRule 78 15
               #
              ##
             ###
            ## #
           ### #
          ## # #
         ### # #
        ## # # #
       ### # # #
      ## # # # #
     ### # # # #
    ## # # # # #
   ### # # # # #
  ## # # # # # #
 ### # # # # # #

And finally, my all-time favorite, which simulates a Sierpinski gasket:

showRule 82 32
                                #
                               # #
                              #   #
                             # # # #
                            #       #
                           # #     # #
                          #   #   #   #
                         # # # # # # # #
                        #               #
                       # #             # #
                      #   #           #   #
                     # # # #         # # # #
                    #       #       #       #
                   # #     # #     # #     # #
                  #   #   #   #   #   #   #   #
                 # # # # # # # # # # # # # # # #
                #                               #
               # #                             # #
              #   #                           #   #
             # # # #                         # # # #
            #       #                       #       #
           # #     # #                     # #     # #
          #   #   #   #                   #   #   #   #
         # # # # # # # #                 # # # # # # # #
        #               #               #               #
       # #             # #             # #             # #
      #   #           #   #           #   #           #   #
     # # # #         # # # #         # # # #         # # # #
    #       #       #       #       #       #       #       #
   # #     # #     # #     # #     # #     # #     # #     # #
  #   #   #   #   #   #   #   #   #   #   #   #   #   #   #   #
 # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Wow!

Followup: I had mentioned that my code had a bug, because some pictures, such as this one:

showRule 29 11
          #
######### ###########
#         #
######### ###########
#         #
######### ###########
#         #
######### ###########
#         #
######### ###########
#         #

look different than the way Wolfram's book and web site shows them, which is like this:

          #
######### ###########
          #
######### ###########
          #
######### ###########
          #
######### ###########
          #
######### ###########
          #

After a little investigation this seems to be because Wolfram's implementation wraps around; the left neighbor of the leftmost cell of a given universe is taken from the rightmost cell, and vice-versa, while my implementation pretends that there is always more empty space available to the left and right.

Whether you consider this a bug or not is up to you. The wraparound behavior is probably considered more "canonical." You can compare the results from my program to the pictures at Wolfram's MathWorld site here. If you replace my universeToTuples function with this one:

universeToTuples :: [Bool] -> [(Bool, Bool, Bool)]
universeToTuples [] = []
universeToTuples universe = tupleize $ wrapUniverse universe
   where
       wrapUniverse xs =  ( last xs ) : ( xs ++ [ head xs ] )
       tupleize xs =
           if length xs > 3 then tuple3 xs : tupleize ( tail xs )
               else [ tuple3 xs ]
       tuple3 xs = ( xs !! 0, xs !! 1, xs !! 2 )

you will get the wraparound behavior.

Thanks for reading and as always, I appreciate your comments.

17 December 2006

From Bits to Cells: Simple Cellular Automata in Haskell, Part One

The little Haskell toy I wrote to simulate a dot-matrix printhead was useful in one sense: it helped me learn a little more Haskell! I couldn't help noticing that the results look like the transformations you get when applying simple cellular automata rules. So my next experiment is to implement those rules, and see what else I can learn from the results.

What follows is Literate Haskell, but there is no main program to run; it is for you to experiment with interactively, if you like.

Some Haskell code to implement simple cellular automata (CA) rules:
part one, round one.

The simplest type of one-dimensional CA behaves as follows:

- a cell may be on or off (one or zero).

- time ticks by in discrete intervals.

- at each tick, the state of each cell is updated using a function that
maps the current state of the cell to the left, the cell itself, and
the cell to the right to the new cell value.

Because this value-generating function depends on 3 bits of input, a
given CA rule actually consists of 2^3 = 8 possible mappings. The
mappings can be represented by the binary values 0b000 to 0b111
(decimal 0..7). By Wolfram's numbering convention for these simple
CA, we actually count down from 7 to 0. So, let's say we have a rule
that maps each possible set of left, center, and right values:

0b111, 0b110, 0b101, 0b100, 0b011, 0b010, 0b001, 0b000

to the new cell value:

0b0, 0b0, 0b0, 0b1, 0b1, 0b1, 0b1, 0b0

We can treat the resulting number, in this case 0b00011110, as the
rule number. This is rule 30 in Wolfram's schema; there are 2^8,
or 256, possible rules, numbered from 0 to 255.

First, let's make a function that, given a rule number and three
values (left, center, and right), returns the new cell state. We
turn the three values into an index to look up the appropriate bit
in the rule number.

>import Data.Bits

>genNextBit :: Int -> ( Bool, Bool, Bool ) -> Bool
>genNextBit rulenum ( left, center, right ) = rulenum `testBit` idx
>    where idx = ( if left then (4::Int) else (0::Int) ) .|.
>           ( if center then (2::Int) else (0::Int) ) .|.
>           ( if right then (1::Int) else (0::Int) )

Hmmm... it is lacking a certain elegance, and I don't quite like
the way the indentation rules work in this case, but let's test it
with a function that generates the 8 rule indices in the form of tuples:

>genRuleIndices :: [( Bool, Bool, Bool )]
>genRuleIndices = [ ( (val `testBit` 2), (val `testBit` 1),
>                   ( val `testBit` 0) ) | val <- [(7::Int),
>                   (6::Int)..(0::Int)] ]

Now if we write:

genRuleIndices

we get:

[(True,True,True),
(True,True,False),
...
(False,False,True),
(False,False,False)]

and if we write:

map (genNextBit 30) genRuleIndices

this expression curries us a function (mmm... curry!) which takes
a starting state and applies rule 30, then feeds it the possible
starting states. The result is:

[False,False,False,True,True,True,True,False]

That looks like the bit pattern for rule 30.

Just for fun, let's confirm by making a function that will
translate a list of output bit values back into a rule number.
The signature should look like this:

sumBitVals :: [Bool] -> Int

And we want the list

>test_bit_vals = [False,False,False,True,True,True,True,False]

to map back to 30. Take a shot at it yourself; you might find
that the result is instructional I'll wait.

(Musical interlude).

Did you try it? Let's look at some possible implementation strategies.
We could make a list of the powers of two and then do some list
manipulation to get the powers of two that correspond to our one-bits
summed up:

>powers_of_two_8_bits = reverse ( take 8 (iterate (2 *) 1) )

[128,64,32,16,8,4,2,1]

(An aside: since this is generated by a function that takes no
parameters at runtime, it would seem like a sufficiently smart
compiler could generate this list and stick it in the resulting
program so that no run-time calculation is required to generate
it. Does GHC do this? I don't know.)

Anyway, our implementation. We can tuple up our powers of two with
our bit values:

>tups = zip powers_of_two_8_bits test_bit_vals

[(128,False),(64,False),(32,False),(16,True),(8,True),(4,True),(2,True),(1,False)]

Then we can turn this back into a list of only the powers of two that are
"on," and sum the results:

sum (map (\ tup -> if snd tup then fst tup else 0) tups )

30

It seems like this should be simpler still. I looked in the standard
library for a function to filter one list using a list of boolean
values as a comb. Let's write one:

>combListWithList :: [a] -> [Bool] -> [a]
>combListWithList [] [] = []
>combListWithList ls comb =
>   if (head comb) then (head ls) : combListWithList (tail ls) (tail comb)
>       else combListWithList (tail ls) (tail comb)

combListWithList powers_of_two_8_bits test_bit_vals

[16,8,4,2]

That seems good, although it doesn't express the idea that the lists
have to be the same length. It still seems amazing to me that I can
reel off functions like that in Haskell and have them work right the
first time! Now we can produce our final function:

>sumBitVals :: [Bool] -> Int
>sumBitVals ls = sum $ combListWithList powers_of_two_8_bits ls

sumBitVals test_bit_vals

30

There is probably an elementary way to implement the above function
using zipWith instead of my combList, but I'll leave that to you;
leave a comment if you figure it out!

As another aside, here is another function I came up with to generate
this bit vals; it doesn't rely on a specific length.

>foldBitVals :: [Bool] -> Int
>foldBitVals ls = snd (foldr
>   (\ flag tup -> if flag then ((fst tup) * 2, (snd tup) + (fst tup))
>                           else ((fst tup) * 2, snd tup) )
>   (1, 0) ls )

foldBitVals test_bit_vals

30

Take a moment to understand the above function; it is a little strange.
I perform a fold using a tuple. The first element is the power of two,
so it keeps track of our place in the list, and the second is an
accumulator as we add up our values. This is a trick for stashing
a state that is more complex than a single value into a fold.

Anyway, that was a long diversion, so let's end for now. Next time we'll look at ways to represent our CA universe as it evolves by applying our rules. As always, your comments are welcome!

14 December 2006

The Revised Dot-Matrix Printhead

It looks like this blog has been unceremoniously removed from Planet Scheme. I guess that's only fair, since I haven't written anything specifically about Scheme in a while. Best of intentions, and all that. This blog probably belongs on "Planet Programming Language Dilettante."

Anyway, I received some helpful feedback on the short program I wrote to print bitmaps of binary numbers. Jason Creighton in particular took a shot at a rewrite, and the result as highly enlightening. I also recieved two specific pieces of advice: use short function names, and don't write a comprehension like [a | a <- [a..b]] when you can just write [a..b]. Oops. Check, and thanks!

(Semi) or (Ill) Literate Haskell follows:

Haskell Binary Printhead Toy, Second Round

>import Data.Bits

Our functions to generate lists of signed and unsigned integers of
numbits length, revised with names as suplied by Jason and error-
catching as supplied by yours truly:

>unsignedInts :: Int -> [Int]
>unsignedInts bits | (1 <= bits) && (bits <= 16) =
>   [(0::Int)..(bound::Int)]
>   where bound = 2 ^ bits - 1
>unsignedInts bits = error "expected a value from 1 to 16"

>signedInts :: Int -> [Int]
>signedInts bits | (1 <= bits) && (bits <= 16) =
>   [(- (bound::Int)) - 1..(bound::Int)]
>   where bound = 2 ^ ( bits - 1 ) - 1
>signedInts bits = error "expected a value from 1 to 16"

Jason came up with a considerably simpler way to map the bits of
the binary numbers to the characters for printing:

>boolToChar :: Bool -> Char
>boolToChar True = '#'
>boolToChar False = ' '

I'm adopting that suggestion enthusiastically, since I don't think
code can get much simpler than that!

Jason then proceeded to turn my dog food into a much more flavorful
wafter-thin after-dinner mint, leaving me struggling to understand
just what he did:

>bitPicture :: (Int -> [Int]) -> Int -> [Char]
>bitPicture intGen bits = unlines $ map stringAtIndex (reverse [0..(bits-1)])
>  where
>    stringAtIndex = map boolToChar . bitsAtIndex
>    bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

Wow. First off, Jason has supplied a type signature which expresses
what the function actually _does_. The appropriate integer generator
(the signed or unsigned version) comes in as a parameter; that's what
(Int -> [Int]) means. We next get an integer argument (the number of
bits), and finally, we spit out a list of chars.

Jason proceeds to use a couple of where clauses to set up some
simplifying expressions. Let's look at the first one:

StringAtIndex = map boolToChar . bitsAtIndex

OK, this is slightly mind-twisting right off the bat. In the world
of C and Java you can't write things like that. The dot is function
composition expressed naturally; instead of something like

stringOfBoolsFromBitsAtIndex idx = map boolToChar (bitsAtIndex idx)

we can just get Haskell to chain the functions together for us.
No fuss, no muss. But this function makes a forward reference to
bitsAtIndex, which we haven't seen yet (completely illegal in C
and Java). Let's make sure we understand what bitsAtIndex does:

bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

That's definitely more dense than the way I wrote it (and that's a
good thing). To break down the list comprehension, the expression on
the right hand side of the vertical bar says that we feed the supplied
integer generator function the number of bits to use. That gives us a
list. From that list we will draw our "n"s and test bit idx on them.
By doing it this way, Creighton has pulled an act of Haskell sleight-
of-hand: instead of generating the bitmaps for each integer value,
like I did, and then using "transpose" to rearrange them, this version
accesses the "columns" of the table directly! With those two
expressions, we've got our string representing all the bits at a
given index _for the whole supplied set of integers_.

Wow. I feel like an audience member watching a magician smash up my
watch and then give it back to me rebuilt as a marvelous whirring
gyroscope.

There's more, but this part is relatively simple:

bitPicture intGen bits = unlines $ map stringAtIndex (reverse [0..(bits-1)])

We already know what unlines does; it turns our list of strings into a
properly line-broken string. We've just described how stringAtIndex works.
and map stringAtIndex (reverse [0..(bits-1)[) just feeds stringAtIndex a
set of indices mapped from high to low, so that our most significant bit
comes out leftmost. But what does "$" do?

My reference at zvon.org says this is the "right-associating infix
application operator (f $ x = f x), useful in continuation-passing style."
I'm not sure I fully understand, and I'm not going to go into the meaning
of "continuation-passing style" right now, but I guess it saves us from
having to write:

unlines $ map (stringAtIndex (reverse [0..(bits-1)]))

to get the right order of operations, which is nice. So I'll try to
incorporate it into my own Haskell.

Anyway, let's try it. We need to drive the function:

>main :: IO ()
>main = do
>   print "Unsigned values (6 bits)"
>   putStr $ bitPicture unsignedInts 6
>   print "Signed values (6 bits)"
>   putStr $ bitPicture signedInts 6

After using GHCi's :cd and :load command to read this file, just
type:

main

"Unsigned values (6 bits)"
                                 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

"Signed values (6 bits)"
 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Excellent. I laughed; I cried; I learned something today. I hope you did, too. Thanks, Jason! And thanks to the other people who commented!

Followup: for those following along the discussion in the comments, here is an alternate implementation of the bitPicture function:

bitPicture intGen bits =
    unlines $ map stringAtIndex $ [bits - 1, bits - 2..0]
    where stringAtIndex = map boolToChar . bitsAtIndex
        bitsAtIndex idx = map (flip testBit idx) (intGen bits)

Note that using [bits - 1, bits - 2..0] in the first line allows us to avoid the list reversal, so that seems like a clear win. The method of expressing bitsAtIndex above is one of several that were suggested. The idea behind the flip is that the return value of intGen should become the first parameter to bitsAtIndex, not the second. You can also express this by forcing stringAtIndex to be treated as an infix operator:

bitsAtIndex idx = map (`testBit` idx) (intGen bits)

Both of these alter the behavior of Haskell's automatic currying so that it replaces the second parameter instead of the first, leaving the first as the sole parameter to the resulting curried function; in Common Lisp, you might do this explicitly using rcurry. And don't forget the original, using a list comprehension:

bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

Assuming there is not a specific concern about efficiency, it seems like the choice between these versions might be a matter of style. Thanks again to everyone who made suggestions.

12 December 2006

The Dot-Matrix Printhead: a Haskell Toy

The body of this article is a literate Haskell program. You can extract the text, then save it as a file with the extension .lhs, then execute it using GHCi.

This is a Haskell toy to visualize twos-complement binary numbers.

In my previous articles on the subject of integer division, I alluded
to underflow and overflow and the "weird number" -- well, now you
can see them!

Almost 30 years ago I wrote code on my Radio Shack TRS-80 to make my
dot-matrix printer spit out bitmaps of binary numbers, which made
pretty recursive patterns. This program commemorates that long-ago
geekery from my early teenage hacking days.

First, we need the bits library:

>import Data.Bits

We'll also need the list library later:

>import List

Let's simulate n-bit binary number in both a signed and unsigned
interpretation. Given a number of bits, we want a function to make
a list of the possible values. For example, in unsigned form 8 bits
will give us a list of the values 0..255; in signed form, it will be
an asymmetrical list from -128..127. Note that this function generates
the whole list, so if you give it a numbits value higher than, say,
16, you might regret it.

Our calculation fails for values of numbits <= 0, so we put in a
guard case for that. We also want to discourage creating very large
lists, so we put in a guard case for numbits > 16.

>gen_n_bit_nums_unsigned numbits | numbits <= 0 = []
>gen_n_bit_nums_unsigned numbits | numbits > 16 = error "too many bits!"
>gen_n_bit_nums_unsigned numbits =
>   [ n | n <- [(0::Int)..((2::Int)^numbits - 1)] ]

>gen_n_bit_nums_signed numbits | numbits <= 0 = []
>gen_n_bit_nums_signed numbits | numbits > 16 = error "too many bits!"
>gen_n_bit_nums_signed numbits =
>   [ n | n <- [-(2::Int)^(numbits - 1)..((2::Int)^(numbits - 1) - 1)] ]

To test this we can execute:

gen_n_bit_nums_unsigned 4
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

gen_n_bit_nums_signed 4
[-8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7]

That looks about right.

Now we want a function to access the bits of those numbers from most
significant to least significant and build a list of corresponding
boolean values. This would be expressed more naturally if EnumFromTo
operated descending sequences. I'm sure there's a better way, but for
now I will just reverse the resulting list:

>rightmost_bits num_bits val | num_bits > 16 = error "too many bits"
>rightmost_bits num_bits val | num_bits <= 0 = error "too few bits"
>rightmost_bits num_bits val =
>   reverse [ testBit (val::Int) bit_idx | bit_idx <- [0..(num_bits - 1)] ]

Once we have a function which will do it with one number, doing it with a
whole list of numbers is very easy. We will make a curried function and
use map. Currying in Haskell is very easy, which makes sense given that
the operation, "to curry" and the language, Haskell, are both named after
Haskell Curry!

Our rightmost_bits function takes two arguments, but we want to make a
new function which we can hand off as an argument to map, along with a
list. This new function will only take one parameter. We can ask GHCi
what it thinks the type of the rightmost_bits function is:

:type rightmost_bits

It says:

rightmost_bits :: Int -> Int -> [Bool]

We can make a new function out of it by doing partial application:
instead of supplying all its arguments, we just supply the first one.

gen_rightmost_bits 4

If you do this in GHCi, you'll get an error message because the
result is not printable (GHCi can't apply show to the result).
But you can bind it:

let bit_generator_4 = gen_rightmost_bits 4

And then ask GHCi what its type is:

:type bit_generator_4
bit_generator_4 :: [Int] -> [[Bool]]

Take a moment to make sure you understand what we've done here:
Haskell has made us a curried function which takes a list of values
instead of a single value, and generates a list of lists. This
seems pretty close to magic! Let's make another function that
takes the number of bits and a list, generates the curried
function, then maps the list to the curried function:

>gen_rightmost_bits num_bits ls =
>   map (rightmost_bits num_bits) ls

As you can see, working with curried functions is completely
effortless in Haskell. Here's a general function to generate
a list of all the bit values for unsigned numbers of num_bits
size:

>gen_all_bits_unsigned num_bits =
>   gen_rightmost_bits num_bits (gen_n_bit_nums_unsigned num_bits)

>gen_all_bits_signed num_bits =
>   gen_rightmost_bits num_bits (gen_n_bit_nums_signed num_bits)

If we execute:

gen_all_bits_unsigned 4

We get back the following (I have reformatted the output for clarity):

[[False, False, False, False],
[False, False, False, True],
[False, False, True, False],
[False, False, True, True],
[False, True, False, False],
[False, True, False, True],
[False, True, True, False],
[False, True, True, True],
[True, False, False, False],
[True, False, False, True],
[True, False, True, False],
[True, False, True, True],
[True, True, False, False],
[True, True, False, True],
[True, True, True, False],
[True, True, True, True]]

This is recognizably a table of the unsigned binary values 0..15.
This approach doesn't seem very "Haskell-ish" yet -- instead of
generating the combinations as needed, I'm producing them all
ahead of time. It also feels to me like I'm pushing uphill on
the syntax, and having to use parentheses to force the order
of evaluation. We can think more about that later, but let's
get our pretty-printing working first.

How can I turn all those boolean values into a string I can print?
Well, we can take advantage of the fact that in Haskell, strings can
be treated as lists, and the same methods apply. This is not efficient,
but it allows us to use some of the standard functional techniques.
One of these is "fold."

Fold is a technique for boiling down a list into some kind of summary
form. Just what form this summary takes is up to you, since you can
supply your own function and a starting value. For example, to sum
the elements in a list, I could use foldl like this:

foldl (\ x y -> x + y) 0 [0, 1, 2]

The first parameter to foldl is a function. In this case I'll use
a lambda expression, which gives me a function I won't bother to
name. The second paramter is the starting value. The starting value
and the next item in the list are repeatedly passed to the function,
which accumulates a value. In this case, the final result is 3.

I can give foldl an empty string as the starting accumulator value
and my function can concatenate strings onto it:

>stringify_vals =
>   foldl (\ x y -> if y then x ++ "#" else x ++ " ") ""

Now you might want to try to apply our list of lists to this
function:

stringify_vals (gen_all_bits_unsigned num_bits)

If you do that, though, you'll get a type error like this:

"Couldn't match expected type `Bool' against inferred type `[Bool]'"

The reason can be seen if we examine the types:

:type (gen_all_bits_unsigned 4) :: [[Bool]]
(gen_all_bits_unsigned 4) :: [[Bool]]

:type stringify_vals
stringify_vals :: [Bool] -> [Char]

We have a list of lists of bools, but we want to feed our
stringification function a list of bools. To apply each
element of (gen_all_bits_unsigned 4) to our stringification
function, we can use map again:

>stringify_all_unsigned_vals num_bits =
>   map stringify_vals (gen_all_bits_unsigned num_bits)

>stringify_all_signed_vals num_bits =
>   map stringify_vals (gen_all_bits_signed num_bits)

This will give us a list of strings, one for each list of boolean
values. (I've reformatted the output).

["    ",
"   #",
"  # ",
"  ##",
" #  ",
" # #",
" ## ",
" ###",
"#   ",
"#  #",
"# # ",
"# ##",
"##  ",
"## #",
"### ",
"####"]

That's nice: we can start to see our recursive structure! But
we really want that value to be turned into a single printable
string. To achieve that, we can use the unlines function.

>prettyprint_unsigned num_bits =
>   unlines (stringify_all_unsigned_vals num_bits)

>prettyprint_signed num_bits =
>   unlines (stringify_all_signed_vals num_bits)

When we interpret

prettyprint_unsigned 3

we see a single string with newline characters escaped, like so:

"   \n  #\n # \n ##\n#  \n# #\n## \n###\n"

This will give the results we want when we pass it to putStr, like so:

putStr (prettyprint_unsigned 8)

        #
       #
       ##
      #
      # #
      ##
      ###
     #
     #  #
     # #
     # ##
     ##
     ## #
     ###
     ####

    (etc.)

Hmmm. That's nice, but it would be really cool if the results
were rotated, so that we could see what it would look like if
a printhead moving horizontally was printing out these values.
To do this, we can actually start with our intermediate list of
lists. If our four-bit numbers give us

[[False, False, False, False],
[False, False, False, True],
[False, False, True, False],
etc.

we can see the values the other way using transpose. Let's write
rotated versions of gen_all_bits_unsigned and gen_all_bits_signed:

>gen_all_bits_unsigned_rot num_bits =
>   transpose
>   (gen_rightmost_bits num_bits (gen_n_bit_nums_unsigned num_bits))

>gen_all_bits_signed_rot num_bits =
>   transpose
>   (gen_rightmost_bits num_bits (gen_n_bit_nums_signed num_bits))

And make a new stringify function to play with:

>stringify_all_unsigned_vals_rot num_bits =
>   map stringify_vals (gen_all_bits_unsigned_rot num_bits)

>prettyprint_unsigned_rot num_bits =
>   unlines (stringify_all_unsigned_vals_rot num_bits)

Now

putStr (prettyprint_unsigned_rot 6)

will give us:

                                 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Now that's what I'm talking about!

We can use the unsigned version to see the interesting transition
point between negative and positive values:

>stringify_all_signed_vals_rot num_bits =
>   map stringify_vals (gen_all_bits_signed_rot num_bits)

>prettyprint_signed_rot num_bits =
>   unlines (stringify_all_signed_vals_rot num_bits)

Now

putStr (prettyprint_signed_rot 6)

will give us:

 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Now let's give our program a main function using the IO monad and
have it print out several of our results:

>main :: IO ()
>main = do
>   print "Unsigned values (8 bits)"
>   putStr (prettyprint_unsigned 8)
>   print "Signed values (8 bits)"
>   putStr (prettyprint_signed 8)
>   print "Unsigned values (6 bits) rotated"
>   putStr (prettyprint_unsigned_rot 6)
>   print "Signed values (6 bits) rotated"
>   putStr (prettyprint_signed_rot 6)

After using GHCi's :cd and :load command to read this file, just
type:

main

I would invite any readers to play around with the code, and if you can make it more concise and/or more "Haskellish," please leave a comment.

Now if only we could get Haskell to generate the appropriate sound! My dot-matrix printer was loud. But that's a task for another day.

Update: there is a followup aticle to this one which develops the code further; see http://praisecurseandrecurse.blogspot.com/2006/12/revised-dot-matrix-printhead.html Also, make sure to take a look at the comments on both this posting and the followup.

08 December 2006

The Divisive Aftermath

Antti-Juhani pointed out to me that Haskell provides the quot and rem pair of operations in addition to div and mod. The difference is that while both satisify the identity involving division, multiplication, and the remainder, div and mod round down (towards -infinity), while quot and rem round towards zero (truncating the fractional part).

That's great! (Besides the fact that I displayed my ignorance in public, that is). It means I really can use GHCi to model my functions for testing! And I was getting worked up to write a C version of Plauger's div which rounded away from zero for negative quotients so I could have some guaranteed-consistent behavior. Instead I can use the C standard function div and Haskell's quot and rem.

So, this sort of begs the question: is there a good reason to prefer one method of integer division rounding to another? Yes. It comes down to this question: do you want your algorithms to behave smoothly across the origin, or to behave symmetrically around the origin?

Let's briefly consider some Haskell code for mapping a range of fractions onto a range of integers again.

let gen_nths denom =
    [ (numer, denom::Int) | numer <- [ (-denom::Int)..(denom::Int) ] ]

let div_map m1 m2 = map (\ tup -> fst tup *
    ( ( (m2::Int) - (m1::Int) ) `div` 2 ) `div` snd tup )

div_map (-10) 10 ( gen_nths 3 )

[-10, -7, -4, 0, 3, 6, 10]

Notice something about those values: the distances between them follow a consistent pattern. Reading the list left to right, we have distances of 3, 3, 4, 3, 3, 4. This pattern repeats. It is smooth across the origin.

Now let's change our div_map function to use quot and rem:

let quot_map m1 m2 = map ( \tup -> fst tup *
    ( ( (m2::Int) - (m1::Int) ) `quot` 2) `quot` snd tup )

quot_map (-10) 10 ( gen_nths 3 )

[-10, -6, -3, 0, 3, 6, 10]

Looking at the differences between values, from left to right, we find that they are 4, 3, 3, 3, 3, 4. The function is symmetric around the origin.

Does this matter in practice? Of course! Let's say that we have a mapping using a non-origin-crossing version of our quot_map function:

let gen_non_negative_nths denom =
    [ (numer, denom::Int) | numer <- (0::Int)..(denom::Int) ] ]
let quot_map_2 m1 m2 = map ( \tup -> fst tup *
    ( (m2::Int) - (m1::Int) ) `quot` snd tup )

quot_map_2 0 20 ( gen_non_negative_nths 6 )

[0, 3, 6, 10, 13, 16, 20]

Shiny. The intervals go 3, 3, 4, 3, 3, 4. But if we shift the values:

let gen_non_positive_nths denom =
    [ (numer, denom::Int) | numer <- [ (-denom::Int)..(0::Int) ] ]

quot_map_2 0 20 ( gen_non_positive_nths 6 )

[-20, -16, -13, -10, -6, -3, 0]

The intervals go 4, 3, 3, 4, 3, 3 -- they are backwards. In other words, the effect of integer rounding on the values of the codomain changes if you shift the domain of the denominator across the origin: the ordering of the pattern is reversed.

If we do the same thing using div:

let div_map_2 m1 m2 =
    map (\ tup -> fst tup * ( (m2::Int) - (m1::Int) ) `div` snd tup )

div_map_2 0 20 ( gen_non_negative_nths 6 )

[0, 3, 6, 10, 13, 16, 20]

div_map_2 0 20 ( gen_non_positive_nths 6 )

[-20, -17, -14, -10, -7, -4, 0]

we get a pattern of intervals (the effect of rounding) which remains the same: 3, 3, 4, 3, 3, 4.

Now, smoothness across the origin is is what I want, at least for the kinds of functions I am working on now. But your problem domain may lend itself to writing functions which involve rotations, or mapping values across the origin: in that case, you're going to want the symmetry. The important thing is to know what strategy you are using and apply it consistently. I'm really impressed that Haskell give me so many interesting options.

07 December 2006

The Division Bell Tolls for Me, Part Four (Conclusion)

Well, this was initially going to be a three-part article, but things have become more interesting. The situation is not unlike picking up a log in the forest and finding all kinds of nifty crawlies living under it. "Interesting" to certain geeky people like myself, at least! It's about to get geekier. If you are allergic to ANSI or ISO standards, you'd better stop reading now. But I claim the payoff is worth it, if you can make it through to the end. At least, for some value of "worth it."

In the last installment we were exploring the behavior of integer division in Haskell. I want to get back to C now. We learned that the behavior of integer division involving negative results is not strictly defined for C. The behavior of % (mod) is also not very strictly defined.

Let's start with Kernighan and Ritchie's The C Programming Language, Second Edition (I'm not even going to try to figure out rules for pre-ANSI/ISO C). K&R tell us on p. 10 that

...integer division truncates; any fractional part is discarded.

That isn't very detailed, but on p. 41 they tell us

the direction of truncation for / and the sign of the result for % are machine-dependent for negative operands, as is the action taken on overflow or underflow.

OK, but that doesn't describe exactly what the options are. Things get a little bit more detailed on p. 205, where we read that no guarantees are made about the sign of the remainder!

...if the second operand is zero, the result is undefined. Otherwise, it is always true that (a/b)*b + a%b is equal to a. If both operands are non-negative, then the remainder is non-negative and smaller than the divisor; if not, it is guaranteed only that the absolute value of the remainder is smaller than the absolute value of the divisor.

This isn't actually internally consistent, because if the sign of the remainder is not correct, then the identity for division doesn't work for negative values unless the sign of the quotient is allowed to be incorrect! I've never seen it suggested anywhere else that division could be that loosely defined. But K&R isn't the formal standard, so let's move on.

The reason for the "looseness" that exists in the C standard, of course, is that the standard was originally written to, as much as possible, codify existing practice, and C has a strong bias towards efficiency. Since (apparently) different hardware division algorithms had different behavior, the standards committee did not feel that it was appropriate to require existing implementations to change behavior. Doing so might have required existing architectures to have to replace hardware division with software division, which could have inflicted an enormous efficiency cost.

According to ISO/IEC 9899:1990 (known as C90):

When integers are divided and the division is inexact, if both operands are positive the result of the / operator is the largest integer less than the algebraic quotient and the result of the % operator is positive. If either operand is negative, whether the result of the / operator is the largest integer less than or equal to the algebraic quotient or the smallest integer greater than or equal to the algebraic quotient is implementation-defined, as is the sign of the result of the % operator. If the quotient a/b is representable, the expression (a/b)*b + a%b shall equal a.

So we know that there are two options for rounding. In order to maintain the identity involving integer division and mod, though, the value of mod has to be consistent with the rounding strategy, even if the sign does not have to be consistent.

There is a back door that lets us get to defined behavior. Apparently the div function does have strictly defined rounding. In section 7.10.6.2 of the standard, we read:

If the division is inexact, the resulting quotient is the integer of lesser magnitude that is the nearest to the algebraic quotient.

That's interesting -- this is rounding towards zero. The div function returns a structure of type div_t that contains both the integer quotient and integer remainder. It is provided in the <stdlib> header, available in C++ as <cstdlib>. In his book The Standard C Library, P. J. Plauger provides an implementation for div. It looks like this:

div_t (div)(int numer, int denom)
    {        /* compute int quotient and remainder */
    div_t val;
    val.quot = numer / denom;
    val.rem = numer - denom * val.quot;
    if (val.quot < 0 && 0 < val.rem)
        {    /* fix remainder with wrong sign */
        val.quot += 1;
        val.rem -= denom;
        }
return (val); }

We can see what he is doing: for negative quotients that are inexact answers to the division problem, he shifts the quotients up towards zero and the remainders get reversed. Note that he does not use the built-in % facility, presumably since its behavior does not have very strong guarantees placed on it.

C: A Reference Manual (5th Edition) by Harbison and Steele seems to indicate that the semantics are a little more rigorously defined. We read there that

For integral operands, if the mathematical quotient of the operands is not an exact integer, then the fractional part is discarded (truncation toward zero). Prior to C99, C implementations could choose to truncate toward or away from zero if either of the operands were negative. The div and ldiv library functions were always well defined for negative operands.

That seems pretty unambiguous, but then on page 419, when describing the div and ldiv function, the authors write

The returned quotient quot is the same as n/d, and the remainder rem is the same as n%d.

But that ignores the possibility of a pre-C99 implementation where / rounds away from zero for negative values, as does GHCi in my tests. So again we have a lack of rigorous internal consistency. It is worth noting here that K&R don't make any extra statements about the required rounding behavior of div and ldiv, and since K&R is still considered the "bible" by many C programmers, the guarantees on the div and ldiv function may not be very well-known -- or properly implemented.

Does C99 clarify this at all? ISO/IEC 9899:1999(E), Second Edition 1999-12-01, tells us that "the result of the / operator is the algebraic quotient with any fractional part discarded" and a footnote indicating that this is often called "truncation towards zero."

How about C++? ISO/IEC 14882, Second Edition (2003-10-15) does not actually say how rounding is performed, although it says that "If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined." There is a footnote indicating that "According to work underway toward the revision of ISO C, the preferred algorithm for integer division follows the rules defined in the ISO Fortran standard" (rounding towards zero). When we try to look up information about div and ldiv, we find that the C++ standard just refers us to the Standard C library.

OK, all the standards language is giving me a headache; let's take a look at some code. To begin with, let's confirm the way my C implementation does rounding of integer quotients. We took a look at 6/5, -6/5, 5/6, and -5/6 in Haskell; let's look at the same quotients in C:

int val_1 = 6;
int val_2 = 5;
int val_3 = -6;
int val_4 = -5
int result_1 = val_1 / val_2;
int result_2 = val_3 / val_2;
int result_3 = val_2 / val_1;
int result_4 = val_4 / val_1;
printf("result_1 = %d, result_2 = %d, result_3 = %d, result_4 = %d\n",
    result_1, result_2, result_3, result_4);

This gives me:

result_1 = 1, result_2 = -1, result_3 = 0, result_4 = 0

Yep, I'm getting rounding towards zero for both positive and negative values. Next, some Haskell. Let's bring back our algorithm from part 2:

let gen_nths denom =
    [(numer, denom::Int) | numer <- [(-denom::Int)..(denom::Int)] ]

let div_map m1 m2 = map (\tup -> (fst tup + snd tup) *
    (((m2::Int) - (m1::Int)) `div` 2) `div` snd tup + m1)

div_map (-10) 10 (gen_nths 3)

[-10,-7,-4,0,3,6,10]

And now let's write up my C algorithm again. This time we'll use int instead of long since we don't need big numbers (although on my target, ints are actually 32 bits long).

unsigned int denom = 3;
int numer;
for ( numer = -3; numer <= 3; numer++ )
{
    int quot = numer * 10 / denom;
    printf("quotient %d\n", quot);
}

Go ahead and try it on your platform.

Whoah! What happened? You might have noticed something very odd about the first few quotients. They are monstrously large! On my target I get:

quotient 1431655755
quotient 1431655758
quotient 1431655762
quotient 0
quotient 3
quotient 6
quotient 10

What happened is that I injected a bug to point out yet another possible way your C code can fail catastrophically. This behavior can surprise even experienced programmers. Note that since my denominator is always a positive value, I declared it as an unsigned int instead of an int. So what went wrong?

Let's look at the first example: -3/3 yielding 1431655755. The behavior you are looking at is actually mandated by the C standard. When mixing signed and unsigned types, the compiler is required to promote (that is, widen) the actual type of the calculation to an unsigned type. So, internally, if we are performing the calulation -22 (signed) / 7 (unsigned), the compiler is required to notice that both a signed and unsigned long value are presented to the integer division operator. The 32-bit twos-complement representation of -3 is 0xFFFFFFFD (the conversion to an unsigned intermediate value changes the meaning, not the representation). I multiply this value by 10, yielding 0xFFFFFFE2, and then divide it by 3, yielding 0x5555554B, or a decimal value of 1431655755. The compiler sticks this unsigned value right into the signed result variable. The signed or unsigned status of the destination variable has absolutely no effect; it does not dissuade the compiler from deciding to treat the values as unsigned. (It is a common misconception to think that the destination value influences the type of the calculation in C).

If you have a target with 16-bit ints, your rolled-over values will be different (0xFFFD, 0xFFE2, and 0x554B, or decimal 21835). So the mixing of signed and unsigned values has not only given us the wrong answer, but made the code that generates the wrong answer produce inconsistent results depending on the size of int, even when the values we start with are not actually too big to fit into 16 bits.

If that wasn't perfectly clear, and even if it was, if you are really trying to write portable numeric code in C that has to be correct, I urge to consider purchasing the MISRA (Motor Industry Software Reliability Association) book MISRA-C:2004, Guidelines for the Use of the C language in Critical Systems. It has the most detailed explanation of C's arithmetic type conversions, and more importantly, the implications of these type conversions in real code. The crux of the biscuit is that C is hard to get right, even for good programmers.

OK, so let's change my denominator back to a signed long and try it again. This time the results look more reasonable: -10, -6, -3, 0, 3, and 6.

Are things any different if we use div?

int denom = 3;
int numer;
for ( numer = -3; numer <= 3; numer++ )
{
    div_t val = div(numer * 10, denom);
    printf("quotient %d\n", val.quot);
}

In my case, no, since on my compiler and target, / already seems to round towards zero (asymmetrically) while Haskell rounds symmetrically downwards (floor behavior).

By the way, while we're at it, let's see what C has to say about the "weird" number divided by -1:

long most_negative_long = -2147483648;
long minus_one = -1;
long result = most_negative_long / minus_one;
ldiv_t ldiv_result = ldiv(most_negative_long, minus_one);
printf("results: %lX, %lX\n", result, ldiv_result.quot);

What do you get? I get zero as the result of both the / and ldiv.

I also get GCC telling me "warning: this decimal constant is unsigned only in ISO C90." What does that mean? The compiler is apparently warning us that in C99 we won't be getting an implicitly unsigned value out of the constant -2147483648, in case we might have wanted to use this constant in an expression involving signed types to force the calculation to be promoted to an unsigned type. But why would we get an unsigned value in the first place? Apparently in C90, the number's magnitude is interpreted first, and the value is negated. 2147483648 is too large (by one) to fit into a signed long, so the compiler promotes the type to an unsigned long, then negates it.

I have been trying to come up with an example of an expression that behaves differently when I use -2147483648 as a constant, as opposed to 0x80000000, but so far I haven't been able to come up with one. This may be because my compiler is already operating under C99 rules and so I never get the implicit promotion to an unsigned value.

Anyway, be that as it may, some parting advice and conclusions:

1. GHCi (on the PC and Mac) and C (Visual C++ 6.0 and GCC on the PC and Mac) yield distinctly different rounding behavior for integer division. C90 allows this behavior. Does Haskell? Would it make sense for Haskell to define integer division more strictly, the way that C99 does?

2. Division by zero is never OK; neither is taking % zero. Both are a fatal error in GHCi. The results are undefined in C (but a fatal error is generated by most implementations).

3. Overflow (which, in the case of integer division, occurs only when the "weird number" is divided by -1) produces undefined results, and your code should avoid it, both in Haskell and in C.

4. Operations on unsigned values are guaranteed to produce results that are consistent from platform to platform, assuming the integer size is the same. Operations on signed values don't have this guarantee.

5. The rounding (truncation) of inexact quotients may be either towards zero in all cases, or towards zero for positive quotients and away from zero when the results are negative. If your implementation follows the newer C99 standard rounding should always be towards zero.

6. Mixing signed and unsigned values in C expressions is very dangerous. If you do so, your implementation may have errors far more severe than differences in rounding behavior.

7. Consider avoiding signed division of negative values in C altogether, if your algorithm allows it. Either implement the algorithm using entirely unsigned integral types, or shift the domain of your function so that you are operating on non-negative signed values, and shift it back (using subtraction, which has well-defined semantics) when your division is done.

8. Algorithms which feed a range of values which cross the origin to integer division may vary between GHCi and C implementations. Because GHCi does not provide guaranteed rounding towards zero as C99 and the div and ldiv functions require, it is difficult to prototype in GHCi and expect to get the same results in C.

And, finally,

9. Make sure your implementation actually works the way it is supposed to. Only testing can truly accomplish this.

Interesting, isn't it, how studying one language can enrich your understanding of another! At least, it works for me!

The Division Bell Tolls for Me, Part Three

In the last installment, I pointed out a discrepancy between the way that Haskell does integer division and the way that some C code does it. Let's look at just how integer division behaves in Haskell a little more closely. You can type these expressions into GHCi:

6 / 5

1.2

5 / 6

0.8333333333333334

(6::Int) `div` (5::Int)

1

(5::Int) `div` (6::Int)

0

Integer division in the positive numbers always rounds down: 1.2 rounds down to 1, and 0.8333... rounds down to 0. In fact, it looks like the fractional part is just truncated. But let's check some negative values to be sure:

-6 / 5

-1.2

(-6::Int) `div` (5::Int)

-2

-5 / 6

-0.8333333333333334

(-5::Int) `div` (-6::Int)

-1

That's interesting; clearly it is not strictly truncation (rounding towards zero) that is happening, or else -5/6 would give us zero, not -1. And we clearly aren't getting "rounding" in the usual sense of rounding to the nearest integer (although there is no universal rounding algorithm to round to the nearest integer; there are lots of different variations in practice).

It looks like we have floor() behavior, which in the case of negative values means rounding away from zero. To state it slightly more rigidly, if the result of the div operation is a whole number, we get the whole number, otherwise we get an integer that represents the exact answer minus the fractional part.

If you learned integer division the old-fashioned way, on paper, you know what remainders are. The remainder is the leftover part of the process, and represents the numeator of the fractional part of the quotient. In Haskell, div gives you the whole number result of integer division, and mod will give you the remainder. So, for example,

(2::Int) `div` (3::Int)

0

(2::Int) `mod` (3::Int)

2

In fact, what this points to is that there is a relationship between the result of div and the result of mod. It should hold true in all cases, actually; in C, it is part of the ISO standard. The relationship is this:

For any quotient o = m `div` n where n is not zero, and the remainder p = m `mod` n, o * n + p = m.

In other words, you can reverse the integer division by multiplication as long as you add the remainder back in.

This relationship should be obviously true for any non-negative n and m. But I'll go further and say that while the exact meaning of mod for negative numbers varies from language to language, and perhaps from implementation to implementation, this relationship should still hold true, with the possible exception of cases involving overflow (which I'll discuss further below).

We can represent the relationship in Haskell like this:

10 `div` 3

3

10 `mod` 3

1

let undivide n o p = o * n + p
undivide 3 ( 10 `div` 3 ) ( 10 `mod` 3 )

10

In fact, let's test it for a few values of n and m:

let check_divide m n = ( undivide n ( m `div` n ) ( m `mod` n ) ) == m
[ check_divide m n | m <- [(-5::Int)..(5::Int)], n <- [(-5::Int)..(-1::Int)] ]
[ check_divide m n | m <- [(-5::Int)..(5::Int)], n <- [(-5::Int)..(-1::Int)] ]

Try out the results for yourself; again, not a rigorous proof, but suggestive.

But what about the overflow cases? Let's examine them. If Haskell's Int type is a 32-bit signed int, then we should be able to represent the values −2,147,483,648 (corresponding to 10000000000000000000000000000000 in binary or 0x80000000 in hexadecimal), to +2,147,483,647 (corresponding to 01111111111111111111111111111111 in binary or 0x7FFFFFFF in hexadecimal). Let's ask Haskell what happens when we try to exceed our largest positive number:

(2147483647::Int) + (1::Int)

-2147483648

Yep, we overflow; we get the largest negative number. Conversely, we can underflow:

(-2147483648::Int) - (1::Int)

2147483647

We don't get run-time error checking (and probably don't want it, for the sake of efficiency, when using mahine types). Of course, division by zero is still considered a run-time error:

(2147483648::Int) `div` (0::Int)

*** Exception: divide by zero

What about overflow or underflow in the results of division? With multiplication, it is very easy to overflow; multiplying two large numbers can easily exceed the width of a 32-bit signed value. You can come pretty close with:

(46341::Int) * (46340::Int)

2147441940

because 46340 is the next-lowest integer value to the square root of 2,147,483,647. Generate a result just slightly higher and you will roll over to a large negative:

(46341::Int) * (46341::Int)

-2147479015

With multiplication there are lots of combinations of values that will overflow, but with division it is a little bit harder to trigger overflow or underflow. There is one way, though. While 2,147,483,647 divided by any lesser value will be a lesser value, and divided by itself will equal one, recall that twos-complement representation gives us a slightly asymmetric set of values: the largest representable negative number is one larger, in absolute magnitude, than the largest representable positive value. Thus:

(214483647::Int) `div` (1::Int)

214483647

(214483647::Int) `div` (-1::Int)

-214483647

(-214483648::Int) `div` (1::Int)

-214483648

In 32 bits of twos-complement, -2,147,483,648 is the "weird number." As a general rule, if you take a twos-complement number, flip all the bits, and add one, you get the negative of that number. The only exception is the weird number. Let's see what happens when we try to divide the weird number by -1:

(-2147483648::Int) `div` (-1::Int)

Now, I would like to tell you what GHCi says when I ask it to divide the weird number by -1. I'd also like to see what it says about the mod. But I can't. GHCi crashes when I type in those expressions.

So, I can't tell you what Haskell says. But I can tell you what the C standard says. Here's a hint: not only is it not defined, but in fact the result of any overflow that occurs with arithmetic operations on signed integral types is not defined by the language standard. Only the results of arithemtic operations on unsigned integers have defined semantics for overflow, making them portable -- assuming that the size of the type in question is the same. More on that next time.

05 December 2006

The Division Bell Tolls for Me, Part Two

OK. In our last installment I started talking about using Haskell one-liners to calculate some functions. The function in question was designed to map a set of fractions in the range -7/7 to 7/7 to the machine integers 0..65536. The function itself, along with a wrapper to test, it fits neatly into a Haskell one-liner (well, it might be more than one line, depending on your terminal width). You can use GHCi to evaluate it:

map (\tup -> (fst tup + snd tup) * 32768 `div` snd tup) [(numer, 7) | numer]

I showed in the last installment that this function gave me the results I wanted:

[0, 4681, 9362, 14043, 18724, 23405, 28086, 32768, 37449, 42130, 46811, 51492, 56173, 60854, 65536]

Let's play with that function a little more, and then I want to talk about division (hence the title of the entry). In particular, integer division.

First, let's simplify our expressions by breaking them down and establishing some bindings. We can also get rid of the presumption that our output range starts at zero:

let gen_nths denom = [(numer, denom::Int) | numer <- denom::int let div_map m1 m2 = map (\tup -> (fst tup + snd tup) * (((m2::Int) - (m1::Int)) `div` 2) `div` snd tup + m1)

And now we can test it. Note that we need to add a few parentheses:

div_map (-32768) 32768 (gen_nths 11)

[-32768, -29790, -26811, -23832, -20853, -17874, -14895, -11916, -8937, -5958, -2979, 0, 2978, 5957, 8936, 11915, 14894, 17873, 20852, 23831, 26810, 29789, 32768]

OK. Now what is it we've done exactly? Well, we've generated some test values, and an algorithm, for distributing a fractional range onto a machine integer range. For our purposes, think of it as a little machine for testing division.

To see just why I chose this algorithm to play with Haskell, we have to go back and consider some C code. It's quite a headache-inducing paradigm shift to go back to C/C++, but bear with me. Let's just start with the simplest thing: we'll create the same mapping we just generated, hard-coding the values:

long denom = 11;
long numer;
for ( numer = -11; numer <= 11; numer++ )
{
    long val = numer * 32768 / denom;
    printf("%d\n", val);
}

And the results:

-32768
-29789
-26810
-23831
-20852
-17873
-14894
-11915
-8936
-5957
-2978
0
2978
5957
8936
11915
14894
17873
20852
23831
26810
29789
32768

Hmmm... wait a minute, let's compare the results:

Haskell            C
-32768             same
-29790             -29789
-26811             -26810
-23832             -23831
-20853             -20852
-17874             -17873
-14895             -14894
-11916             -11915
-8937              -8936
-5958              -5957
-2979              -2978
0                  same
2978               same
5957               same
8936               same
11915              same
14894              same
17873              same
20852              same
23831              same
26810              same
29789              same
32768              same

It looks like every single one of the negative values is off by one! In our third and final installment I'll dive a little deeper into the semantics of integer division.

The Division Bell Tolls for Me, Part One

With apologies to both Pink Floyd and John Donne.

I've been toying with Haskell. I wish I had enough free time for a systematic study of Richard Bird's Introduction to Functional Programming Using Haskell, Second Edition, but for now I have to content myself with dipping into it in small scraps of free time when both babies are asleep and the dishes are done. In addition, I'm using Haskell to model the behavior of some functions that I'm actually writing in C++.

One of the reasons I like Haskell is that with a tool like GHCi, which gives me a REPL (a Read, Evaluate, Print loop for interactive development), I can write toy one-liners that do something useful, and since Haskell supports typing, I can test how machine types behave. With an important exception, which I will discuss.

Let's say I want to try out a function that maps the fractional values -7/7 through 7/7 to unsigned integers in the range 0..65536 (the range of values representable by 16 bits, plus one). In Haskell I can try this out using a one-liner (typed into GHCi). Let's build it up. First, the range of input values. Ignore for now the possibility of using or introducing a true rational number type; instead we'll generate a set of tuples. In Haskell you can express this very concisely using a list comprehension:

[(numer, 7) | numer <- [-7..7] ]

[(-7,7),(-6,7), (-5,7), (-4,7), (-3,7), (-2,7), (-1,7), (0,7), (1,7), (2,7), (3,7), (4,7), (5,7), (6,7), (7,7)]

The list comprehension can be (roughly) read as "Make a list of tuples out of numer, 7 where numer takes on the values from the list -7 to 7." Think for a moment about how you could do this in your favorite language.

Now let's do our math on the resulting list. We can use map to apply a function to the list. We can use fst and snd to get to the values in the tuple; for example, to look at the numerators, do this:

map fst [(numer, 7) | numer <- [-7..7] ]

[-7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7]

Let's go ahead and do our scaling. We want to describe a single function to pass each tuple to; in fact we want a lambda abstraction, a function that we don't bother to name:

map (\tup -> fst tup / snd tup * 65536 ) [(numer, 7) | numer <- [-7..7] ]

[-65536.0, -56173.71428571428, -46811.42857142857, -37449.142857142855, -28086.85714285714, -18724.571428571428, -9362.285714285714, 0.0, 9362.285714285714, 18724.571428571428, 28086.85714285714, 37449.142857142855, 46811.42857142857, 56173.71428571428, 65536.0]

Whoops, that's not quite right. We want to map to only the positive values. We have to shift the domain (the incoming numerators) so they are non-negative:

map (\tup -> (fst tup + snd tup) / snd tup * 65536 ) [(numer, 7) | numer <- [-7..7] ]

[0.0, 9362.285714285714, 18724.571428571428, 28086.85714285714, 37449.142857142855, 46811.42857142857, 56173.71428571428, 65536.0,  74898.28571428571, 84260.57142857143, 93622.85714285714, 102985.14285714286, 112347.42857142857, 121709.71428571429, 131072.0]

But now the codomain is too big; we have to cut it in half:

map (\tup -> (fst tup + snd tup) / snd tup * 65536 / 2) [(numer, 7) | numer <- [-7..7]]

[0.0, 4681.142857142857, 9362.285714285714, 14043.42857142857, 18724.571428571428, 23405.714285714286, 28086.85714285714, 32768.0, 37449.142857142855, 42130.28571428572, 46811.42857142857, 51492.57142857143, 56173.71428571428, 60854.857142857145, 65536.0]

That looks about right. Now let's apply a type. Because I'm eventually going to be implementing this without floating-point, I want the math to be done on machine integers. There are some exceptions, but most target architectures these days give you 32 bits for int, so Haskell's Int, which is defined to be 32 bits, should yield approximately what I want. To introduce typing, all I have to do is add a type to the list values. Type inferencing will do the rest. Oh, and since division by slash is not defined on Int, I have to change it to `div` (infix division):

map (\tup -> (fst tup + snd tup) `div` snd tup * 65536 `div` 2) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

[0, 0, 0, 0, 0, 0, 0, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 65536]

Whoah. Something has gone disastrously long. Take a moment to figure out what it is. I'll wait.

Did you figure it out?

The problem is that when I use integer division in an expression, the result is truncated (the fractional part is lost). If the division occurs before any of the other operations, most of the significant bits of the math are lost! I can fix this by making sure to do the division last:

map (\tup -> (fst tup + snd tup) * 65536 `div` snd tup `div` 2) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

[0, 4681, 9362, 14043, 18724, 23405, 28086, 32768, 37449, 42130, 46811, 51492, 56173, 60854, 65536]

That's better. While I'm at it, I can simplify out that factor of 2 by using half of 65536:

map (\tup -> (fst tup + snd tup) * 32768 `div` snd tup) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

[0, 4681, 9362, 14043, 18724, 23405, 28086, 32768, 37449, 42130, 46811, 51492, 56173, 60854, 65536]

In fact, I can ask GHC to verify that the results are the same:

map (\tup -> (fst tup + snd tup) * 65536 `div` snd tup `div` 2) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ] == map (\tup -> (fst tup + snd tup) * 32768 `div` snd tup) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

True

Very cool! Of course, this is not a rigorous proof, but I'm satisfied that my function is correct for the given inputs, and sometimes that's enough.