22 December 2006

Happy Life Day and Wishing You Lots of Haskell in 2007

Let's remember the words of Princess Leia, in the Star Wars Holiday Special:

We celebrate a day of peace. A day of harmony. A day of joy we can all share together joyously. A day that takes us through the darkness. A day that leads us into might. A day that makes us want to celebrate the light. A day that brings the promise that one day, we'll be free to live, to laugh, to dream, to grow, to trust, to love, to be.

Happy Life Day, everyone! Or whatever tradition you celebrate.

I am not planning to work on any more entries until early January, so I just wanted say that I am extremely grateful for all the comments and suggestions I've received on my blog entries. With two babies at home I don't get much time to study textbooks or tutorials, although I try, and I don't get much dedicated coding time. But the suggestions and concrete examples have helped me make good use of what free time I do have. Thanks to all of you for your support learning this exciting language!

19 December 2006

From Bits to Cells: Simple Cellular Automata in Haskell, Part Two

Let's get back to our CA generator. Literate Haskell follows:

Last time we defined a function to generate the next state of a given
cell in our cellular universe, given a rule number and a tuple consisting
of the current state of the cell to the left, the cell itself, and the
cell to the right.

>import Data.Bits

>genNextBit :: Int -> ( Bool, Bool, Bool ) -> Bool
>genNextBit rulenum ( left, center, right ) = rulenum `testBit` idx
>    where idx = ( if left then (4::Int) else (0::Int) ) .|.
>           ( if center then (2::Int) else (0::Int) ) .|.
>           ( if right then (1::Int) else (0::Int) )

Recall that we can use automatic currying to make a rule-applying
function like so:

>rule_30 = genNextBit 30

We can ask GHCi for the type:

:type rule_30

rule_30 :: (Bool, Bool, Bool) -> Bool

I've put it off while I work on the rules, but it is time to figure out
how to actually represent our CA universe. Let's start by using a list.
I know that I'm going to write a number of inefficient functions, and
do evil things like take the length of lists a lot, but let's suspend all
concerns about efficiency over to a future discussion and consider this
purely a proof-of-concept.

Our inital universe at time zero has one cell set to True:

>initial_universe = [True]

But that isn't quite the right representation for the universe, because
it implies that our whole universe is one cell in size. We can't even
apply our rule once because there is no left cell and right cell!
Really, we want to pretend that we have an _infinite_ universe; at
time zero, all the cells to the left and right hold False. Remember,
Haskell is so powerful that it can traverse an infinite list in only
0.0003 seconds! Well, if you don't evaluate the whole thing, that is.
Taking advantage of lazy evaluation, you can define all kinds of
infinite structures. This construct will give us an infinite list of
False values:

>allFalse :: [Bool]
>allFalse = False : allFalse

We don't want to evaluate allFalse, but we can partially evaluate it
using a function like take. So can we represent our universe like this?

>genInfiniteUniverse :: [Bool] -> [Bool]
>genInfiniteUniverse known_universe = allFalse ++ known_universe ++ allFalse

Let's try it:

take 10 ( genInfiniteUniverse initial_universe )

[False,False,False,False,False,False,False,False,False,False]

Nope! Since the left-hand side of the universe is infinite, we will
never reach the middle element, no matter how far we travel from the
start of the list!

That's no good. However, we can do it another way. We'll allow our
universe to be expanded on demand on the left and right sides:

>expandUniverse :: Int -> [Bool] -> [Bool]
>expandUniverse expand_by known_universe =
>   ( take expand_by allFalse ) ++ known_universe ++ ( take expand_by allFalse )

expandUniverse 3 initial_universe

[False,False,False,True,False,False,False]

We can use the expandUniverse function to expand our initial universe
out to a standardized width before we start applying the rules.

First, here's a routine to stringify a universe for display:

>boolToChar :: Bool -> Char
>boolToChar True = '#'
>boolToChar False = ' '

>stringifyUniverse :: [Bool] -> String
>stringifyUniverse ls = map boolToChar ls

Now our infrastructure is in place, so let's figure out how to apply
our generator rule. We know that we want to start with our initial
universe. Let's expand it to a fixed size. This will give us enough
elements to start making left/center/right tuples out of each consecutive
set of three cells. Each tuple is then used to look up the next state
of the cell at the center; this will become an element in our next
universe. Then we move to the next cell (not three cells down). This
means that the tuples overlap.

Let's make the tuples. We have to do some thinking here and consider
all the cases; the behavior isn't immediately obvious. The following
almost works:

universeToTuples :: [Bool] -> [(Bool, Bool, Bool)]
universeToTuples universe | length universe >= 3 =
   ( universe !! 0, universe !! 1, universe !! 2 ) :
   universeToTuples ( tail universe )
universeToTuples universe = []

universeToTuples [False, True, True, True, False]

[(False,True,True),
(True,True,True),
(True,True,False)]

But it isn't quite right; it leaves off the end cases; when we apply
our rules, the intermediate representation of the universe as a list
of tuples to look up cell mappings will shrink. We actually want the
following tuples:

[(False,False,True),
(False,True,True),
(True,True,True),
(True,True,False),
(True,False,False)]

where the first element of the list is considered as if it was just
to the right of an implied False, and the last element is considered
as if it was just to the left of another implied False. This sounds
like another place we can use our universe expander:

>universeToTuples :: [Bool] -> [(Bool, Bool, Bool)]
>universeToTuples [] = []
>universeToTuples universe = tupleize $ expandUniverse 1 universe
>   where
>       tupleize xs =
>            if length xs > 3 then tuple3 xs : tupleize ( tail xs )
>               else [ tuple3 xs ]
>       tuple3 xs = ( xs !! 0, xs !! 1, xs !! 2 )

Why did I write it that way? Well, I tried to write tupleize using
guards, special-casing length xs > 3 followed by an unguarded case for
all other possibilities, but GHC didn't like it -- it told me I had non-exhaustive patterns. There is probably a smarter way to write this,
but note that we definitely don't want this version:

universeToTuples universe = ( xs !! 0, xs !! 1, xs !! 2 )
    : universeToTuples ( tail xs )
       where xs = expandUniverse 1 universe

In that version, the universe keeps expanding as we walk down the list,
and we never get to the end!

OK, now that we have our tuples, we want to turn them into our new
universe, given a cellular rule number:

>tuplesToUniverse :: Int -> [(Bool, Bool, Bool)] -> [Bool]
>tuplesToUniverse rule [] = []
>tuplesToUniverse rule (tup:tups) = genNextBit rule tup : tuplesToUniverse rule tups

Note that we don't have to explicitly take the tail since we provide a
name for it in the pattern. We're ready to define our genUniverses function
that applies a given CA rule. We can express a given generation like this:

>nextUniverse :: Int -> [Bool] -> [Bool]
>nextUniverse rule universe = tuplesToUniverse rule $ universeToTuples universe

Now, let's generalize it:

>genUniverses :: Int -> Int -> Int -> [[Bool]]
>genUniverses rule width count = take count
>   ( iterate ( nextUniverse rule ) ( expandUniverse ( width `div` 2 ) initial_universe ) )

(You could also use a fold, and I'm sure there are lots of other ways to
do it, but iterate seems to work fine).

And now, witness the unfolding of a universe! Note that the parameters
that go to genUniverses are the rule number, the width for display, and the number of generations:

putStr $ unlines $ map stringifyUniverse $ genUniverses 222 19 10

          #
         ###
        #####
       #######
      #########
     ###########
    #############
   ###############
  #################
 ###################

In general, a width of twice the number of generations - 1 will show
all the transitions we are interested in; you could consider the diagonal
area above to be the "light cone" of events causally connected to that
single point (although some rules will generate True cells outside of
that "light cone" based on the other False values in the initial universe). Let's make a helper function to choose a width for us:

>showRule rule gens =
>   putStr $ unlines $ map stringifyUniverse $
>       genUniverses rule ( gens * 2 - 1 ) gens

Let's try a few of the other rules:

showRule 252 15
              #
              ##
              ###
              ####
              #####
              ######
              #######
              ########
              #########
              ##########
              ###########
              ############
              #############
              ##############
              ###############

showRule 78 15
               #
              ##
             ###
            ## #
           ### #
          ## # #
         ### # #
        ## # # #
       ### # # #
      ## # # # #
     ### # # # #
    ## # # # # #
   ### # # # # #
  ## # # # # # #
 ### # # # # # #

And finally, my all-time favorite, which simulates a Sierpinski gasket:

showRule 82 32
                                #
                               # #
                              #   #
                             # # # #
                            #       #
                           # #     # #
                          #   #   #   #
                         # # # # # # # #
                        #               #
                       # #             # #
                      #   #           #   #
                     # # # #         # # # #
                    #       #       #       #
                   # #     # #     # #     # #
                  #   #   #   #   #   #   #   #
                 # # # # # # # # # # # # # # # #
                #                               #
               # #                             # #
              #   #                           #   #
             # # # #                         # # # #
            #       #                       #       #
           # #     # #                     # #     # #
          #   #   #   #                   #   #   #   #
         # # # # # # # #                 # # # # # # # #
        #               #               #               #
       # #             # #             # #             # #
      #   #           #   #           #   #           #   #
     # # # #         # # # #         # # # #         # # # #
    #       #       #       #       #       #       #       #
   # #     # #     # #     # #     # #     # #     # #     # #
  #   #   #   #   #   #   #   #   #   #   #   #   #   #   #   #
 # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Wow!

Followup: I had mentioned that my code had a bug, because some pictures, such as this one:

showRule 29 11
          #
######### ###########
#         #
######### ###########
#         #
######### ###########
#         #
######### ###########
#         #
######### ###########
#         #

look different than the way Wolfram's book and web site shows them, which is like this:

          #
######### ###########
          #
######### ###########
          #
######### ###########
          #
######### ###########
          #
######### ###########
          #

After a little investigation this seems to be because Wolfram's implementation wraps around; the left neighbor of the leftmost cell of a given universe is taken from the rightmost cell, and vice-versa, while my implementation pretends that there is always more empty space available to the left and right.

Whether you consider this a bug or not is up to you. The wraparound behavior is probably considered more "canonical." You can compare the results from my program to the pictures at Wolfram's MathWorld site here. If you replace my universeToTuples function with this one:

universeToTuples :: [Bool] -> [(Bool, Bool, Bool)]
universeToTuples [] = []
universeToTuples universe = tupleize $ wrapUniverse universe
   where
       wrapUniverse xs =  ( last xs ) : ( xs ++ [ head xs ] )
       tupleize xs =
           if length xs > 3 then tuple3 xs : tupleize ( tail xs )
               else [ tuple3 xs ]
       tuple3 xs = ( xs !! 0, xs !! 1, xs !! 2 )

you will get the wraparound behavior.

Thanks for reading and as always, I appreciate your comments.

17 December 2006

From Bits to Cells: Simple Cellular Automata in Haskell, Part One

The little Haskell toy I wrote to simulate a dot-matrix printhead was useful in one sense: it helped me learn a little more Haskell! I couldn't help noticing that the results look like the transformations you get when applying simple cellular automata rules. So my next experiment is to implement those rules, and see what else I can learn from the results.

What follows is Literate Haskell, but there is no main program to run; it is for you to experiment with interactively, if you like.

Some Haskell code to implement simple cellular automata (CA) rules:
part one, round one.

The simplest type of one-dimensional CA behaves as follows:

- a cell may be on or off (one or zero).

- time ticks by in discrete intervals.

- at each tick, the state of each cell is updated using a function that
maps the current state of the cell to the left, the cell itself, and
the cell to the right to the new cell value.

Because this value-generating function depends on 3 bits of input, a
given CA rule actually consists of 2^3 = 8 possible mappings. The
mappings can be represented by the binary values 0b000 to 0b111
(decimal 0..7). By Wolfram's numbering convention for these simple
CA, we actually count down from 7 to 0. So, let's say we have a rule
that maps each possible set of left, center, and right values:

0b111, 0b110, 0b101, 0b100, 0b011, 0b010, 0b001, 0b000

to the new cell value:

0b0, 0b0, 0b0, 0b1, 0b1, 0b1, 0b1, 0b0

We can treat the resulting number, in this case 0b00011110, as the
rule number. This is rule 30 in Wolfram's schema; there are 2^8,
or 256, possible rules, numbered from 0 to 255.

First, let's make a function that, given a rule number and three
values (left, center, and right), returns the new cell state. We
turn the three values into an index to look up the appropriate bit
in the rule number.

>import Data.Bits

>genNextBit :: Int -> ( Bool, Bool, Bool ) -> Bool
>genNextBit rulenum ( left, center, right ) = rulenum `testBit` idx
>    where idx = ( if left then (4::Int) else (0::Int) ) .|.
>           ( if center then (2::Int) else (0::Int) ) .|.
>           ( if right then (1::Int) else (0::Int) )

Hmmm... it is lacking a certain elegance, and I don't quite like
the way the indentation rules work in this case, but let's test it
with a function that generates the 8 rule indices in the form of tuples:

>genRuleIndices :: [( Bool, Bool, Bool )]
>genRuleIndices = [ ( (val `testBit` 2), (val `testBit` 1),
>                   ( val `testBit` 0) ) | val <- [(7::Int),
>                   (6::Int)..(0::Int)] ]

Now if we write:

genRuleIndices

we get:

[(True,True,True),
(True,True,False),
...
(False,False,True),
(False,False,False)]

and if we write:

map (genNextBit 30) genRuleIndices

this expression curries us a function (mmm... curry!) which takes
a starting state and applies rule 30, then feeds it the possible
starting states. The result is:

[False,False,False,True,True,True,True,False]

That looks like the bit pattern for rule 30.

Just for fun, let's confirm by making a function that will
translate a list of output bit values back into a rule number.
The signature should look like this:

sumBitVals :: [Bool] -> Int

And we want the list

>test_bit_vals = [False,False,False,True,True,True,True,False]

to map back to 30. Take a shot at it yourself; you might find
that the result is instructional I'll wait.

(Musical interlude).

Did you try it? Let's look at some possible implementation strategies.
We could make a list of the powers of two and then do some list
manipulation to get the powers of two that correspond to our one-bits
summed up:

>powers_of_two_8_bits = reverse ( take 8 (iterate (2 *) 1) )

[128,64,32,16,8,4,2,1]

(An aside: since this is generated by a function that takes no
parameters at runtime, it would seem like a sufficiently smart
compiler could generate this list and stick it in the resulting
program so that no run-time calculation is required to generate
it. Does GHC do this? I don't know.)

Anyway, our implementation. We can tuple up our powers of two with
our bit values:

>tups = zip powers_of_two_8_bits test_bit_vals

[(128,False),(64,False),(32,False),(16,True),(8,True),(4,True),(2,True),(1,False)]

Then we can turn this back into a list of only the powers of two that are
"on," and sum the results:

sum (map (\ tup -> if snd tup then fst tup else 0) tups )

30

It seems like this should be simpler still. I looked in the standard
library for a function to filter one list using a list of boolean
values as a comb. Let's write one:

>combListWithList :: [a] -> [Bool] -> [a]
>combListWithList [] [] = []
>combListWithList ls comb =
>   if (head comb) then (head ls) : combListWithList (tail ls) (tail comb)
>       else combListWithList (tail ls) (tail comb)

combListWithList powers_of_two_8_bits test_bit_vals

[16,8,4,2]

That seems good, although it doesn't express the idea that the lists
have to be the same length. It still seems amazing to me that I can
reel off functions like that in Haskell and have them work right the
first time! Now we can produce our final function:

>sumBitVals :: [Bool] -> Int
>sumBitVals ls = sum $ combListWithList powers_of_two_8_bits ls

sumBitVals test_bit_vals

30

There is probably an elementary way to implement the above function
using zipWith instead of my combList, but I'll leave that to you;
leave a comment if you figure it out!

As another aside, here is another function I came up with to generate
this bit vals; it doesn't rely on a specific length.

>foldBitVals :: [Bool] -> Int
>foldBitVals ls = snd (foldr
>   (\ flag tup -> if flag then ((fst tup) * 2, (snd tup) + (fst tup))
>                           else ((fst tup) * 2, snd tup) )
>   (1, 0) ls )

foldBitVals test_bit_vals

30

Take a moment to understand the above function; it is a little strange.
I perform a fold using a tuple. The first element is the power of two,
so it keeps track of our place in the list, and the second is an
accumulator as we add up our values. This is a trick for stashing
a state that is more complex than a single value into a fold.

Anyway, that was a long diversion, so let's end for now. Next time we'll look at ways to represent our CA universe as it evolves by applying our rules. As always, your comments are welcome!

14 December 2006

The Revised Dot-Matrix Printhead

It looks like this blog has been unceremoniously removed from Planet Scheme. I guess that's only fair, since I haven't written anything specifically about Scheme in a while. Best of intentions, and all that. This blog probably belongs on "Planet Programming Language Dilettante."

Anyway, I received some helpful feedback on the short program I wrote to print bitmaps of binary numbers. Jason Creighton in particular took a shot at a rewrite, and the result as highly enlightening. I also recieved two specific pieces of advice: use short function names, and don't write a comprehension like [a | a <- [a..b]] when you can just write [a..b]. Oops. Check, and thanks!

(Semi) or (Ill) Literate Haskell follows:

Haskell Binary Printhead Toy, Second Round

>import Data.Bits

Our functions to generate lists of signed and unsigned integers of
numbits length, revised with names as suplied by Jason and error-
catching as supplied by yours truly:

>unsignedInts :: Int -> [Int]
>unsignedInts bits | (1 <= bits) && (bits <= 16) =
>   [(0::Int)..(bound::Int)]
>   where bound = 2 ^ bits - 1
>unsignedInts bits = error "expected a value from 1 to 16"

>signedInts :: Int -> [Int]
>signedInts bits | (1 <= bits) && (bits <= 16) =
>   [(- (bound::Int)) - 1..(bound::Int)]
>   where bound = 2 ^ ( bits - 1 ) - 1
>signedInts bits = error "expected a value from 1 to 16"

Jason came up with a considerably simpler way to map the bits of
the binary numbers to the characters for printing:

>boolToChar :: Bool -> Char
>boolToChar True = '#'
>boolToChar False = ' '

I'm adopting that suggestion enthusiastically, since I don't think
code can get much simpler than that!

Jason then proceeded to turn my dog food into a much more flavorful
wafter-thin after-dinner mint, leaving me struggling to understand
just what he did:

>bitPicture :: (Int -> [Int]) -> Int -> [Char]
>bitPicture intGen bits = unlines $ map stringAtIndex (reverse [0..(bits-1)])
>  where
>    stringAtIndex = map boolToChar . bitsAtIndex
>    bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

Wow. First off, Jason has supplied a type signature which expresses
what the function actually _does_. The appropriate integer generator
(the signed or unsigned version) comes in as a parameter; that's what
(Int -> [Int]) means. We next get an integer argument (the number of
bits), and finally, we spit out a list of chars.

Jason proceeds to use a couple of where clauses to set up some
simplifying expressions. Let's look at the first one:

StringAtIndex = map boolToChar . bitsAtIndex

OK, this is slightly mind-twisting right off the bat. In the world
of C and Java you can't write things like that. The dot is function
composition expressed naturally; instead of something like

stringOfBoolsFromBitsAtIndex idx = map boolToChar (bitsAtIndex idx)

we can just get Haskell to chain the functions together for us.
No fuss, no muss. But this function makes a forward reference to
bitsAtIndex, which we haven't seen yet (completely illegal in C
and Java). Let's make sure we understand what bitsAtIndex does:

bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

That's definitely more dense than the way I wrote it (and that's a
good thing). To break down the list comprehension, the expression on
the right hand side of the vertical bar says that we feed the supplied
integer generator function the number of bits to use. That gives us a
list. From that list we will draw our "n"s and test bit idx on them.
By doing it this way, Creighton has pulled an act of Haskell sleight-
of-hand: instead of generating the bitmaps for each integer value,
like I did, and then using "transpose" to rearrange them, this version
accesses the "columns" of the table directly! With those two
expressions, we've got our string representing all the bits at a
given index _for the whole supplied set of integers_.

Wow. I feel like an audience member watching a magician smash up my
watch and then give it back to me rebuilt as a marvelous whirring
gyroscope.

There's more, but this part is relatively simple:

bitPicture intGen bits = unlines $ map stringAtIndex (reverse [0..(bits-1)])

We already know what unlines does; it turns our list of strings into a
properly line-broken string. We've just described how stringAtIndex works.
and map stringAtIndex (reverse [0..(bits-1)[) just feeds stringAtIndex a
set of indices mapped from high to low, so that our most significant bit
comes out leftmost. But what does "$" do?

My reference at zvon.org says this is the "right-associating infix
application operator (f $ x = f x), useful in continuation-passing style."
I'm not sure I fully understand, and I'm not going to go into the meaning
of "continuation-passing style" right now, but I guess it saves us from
having to write:

unlines $ map (stringAtIndex (reverse [0..(bits-1)]))

to get the right order of operations, which is nice. So I'll try to
incorporate it into my own Haskell.

Anyway, let's try it. We need to drive the function:

>main :: IO ()
>main = do
>   print "Unsigned values (6 bits)"
>   putStr $ bitPicture unsignedInts 6
>   print "Signed values (6 bits)"
>   putStr $ bitPicture signedInts 6

After using GHCi's :cd and :load command to read this file, just
type:

main

"Unsigned values (6 bits)"
                                 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

"Signed values (6 bits)"
 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Excellent. I laughed; I cried; I learned something today. I hope you did, too. Thanks, Jason! And thanks to the other people who commented!

Followup: for those following along the discussion in the comments, here is an alternate implementation of the bitPicture function:

bitPicture intGen bits =
    unlines $ map stringAtIndex $ [bits - 1, bits - 2..0]
    where stringAtIndex = map boolToChar . bitsAtIndex
        bitsAtIndex idx = map (flip testBit idx) (intGen bits)

Note that using [bits - 1, bits - 2..0] in the first line allows us to avoid the list reversal, so that seems like a clear win. The method of expressing bitsAtIndex above is one of several that were suggested. The idea behind the flip is that the return value of intGen should become the first parameter to bitsAtIndex, not the second. You can also express this by forcing stringAtIndex to be treated as an infix operator:

bitsAtIndex idx = map (`testBit` idx) (intGen bits)

Both of these alter the behavior of Haskell's automatic currying so that it replaces the second parameter instead of the first, leaving the first as the sole parameter to the resulting curried function; in Common Lisp, you might do this explicitly using rcurry. And don't forget the original, using a list comprehension:

bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

Assuming there is not a specific concern about efficiency, it seems like the choice between these versions might be a matter of style. Thanks again to everyone who made suggestions.

12 December 2006

The Dot-Matrix Printhead: a Haskell Toy

The body of this article is a literate Haskell program. You can extract the text, then save it as a file with the extension .lhs, then execute it using GHCi.

This is a Haskell toy to visualize twos-complement binary numbers.

In my previous articles on the subject of integer division, I alluded
to underflow and overflow and the "weird number" -- well, now you
can see them!

Almost 30 years ago I wrote code on my Radio Shack TRS-80 to make my
dot-matrix printer spit out bitmaps of binary numbers, which made
pretty recursive patterns. This program commemorates that long-ago
geekery from my early teenage hacking days.

First, we need the bits library:

>import Data.Bits

We'll also need the list library later:

>import List

Let's simulate n-bit binary number in both a signed and unsigned
interpretation. Given a number of bits, we want a function to make
a list of the possible values. For example, in unsigned form 8 bits
will give us a list of the values 0..255; in signed form, it will be
an asymmetrical list from -128..127. Note that this function generates
the whole list, so if you give it a numbits value higher than, say,
16, you might regret it.

Our calculation fails for values of numbits <= 0, so we put in a
guard case for that. We also want to discourage creating very large
lists, so we put in a guard case for numbits > 16.

>gen_n_bit_nums_unsigned numbits | numbits <= 0 = []
>gen_n_bit_nums_unsigned numbits | numbits > 16 = error "too many bits!"
>gen_n_bit_nums_unsigned numbits =
>   [ n | n <- [(0::Int)..((2::Int)^numbits - 1)] ]

>gen_n_bit_nums_signed numbits | numbits <= 0 = []
>gen_n_bit_nums_signed numbits | numbits > 16 = error "too many bits!"
>gen_n_bit_nums_signed numbits =
>   [ n | n <- [-(2::Int)^(numbits - 1)..((2::Int)^(numbits - 1) - 1)] ]

To test this we can execute:

gen_n_bit_nums_unsigned 4
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

gen_n_bit_nums_signed 4
[-8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7]

That looks about right.

Now we want a function to access the bits of those numbers from most
significant to least significant and build a list of corresponding
boolean values. This would be expressed more naturally if EnumFromTo
operated descending sequences. I'm sure there's a better way, but for
now I will just reverse the resulting list:

>rightmost_bits num_bits val | num_bits > 16 = error "too many bits"
>rightmost_bits num_bits val | num_bits <= 0 = error "too few bits"
>rightmost_bits num_bits val =
>   reverse [ testBit (val::Int) bit_idx | bit_idx <- [0..(num_bits - 1)] ]

Once we have a function which will do it with one number, doing it with a
whole list of numbers is very easy. We will make a curried function and
use map. Currying in Haskell is very easy, which makes sense given that
the operation, "to curry" and the language, Haskell, are both named after
Haskell Curry!

Our rightmost_bits function takes two arguments, but we want to make a
new function which we can hand off as an argument to map, along with a
list. This new function will only take one parameter. We can ask GHCi
what it thinks the type of the rightmost_bits function is:

:type rightmost_bits

It says:

rightmost_bits :: Int -> Int -> [Bool]

We can make a new function out of it by doing partial application:
instead of supplying all its arguments, we just supply the first one.

gen_rightmost_bits 4

If you do this in GHCi, you'll get an error message because the
result is not printable (GHCi can't apply show to the result).
But you can bind it:

let bit_generator_4 = gen_rightmost_bits 4

And then ask GHCi what its type is:

:type bit_generator_4
bit_generator_4 :: [Int] -> [[Bool]]

Take a moment to make sure you understand what we've done here:
Haskell has made us a curried function which takes a list of values
instead of a single value, and generates a list of lists. This
seems pretty close to magic! Let's make another function that
takes the number of bits and a list, generates the curried
function, then maps the list to the curried function:

>gen_rightmost_bits num_bits ls =
>   map (rightmost_bits num_bits) ls

As you can see, working with curried functions is completely
effortless in Haskell. Here's a general function to generate
a list of all the bit values for unsigned numbers of num_bits
size:

>gen_all_bits_unsigned num_bits =
>   gen_rightmost_bits num_bits (gen_n_bit_nums_unsigned num_bits)

>gen_all_bits_signed num_bits =
>   gen_rightmost_bits num_bits (gen_n_bit_nums_signed num_bits)

If we execute:

gen_all_bits_unsigned 4

We get back the following (I have reformatted the output for clarity):

[[False, False, False, False],
[False, False, False, True],
[False, False, True, False],
[False, False, True, True],
[False, True, False, False],
[False, True, False, True],
[False, True, True, False],
[False, True, True, True],
[True, False, False, False],
[True, False, False, True],
[True, False, True, False],
[True, False, True, True],
[True, True, False, False],
[True, True, False, True],
[True, True, True, False],
[True, True, True, True]]

This is recognizably a table of the unsigned binary values 0..15.
This approach doesn't seem very "Haskell-ish" yet -- instead of
generating the combinations as needed, I'm producing them all
ahead of time. It also feels to me like I'm pushing uphill on
the syntax, and having to use parentheses to force the order
of evaluation. We can think more about that later, but let's
get our pretty-printing working first.

How can I turn all those boolean values into a string I can print?
Well, we can take advantage of the fact that in Haskell, strings can
be treated as lists, and the same methods apply. This is not efficient,
but it allows us to use some of the standard functional techniques.
One of these is "fold."

Fold is a technique for boiling down a list into some kind of summary
form. Just what form this summary takes is up to you, since you can
supply your own function and a starting value. For example, to sum
the elements in a list, I could use foldl like this:

foldl (\ x y -> x + y) 0 [0, 1, 2]

The first parameter to foldl is a function. In this case I'll use
a lambda expression, which gives me a function I won't bother to
name. The second paramter is the starting value. The starting value
and the next item in the list are repeatedly passed to the function,
which accumulates a value. In this case, the final result is 3.

I can give foldl an empty string as the starting accumulator value
and my function can concatenate strings onto it:

>stringify_vals =
>   foldl (\ x y -> if y then x ++ "#" else x ++ " ") ""

Now you might want to try to apply our list of lists to this
function:

stringify_vals (gen_all_bits_unsigned num_bits)

If you do that, though, you'll get a type error like this:

"Couldn't match expected type `Bool' against inferred type `[Bool]'"

The reason can be seen if we examine the types:

:type (gen_all_bits_unsigned 4) :: [[Bool]]
(gen_all_bits_unsigned 4) :: [[Bool]]

:type stringify_vals
stringify_vals :: [Bool] -> [Char]

We have a list of lists of bools, but we want to feed our
stringification function a list of bools. To apply each
element of (gen_all_bits_unsigned 4) to our stringification
function, we can use map again:

>stringify_all_unsigned_vals num_bits =
>   map stringify_vals (gen_all_bits_unsigned num_bits)

>stringify_all_signed_vals num_bits =
>   map stringify_vals (gen_all_bits_signed num_bits)

This will give us a list of strings, one for each list of boolean
values. (I've reformatted the output).

["    ",
"   #",
"  # ",
"  ##",
" #  ",
" # #",
" ## ",
" ###",
"#   ",
"#  #",
"# # ",
"# ##",
"##  ",
"## #",
"### ",
"####"]

That's nice: we can start to see our recursive structure! But
we really want that value to be turned into a single printable
string. To achieve that, we can use the unlines function.

>prettyprint_unsigned num_bits =
>   unlines (stringify_all_unsigned_vals num_bits)

>prettyprint_signed num_bits =
>   unlines (stringify_all_signed_vals num_bits)

When we interpret

prettyprint_unsigned 3

we see a single string with newline characters escaped, like so:

"   \n  #\n # \n ##\n#  \n# #\n## \n###\n"

This will give the results we want when we pass it to putStr, like so:

putStr (prettyprint_unsigned 8)

        #
       #
       ##
      #
      # #
      ##
      ###
     #
     #  #
     # #
     # ##
     ##
     ## #
     ###
     ####

    (etc.)

Hmmm. That's nice, but it would be really cool if the results
were rotated, so that we could see what it would look like if
a printhead moving horizontally was printing out these values.
To do this, we can actually start with our intermediate list of
lists. If our four-bit numbers give us

[[False, False, False, False],
[False, False, False, True],
[False, False, True, False],
etc.

we can see the values the other way using transpose. Let's write
rotated versions of gen_all_bits_unsigned and gen_all_bits_signed:

>gen_all_bits_unsigned_rot num_bits =
>   transpose
>   (gen_rightmost_bits num_bits (gen_n_bit_nums_unsigned num_bits))

>gen_all_bits_signed_rot num_bits =
>   transpose
>   (gen_rightmost_bits num_bits (gen_n_bit_nums_signed num_bits))

And make a new stringify function to play with:

>stringify_all_unsigned_vals_rot num_bits =
>   map stringify_vals (gen_all_bits_unsigned_rot num_bits)

>prettyprint_unsigned_rot num_bits =
>   unlines (stringify_all_unsigned_vals_rot num_bits)

Now

putStr (prettyprint_unsigned_rot 6)

will give us:

                                 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Now that's what I'm talking about!

We can use the unsigned version to see the interesting transition
point between negative and positive values:

>stringify_all_signed_vals_rot num_bits =
>   map stringify_vals (gen_all_bits_signed_rot num_bits)

>prettyprint_signed_rot num_bits =
>   unlines (stringify_all_signed_vals_rot num_bits)

Now

putStr (prettyprint_signed_rot 6)

will give us:

 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Now let's give our program a main function using the IO monad and
have it print out several of our results:

>main :: IO ()
>main = do
>   print "Unsigned values (8 bits)"
>   putStr (prettyprint_unsigned 8)
>   print "Signed values (8 bits)"
>   putStr (prettyprint_signed 8)
>   print "Unsigned values (6 bits) rotated"
>   putStr (prettyprint_unsigned_rot 6)
>   print "Signed values (6 bits) rotated"
>   putStr (prettyprint_signed_rot 6)

After using GHCi's :cd and :load command to read this file, just
type:

main

I would invite any readers to play around with the code, and if you can make it more concise and/or more "Haskellish," please leave a comment.

Now if only we could get Haskell to generate the appropriate sound! My dot-matrix printer was loud. But that's a task for another day.

Update: there is a followup aticle to this one which develops the code further; see http://praisecurseandrecurse.blogspot.com/2006/12/revised-dot-matrix-printhead.html Also, make sure to take a look at the comments on both this posting and the followup.

08 December 2006

The Divisive Aftermath

Antti-Juhani pointed out to me that Haskell provides the quot and rem pair of operations in addition to div and mod. The difference is that while both satisify the identity involving division, multiplication, and the remainder, div and mod round down (towards -infinity), while quot and rem round towards zero (truncating the fractional part).

That's great! (Besides the fact that I displayed my ignorance in public, that is). It means I really can use GHCi to model my functions for testing! And I was getting worked up to write a C version of Plauger's div which rounded away from zero for negative quotients so I could have some guaranteed-consistent behavior. Instead I can use the C standard function div and Haskell's quot and rem.

So, this sort of begs the question: is there a good reason to prefer one method of integer division rounding to another? Yes. It comes down to this question: do you want your algorithms to behave smoothly across the origin, or to behave symmetrically around the origin?

Let's briefly consider some Haskell code for mapping a range of fractions onto a range of integers again.

let gen_nths denom =
    [ (numer, denom::Int) | numer <- [ (-denom::Int)..(denom::Int) ] ]

let div_map m1 m2 = map (\ tup -> fst tup *
    ( ( (m2::Int) - (m1::Int) ) `div` 2 ) `div` snd tup )

div_map (-10) 10 ( gen_nths 3 )

[-10, -7, -4, 0, 3, 6, 10]

Notice something about those values: the distances between them follow a consistent pattern. Reading the list left to right, we have distances of 3, 3, 4, 3, 3, 4. This pattern repeats. It is smooth across the origin.

Now let's change our div_map function to use quot and rem:

let quot_map m1 m2 = map ( \tup -> fst tup *
    ( ( (m2::Int) - (m1::Int) ) `quot` 2) `quot` snd tup )

quot_map (-10) 10 ( gen_nths 3 )

[-10, -6, -3, 0, 3, 6, 10]

Looking at the differences between values, from left to right, we find that they are 4, 3, 3, 3, 3, 4. The function is symmetric around the origin.

Does this matter in practice? Of course! Let's say that we have a mapping using a non-origin-crossing version of our quot_map function:

let gen_non_negative_nths denom =
    [ (numer, denom::Int) | numer <- (0::Int)..(denom::Int) ] ]
let quot_map_2 m1 m2 = map ( \tup -> fst tup *
    ( (m2::Int) - (m1::Int) ) `quot` snd tup )

quot_map_2 0 20 ( gen_non_negative_nths 6 )

[0, 3, 6, 10, 13, 16, 20]

Shiny. The intervals go 3, 3, 4, 3, 3, 4. But if we shift the values:

let gen_non_positive_nths denom =
    [ (numer, denom::Int) | numer <- [ (-denom::Int)..(0::Int) ] ]

quot_map_2 0 20 ( gen_non_positive_nths 6 )

[-20, -16, -13, -10, -6, -3, 0]

The intervals go 4, 3, 3, 4, 3, 3 -- they are backwards. In other words, the effect of integer rounding on the values of the codomain changes if you shift the domain of the denominator across the origin: the ordering of the pattern is reversed.

If we do the same thing using div:

let div_map_2 m1 m2 =
    map (\ tup -> fst tup * ( (m2::Int) - (m1::Int) ) `div` snd tup )

div_map_2 0 20 ( gen_non_negative_nths 6 )

[0, 3, 6, 10, 13, 16, 20]

div_map_2 0 20 ( gen_non_positive_nths 6 )

[-20, -17, -14, -10, -7, -4, 0]

we get a pattern of intervals (the effect of rounding) which remains the same: 3, 3, 4, 3, 3, 4.

Now, smoothness across the origin is is what I want, at least for the kinds of functions I am working on now. But your problem domain may lend itself to writing functions which involve rotations, or mapping values across the origin: in that case, you're going to want the symmetry. The important thing is to know what strategy you are using and apply it consistently. I'm really impressed that Haskell give me so many interesting options.

07 December 2006

The Division Bell Tolls for Me, Part Four (Conclusion)

Well, this was initially going to be a three-part article, but things have become more interesting. The situation is not unlike picking up a log in the forest and finding all kinds of nifty crawlies living under it. "Interesting" to certain geeky people like myself, at least! It's about to get geekier. If you are allergic to ANSI or ISO standards, you'd better stop reading now. But I claim the payoff is worth it, if you can make it through to the end. At least, for some value of "worth it."

In the last installment we were exploring the behavior of integer division in Haskell. I want to get back to C now. We learned that the behavior of integer division involving negative results is not strictly defined for C. The behavior of % (mod) is also not very strictly defined.

Let's start with Kernighan and Ritchie's The C Programming Language, Second Edition (I'm not even going to try to figure out rules for pre-ANSI/ISO C). K&R tell us on p. 10 that

...integer division truncates; any fractional part is discarded.

That isn't very detailed, but on p. 41 they tell us

the direction of truncation for / and the sign of the result for % are machine-dependent for negative operands, as is the action taken on overflow or underflow.

OK, but that doesn't describe exactly what the options are. Things get a little bit more detailed on p. 205, where we read that no guarantees are made about the sign of the remainder!

...if the second operand is zero, the result is undefined. Otherwise, it is always true that (a/b)*b + a%b is equal to a. If both operands are non-negative, then the remainder is non-negative and smaller than the divisor; if not, it is guaranteed only that the absolute value of the remainder is smaller than the absolute value of the divisor.

This isn't actually internally consistent, because if the sign of the remainder is not correct, then the identity for division doesn't work for negative values unless the sign of the quotient is allowed to be incorrect! I've never seen it suggested anywhere else that division could be that loosely defined. But K&R isn't the formal standard, so let's move on.

The reason for the "looseness" that exists in the C standard, of course, is that the standard was originally written to, as much as possible, codify existing practice, and C has a strong bias towards efficiency. Since (apparently) different hardware division algorithms had different behavior, the standards committee did not feel that it was appropriate to require existing implementations to change behavior. Doing so might have required existing architectures to have to replace hardware division with software division, which could have inflicted an enormous efficiency cost.

According to ISO/IEC 9899:1990 (known as C90):

When integers are divided and the division is inexact, if both operands are positive the result of the / operator is the largest integer less than the algebraic quotient and the result of the % operator is positive. If either operand is negative, whether the result of the / operator is the largest integer less than or equal to the algebraic quotient or the smallest integer greater than or equal to the algebraic quotient is implementation-defined, as is the sign of the result of the % operator. If the quotient a/b is representable, the expression (a/b)*b + a%b shall equal a.

So we know that there are two options for rounding. In order to maintain the identity involving integer division and mod, though, the value of mod has to be consistent with the rounding strategy, even if the sign does not have to be consistent.

There is a back door that lets us get to defined behavior. Apparently the div function does have strictly defined rounding. In section 7.10.6.2 of the standard, we read:

If the division is inexact, the resulting quotient is the integer of lesser magnitude that is the nearest to the algebraic quotient.

That's interesting -- this is rounding towards zero. The div function returns a structure of type div_t that contains both the integer quotient and integer remainder. It is provided in the <stdlib> header, available in C++ as <cstdlib>. In his book The Standard C Library, P. J. Plauger provides an implementation for div. It looks like this:

div_t (div)(int numer, int denom)
    {        /* compute int quotient and remainder */
    div_t val;
    val.quot = numer / denom;
    val.rem = numer - denom * val.quot;
    if (val.quot < 0 && 0 < val.rem)
        {    /* fix remainder with wrong sign */
        val.quot += 1;
        val.rem -= denom;
        }
return (val); }

We can see what he is doing: for negative quotients that are inexact answers to the division problem, he shifts the quotients up towards zero and the remainders get reversed. Note that he does not use the built-in % facility, presumably since its behavior does not have very strong guarantees placed on it.

C: A Reference Manual (5th Edition) by Harbison and Steele seems to indicate that the semantics are a little more rigorously defined. We read there that

For integral operands, if the mathematical quotient of the operands is not an exact integer, then the fractional part is discarded (truncation toward zero). Prior to C99, C implementations could choose to truncate toward or away from zero if either of the operands were negative. The div and ldiv library functions were always well defined for negative operands.

That seems pretty unambiguous, but then on page 419, when describing the div and ldiv function, the authors write

The returned quotient quot is the same as n/d, and the remainder rem is the same as n%d.

But that ignores the possibility of a pre-C99 implementation where / rounds away from zero for negative values, as does GHCi in my tests. So again we have a lack of rigorous internal consistency. It is worth noting here that K&R don't make any extra statements about the required rounding behavior of div and ldiv, and since K&R is still considered the "bible" by many C programmers, the guarantees on the div and ldiv function may not be very well-known -- or properly implemented.

Does C99 clarify this at all? ISO/IEC 9899:1999(E), Second Edition 1999-12-01, tells us that "the result of the / operator is the algebraic quotient with any fractional part discarded" and a footnote indicating that this is often called "truncation towards zero."

How about C++? ISO/IEC 14882, Second Edition (2003-10-15) does not actually say how rounding is performed, although it says that "If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined." There is a footnote indicating that "According to work underway toward the revision of ISO C, the preferred algorithm for integer division follows the rules defined in the ISO Fortran standard" (rounding towards zero). When we try to look up information about div and ldiv, we find that the C++ standard just refers us to the Standard C library.

OK, all the standards language is giving me a headache; let's take a look at some code. To begin with, let's confirm the way my C implementation does rounding of integer quotients. We took a look at 6/5, -6/5, 5/6, and -5/6 in Haskell; let's look at the same quotients in C:

int val_1 = 6;
int val_2 = 5;
int val_3 = -6;
int val_4 = -5
int result_1 = val_1 / val_2;
int result_2 = val_3 / val_2;
int result_3 = val_2 / val_1;
int result_4 = val_4 / val_1;
printf("result_1 = %d, result_2 = %d, result_3 = %d, result_4 = %d\n",
    result_1, result_2, result_3, result_4);

This gives me:

result_1 = 1, result_2 = -1, result_3 = 0, result_4 = 0

Yep, I'm getting rounding towards zero for both positive and negative values. Next, some Haskell. Let's bring back our algorithm from part 2:

let gen_nths denom =
    [(numer, denom::Int) | numer <- [(-denom::Int)..(denom::Int)] ]

let div_map m1 m2 = map (\tup -> (fst tup + snd tup) *
    (((m2::Int) - (m1::Int)) `div` 2) `div` snd tup + m1)

div_map (-10) 10 (gen_nths 3)

[-10,-7,-4,0,3,6,10]

And now let's write up my C algorithm again. This time we'll use int instead of long since we don't need big numbers (although on my target, ints are actually 32 bits long).

unsigned int denom = 3;
int numer;
for ( numer = -3; numer <= 3; numer++ )
{
    int quot = numer * 10 / denom;
    printf("quotient %d\n", quot);
}

Go ahead and try it on your platform.

Whoah! What happened? You might have noticed something very odd about the first few quotients. They are monstrously large! On my target I get:

quotient 1431655755
quotient 1431655758
quotient 1431655762
quotient 0
quotient 3
quotient 6
quotient 10

What happened is that I injected a bug to point out yet another possible way your C code can fail catastrophically. This behavior can surprise even experienced programmers. Note that since my denominator is always a positive value, I declared it as an unsigned int instead of an int. So what went wrong?

Let's look at the first example: -3/3 yielding 1431655755. The behavior you are looking at is actually mandated by the C standard. When mixing signed and unsigned types, the compiler is required to promote (that is, widen) the actual type of the calculation to an unsigned type. So, internally, if we are performing the calulation -22 (signed) / 7 (unsigned), the compiler is required to notice that both a signed and unsigned long value are presented to the integer division operator. The 32-bit twos-complement representation of -3 is 0xFFFFFFFD (the conversion to an unsigned intermediate value changes the meaning, not the representation). I multiply this value by 10, yielding 0xFFFFFFE2, and then divide it by 3, yielding 0x5555554B, or a decimal value of 1431655755. The compiler sticks this unsigned value right into the signed result variable. The signed or unsigned status of the destination variable has absolutely no effect; it does not dissuade the compiler from deciding to treat the values as unsigned. (It is a common misconception to think that the destination value influences the type of the calculation in C).

If you have a target with 16-bit ints, your rolled-over values will be different (0xFFFD, 0xFFE2, and 0x554B, or decimal 21835). So the mixing of signed and unsigned values has not only given us the wrong answer, but made the code that generates the wrong answer produce inconsistent results depending on the size of int, even when the values we start with are not actually too big to fit into 16 bits.

If that wasn't perfectly clear, and even if it was, if you are really trying to write portable numeric code in C that has to be correct, I urge to consider purchasing the MISRA (Motor Industry Software Reliability Association) book MISRA-C:2004, Guidelines for the Use of the C language in Critical Systems. It has the most detailed explanation of C's arithmetic type conversions, and more importantly, the implications of these type conversions in real code. The crux of the biscuit is that C is hard to get right, even for good programmers.

OK, so let's change my denominator back to a signed long and try it again. This time the results look more reasonable: -10, -6, -3, 0, 3, and 6.

Are things any different if we use div?

int denom = 3;
int numer;
for ( numer = -3; numer <= 3; numer++ )
{
    div_t val = div(numer * 10, denom);
    printf("quotient %d\n", val.quot);
}

In my case, no, since on my compiler and target, / already seems to round towards zero (asymmetrically) while Haskell rounds symmetrically downwards (floor behavior).

By the way, while we're at it, let's see what C has to say about the "weird" number divided by -1:

long most_negative_long = -2147483648;
long minus_one = -1;
long result = most_negative_long / minus_one;
ldiv_t ldiv_result = ldiv(most_negative_long, minus_one);
printf("results: %lX, %lX\n", result, ldiv_result.quot);

What do you get? I get zero as the result of both the / and ldiv.

I also get GCC telling me "warning: this decimal constant is unsigned only in ISO C90." What does that mean? The compiler is apparently warning us that in C99 we won't be getting an implicitly unsigned value out of the constant -2147483648, in case we might have wanted to use this constant in an expression involving signed types to force the calculation to be promoted to an unsigned type. But why would we get an unsigned value in the first place? Apparently in C90, the number's magnitude is interpreted first, and the value is negated. 2147483648 is too large (by one) to fit into a signed long, so the compiler promotes the type to an unsigned long, then negates it.

I have been trying to come up with an example of an expression that behaves differently when I use -2147483648 as a constant, as opposed to 0x80000000, but so far I haven't been able to come up with one. This may be because my compiler is already operating under C99 rules and so I never get the implicit promotion to an unsigned value.

Anyway, be that as it may, some parting advice and conclusions:

1. GHCi (on the PC and Mac) and C (Visual C++ 6.0 and GCC on the PC and Mac) yield distinctly different rounding behavior for integer division. C90 allows this behavior. Does Haskell? Would it make sense for Haskell to define integer division more strictly, the way that C99 does?

2. Division by zero is never OK; neither is taking % zero. Both are a fatal error in GHCi. The results are undefined in C (but a fatal error is generated by most implementations).

3. Overflow (which, in the case of integer division, occurs only when the "weird number" is divided by -1) produces undefined results, and your code should avoid it, both in Haskell and in C.

4. Operations on unsigned values are guaranteed to produce results that are consistent from platform to platform, assuming the integer size is the same. Operations on signed values don't have this guarantee.

5. The rounding (truncation) of inexact quotients may be either towards zero in all cases, or towards zero for positive quotients and away from zero when the results are negative. If your implementation follows the newer C99 standard rounding should always be towards zero.

6. Mixing signed and unsigned values in C expressions is very dangerous. If you do so, your implementation may have errors far more severe than differences in rounding behavior.

7. Consider avoiding signed division of negative values in C altogether, if your algorithm allows it. Either implement the algorithm using entirely unsigned integral types, or shift the domain of your function so that you are operating on non-negative signed values, and shift it back (using subtraction, which has well-defined semantics) when your division is done.

8. Algorithms which feed a range of values which cross the origin to integer division may vary between GHCi and C implementations. Because GHCi does not provide guaranteed rounding towards zero as C99 and the div and ldiv functions require, it is difficult to prototype in GHCi and expect to get the same results in C.

And, finally,

9. Make sure your implementation actually works the way it is supposed to. Only testing can truly accomplish this.

Interesting, isn't it, how studying one language can enrich your understanding of another! At least, it works for me!

The Division Bell Tolls for Me, Part Three

In the last installment, I pointed out a discrepancy between the way that Haskell does integer division and the way that some C code does it. Let's look at just how integer division behaves in Haskell a little more closely. You can type these expressions into GHCi:

6 / 5

1.2

5 / 6

0.8333333333333334

(6::Int) `div` (5::Int)

1

(5::Int) `div` (6::Int)

0

Integer division in the positive numbers always rounds down: 1.2 rounds down to 1, and 0.8333... rounds down to 0. In fact, it looks like the fractional part is just truncated. But let's check some negative values to be sure:

-6 / 5

-1.2

(-6::Int) `div` (5::Int)

-2

-5 / 6

-0.8333333333333334

(-5::Int) `div` (-6::Int)

-1

That's interesting; clearly it is not strictly truncation (rounding towards zero) that is happening, or else -5/6 would give us zero, not -1. And we clearly aren't getting "rounding" in the usual sense of rounding to the nearest integer (although there is no universal rounding algorithm to round to the nearest integer; there are lots of different variations in practice).

It looks like we have floor() behavior, which in the case of negative values means rounding away from zero. To state it slightly more rigidly, if the result of the div operation is a whole number, we get the whole number, otherwise we get an integer that represents the exact answer minus the fractional part.

If you learned integer division the old-fashioned way, on paper, you know what remainders are. The remainder is the leftover part of the process, and represents the numeator of the fractional part of the quotient. In Haskell, div gives you the whole number result of integer division, and mod will give you the remainder. So, for example,

(2::Int) `div` (3::Int)

0

(2::Int) `mod` (3::Int)

2

In fact, what this points to is that there is a relationship between the result of div and the result of mod. It should hold true in all cases, actually; in C, it is part of the ISO standard. The relationship is this:

For any quotient o = m `div` n where n is not zero, and the remainder p = m `mod` n, o * n + p = m.

In other words, you can reverse the integer division by multiplication as long as you add the remainder back in.

This relationship should be obviously true for any non-negative n and m. But I'll go further and say that while the exact meaning of mod for negative numbers varies from language to language, and perhaps from implementation to implementation, this relationship should still hold true, with the possible exception of cases involving overflow (which I'll discuss further below).

We can represent the relationship in Haskell like this:

10 `div` 3

3

10 `mod` 3

1

let undivide n o p = o * n + p
undivide 3 ( 10 `div` 3 ) ( 10 `mod` 3 )

10

In fact, let's test it for a few values of n and m:

let check_divide m n = ( undivide n ( m `div` n ) ( m `mod` n ) ) == m
[ check_divide m n | m <- [(-5::Int)..(5::Int)], n <- [(-5::Int)..(-1::Int)] ]
[ check_divide m n | m <- [(-5::Int)..(5::Int)], n <- [(-5::Int)..(-1::Int)] ]

Try out the results for yourself; again, not a rigorous proof, but suggestive.

But what about the overflow cases? Let's examine them. If Haskell's Int type is a 32-bit signed int, then we should be able to represent the values −2,147,483,648 (corresponding to 10000000000000000000000000000000 in binary or 0x80000000 in hexadecimal), to +2,147,483,647 (corresponding to 01111111111111111111111111111111 in binary or 0x7FFFFFFF in hexadecimal). Let's ask Haskell what happens when we try to exceed our largest positive number:

(2147483647::Int) + (1::Int)

-2147483648

Yep, we overflow; we get the largest negative number. Conversely, we can underflow:

(-2147483648::Int) - (1::Int)

2147483647

We don't get run-time error checking (and probably don't want it, for the sake of efficiency, when using mahine types). Of course, division by zero is still considered a run-time error:

(2147483648::Int) `div` (0::Int)

*** Exception: divide by zero

What about overflow or underflow in the results of division? With multiplication, it is very easy to overflow; multiplying two large numbers can easily exceed the width of a 32-bit signed value. You can come pretty close with:

(46341::Int) * (46340::Int)

2147441940

because 46340 is the next-lowest integer value to the square root of 2,147,483,647. Generate a result just slightly higher and you will roll over to a large negative:

(46341::Int) * (46341::Int)

-2147479015

With multiplication there are lots of combinations of values that will overflow, but with division it is a little bit harder to trigger overflow or underflow. There is one way, though. While 2,147,483,647 divided by any lesser value will be a lesser value, and divided by itself will equal one, recall that twos-complement representation gives us a slightly asymmetric set of values: the largest representable negative number is one larger, in absolute magnitude, than the largest representable positive value. Thus:

(214483647::Int) `div` (1::Int)

214483647

(214483647::Int) `div` (-1::Int)

-214483647

(-214483648::Int) `div` (1::Int)

-214483648

In 32 bits of twos-complement, -2,147,483,648 is the "weird number." As a general rule, if you take a twos-complement number, flip all the bits, and add one, you get the negative of that number. The only exception is the weird number. Let's see what happens when we try to divide the weird number by -1:

(-2147483648::Int) `div` (-1::Int)

Now, I would like to tell you what GHCi says when I ask it to divide the weird number by -1. I'd also like to see what it says about the mod. But I can't. GHCi crashes when I type in those expressions.

So, I can't tell you what Haskell says. But I can tell you what the C standard says. Here's a hint: not only is it not defined, but in fact the result of any overflow that occurs with arithmetic operations on signed integral types is not defined by the language standard. Only the results of arithemtic operations on unsigned integers have defined semantics for overflow, making them portable -- assuming that the size of the type in question is the same. More on that next time.

05 December 2006

The Division Bell Tolls for Me, Part Two

OK. In our last installment I started talking about using Haskell one-liners to calculate some functions. The function in question was designed to map a set of fractions in the range -7/7 to 7/7 to the machine integers 0..65536. The function itself, along with a wrapper to test, it fits neatly into a Haskell one-liner (well, it might be more than one line, depending on your terminal width). You can use GHCi to evaluate it:

map (\tup -> (fst tup + snd tup) * 32768 `div` snd tup) [(numer, 7) | numer]

I showed in the last installment that this function gave me the results I wanted:

[0, 4681, 9362, 14043, 18724, 23405, 28086, 32768, 37449, 42130, 46811, 51492, 56173, 60854, 65536]

Let's play with that function a little more, and then I want to talk about division (hence the title of the entry). In particular, integer division.

First, let's simplify our expressions by breaking them down and establishing some bindings. We can also get rid of the presumption that our output range starts at zero:

let gen_nths denom = [(numer, denom::Int) | numer <- denom::int let div_map m1 m2 = map (\tup -> (fst tup + snd tup) * (((m2::Int) - (m1::Int)) `div` 2) `div` snd tup + m1)

And now we can test it. Note that we need to add a few parentheses:

div_map (-32768) 32768 (gen_nths 11)

[-32768, -29790, -26811, -23832, -20853, -17874, -14895, -11916, -8937, -5958, -2979, 0, 2978, 5957, 8936, 11915, 14894, 17873, 20852, 23831, 26810, 29789, 32768]

OK. Now what is it we've done exactly? Well, we've generated some test values, and an algorithm, for distributing a fractional range onto a machine integer range. For our purposes, think of it as a little machine for testing division.

To see just why I chose this algorithm to play with Haskell, we have to go back and consider some C code. It's quite a headache-inducing paradigm shift to go back to C/C++, but bear with me. Let's just start with the simplest thing: we'll create the same mapping we just generated, hard-coding the values:

long denom = 11;
long numer;
for ( numer = -11; numer <= 11; numer++ )
{
    long val = numer * 32768 / denom;
    printf("%d\n", val);
}

And the results:

-32768
-29789
-26810
-23831
-20852
-17873
-14894
-11915
-8936
-5957
-2978
0
2978
5957
8936
11915
14894
17873
20852
23831
26810
29789
32768

Hmmm... wait a minute, let's compare the results:

Haskell            C
-32768             same
-29790             -29789
-26811             -26810
-23832             -23831
-20853             -20852
-17874             -17873
-14895             -14894
-11916             -11915
-8937              -8936
-5958              -5957
-2979              -2978
0                  same
2978               same
5957               same
8936               same
11915              same
14894              same
17873              same
20852              same
23831              same
26810              same
29789              same
32768              same

It looks like every single one of the negative values is off by one! In our third and final installment I'll dive a little deeper into the semantics of integer division.

The Division Bell Tolls for Me, Part One

With apologies to both Pink Floyd and John Donne.

I've been toying with Haskell. I wish I had enough free time for a systematic study of Richard Bird's Introduction to Functional Programming Using Haskell, Second Edition, but for now I have to content myself with dipping into it in small scraps of free time when both babies are asleep and the dishes are done. In addition, I'm using Haskell to model the behavior of some functions that I'm actually writing in C++.

One of the reasons I like Haskell is that with a tool like GHCi, which gives me a REPL (a Read, Evaluate, Print loop for interactive development), I can write toy one-liners that do something useful, and since Haskell supports typing, I can test how machine types behave. With an important exception, which I will discuss.

Let's say I want to try out a function that maps the fractional values -7/7 through 7/7 to unsigned integers in the range 0..65536 (the range of values representable by 16 bits, plus one). In Haskell I can try this out using a one-liner (typed into GHCi). Let's build it up. First, the range of input values. Ignore for now the possibility of using or introducing a true rational number type; instead we'll generate a set of tuples. In Haskell you can express this very concisely using a list comprehension:

[(numer, 7) | numer <- [-7..7] ]

[(-7,7),(-6,7), (-5,7), (-4,7), (-3,7), (-2,7), (-1,7), (0,7), (1,7), (2,7), (3,7), (4,7), (5,7), (6,7), (7,7)]

The list comprehension can be (roughly) read as "Make a list of tuples out of numer, 7 where numer takes on the values from the list -7 to 7." Think for a moment about how you could do this in your favorite language.

Now let's do our math on the resulting list. We can use map to apply a function to the list. We can use fst and snd to get to the values in the tuple; for example, to look at the numerators, do this:

map fst [(numer, 7) | numer <- [-7..7] ]

[-7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7]

Let's go ahead and do our scaling. We want to describe a single function to pass each tuple to; in fact we want a lambda abstraction, a function that we don't bother to name:

map (\tup -> fst tup / snd tup * 65536 ) [(numer, 7) | numer <- [-7..7] ]

[-65536.0, -56173.71428571428, -46811.42857142857, -37449.142857142855, -28086.85714285714, -18724.571428571428, -9362.285714285714, 0.0, 9362.285714285714, 18724.571428571428, 28086.85714285714, 37449.142857142855, 46811.42857142857, 56173.71428571428, 65536.0]

Whoops, that's not quite right. We want to map to only the positive values. We have to shift the domain (the incoming numerators) so they are non-negative:

map (\tup -> (fst tup + snd tup) / snd tup * 65536 ) [(numer, 7) | numer <- [-7..7] ]

[0.0, 9362.285714285714, 18724.571428571428, 28086.85714285714, 37449.142857142855, 46811.42857142857, 56173.71428571428, 65536.0,  74898.28571428571, 84260.57142857143, 93622.85714285714, 102985.14285714286, 112347.42857142857, 121709.71428571429, 131072.0]

But now the codomain is too big; we have to cut it in half:

map (\tup -> (fst tup + snd tup) / snd tup * 65536 / 2) [(numer, 7) | numer <- [-7..7]]

[0.0, 4681.142857142857, 9362.285714285714, 14043.42857142857, 18724.571428571428, 23405.714285714286, 28086.85714285714, 32768.0, 37449.142857142855, 42130.28571428572, 46811.42857142857, 51492.57142857143, 56173.71428571428, 60854.857142857145, 65536.0]

That looks about right. Now let's apply a type. Because I'm eventually going to be implementing this without floating-point, I want the math to be done on machine integers. There are some exceptions, but most target architectures these days give you 32 bits for int, so Haskell's Int, which is defined to be 32 bits, should yield approximately what I want. To introduce typing, all I have to do is add a type to the list values. Type inferencing will do the rest. Oh, and since division by slash is not defined on Int, I have to change it to `div` (infix division):

map (\tup -> (fst tup + snd tup) `div` snd tup * 65536 `div` 2) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

[0, 0, 0, 0, 0, 0, 0, 32768, 32768, 32768, 32768, 32768, 32768, 32768, 65536]

Whoah. Something has gone disastrously long. Take a moment to figure out what it is. I'll wait.

Did you figure it out?

The problem is that when I use integer division in an expression, the result is truncated (the fractional part is lost). If the division occurs before any of the other operations, most of the significant bits of the math are lost! I can fix this by making sure to do the division last:

map (\tup -> (fst tup + snd tup) * 65536 `div` snd tup `div` 2) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

[0, 4681, 9362, 14043, 18724, 23405, 28086, 32768, 37449, 42130, 46811, 51492, 56173, 60854, 65536]

That's better. While I'm at it, I can simplify out that factor of 2 by using half of 65536:

map (\tup -> (fst tup + snd tup) * 32768 `div` snd tup) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

[0, 4681, 9362, 14043, 18724, 23405, 28086, 32768, 37449, 42130, 46811, 51492, 56173, 60854, 65536]

In fact, I can ask GHC to verify that the results are the same:

map (\tup -> (fst tup + snd tup) * 65536 `div` snd tup `div` 2) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ] == map (\tup -> (fst tup + snd tup) * 32768 `div` snd tup) [(numer, (7::Int)) | numer <- [(-7::Int)..(7::Int)] ]

True

Very cool! Of course, this is not a rigorous proof, but I'm satisfied that my function is correct for the given inputs, and sometimes that's enough.

27 November 2006

Some Very Basic Haskell and Thoughts on Error Diagnosis

So, yesterday I attempted to write a very simple Haskell one-liner. I wanted to use a list comprehension to dump the results of some simple math on a range of numbers. Something like this:

[n / 5 | n <- [0..1024] ]

That's a tremendously concise program. You could read it as follows: "generate a list of n divided by 5 where n is drawn from the enumerated list of numbers 0 to 1024." Type this into GHCi (that's GCH in interactive mode) and it instantly spits out [0.0, 0.2, 0.4 through 204.8].

That's cool, but the reason I was writing this little program was actually to look at rounding behavior when doing division on hardware-sized integers. The '/' operator is not defined on the Int type, so if I ask the list comprehension to operate on a list of Int-typed values I understandably get an error:

Prelude> [n / 5 | n <- [(0::Int)..(1024::Int)] ]
<interactive>:1:1:
   No instance for (Fractional Int)
     arising from use of `/' at <interactive>:1:1-5
   Possible fix: add an instance declaration for (Fractional Int)
   In the expression: n / 5
   In the expression: [n / 5 | n <- [(0 :: Int) .. (1024 :: Int)]]
   In the definition of `it':
     it = [n / 5 | n <- [(0 :: Int) .. (1024 :: Int)]]

GHCi is giving me a lot of context here; that's nice. The message indicates the definition of "it." Apparently "it" is the way GHCi implements a REPL; my expression gets bound to "it" and evaluated. Then it zooms in on what goes wrong. Understanding this error message completely requires an understanding of type classes; I'm not fully up on type classes yet, but it obviously has something to do with type. The suggestion in this case is probably not what I want: I think it is suggesting that I could extend the slash operator to handle integer arguments. That would be a sensible suggestion if I was doing something a little more sophisticated with my own classes. But instead let's try div, which is supposed to do integer division, instead of the slash operator:

Prelude> [n div 5 | n <- [(0::Int)..(1024::Int)] ]
<interactive>:1:1:
   Couldn't match expected type `(a -> a -> a) -> t1 -> t'
          against inferred type `Int'
   In the expression: n div 5
   In the expression: [n div 5 | n <- [(0 :: Int) .. (1024 :: Int)]]
   In the definition of `it':
     it = [n div 5 | n <- [(0 :: Int) .. (1024 :: Int)]]

Whoah. Now the error message is different, but I'm doing something wrong. Let's see if we can decode it. It says that in n div 5 it is expecting the type (a -> a -> a) -> t1 -> t. What does that mean?

I think it means that div takes two values and returns a third, and I'm then trying to apply two arguments to the resulting value. Or something like that. This error message may have everything I need to know; maybe after studying my copy of Bird's Introduction to Functional Programming Using Haskell, which should be arriving in the mail any day now, I'll be able to decipher this; but then again, maybe not. Time to do some Googling. Googling complete, with no answer; time to ask for help.

I got a very quick and helpful reply on fa.haskell. (The newsgroup comp.lang.haskell supposedly has been created, but it doesn't seem to show up in Google Groups yet). The helpful reader suggested that I use `div` (note that those are backticks, not single quotation marks). What this really means is that div in Haskell is a prefix operator, not an infix operator like the slash. It expects its arguments to come afterwards. The magic backticks allow transforming a prefix operator into an infix operator. That works great.

My larger topic is error messages. Having tried to contribute just a little tiny bit on d2c, the Dylan to C compiler, motivated by stumbling across bugs and places where error messages were a bit weak, I can attest to the following principle. Since I've never heard it expressed anywhere else, I'm going to call it Potts' First Principle of Programming Language Implementation. Here it is:

A compiler implementation will require several times more code to provide a helpful diagnostic message than it does to detect a failure, issue a cryptic message, and exit. The diagnostic-generating code will be complicated, difficult to understand, and inelegant. It is nevertheless worth writing.

The difference between the ugly implementation which is helpful to the end user and the wonderfully elegant implementation that produces a cryptic failure message is that the former will be adopted and the latter will not.

Consider writing two methods, one that compiles good code quickly but doesn't give you much help when a failure is detected, and another fully instrumented method that diagnoses the problem in English. Having worked with a large number of different tools, I've found that the primary difference between a truly usable tool and one that is not so usable is the quality of the diagnostics.

A quick example from Dylan: Dylan allows multiple inheritance. Multiple inheritance introduces means that the inheritance graph of a given class can be not just a list but a DAG (a directed acyclic graph). This means that it is possible that you can have a diamond-shaped inheritance graph, in which one or more base classes is included via more than one path. This presents a problem when searching for the next applicable method. The rule, expressed informally, is this: the different partial orderings of the inherited classes, from child to parent to nth-generation ancestor, can't contradict each other. You can't have a class which is both a sports car first and a fire truck second, as well as a fire truck first and a sports car second. If you do, you don't know what is going to happen when you try to shift it into gear.

The d2c compiler complained about this situation with an error message that was, on the positive side, concise. It said "Inconsistent CPL" and quit.

The code in the compiler that detected this condition was a model of elegance. In order for it to display a much more detailed message, something like "class wonderCar: the partial class precedence list (sportsCar, fireTruck) reached via base class superClassTypeA contradicts the partial class precedence list (fireTruck, sportsCar) reached via base class superClassTypeB," some much uglier code has to be inserted, or wrapped around, that model of elegance, to accumulate the information to display to the end user. It's worth writing that method. It could even be implemented as a kind of exception handler when the elegant version fails.

Diagnostic messages: they're what's for breakfast.

25 October 2006

Exploring Haskell

Having gotten somewhat stalled on learning more advanced Scheme programming paradigms, I have decided to try learning Haskell. In particular, I want to use Haskell to get both more lazy and more functional. It is all too easy for me to write Scheme programs which use recursion and lists but are basically translations of the way I would write things in Pascal. I can build my little closures, and do maps and folds, but the programs still evaluate everything, still compute everything, and still use a lot of mutation. My Scheme functions often aren't referentially transparent; they're often really methods to mutate objects.

It is reasonably legitimate to use Scheme that way, but I'm not learning much.

In particular I'd love to use something more advanced for my embedded work, but it seems to me that using Scheme is just going to turn compile-time errors into run-time errors, with its lack of typing, and although my programs might be shorter, they won't really be any more explicit.

In particular, although I have become reasonably facile with s-expressions and tend to be able to see "past" the parentheses, Haskell confirms my belief that it isn't actually necessary for the programmer to play compiler and write the abstract syntax tree him or herself. There's a little bit of buzz beginning about how Haskell is easier for beginners, and my experience with Dylan leads me to believe that typing is actually valuable; in Dylan, for example, it can make a difference between code that uses full runtime dispatch and takes twenty minutes to run, or a version that takes a few seconds.

I'd also like to better understand the lazy paradigm, where a very large or infinite list or tree does not have to be fully generated or fully evaluated in order to be useful. I'd like to understand Monads, or at least how to use them.

A lot of my programming work involves explicit or implicit state machines and plumbing to manage state changes. To improve this I really just need something a lot more concise and expressive than C++. I need leverage, not totally new programming paradigms, at least not at first. So we'll see if Haskell can help me there.

I've been faintly interested in learning languages like Clean, or ML, or Miranda, or Ocaml. But so far what I'm reading leads me to think that it may not be worth my time to bother with these and that Haskell has absorbed a lot of the things that are cool about these languages and may, in fact, be the shit. A large amount of material is being written about Haskell, and the stable base/experimental extensions approach seems like the right thing. In other words, Haskell seems like a language upon which a lot of smart people are converging.

And if it isn't, maybe I can take some of what I learned using it back to Scheme, or maybe Kernel.

More to come!

31 August 2006

Generic Functions and Pointers to Member Functions

People who know me know that I still carry a torch for a wonderfully-designed but little-used programming language called Dylan. In my arrogant opinion Dylan took much of the best of Common Lisp and CLOS, regularized and streamlined it, and gave it a syntax more acceptable (due to familiarity) to programmers from the C and C++ world (although this annoyed some prefix syntax fans in the process). Dylan is a multi-paradigm language, where "everything is an object," designed to support functional styles, object-oriented styles, and procedural styles.

Dylan is still alive and kicking with several impressive implementations available here including d2c, an amazing compiler that generates C code from Dylan source, and which is written primarily in Dylan itself, and the formerly commercial Functional Objects compiler, which generates native machine code. I have not studied the Functional Objects codebase very much, but I have spent time with d2c. It is an amazing piece of code. Studying it makes me wish I was able to strap on an auxilliary brain or two and take a pill that gave me a post-graduate education in language implementation without all that tedious mucking about in graduate school, so I could contribute something useful to the project instead of just an occasional bug report or ignorant question.

In Dylan, and in Dylan's sire Common Lisp with CLOS, classes contain data. Classes don't contain methods. Methods are implemented using a construct known as a generic function. A generic function lets you define several specialized functions to operate on objects of different classes. A variety of functions with the same name are bundled together; as you define additional functions, they get added to the generic function. At run-time, when you make a call to the generic function, the specific function chosen can depend on the actual run-time type (the class) of the object you send to it. Classes can inherit from one or more other classes; this inheritance tree is used to make the decision about which actual method to call for a given object. Generic functions can be specialized in other ways; they are extraordinarily flexible and powerful. But I'm only going to touch today on generic function dispatch specialized on the class of the incoming object, incoming parameters, and return type.

In his article "A First Look at Dylan: Classes, Functions, and Modules," Steve Strassmann writes:

Object-oriented languages, including Dylan, provide polymorphic functions, which means a given function may be executed as one of several possible implementations of that function, called methods... when Dylan sees a call to name(x), depending on what type of object is, one of several methods is selected and executed. In Dylan, name is called a generic function, consisting of a family of name methods that implement the functionality of name for various classes (see Figure 2). Each of these methods "belongs to" its generic function (in this case, name) rather than to a class. This is a key point; it's the core difference between C++'s object model and Dylan's.

In terms of implementation, this means that when you make a generic function call, the generic function isn't a "standard" function; a dispatch mechanism comes into play to select the proper generic function based on the call's arguments. The run-time context contained in the object is explicitly passed to the function.

Strassmann gives examples generic function; I have stripped down one of his examples somewhat here:

// No type declaration, works on any type of object
define method double (x)
   pair(x, x);
end method double;

// Works on all numbers
define method double (x :: <number>)
   2 * x;
end method double;

// Works on all strings
define method double (x :: <string>)
   concatenate(x, x);
end method double;
When double is invoked on an argument, the most specific method is invoked... for example, double("foo") would invoke the third method, because <string> is more specific than <object>, which is what the first method is specialized to. If no match is found, Dylan will catch it and signal an error.

Besides a plethora of different ways to specialize generic functions, Dylan supplies various forms of introspection; if you want to extend or alter the run-time dispatch, you can do so.

To C++ programmers, this is inside-out. In C++, classes "contain" methods, known more commonly as "member functions" because they are "members" of the class. C++ supports polymorphism and dispatch based on run-time type. I'm going to talk a little bit about how objects and member functions work together in C++ and how you can use them, and how, when we look at C++ in light of Dylan or CLOS, C++ actually gives us a subset of object dispatch, presented in an obfuscated form.

Let's say you don't want to just make a new subclass with specialized methods and allow polymorphic dispatch to operate based on run-time type, but instead you want to handle your own run-time dispatch. In Dylan you can do this using introspection and some handlers that allow you to override the standard generic function dispatch. Since you have the object and the generic function and can get at specific functions, you can just call them, passing the object.

To do the same thing in C++ you need to work around the fact that C++ obfuscates what is actually going on when you call a method. You can do this by using a slightly obscure C++ construct called the pointer to member. For the present discussion the member in question will always be a member function, but keep in mind that pointers to members can work with data members as well. This use is probably even less common since if you have an object, or a pointer or reference to an object you can access public members directly as you would access members of a struct, or via an accessor function; I would think that pointers to data members are necessary only for serious runtime hacking or compiler implementation. But the key point here is that C++ pretends that both data members and member functions are part of classes, while CLOS and Dylan much more explicitly separate these concepts.

Pointers to member functions are uncommonly used, probably in part because of its somewhat obscure syntax. The only place I've seen them used, in fact, is the Darwin kernel's IOKit source code. Recently I've had a need to customize class behaviors at runtime without subclassing and without a complete refactoring of a class into modular pieces, so I've been investigating the construct again.

The syntax is awkward, but it is consistent with the general C school of thought which says that declarations are read backwards (right to left). In C, if you have an integer variable:

int i;

to make a pointer out of it, you insert an asterisk to the left of the variable name:

int * i;

Similarly, if you have a prototype for a function returning int and taking an int parameter:

int fn ( int );

to take a pointer to it, you insert an asterisk to the left of the name:

int * fn ( int );

but the order of evaluation means that the parser reads it as a function returning a pointer to int. So you have to change the order of evaluation using parentheses:

int ( *fn ) ( int );

Now fn means a pointer to a function, not a function returning a pointer.

A member function is basically the same, with the addition of the class name and the scope resolution operator (::). When you are declaring your class, this prefix is not wanted or needed (it is implicit within the class declaration's curly braces), but when you define your methods, you do so like this:

int my_class::my_member_func( int )
{
   /* function body */
}

An aside: the separation between the class declaration and class definition is just one of the speed bumps you have to live with when using C++; it is there to support separate compilation, but if you've ever made the switch from C++ to Java you know that giving up the need to keep header files and implementation files in perfect synchronization gives you an immediate productivity boost.

If we want to make a variable that can point to that member function, we put an asterisk to the left of the name:

int my_class::*my_member_func_ptr( int );

The precedence rules get confused by this (during parsing, it probably looks like a collision between a definition of a member function and a conventional function returning a pointer with no type attached, or some such nonsense), so again we have to use parentheses:

int ( my_class::*my_member_func_ptr )( int );

and this gives us a variable, my_member_func_ptr, which can hold a pointer to a member function with the signature "returns int and takes one int parameter." Note that the parameter list, as in a function prototype or member function declaration, only requires the types of the parameters and not the names, although you can supply names for documentation purposes if you want.

Another aside: it would be easy to ridicule this syntax, but ridiculing the syntax of C++ is kind of like shooting fish in a barrel. I've shown that the syntax is at least somewhat consistent with the other C++ syntactical forms. Because of that, this particular syntax doesn't really bother me; if I can't remember it off the top of my head, I can mentally follow this derivation path and reason it out. This isn't necessarily true of a lot of other pieces of C++ syntax which do bother me and which I'm quite happy to rant about; see my piece on struct.

A pointer to member function does not have a standard implementation; the implementation (size, etc.) may differ between implementations; you should probably treat it as an opaque data structure and make no assumptions about what it actually looks like inside, unless you want to write very non-portable code. There's nothing in the standard that says that pointers to member functions have to be the same size, for example, and the standard specifically says they are _not_ just specially typed pointers; they are not convertible to and from void* by casting, as other pointer types are, using either the old-style or new-style casts.

This is because pointer to member function is not a usable pointer by itself; instead it contains whatever extra information is needed to represent a specific member function for a given class, probably an index or offset of some sort. You can imagine that the compiler needs a mechanism like this for its own use, so that it can turn

object.member_function(...)

or

object_pointer->member_function(...)

into C-style function calls with the addition of a hidden parameter:

looked_up_member_function( this, ... )

Remember our discussion on generic function dispatch? You might notice that when you rewrite the code to explicitly pass in the contest (the "this"), it starts to look like Dylan (or, ignoring the difference between infix and prefix notation, CLOS). You can actually think of the C++ object model as a much more limited and obfuscated form of CLOS generic function dispatch. How weird is that?

The "this" pointer allows the function body access to the specific object instance variables, and can also be passed on to any subsidiary member function calls on the same object. In Dylan or CLOS you would access the data members (in slots) of the incoming object explicitly; C++ gives you access to the data members and member functions of "this" implicitly, although you can still use this-> as a prefix if you want to make very clear what you are doing, or protect against accidentally accessing something global or local in the member function's namespace that might inadvertently shadow a member. This merging of namespaces is another rich source of potential errors (remember my comment about how C++ is like obfuscated Dylan?)

Note that our pointer to member function is not defined to refer to a specific member function in the given class. It can have assigned to it any member function that is part of the class (or a derived class), and that has the same signature (that is, matching return type and parameter types). This, as you might imagine, allows the compiler to do dynamic dispatch based on runtime type; you can do the same thing, based on any criteria you choose.

Note also that the variable you've created is not yet initialized. There's that obfuscation again! In a language where everything is done by reference, that isn't even quite possible; your reference will wind up referring to something. But pointers enable whole new classes of errors, and there is unfortunately no such thing as a "reference to member function" in C++. Be careful to make certain you initialize it before use, preferably at the point the variable is defined! Not doing so will cause you severe tire damage!

And, in fact, the compiler may not help you avoid this. I tested GCC by writing a call to an uninitialized regular function pointer:

void ( *fptr ) ( void );
fptr();

and it generated a helpful warning for this case. When I wrote a similar call using an uninitialized pointer to a member function, like so:

void ( Class_c::*mptr ) ( void );
( this->*mptr ) ();

GCC produced no warning at all, and as you might expect, the call caused an immediate crash. This is the likely outcome, but according to the C++ standard the behavior is undefined (this is very bad; it means the program is not required by the standard to catch this as an error; it is free to silently fail in some insidious way, or destroy your hard drive, or electrify your keyboard).

You can initialize them to zero, the null pointer constant (there is an explicitly allowed conversion), but the results of calling a null pointer to member function are also undefined, so this doesn't buy you anything; you are better off initializing them at the point where they are defined. Here is an example of how to do so:

void ( my_class::*mp ) ( void ) = &my_class::member_function;

Since pointers to member functions are hard to read, especially for member functions that that accept a long list of parameters, it is valuable to make the construct into a type using typedef. I recommend using the naming convention class name + the _kind_ of member func + "_pm_t"

But follow your own convention, or your team's convention, if you like. Keep in mind that the type can be used to represent any member function of the class (or a derived class) that matches the signature. The member function will probably represent a handler of some kind, or in design pattern terms a specific strategy. Describe what it handles, or what the strategies are trying to accomplish.

When you write the typedef, keep in mind that it looks just like the variable definition, except that the type name replaces the variable name and you put typedef in front. To define a type called my_class_SomethingHandler_pm_t, use the form:

typedef int ( my_class::*my_class_SomethingHandler_pm_t ) ( int );

Once you have a type, the following definition creates a variable with the new type and initializes it with a specific member function:

my_class_SomethingHandler_pm_t pm = &my_class::my_member_func;

This assignment can also take place in parameter binding. This means you can write a standalone function, member function, or static member function that accepts any pointer to member function matching the specified signature, and then pass it a specific one at run-time.

Once you have a pointer to member function, you can call that member function, but to do this you need an object (the "this" that will be in effect for the duration of the call). If you have a pointer to the object (which could be "this"), use the syntax:

( object_ptr->*mp )( params );

If you have a local or global variable holding an object or a reference to an object, use the syntax:

( object.*mp ) ( params );

Note that the parentheses around the first part are mandatory, to help the parser. If your object pointer or pointer to member function is stored in a structure or you are accessing it via pointer you will probably have to introduce more parentheses to make sure the parts of the expression are evaluated in the right order. This kind of construct can get ugly fast, so be careful.

In his C++ FAQ Lite section on pointers to member functions available here Marshall Cline suggest that when making calls using pointers to member functions, you always define a calling macro like this:

#define CALL_MEMBER_FN(object,ptrToMember) ((object).*(ptrToMember))

and then do the call like this:

result = CALL_MEMBER_FN(PO,PM)( PL );

but I wouldn't necessarily recommend that; C-style macros are still quite evil, and can make it very hard to find out what the compiler is complaining about, or even what it actually compiled that you didn't intend.