14 December 2006

The Revised Dot-Matrix Printhead

It looks like this blog has been unceremoniously removed from Planet Scheme. I guess that's only fair, since I haven't written anything specifically about Scheme in a while. Best of intentions, and all that. This blog probably belongs on "Planet Programming Language Dilettante."

Anyway, I received some helpful feedback on the short program I wrote to print bitmaps of binary numbers. Jason Creighton in particular took a shot at a rewrite, and the result as highly enlightening. I also recieved two specific pieces of advice: use short function names, and don't write a comprehension like [a | a <- [a..b]] when you can just write [a..b]. Oops. Check, and thanks!

(Semi) or (Ill) Literate Haskell follows:

Haskell Binary Printhead Toy, Second Round

>import Data.Bits

Our functions to generate lists of signed and unsigned integers of
numbits length, revised with names as suplied by Jason and error-
catching as supplied by yours truly:

>unsignedInts :: Int -> [Int]
>unsignedInts bits | (1 <= bits) && (bits <= 16) =
>   [(0::Int)..(bound::Int)]
>   where bound = 2 ^ bits - 1
>unsignedInts bits = error "expected a value from 1 to 16"

>signedInts :: Int -> [Int]
>signedInts bits | (1 <= bits) && (bits <= 16) =
>   [(- (bound::Int)) - 1..(bound::Int)]
>   where bound = 2 ^ ( bits - 1 ) - 1
>signedInts bits = error "expected a value from 1 to 16"

Jason came up with a considerably simpler way to map the bits of
the binary numbers to the characters for printing:

>boolToChar :: Bool -> Char
>boolToChar True = '#'
>boolToChar False = ' '

I'm adopting that suggestion enthusiastically, since I don't think
code can get much simpler than that!

Jason then proceeded to turn my dog food into a much more flavorful
wafter-thin after-dinner mint, leaving me struggling to understand
just what he did:

>bitPicture :: (Int -> [Int]) -> Int -> [Char]
>bitPicture intGen bits = unlines $ map stringAtIndex (reverse [0..(bits-1)])
>  where
>    stringAtIndex = map boolToChar . bitsAtIndex
>    bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

Wow. First off, Jason has supplied a type signature which expresses
what the function actually _does_. The appropriate integer generator
(the signed or unsigned version) comes in as a parameter; that's what
(Int -> [Int]) means. We next get an integer argument (the number of
bits), and finally, we spit out a list of chars.

Jason proceeds to use a couple of where clauses to set up some
simplifying expressions. Let's look at the first one:

StringAtIndex = map boolToChar . bitsAtIndex

OK, this is slightly mind-twisting right off the bat. In the world
of C and Java you can't write things like that. The dot is function
composition expressed naturally; instead of something like

stringOfBoolsFromBitsAtIndex idx = map boolToChar (bitsAtIndex idx)

we can just get Haskell to chain the functions together for us.
No fuss, no muss. But this function makes a forward reference to
bitsAtIndex, which we haven't seen yet (completely illegal in C
and Java). Let's make sure we understand what bitsAtIndex does:

bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

That's definitely more dense than the way I wrote it (and that's a
good thing). To break down the list comprehension, the expression on
the right hand side of the vertical bar says that we feed the supplied
integer generator function the number of bits to use. That gives us a
list. From that list we will draw our "n"s and test bit idx on them.
By doing it this way, Creighton has pulled an act of Haskell sleight-
of-hand: instead of generating the bitmaps for each integer value,
like I did, and then using "transpose" to rearrange them, this version
accesses the "columns" of the table directly! With those two
expressions, we've got our string representing all the bits at a
given index _for the whole supplied set of integers_.

Wow. I feel like an audience member watching a magician smash up my
watch and then give it back to me rebuilt as a marvelous whirring
gyroscope.

There's more, but this part is relatively simple:

bitPicture intGen bits = unlines $ map stringAtIndex (reverse [0..(bits-1)])

We already know what unlines does; it turns our list of strings into a
properly line-broken string. We've just described how stringAtIndex works.
and map stringAtIndex (reverse [0..(bits-1)[) just feeds stringAtIndex a
set of indices mapped from high to low, so that our most significant bit
comes out leftmost. But what does "$" do?

My reference at zvon.org says this is the "right-associating infix
application operator (f $ x = f x), useful in continuation-passing style."
I'm not sure I fully understand, and I'm not going to go into the meaning
of "continuation-passing style" right now, but I guess it saves us from
having to write:

unlines $ map (stringAtIndex (reverse [0..(bits-1)]))

to get the right order of operations, which is nice. So I'll try to
incorporate it into my own Haskell.

Anyway, let's try it. We need to drive the function:

>main :: IO ()
>main = do
>   print "Unsigned values (6 bits)"
>   putStr $ bitPicture unsignedInts 6
>   print "Signed values (6 bits)"
>   putStr $ bitPicture signedInts 6

After using GHCi's :cd and :load command to read this file, just
type:

main

"Unsigned values (6 bits)"
                                 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

"Signed values (6 bits)"
 ################################
                 ################                ################
         ########        ########        ########        ########
     ####    ####    ####    ####    ####    ####    ####    ####
   ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##  ##
  # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Excellent. I laughed; I cried; I learned something today. I hope you did, too. Thanks, Jason! And thanks to the other people who commented!

Followup: for those following along the discussion in the comments, here is an alternate implementation of the bitPicture function:

bitPicture intGen bits =
    unlines $ map stringAtIndex $ [bits - 1, bits - 2..0]
    where stringAtIndex = map boolToChar . bitsAtIndex
        bitsAtIndex idx = map (flip testBit idx) (intGen bits)

Note that using [bits - 1, bits - 2..0] in the first line allows us to avoid the list reversal, so that seems like a clear win. The method of expressing bitsAtIndex above is one of several that were suggested. The idea behind the flip is that the return value of intGen should become the first parameter to bitsAtIndex, not the second. You can also express this by forcing stringAtIndex to be treated as an infix operator:

bitsAtIndex idx = map (`testBit` idx) (intGen bits)

Both of these alter the behavior of Haskell's automatic currying so that it replaces the second parameter instead of the first, leaving the first as the sole parameter to the resulting curried function; in Common Lisp, you might do this explicitly using rcurry. And don't forget the original, using a list comprehension:

bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

Assuming there is not a specific concern about efficiency, it seems like the choice between these versions might be a matter of style. Thanks again to everyone who made suggestions.

9 comments:

Neil Mitchell said...

A couple of comments, just tiny refinements:


>unsignedInts :: Int -> [Int]
>unsignedInts bits | (1 <= bits) && (bits <= 16) =
> [(0::Int)..(bound::Int)]
> where bound = 2 ^ bits - 1

I'd write this as:

unsignedInts :: Int -> [Int]
unsignedInts bits | 1 <= bits && bits <= 16 = [0..bound]
where bound = 2 ^ bits - 1

The brackets are unnecessary.

I'm pretty sure the type signature is entirely redundant? The type sig on unsignedInts should make it all Int's. If you did need a type sig, I'd put it on bound

where bound = (2 ^ bits - 1) :: Int

signedInt's could be refined in a similar way.

In a similar way:

>bitPicture intGen bits = unlines $ map stringAtIndex (reverse [0..(bits-1)])
> where
> stringAtIndex = map boolToChar . bitsAtIndex
> bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

bitPicture intGen bits = unlines $ map stringAtIndex $ reverse [0..bits-1]
where
stringAtIndex = map boolToChar . bitsAtIndex
bitsAtIndex idx = map (`testBit` n) (intGen bits)

No need for a list comp in the second one, and a few less brackets all round :)

Neil Mitchell said...

Oh, and don't worry about $ -its useful in continuation passing style, but its a lot more useful to avoid some brackets!

f (big thing) == f $ big thing

f (big (thing here)) = f $ big $ thing here

Paul R. Potts said...

Hi Neil,

Thanks for your suggestions!

I think you are correct that there is no need for the explicit typing if a type signature is provided.

Although it seems to work correctly, the expression

1 <= bits && bits <= 16

(with no parentheses) makes me a little nervous. I work a lot C++, in which some operators have the wrong precedence. The above parses correctly in C++, but the expression

x & 1 == 0

doesn't say what you mean, unless you mean "always false," because "==" binds more tightly than bitwise "and." C programmers who endeavor to make life easier for their successors tend to fully parenthesize expressions that mix different kinds of operators. I don't really want to break that habit for Haskell or I'll be writing expressions that are silently incorrect when I go back to C/C++.

I have a similar problem with variable names; in Dylan, names like "gen-total" and "*module-var*" are not just legal but conventional, and if you want the "-" and "*" to be interpreted as a separate token you have to leave spaces around them. When I go from Dylan back to C I tend to write illegal variable names!

The conversion of

bitPicture intGen bits = unlines $ map stringAtIndex (reverse [0..(bits - 1)])

to

bitPicture intGen bits = unlines $ map stringAtIndex $ reverse [0..(bits - 1)]

seems entirely good, using the $ operator again; but the change from

bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

to

bitsAtIndex idx = map (`testBit` n) (intGen bits)

loses the definition for n, so I get "Not in scope: 'n'." I have played around a bit with other possible ways to get rid of the comprehension here but they did not seem to work right. In Scheme I would consider using rcurry to make testBit idx into a function I could map the list onto, but I am not quite sure how to get that effect in Haskell.

Paul R. Potts said...

Oh, I found what I was looking for... a way to use ".." to express a descending list. Instead of:

reverse [0..(bits - 1)]

you can say

[bits - 1, bits - 2..0]

Clearer? More efficient? I'm not sure.

The Alternate Moebyus said...

Paul, I'd say the comprehension is more efficient, as it just becomes a enumFromThenTo, and does not involve reversing a list.

Unknown said...

Regarding:

bitsAtIndex idx = [ testBit n idx | n <- intGen bits ]

changed to

bitsAtIndex idx = map (`testBit` n) (intGen bits)

I think you might be misunderstanding things a little. 'n' should be 'idx'. Here, we'll start simple, using a lambda to simulate the list comprehension:

bitsAtIndex idx = map (\n -> testBit n idx) (intGen bits)

Then we'll use the standard function 'flip' to flip the arguments:

bitsAtIndex idx = map (\n -> flip testBit idx n) (intGen bits)

Now, by the definition of lambda, we know that (\x -> f x) is equivalent to (f), so we remove the lambda (in this case f is (flip testBit idx)):

bitsAtIndex idx = map (flip testBit idx) (intGen bits)

This could also be written:

bitsAtIndex idx = map (`testBit` idx) (intGen bits)

since (x `f` y) is equivalent to (f x y). Figuring out why is left as a learning experience. :)

Paul R. Potts said...

Thanks for the ongoing advice!

I had a conversation with another Paul here in Ann Arbor who is a Ph.D. student in Computer Science. It turns out Haskell is his favorite language, so we had an interesting and helpful discussion on currying, and he mentioned flip. Of course, everyone else at that get-together quickly found other people to talk to! I have also been studying Bird's book, which has interesting material on how list comprehensions are actually understood. I'm getting there...

Paul R. Potts said...

Some further followup:

GHC is confused by Literate Haskell programs that have lines starting with #. The Haskell 98 report does not mention anything special about the # character, but a bug report about this issue on the GHC site indicates it has something to do with special support for C preprocessor directives and will not likely be changed. I remembered to indent the output lines starting with # in my first article, but not the second. I have fixed that and will try to remember to test the .lhs content as-is in the future.

Neil, you are right that the line

bitsAtIndex idx = map (`testBit` idx) (intGen bits)

works. I should have been able to figure that out. I thought I had tried making that change but I must have broken something else and just made myself more confused. I have also followed your steps and feel like I still understand, so I think that's a good sign!

Neil Mitchell said...

Glad you figured it out - sorry for the one character typo in the original :)

I did consider suggesting [bit-1,bit-2..0] instead of the reverse, but it doesn't look as clean. It would be possible to add a GHC rule to convert:

reverse (fromEnumTo ...)

to switch the arguments, which is always safe, and would give you better performance, without having to duplicate bit.

I understand your comments about trying to keep "in C mindset" as well, it can get very confusing. I tend to either type semi-colons in Haskell, or miss them in C, depending on which transition I'm making.