Praise, Curse, and Recurse: 2007

21 December 2007

Newegg Makes Good

Newegg is refunding me the camera and case and I was able to find the camera elsewhere in time, so it is all working out OK. They have excellent customer service, so I will no doubt be buying from them again in the future.

Today is the shortest day of the year. It takes a lot of caffeine and St. John's Wort to keep me functioning this time of year. I think the barista at the local Caribou Coffee realized I needed a triple espresso and so I got a free upgrade. Either that or she was thinking "you know, we don't see you in here often enough... you're just not quite as addicted as we'd like you to be!"

This year really was a challenge -- illness, death, and mayhem all around us. Let's hope 2008 is better!

19 December 2007

Newegg Screws Up

So, I decided rather late in the season to get my son an inexpensive digital camera for Christmas. I ordered it from Newgg, a company I've had great service from in the past. I bought a motherobard and CPU from them a few years ago; I've ordered various little things like flash drives and hard drives.

They took the order for a camera, case, memory cards, and card reader. I paid with PayPal; everything went through normally. The package went out; it is supposed to be delivered today. I wanted to make sure of that, since we're obviously getting pretty close to Christmas, and my family is also planning on going out of town. In fact, we're taking the train out of town to see family. The plan was for my son to enjoy taking pictures of the trip. So it has to be here, or there wasn't much point.

Today, the day that my UPS package I've been tracking is scheduled for delivery, I got a note saying that it was all just a joke... the package I've been tracking doesn't actually have a camera in it, and I should get a refund in two or three days. Oh, it does have a cheap little camera case in it, which I threw in just because they were shipping me the camera anyway. Which means I just paid $5.24 to ship a $10 camera case.

I'm scrambling to try find the camera locally. I'm fortunate in that my finances aren't that tight right now, but if they were, I'd be completely screwed, because i wouldn't be able to charge the replacement camera on my debit card until the PayPal refund went through.

I can't recall any company I've ever ordered from screwing up an order in quite this way. I mean, I've had items cancelled on me at the last minute, or delayed, but I've never, ever been charged for an item that was not actually shipped. I'm a bit boggled by just what kind of a failure in their IT infrastructure allowed the charge to go through. Wow!

18 December 2007

The SOE

I have been looking at Paul Hudak's book The Haskell School of Expression. My first barrier was getting the sample code to work under Ubuntu "Gutsy Gibbon." While I highly recommend Graham Hutton's book, which I think of as the K&R of Haskell, SOE is more of a tutorial and may be more useful to people who learn by doing. I got a bit of assistance from Paul Liu. It turns out that in addition to installing ghc using sudo apt-get install ghc, I needed to install libghc6-opengl-dev.

Without this, when I tried to configure GLFW using runhaskell Setup.hs configure, I got an error that said Setup.hs: cannot satisfy dependency OpenGL>=2.1. I went down a dead end of trying to figure out what kind of package I needed to install to give me the right OpenGL libraries, but the key was realizing that it is looking for the Haskell OpenGL library, not the system library.

This is not specifically mentioned in any of the docs I found online -- the only place I found it mentioned was on the Debian package list -- so I am mentioning it here! Maybe if someone with the same issue is Googling for the answer, they will now find it quicker.

17 December 2007

Getting Back Online with Music Theory

Well, this hellacious year is almost over. Most of my spare time has been spent on the archiving and family history project, which I'm documenting at The Marcella Armstrong Memorial Collection. That project is bearing fruit in that I have almost a thousand images scanned, and so I'll be giving some of the originals to family members. I've created some beautiful prints, as well, and over Christmas I'm hoping to share some of the images with family members in the form of slideshows.

Meanwhile, in the back of my mind I'm thinking about how to get back into Haskell programming. Music theory keeps coming to mind.

The problem with music theory is that it is really a collection of ad hoc convenient rules and relationships in the guise of a coherent theory. The various named entities have completely inconsistent nomenclatures. You kind of get used to this when you learn to play chords and transcribe songs, but it is easy to forget how inconsistent it is and thus confusing for students.

For example, intervals. A major second is an interval also known as a whole step. If you start on the root note of a scale, say, a C major scale, the C is known as the root or "first." If you add a major second to a first, you might imagine that you'd get to a third note of the scale. But, no, you have a second note.

Does that mean the steps in the scale are off by one in their naming (that is, the first is really the zeroth?) No, because if you add two major seconds to the root, you have a third, not a fourth. It is the naming of the intervals that are off.

So can you just start the intervals at zero instead of one? Well, kind of. A minor second is a half-step, or the difference between adjacent keys on a piano. Call that 0.5. A major second is then 1.0. But this still doesn't explain intervals, because a "fourth" is actually three whole steps and a half step. And if you add a third and a third, you don't have a sixth, you have a flat seventh!

So it keeps coming back to me that I should try to codify the rules of intervals and chords in some bits of Haskell code. It might even be of use to geeks trying to understand music theory.

Don't even get me started on time signatures or tuning!

30 October 2007

Memento Mori

So, I have been falling behind on all sorts of commitments -- I was excited by the prospect of working through a bunch of Haskell exercises and get more serious about learning Haskell; I was learning all kinds of new jazz chord voicings in my guitar lessons; I had plans to record another podcast with original music.

Instead life intervened, and my mother died after a brief, unexpected illness, and then Grace's father died just under two weeks later. We've had a lot of death recently -- my grandmother died two years ago, at the age of 102, and Grace's brother died at the age of 40 just last year.

"Memento Mori" means roughly "remember, you will die." There's nothing quite like sitting in a room with the dead body of your mother to make this clear. It has been my task this year to think pretty hard about this, and to try to start planning for it, beginning what I hope will be my "ars moriendi" -- the art of dying well. And, I hope, a lot later.

I've been living on a kind of "split screen" for the last few months -- on the one hand, working on retirement plans and investments for my children's education and imagining what Grace and I are going to do for the next five, ten, twenty, or forty years, and on the other hand preparing our wills and making sure we are properly insured. It's been a strange combination of unnerving and reassuring. My grandmother made it to 102, and I have her genes, so I could have 60 years or more to live. Or I could take after my mother, and have 30.

Or none.

On the way to work this morning, as often happens, the light changed for me to make a left turn. I did my usual deep breath, look both ways, count slowly to five -- to wait for whoever was going to blast through the red light to go ahead and do so -- and then started to enter the intersection. At the ten second mark a woman in an SUV blasted through the red light, going way over the speed limit, talking on her cell phone. Missing me by only a few feet.

Memento, mori.

A year ago someone did this and I was hit by such an attack of road rage I actually chased him down, cut him off and forced him over to the side of the road, then got out of the car and chewed him out for nearly leaving my children fatherless. I can't advocate that behavior. Today I took a deep breath and let it go.

It would be a stupid way to die. But so is cancer, or heart disease. And we don't have control over everything. And believing that we do is a recipe for a heart attack.

There is good news in our lives too -- we just celebrated our sixth wedding anniversary and baby Sam's first birthday and baby Veronica's third birthday -- but these celebrations have all been kind of subdued.

There have been a whole bunch of miscellaneous estate issues -- fortunately my stepfather and the estate attorney have been managing most of this -- but my stepfather wants to sell the house he and my mother shared in the very short term, and so is trying to dispose of my mother's personal effects very quickly.

This means a house full of furniture, clothes, and personal effects. Since I'm the son that lives a mere 250 miles away, instead of 2,000, I'm the one that has to figure out how to triage everything and move anything we want to save into our rather cramped and cluttered apartment. Which means a major purge of our existing clutter, and also coming to terms with letting go of almost all of my mother's personal effects.

Along the way I discovered that my mother and my grandmother had amassed a huge collection of family photos and documents, going back several generations. In addition, hundreds of documents: letters, journals, autobiographies, even short stories.

We have pictures of people I think are my son's great, great, great, great grandparents. I have not identified everyone yet, but there may even be pictures of a five-greats grandparent. My grandmother was a member of the Daughters of the American Revolution, which means she can trace her ancestry back to 1776. Which means my daughter can, too. That's good, because it appears from the organization's somewhat controversial history that they could use more black members!

There are Civil-war era photos. Cyanotypes from around 1900. Thousands of photographs -- perhaps 10,000. The oldest ones are mostly in pretty good shape, but many of the color photos, for example instant photographs from the 1970s, are fading badly. And there are some serious preservation issues -- photos that were recently annotated in ballpoint pen ink, which is acidic and eats through the paper until it stains the emulsion. Photos torn from albums and scotch-taped into new albums, or bundled together with paper clips or rubber bands and stuffed into acidic paper envelopes and shoe boxes.

I decided, and Grace concurred, that I was going to engage on a preservation and archiving project. We can't let the collection of family history end with my generation. So I have embarked on that project, which will consist of organizing, cataloging and propagating both the original artifacts and digital derived works.

So, for the immediate future, this project is now my highest priority. My other projects, blogs, and commitments are largely at a standstill. I apologize to everyone who I've ignored or failed to follow up with on some promise or another. But I think my children and grandchildren will approve.

Anyone interested can follow my progress on my blog, The Marcella Armstrong Memorial Collection.

31 August 2007

Another Death in the Family

We just received word that Grace's father, my father-in law, died today at about 6:10 p.m. He died just 13 days after my mother.

This means that this month we lost both my mother and my wife's father. Our children just lost two of their four grandparents in a span of just under two weeks. Isaac, at 13, was old enough to get to know them a little bit, but Veronica and Sam, at 2 years 10 months and 10 months, respectively, will never really get to know them.

It's been a hell of a month. Since bad things usually come in threes, it makes me wonder what we're facing next! Maybe it's my turn?

22 August 2007

Eulogy for My Mother

(This is not about Haskell or even programming per se, but it is, in part, about the interaction between technology and human lives and the limits of that technology, and about focusing on what is important in life. So I thought the Haskell community might forgive me for posting it in this forum. I hope to get back to writing about Haskell soon).

Dear Mom,

In the past few years, whenever something happened in my life, good or bad, I always got the urge to call you and bring you up-to-date, reassure you, and get your reassurance and advice. For the past few weeks, I've had that urge to call you constantly, because a lot of big changes have been happening. Now I can't call you, but I can still write you a letter. I can't mail it to you, but here is what it says.

The big news is that you have died. This was quite a shock to everyone. It happened very fast. You had a lot of things going on -- oxygen masks, IVs, tests, pain medication, sedatives, all kinds of doctors and nurses, and lots of family members with worried expressions on their faces. It must have been very confusing, especially in the last few days, so I thought I would explain to you what happened in case you were confused.

About two years ago you were treated for breast cancer with a combination of surgery, chemotherapy and radiation. I know that was very hard on you, but the treatment seemed to be quite successful. You recovered some energy and were able to spend quite a bit of time in the last two years traveling with your husband. You were getting regular follow-up care to make sure the cancer did not come back.

About three weeks ago, after a short trip to Chataqua, New York, you were exercising with your friends at the YMCA. Things didn't feel right, though; you had been complaining about a feeling of bloating and pain in your abdomen. You said that you must have eaten something that didn't agree with you. You were only a few days away from your regular follow-up visit with your doctor. But when you called and told him about the problem, he suggested you go to the Emergency Room immediately.

From that point everything started moving with frightening speed. You had a CAT scan, and it showed tumors in your abdomen. It isn't exactly clear whether these might have been caused by cancer cells from the breast cancer or whether it was another, separate case of ovarian cancer. It doesn't really matter much at this point.

It would be easy to get very angry at your doctors and blame them. I find it a bit hard to believe that oncologists screening someone regularly for cancer would fail to detect such an advanced case of cancer. It isn't like you came down with a rare tropical disease that was outside their specialty. If anything I would have expected them to suspect cancer and do extra tests to rule it out even when it wasn't there. However, it is important to note that this kind of cancer, in the abdominal cavity, often has no detectable symptoms at all until it is very advanced. And I remind myself that the treatment you got two years ago did restore your health and give you two good and enjoyable years you would not have had otherwise.

Anyway, the next thing was that you went down to the Magee-Womens hospital in Pittsburgh for some tests. The plan then was to find out what was going on and make a treatment plan. But your pain got rapidly worse. You were admitted. I spoke to you on the phone each day for a couple of days. I was making plans to come and trying to arrange with my father and brother to come too. We got the word that you had been moved to the intensive care unit, and so everything went into high gear. We moved up our visit. My brother and my father arranged to fly out. We all got into Pittsburgh the evening of Thursday the 9th of August, just after a series of severe storms had produced tornadoes and flooding. When my father and brother got in by plane around midnight they immediately went to the hospital. Grace and the children and I all came to visit too, in the middle of the night.

You were miserable and frightened in the ICU. You were curled up on your side, wearing an oxygen mask, with your eyes tightly closed. You were able to talk, but mostly we just wanted to hold your hand. It took a while for us to figure out exactly what was going on. You had a a partially collapsed lung, and fluid building up around your lung. It was hard for you to breathe. You had pneumonia in one lung. You also had some sort of kidney infection. But with tubes and antibiotics and a lot of moral support over the next day or so you improved. You got your eyes opened and we were able to sit and talk with you. We talked about how you were doing, what we knew and didn't know about your condition. We told you about our families and how well everyone was doing. We talked about dying and what we thought it might be like, and what you believed and we believed would happen to you after you died. We told you we loved you.

You were moved back into a regular hospital room. This was still not a very comfortable environment but it was a far less frightening place than the ICU. Over the next couple of days you had a lot of visitors -- your husband Dick, your daughter-in-law Carolyn, my father Richard, my brother, your niece Linda, your nephew David, and Grace and the kids. We arranged for people to stay with you round-the-clock. Often one of us would be with you in your room while another one of us caught a quick nap in the waiting room next door.

One night while my brother was sitting with you during the middle of the night you began fighting to get out of bed. My brother tried to use the call button to get help but it didn't work. He had to run out into the hall and yell for help. You managed to get partly out of bed and it is a miracle you didn't tear out an IV. But you had a breathing crisis and you had to go back onto the high-pressure oxygen mask. You hated that mask because it was painful; it rubbed your face raw and dried out your lips and mouth terribly. For a few days it went back and forth like that. You'd improve a little bit; you got a transfusion and some new drugs and that seemed to help. But you'd get worse in the middle of the night.

The CAT scan and biopsy was postponed again and again. The doctors found that you had blood clots in your leg and also that something had gone wrong with your heart. It was very weak and it appears that you may have had a heart attack, or maybe it was damage from your previous chemo or radiation. The radiation in particular may have been poorly administered. Or maybe it was all three. They put you on heparin to thin your blood. They decided they could not do a biopsy. At one point the doctors were considering putting you back into the ICU but you did not want to go back. What you wanted was to leave the hospital, and to go up to a nursing home in Erie so that you could die in comfortable surroundings.

This brings me to the point of talking about your wishes. You made clear both in your written directives that if there was no chance of recovery, you did not want invasive procedures or extreme measures taken that would just serve to prolong your life. You told these things to the doctors as well -- no tubes for breathing or feeding, and no zapping your heart with electricity if it stopped. It is one thing to check some boxes on a form, but I think it is quite another thing to realize that it is time for these directives to be carried out. I was, in fact, awed by your bravery in sticking to your principles.

So, we next focused all our attention on trying to get you moved to Erie. I have to confess that I was terrified that you would die before we could carry out your wishes. But there was only one day's delay, and on the morning of Tuesday, August 14th, the ambulance crew came to take you to Erie. I rode in the front of the ambulance with the driver while you were in the back with the nurse. For me it was the longest chunk of uninterrupted quiet time I had gotten since leaving for Pittsburgh and I finally broke down crying for a while.

I thought that the stress on the family was going to let up and that you would be in good care. But apparently hospice care in Michigan and Nevada and California is much different than the hospice care arrangement we had in Erie. They got you settled comfortably into a room at the Manchester Presbyterian Lodge. The social worker and hospice nurses and nursing home administrator and head nurse greeted us. But things did not go quite like we planned.

All of your IVs and needles and big heavy oxygen masks were removed. Instead you just had a simple cannula in your nose for oxygen. Your pain medication was changed to liquid morphine by mouth. But after 24 hours it became clear that things weren't quite right. You went from being able to communicate to sleeping around the clock. You were over-medicated. It also became clear that without some orders from the doctor, guidance from the hospice organization, or specific requests from family, the nursing staff at the home was not going to give you special care. For example, they still had you on a regular diet. They would plunk down a meatball sub on your table and take it away an hour later. Because you were so heavily drugged they would not even try to get fluids into you. You asked for pen and paper to write some notes but you were so sedated that we could only read a few words of what you wrote. You wanted to communicate but you couldn't. The system was failing you again.

So once again the family went into high gear. We started a round-the-clock visiting schedule with you. We had to feed you because the nurses were not trying hard enough. We got your diet changed to food that you were better able to eat. We started trying to get your doctor to change your prescription, but we were unable to get the doctor out to examine you in person and change your orders until the evening of your third day there. Your condition was changing for the worse so rapidly that we found this completely unacceptable. I began refusing part of your medication on your behalf.

This was probably the most nerve-wracking experience of my life, because above all I did not want you to be in any pain, but I thought that you deserved to share your last days with your family and friends. I think it was the right decision, though. You became alert again, while reporting that you were not in any pain. You were able to visit with family and friends. We got round-the-clock visitation going again. Carolyn came back, and brought her daughter Alicia. You had live music. You had grandchildren visit. You had a couple of good days.

We still felt like we had to watch over everything. Even after your doctor came back and changed your medication orders, although my father followed him down to the nurses station to see what he wrote, what he wrote was not what he had just agreed to write. Your doctor was literally attempting to euthanize you -- not to help you live out your last days as well as possible, but to drug you up and ignore you until you died.

I felt like we were having to act as doctor and nurse and social worker ourselves, all the time. Between the lack of sleep and the nervousness, I completely exhausted myself, to the point where I could not walk and began having hallucinations from lack of sleep. I developed an ulcer. The rest of us were also exhausted. It was the single most stressful week of my life to date.

You gradually became weaker. Your heart rate went up and your blood pressure went down. You began to accumulate more fluids in your lungs. We got your medication raised again so that there was no possibility you would be in any pain. The decision was made that you could not receive anything more by mouth to eat or drink. On Saturday evening at about 6:30 p.m. you died. It was as peaceful and graceful a death as we could make it. I was not there, but your husband and his daughter Alicia were there. Alicia was playing music.

I just want to say a couple more things.

First, that the way you faced your death was an inspiration to me and to everyone around you.

Second, when my father and I were visiting you in the ICU, you said that you were very proud that you had always had the ability to forgive everyone who had harmed you. I would like to ask you one last favor. I want you to forgive the doctor in Erie who displayed such indifference to your condition. I want you to forgive the other doctors that failed to detect your cancer. I even want you to forgive the doctor in Pittsburgh who had such terrible bedside manner that he burst into your hospital room, explained loudly that your heart was failing and that there was nothing more they could do, and left. I will forgive all these people. It might take me a while longer, though. I'm not quite as good at it as you were.

Third, I'd like you to remember that we had our share of doctors with poor bedside manner and nurses who treated you like an annoyance, there were also quite a few people who were very kind to you. The hospice nurse who was with you on your last day was wonderful. The palliative care physicians in Pittsburgh were wonderful. They spoke very calmly to you and asked you if you had any unfinished business. You said that you wanted everyone to know that you loved them.

And last, I just want to thank you. You were a single mother in a difficult time. You had a hard and complicated life. You raised me and my brother. You did a great job. You re-married and brought new fathers into our lives. You cared for my stepfather Wence, and you cared for your mother, and you cared for your husband Dick. You enriched the lives of everyone around you. You had a lot of friends and you were well-loved, and you still are.

We'll go on. We'll miss you terribly. I wish you had gotten more time to enjoy your grandchildren. I wish we had gotten more time to spend with you. You said that you were concerned about the state of the world. The world will go on. It will be OK. We'll be OK. And we know that you are OK now too.

I'll see you again, before too long. And I'll be in touch.

Love,

Paul

07 May 2007

A Hard Dive Headache

This isn't exactly programming-related per se, but I present henceforth my story demonstrating how hardware compatibility issues trump software compatibility any day!

I have a PC and a Mac on my desk at home.

I have a small handful of spare hard drives I use to back up these computers plus two laptops.

I try to buy hard drives when they hit a price bang-for-buck "sweet spot" -- this year, that is in the 250G range; a couple of years ago, it was in the 40G range. So I have a couple of 250G La Cie enclosures. I also have a couple of fanless "Mad Dog Multimedia" enclosures which I bought empty and filled with Seagate 40G drives. They're both USB/FireWire, which is ideal for moving things around between Macs and PCs.

All of them come with external power bricks. The power bricks for all four drives are nearly identical, physically, black boxes with removable AC cables and attached DC cables. The boxes have little green LEDs. They all have identical DIN-4 round 4-pin connectors. These have a little notch on one side and a flat spot on the other, to make sure you orient them correctly when you plug them in, and identical DIN-4 connectors that plug into the drives.

They're all plugged in to power strips on my desk.

The "Sunpro Powmax" bricks that came with the Mad Dog enclosures have a little diagram on them that looks like this (sorry, I don't have a digital camera that will zoom in close, or I'd take a picture).

            _____ 
          /       \
GND ---- | *     * | ---- GND
+12V --- | *     * | ---- +5V
          \       /
           --__--

On the diagram above, the flat side of the connector is shown at the top, and the little "notch" is not shown, but it goes at the bottom.

The LaCie bricks, on the other hand, made by "Sunfone" instead of "Sunpro," have a little diagram that look like this:

     --  ---
    /  \/   \
2  | *     * | 1  PIN 1:5VDC
4  | *     * | 3  PIN 2:12VDC
    \_______/     PIN 3:GND
                  PIN 4:GND

Where the flat part of the circle is at the bottom and the little notch is at the top, although this is actually pretty hard to distinguish in the diagram.

I'll flip them around for you so the orientation of the circular connectors is the same, and label the pins more clearly:

    __-----__           __-----__
  /           \       /           \
 |   G     G   |     |    G    G   | 
 |  +12V  +5V  |     |   +5V  +12V |
  \           /       \           /
    -___/\__-           -___/\__-

Just let that sink in for a moment.

It took me a while to guess why my hard drive, a 250G La Cie drive, wouldn't mount. Finally I started looking at the power bricks. You know that feeling you get when you find out that you've just broken something expensive?

I guess it could have been worse; it could have fried the PC's motherboard too, via the USB port. Who designs this crap? (I know who builds it, but I can't very well blame this on the manufacturers in China; it appears they both did a wonderful job manufacturing exactly what was specified).

Both power supplies have all the appropriate regulatory agency approvals: CE, FCC, UL, and some others I don't recognize. But, of course, they don't actually conform to any particular wiring standard.

After determining that the La Cie drive was out of warranty, and given that the warranty was unlikely to cover this kind of idiocy, I decided to try opening up the enclosure in the hopes that it might have some kind of fuse that I could replace, and that the drive itself was OK.

Surely they must have some kind of fuse in the enclosure's circuitry, right?

I'll bet you a new hard drive that they don't.

Well, the La Cie enclosure is a pretty nice enclosure, but it isn't designed to come apart. They have part of the drive covered with an easily torn (yes, I should know) adhesive foil, and mounted on some kind of sticky silicone anti-vibration pad. I could try stuffing a new drive in it at some point.

The dead drive smells like burning plastic. There's one little surface-mount component on the back that is visibly burnt. If this were 1980, with my knowledge I gained putting together Radio Shack 250-in-one electronic experiment kits, I would be able to tell you if it is a resistor or a capacitor or a diode. If this were 1980, I could probably even solder on a new part. But it is 2007 and all the little black parts look the same, and even if I could find a replacement component I doubt I could solder it on.

Needless to say, the drive would not spin up when I put it in the "Mad Dog" enclosure.

So I put the 40G drive back into the "Mad Dog" enclosure, and powered it up.

I'll give you two guesses which kind of power brick I plugged it into.

My office smells like burning surface-mount components. There's a nearly identical little burnt component on the other drive. It's in a slightly different place on the 40G drive.

I had some files on those drives that I wanted. The 250G drive had mostly unimportant stuff, but a few gigabytes of important stuff.

I'm no fool. I kept a backups of that handful of important files.

On another hard drive. Guess which one?

I've got a headache.

I can't in good conscience ask Seagate to fix them under warranty. Can this kind of thing even be repaired any more? It seems like it should be possible. But I'll have to look into that later.

I'm going to bed. Ever feel like you want to yell at someone for being an idiot, but the only one you can find is between your own chair and keyboard?

09 April 2007

A Wonderful Waste of Time

I just encountered this great toy to challenge your three-dimensional geometry skills. The game in effect asks you to pretend you're creating an Escher-like puzzle, producing a an object that only looks right from three precise vantage points. Ever wanted to pretend to be a 3D rendering optimizer function? Now's your chance!

Warning: if you start it, you won't be able to stop until you complete all the puzzles. Which I did, although it took me an embarassingly long time.

Haskell for the Short Attention Span: A Simple File Filter

Well, there is more to say about run-length encoding and other interesting examples I was sent in response to my last article, but today I've been working on another little real-world problem. I've been able to spend a little bit of time on #haskell and the gang there has been extremely helpful. The task: filter a binary log file, turning it into a text representation, and do some stateful validation. I'm not going to show you all the code (it is pretty boring), but here are the minor pitfalls and helpful suggestions I encountered along the way. First, I started with a very basic file filter example from the Haskell 98 report:

main = do
    putStr "Input file: "
    ifile <- getLine 
    putStr "Output file: "
    ofile <- getLine 
    s <- readFile ifile 
    writeFile ofile (filter isAscii s)
    putStr "Filtering successful\n"

Pretty simple. As a sanity check, I wanted to run this example on a data file. The first thing I found is that isAscii is not in the default namespace. To use this program in GHC you'll need to import the module Char. You can learn this via Hoogle. I've started using Hoogle quite a bit; it is a great little tool! You can find Hoogle here. You can also get to it directly through #haskell via LambdaBot. Or you can put it right in your GHCi. Lambdabot lives here.

Anyway, the next thing I found was that, at least under Cygwin, unless you explicitly open a file in binary mode, you might encounter problems. One I should have anticipated -- you might see line feed translation. But there are apparently others -- like silent truncation of the data! Somewhere in the guts of Cygwin, or maybe Windows, control codes designed back in the Jurassic days of computing are still being honored. Input was terminating on (apparently) an EOF character in my binary file. Another suggestion from one of the very smart folks on #haskell.

I'll just use readBinaryFile instead of readFile, right? Well, no, readBinaryFile is part of MissingH, a third-party library. Back to #haskell, where I was advised that I could roll my own. Importing System.IO, I use the following snippet:

readBinaryFile s = System.IO.openBinaryFile s
    System.IO.ReadMode >>= System.IO.hGetContents

This works fine, although I would humbly suggest that GHC's standard library should provide easy access to binary files; while the solution is pretty trivial, it is a wart.

I next stubbed my toe on printf. It is available in module Text.Printf and provides a type-safe version of C's printf. As a long-time C and C++ programmer, you'd think I'd know just about everything there is to know about printf. So, I rapidly gave it a format string "%02X" and passed it a char. Apparently the uppercase X to produce a hex representation with uppercase A-F is not supported (grrr). Another minor wart -- if you provide printf, it should behave like printf -- but we'll move on.

Per more chatting on #haskell I was given this one liner to dump binary data in a nicely formatted way:

writeFile ofile (concat $ zipWith (printf "%02x %s") s (cycle $ replicate 19 "" ++ ["\n"]))

I want to take a moment to talk about how it works. First, cycle $ replicate 19 "" ++ ["\n"]. The replicate function gives us a list of 19 empty strings, which we then concatenate with a newline. Applying cycle to this list treats it as an infinitely repeating circular list of strings, where every twentieth is a newline. These arguments are then fed to printf using zipWith. zipWith is an interesting function: while zip takes two lists and generates a list of pairs produced by assembling the list elements into tuples, zipWith doesn't tupleize the elements; instead it feeds the elements to the provided function, and makes a list of the results.

While this worked, it was interesting enough that I wanted to play with it using GHCi. But I had to give up on that; I kept tripping over the type checker. While I appreciate the masochistic joys of programming and the safety that comes with it, it can be frustrating for programmers with experience in, say, Ruby, or even C.

For example:

let xs = [1..100]
let ys = take 100 (cycle $ replicate 19 "" ++ ["\n"])
zipWith (printf "%02x %s") xs ys

GHC replies:

Ambiguous type variable `c' in the constraint:
`PrintfType c' arising from use of `printf' at <interactive>:1:9-24
Probable fix: add a type signature that fixes these type variable(s)

Ugh. Using the :t command in GHC it is easy to see that GHC thinks the type of xs is [Integer] and ys is [[Char]] (a list of list of chars, also known as a list of strings). If I put roughly the same code in a Literate Haskell source file and ask GHC to load it, I get:

Ambiguous type variable `a' in the constraints:
  `Enum a'
    arising from the arithmetic sequence `1 .. 100'
    at E:\toy.lhs:3:5-12
  `Num a' arising from the literal `100' at E:\toy.lhs:3:9-11
  `PrintfArg a' arising from use of `printf' at E:\toy.lhs:5:18-33
Possible cause: the monomorphism restriction applied to the following:
  xs :: [a] (bound at E:\toy.lhs:3:0)
Probable fix: give these definition(s) an explicit type signature
              or use -fno-monomorphism-restriction

followed immediately by:

Ambiguous type variable `c' in the constraint:
  `PrintfType c' arising from use of `printf' at E:\toy.lhs:5:18-33
Possible cause: the monomorphism restriction applied to the following:
  result :: [c] (bound at E:\toy.lhs:5:0)
Probable fix: give these definition(s) an explicit type signature
              or use -fno-monomorphism-restriction

Failed, modules loaded: none.

Wow. I'd say that is not really a newbie-friendly error message. However, this printf works fine in my real program. I'm not certain why, and I'm not going to dive into it too deeply right now. But here's a simpler type checking example: while C is strongly typed, you can treat numbers as chars and vice-versa, as long as you keep integral promotion and sign extension in mind. GHC is a harsher mistress. Let's say we want to pattern-match on our binary data. The value 16 in my binary data is DLE, which stands for Data Link Escape; it is often used in serial data to indicate packet boundaries, while inside the payload, it will be escaped (doubled). So here's a little pattern to remove doubled DLEs:

de_dle (16:16:xs) = ...

Simple enough, right? No, to Haskell a number and a Char are not interchangeable. Back to #haskell, where I got a quick explanation of the type checker's error messages. I turned the numbers into chars:

de_dle ('\16':'\16':xs) = ...

And that works just beautifully.

Anyway, to make a long story shorter, I was able to write my file filter, which does some nice pattern matching and formatted out. The problems I had while developing that were the kind I like: problems choosing my algorithm properly, not problems fighting with the language. The runtime was quite helpful here; while processing the file, if I hit a case at runtime which my patterns did not handle, I got a runtime warning about non-exhaustive patterns. That led me to reorganize my patterns, and the result was much clearer.

There was one more minor pitfall remaining. I compiled my program using GHC, but when I ran it, instead of my prompts for input and output filenames, I got nothing! Haskell was silently waiting for input. Back to #haskell. To make the output show up, I had to import System.IO and do hSetBuffering stdout NoBuffering. (Alternately, I could flush stdout immediately after each putStr, but that seems even uglier). I hope that saves someone a little aggravation.

Speaking of aggravation, how did it all come out? Well, the original binary log file is about six megabytes. Since I was doing this by hand, I was more concerned that the program ran correctly than that it ran fast; I would have been satisfied with anything under a half-hour. In fact, without any attempts at optimization at all, my filter ran in under thirty seconds, which is more than fast enough for an ad hoc little tool. I started out trying to do this task using some regular expressions in vi, and that was quickly going nowhere, and taking forever to do it. The vi that came with Cygwin doesn't support some of the more advanced regular expressions features (there is no {x,y} syntax for specifying the number of repeats of a pattern). Notepad++, my workhorse Windows text editor, also doesn't support this syntax. Without this the regexes were becoming hideous, although in (say) Perl they would have been rather simple. But I was able to use Haskell instead, thanks to GHC and the kind folks on #haskell!

07 March 2007

Haskell for the Short Attention Span: Run-Length Encoding, Part 3

This isn't a big improvement, but it occurs to me that my implementation from last time:

>deRLE :: [Integer] -> Bool -> Integer -> [Integer]
>deRLE [] False _ = []
>deRLE points False max = deRLEHelper points max
>deRLE points True max = deRLEHelper (0 : points) max

>deRLEHelper :: [Integer] -> Integer -> [Integer]
>deRLEHelper [] max = []
>deRLEHelper [unmatched] max = [unmatched..max]
>deRLEHelper (start:end:rest) max = [start..end - 1]
>    ++ deRLEHelper rest max

can be yet shorter; since deRLEHelper handles the empty list specially, deRLE doesn't need to:

>deRLE :: [Integer] -> Bool -> Integer -> [Integer]
>deRLE points False max = deRLEHelper points max
>deRLE points True max = deRLEHelper (0 : points) max

That way only the function that actually recurses down to the empty list has to explicitly handle it.

Can we make it even more concise and still retain the self-documenting aspects of the code? I'll have to come back to that question. It's time to respond to comments. When I first mentioned my triplify function, which breaks a string into strings of three characters, I wrote "I'm sure there is a very clever one-line fold that applies take 3 in an mind-expanding manner, but I was unable to find it." I got some great suggestions. The first, due to Cale Gibbard, was:

map (take 3) . takeWhile (not . null) . iterate (drop 3)

Let's look at it piecewise as a pipeline, from right to left.

First, our list/string (for example, "ABCDEFG") is fed to the function iterate along with (drop 3). Iterate is an interesting function. I tried to understand it a few months ago by reading the description at zvon.org, which says that iterate

creates an infinite list where the first item is calculated by applying the function on the second argument, the second item by applying the function on the previous result and so on.

They give the following example:

Input: take 10 (iterate (2*) 1)
Output: [1,2,4,8,16,32,64,128,256,512]

Note that if the behavior followed the description, the output would be [2,4,8,16,32,64,128,256,512,1024]. Their description is wrong, although the output matches GHCi's output. What does iterate really do? I'll describe it in the form of a poem.

iterate takes a function and a value, and returns a list consisting of:

the value

followed by
  the value of applying the function to the value

followed by
  the value of applying the function to
    the value of applying the function to the value

followed by
  the value of applying the function to
    the value of applying the function to
      the value of applying the function to the value
      
it will keep talking
  even when
    it has nothing
      new to say
        so please don't
        ask it any
          open-ended
            questions...

That is, I can't evaluate

iterate (drop 3) "ABCDEFG"

because even though the input is finite, the output isn't; there's no termination to the recursion. However, I can evaluate part of the result:

take 4 (iterate (drop 3) "ABCDEFG")
["ABCDEFG","DEFG","G",""]

And because of Haskell's lazy evaluation, we can feed this infinite result "upstream" to be used in the remainder of our calculation, and it will work fine as long as it doesn't evaluate the complete result. The next stage in our "points-free" pipeline is:

takeWhile (not . null)

The takeWhile function operates on a list, returning a list made up of its values as long as they match the predicate supplied. This has the effect of squashing the overly talkative iterate, because we stop evaluating its results as soon as it returns the first null value. Lazy evaluation is so cool!

Now, our finite list of strings is handed off to map (take 3), which generates a list of 3-character prefixes. Here's the whole process:

iterate (drop 3) "ABCDEFG"

yields an infinite list beginning ["ABCDEFG","DEFG","G",""];

takeWhile (not . null)

applied to the above will take elements from the list until it hits the first null: ["ABCDEFG","DEFG","G"]

map (take 3)

applied to the above will give us ["ABC","DEF","G"]. Note that the short remainder is preserved.

That seems so useful, I'm surprised it isn't a function in the Standard Prelude, called groupsOf. Although it would also be nice to have a version which didn't keep the short remainder, perhaps groupsOfOnly.

I got another suggestion from a user called "kirby." He suggested the function

takeWhile (\="") . List . unfoldr (Just . splitAt 3)

Wow, my first unfoldr! I'm going to have to stop there for today, while I learn a little something about Just, Nothing, and the Maybe warm fuzzy thing. My first funster!

06 March 2007

English Majors as Programmers

The Embedded Muse is an e-newsletter edited by Jack Ganssle; see the Ganssle Group home page. In Issue 142 he writes:

I've found that some of the best developers of all are English majors. They'll often graduate with no programming experience at all, and certainly without a clue about the difference between DRAM and EPROM. But they can write. That's the art of conveying information concisely and clearly. Software development and writing are both the art of knowing what you're going to do, and then lucidly expressing your ideas. The worst developers, regardless of background, fail due to their inability to be clear. Their thoughts and code tend to ramble rather than zero-in on the goal... Too many engineering-trained developers have a total disregard for stylistic issues in programming. Anything goes. Firmware is the most expensive thing in the universe, so it makes sense to craft it carefully and in accordance with a standard style guide. Make sure it clearly communicates its intent. This is where the English majors shine; they've spent 4 years learning everything there is to know about styles and communication.

As an English major who has programmed computers since the age of ten, and who also took every programming course I could manage, it's nice to hear a little love for English majors! But I should add that an English degree is not a requirement; I also know some excellent developers who majored in more technical subjects, such as anthropology and archaeology!

05 March 2007

Haskell for the Short Attention Span: Run-Length Encoding, Part 2

Greetings! Last time I introduced a piece of code to solve a small real-world problem: decoding a set of function IDs from a run-length-encoded list of change points. I received some excellent feedback. One of the most valuable suggestions was that I avoid unnecessarily taking the length of the list in my "triplify" function. That makes a great deal of sense, considering that ideally I'd like to be able to handle infinite lists, and of course it would be nice if the runtime of my algorithm did not spiral out of control quite so quickly!

I also gave a little more thought to the various base cases. One of the nicest things about Haskell is its pattern matching. Indeed, even a simple function like "square a = a * a" makes use of pattern matching, where "a" on the left-hand side is irrefutable. My implementation last time failed to fully take advantage of pattern matching on multiple parameters, particularly to handle base cases, and did not even handle all the various termination conditions properly. So, here is a slightly condensed version of the code that makes the termination conditions much more explicit, and so I claim even though it is shorter, it is more self-explanatory.

>import Char
>import Numeric
>rle_raw_str = "00 30 90 09 31 01 10 32 20 22 12 30 23 12 41 24" ++
>  "34 40 44 1C 0F C1 1D 00 D0 1D 03 D0 4D 05 D0 7D 0A D0 CD 10" ++
>    "D1 2D 20 D2 6D 30 D3 1D 40 D4 1D 50 D5 1F 00"

A new triplify:

>triplify :: String -> [String]
>triplify [] = []
>triplify (x:y:z:rest) = [x, y, z] : triplify rest
>triplify str = error "Unexpected string length (trying to match 3 characters)"

Treating the input data as a list of 12-bit numbers:

>rle_change_points = map (fst . head . readHex)
>  (triplify (filter isHexDigit rle_raw_str))

The main function: given a list of change points, a hint as to whether we are initially presuming the decoder will output ones or zeroes, and a maximum value, decode RLE data from the change points. Return the indices of the one bits.

>deRLE :: [Integer] -> Bool -> Integer -> [Integer]
>deRLE [] False _ = []
>deRLE points False max = deRLEHelper points max
>deRLE points True max = deRLEHelper (0 : points) max

>deRLEHelper :: [Integer] -> Integer -> [Integer]
>deRLEHelper [] max = []
>deRLEHelper [unmatched] max = [unmatched..max]
>deRLEHelper (start:end:rest) max = [start..end - 1] 
>  ++ deRLEHelper rest max

This is more like it -- the kind of program that resembles a proof more than code. I'm starting to prefer this style!

A brief explanation: if we're initially decoding zeroes, the empty list case can be disposed of quickly. The program filters this case out first so that the recursive helper function, which accumulates the output, doesn't need to handle this case. The next two lines handle non-empty lists. If we're initially decoding ones, we can still treat all cases the same if we just prepend a zero to "flip" the decoder. Our helper function then has only three cases: a termination case where an interval of ones is not closed (because we have a single change point left), in which case we spit out ones up to and including our maximum possible value; a termination case where we have two change points forming a closed interval, in which case we spit out an interval inclusive of the start and exclusive of the end, and a non-termination case in which we have more than two entries, which spits out a closed interval and continues the process.

Let's look at some test cases. First, we confirm that if we're generating zeroes and we encounter no change points, our result set is empty:

deRLE [] False 0xF

[]

We confirm that if we hit a single change point, the result set will contain all the values up to and including the maximum possible:

deRLE [0] False 0xF

[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

deRLE [11] False 0xF

[11,12,13,14,15]

In the general case where our interval is closed, our result set is inclusive on the left and exclusive on the right:

deRLE [5, 10] False 0xF

[5, 6, 7, 8, 9]

Closing one range and opening another without closing it works correctly:

deRLE [5, 10, 11] False 0xF

[5,6,7,8,9,11,12,13,14,15]

If we begin assuming our decoder generates zeroes, an empty list of change points maps to all possible values:

deRLE [] True 0xF

[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

A single change point at zero toggles the state of the decoder and produces and empty result:

deRLE [0] True 0xF

[]

A single change point greater than zero allows some values into our result set before it is closed:

deRLE [5] True 0xF

[0,1,2,3,4]

Two change points defines a range to exclude from the results, and we get the rest of the values in the range:

deRLE [5, 10] True 0xF

[0,1,2,3,4,10,11,12,13,14,15]

Three change points suppresses the rest of the result set:

deRLE [5, 10, 13]

[0,1,2,3,4,10,11,12]

With these tests I'm satisified that deRLE works for all expected cases.

I received more comments, including some recommendations for a nifty unfold. Thanks to everyone who made suggestions. Next time I'll take a look at how those work, talk about anything else that comes to mind, and then proceed to my next real-world problem!

01 March 2007

Haskell for the Short Attention Span: Run Length Encoding, Part 1

I'm a fortunate man. Some 50% of people who suffer herpes zoster infections involving the eye wind up with permanent eye damage. I seem to have escaped with little or no permanent damage. The nerves are not completely healed, and there is still visible damage under the skin, but it is no longer painful to work at the computer or read!

It's time to jump back on the horse and risk looking foolish in public by writing some code. I still consider myself to be close to a rank beginner in Haskell, so you're reading my "finger exercises." To experienced functional programmers they will probably sound like a new violin student sawing away. However, since I haven't been a student for almost twenty years and there is no one to assign me homework but myself, this seems to be the next best way to force myself to learn something!

With three children at home, two of them babies, I don't get a lot of unbroken free time; I've got a guaranteed short attention span. So consequently I'm going to look at some very small programs. However, rather than make up toy programs from scratch, I am now going to turn to a few real-world problems that I've stumbled across while doing embedded programming.

OK. Consider a spherical cow of uniform density... wait, I mean, consider a black-box system with an outward-facing interface in which a number of functions are made available. Functions are accessed via a 12-bit unique ID. A client of that system doesn't know which functions are available. The client needs a method to retrieve a list of funciton IDs. Let's assume we don't have bandwidth to burn, so we don't want to send a complete list of all the function IDs. Let's further assume that the functions are organized into a small number of groups with consecutive IDs. In other words, this is data that is very amenable to a simple compression scheme.

We're going to do it using Run-Length Encoding (RLE). Here is a brief informal explanation of RLE; if you are already know all about RLE, skip the next paragraph (but read the ones after that, please!)

RLE is a commonly used technique for encoding bitmaps; a variant of RLE is even used in fax machines. Rather than store a bit for every bit in a bitmap, RLE records _runs_ of repeating data. If you have a raster image that you're scanning left to right, top to bottom, and that image has a lot of horizontal lines in it, or black and white rectangles, this might be a big win. The RLE encoding will record long alternating runs of ones and zeroes. If the image has a lot of vertical lines, this might not be much of a win. The worst-case scenario is a 50% checkerboard: in that case, we'll have to indicate runs of length 1, and the overhead will be far worse than just encoding the bits. For that kind of image you'd be better off using a means of compression that actually adapts to the frequency of occurrence of particular patterns in the data, such as Huffman coding or LZW.

Now, there are several possible ways to represent the runs of RLE data. One scheme might use tuples consisting of the bit value and the number of occurrences of that value (the "run length"). For example, the tuples [(0, 14), (1, 28), (0, 3)] could represent a run of fifteen zeroes, followed by a run of twenty-nine ones, followed by four zeroes. Why are the values off by one? Because we would never need to encode a run of length zero, so we allow zero to represent one, and so on. A tuple like this could be encoded in a single byte, where the high-order bit represents the value and the remaining seven bits represented the run length - 1. This gives us the ability to represent runs of up to 128 bits. Of course, if we have a run of length 129 or greater, we'll need to represent it with multiple bytes. For degenerate cases where we encode very long runs, we'll have unnecessary redundancy.

For that low-entropy bitmap, a better scheme might be to just record the points at which the data _changes_. We have to choose an initial state for decoding: should the decoder, prior to the first change point, produce ones, or zeroes? Once we know that, we don't need to encode the bit values at all. If we assume that our encoded data begins with a run of zeroes, we could represent the example above (fifteen zeroes, followed by twenty-nine ones, followed by four zeroes) by recording only [15, 44]. Let's say that we want our decoder to spit out the indices of the one bits. We'll get:

[15,16,17,18,19,20,21,22,23,24,
25,26,27,28,29,30,31,32,33,34,
35,36,37,38,39,40,41,42,43]

Using this scheme, a bitmap of all zeroes would be represented by an empty list, while a bitmap of all ones would be represented by by a single [0]; it turns on the "ones" state and never turns it off, so we just generate ones until our range of values is exhausted. (This implies that, in addition to knowing what starting state it should assume, the decoder has to know the size of the bitmap we are representing). A bitmap consisting of all zeroes followed by a single one would be represented by [N - 1] where N is the number of bits in the bitmap.

If, instead of assuming that our data starts with an implicit run of zeroes, we assume instead that it starts with an implicit run of ones, it should be obvious that we could encode the same example using [0, 15, 44]; in this case, a bitmap of all zeroes would be represented by [0], which would yield an empty list of one-bit indices, while a bitmap of all ones would be represented by an empty list, yielding a list [0..n - 1] of bit indices.

We'll assume that our function IDs are represented this way: they are the one bits in the bitmap, and we want a list of their indices. When we query our interface about its function IDs, we get back a response that consists of a short list of change points. Here are the data bytes:

00 30 90 09 31 01 10 32 20 22 12 30 23 12 41 24 34 
40 44 1C 0F C1 1D 00 D0 1D 03 D0 4D 05 D0 7D 0A D0 
CD 10 D1 2D 20 D2 6D 30 D3 1D 40 D4 1D 50 D5 1F 00

The first time I encountered data like this, I wrote myself a little bit of Ruby code to decode it. Here's first part of the Ruby code, which turns these bytes into a series of 12-bit unsigned numbers by assembling three nybbles (half-bytes) at a time:

rle_raw_list = gets.chomp.delete(" ")

rle_change_func_ids = []
0.step(rle_raw_list.length - 2, 3) { |nybble_idx|
  rle_change_func_ids.push(rle_raw_list[nybble_idx..nybble_idx + 2].hex)
}

Here's how it works. The first line reads data from standard input and, reading left to right, removes any end-of-line character and deletes spaces. We next create an empty list. "0.step" is cute little bit of idiomatic Ruby shorthand: everything is an object, including integers, and they have a "step" method, so by calling 0.step we loop from zero to the string length - 2, incrementing by threes. We also provide a Ruby block, which is basically a closure that receives one parameter, the loop index, and which is bound to the name nybble_idx. Working inside-out, the body of this closure takes three-byte substrings of the raw data, passes them to the hex function to yield an integer, and then uses push, which treats the list like a stack and appends the new integer to the end. The result looks like this:

[0x003, 0x090, 0x093, 0x101, 0x103 ...]

This is the list of change points. Once I have that, I can generate the runs of data. I'll assume that our data starts with an initial run of ones:

rle_state = true
MAX_FUNC_ID = 0xFFF
current_func_id = 0

for change_func_id in rle_change_func_ids
  if rle_state == true
    (current_func_id...change_func_id).each {|valid_func_id|
      puts sprintf('0x%02X', valid_func_id)
    }
    rle_state = false
  elsif rle_state == false
    rle_state = true
  end
  current_func_id = change_func_id
end

Now, I should mention that I like Ruby, and especially like its support for nice idioms like blocks and closures and iterators and chaining functions. But while this Ruby code worked well enough for my little task, you might notice that it doesn't actually handle all the termination cases properly. As I work with Haskell, I'm finding that I like it quite a bit more than Ruby. One of the reasons is that Haskell code, being less imperative, seems closer to a kind of abstract, platonic representation of the problem space. Problems such as the failure to properly handle various termination cases tend to look much more obvious, particularly when functions are broken down into pieces using pattern matching.

Anyway, here's my Haskell version:

>import Char
>import Numeric

>rle_raw_str = "00 30 90 09 31 01 10 32 20 22 12 30 23 12 41 24" ++
>  "34 40 44 1C 0F C1 1D 00 D0 1D 03 D0 4D 05 D0 7D 0A D0 CD 10" ++
>  "D1 2D 20 D2 6D 30 D3 1D 40 D4 1D 50 D5 1F 00"

(My first opportunity to look silly: I'm sure there's a way to break a string across multiple lines with some sort of continuation character, but I could not get "\" to work together with Literate Haskell mode, so I'll just concatenate them and hope no one notices...)

>rle_bytes_str = filter isHexDigit rle_raw_str

That was easy!

Now, my second opportunity to look silly: I'm sure there is a very clever one-line fold that applies "take 3" in an mind-expanding manner, but I was unable to find it. This was about as clever as I could manage, but it does have the advantage of checking for unexpected data length:

>triplify :: String -> [String]
>triplify str | length str == 0 = []
>triplify str | length str `mod` 3 == 0 = ( take 3 str ) : triplify ( drop 3 str )
>triplify str = error "bad length (expected multiple of three)"

To see the results, evaluate "triplify rle_bytes_str." I get:

["003","090","093","101","103","220","221","230","231","241",
"243","440","441","C0F","C11","D00","D01","D03","D04","D05",
"D07","D0A","D0C","D10","D12","D20","D26","D30","D31","D40",
"D41","D50","D51","F00"]

Now I want to turn these into a list of numbers; these are the change points. Mumble, mumble...

>rle_change_points = map (fst . head . readHex) (triplify rle_bytes_str)

[3,144,147,257,259,544,545,560,561,577,579,1088,1089,3087,3089,3328,3329,3331,3332,
3333,3335,3338,3340,3344,3346,3360,3366,3376,3377,3392,3393,3408,3409,3840]

Our playing field is the range of integers 0x000..0xFFF or 0..4095 in decimal. We want to lazily generate a list of valid values using our RLE decoding scheme. Rather than walk the list of change points spitting out values using some kind of toggling state variable to keep track of whether the decoder is processing a run of ones or a run of zeroes, or walking all the possible values filtering it by the list of change points, we'll use two functions, one for extracting ranges, and one for generating values from ranges. Both functions receive a list representing "the rest of the work to do" in the form of the remainder of the list of change points, and so this is a _little_ bit like using continuation-passing style, or using a Monad to maintain the progress of the calculation. Maybe in Part 2 I'll have learned enough to explain what I mean by that!

Our first function, getRangeFromChangePoints, has to handle some interesting cases. Assuming we are decoding RLE in the range of possible values [X..Y], what happens when our list contains a final value, N, that opens a final range but doesn't explicitly close it? Well, there are two approaches we could take. The more conservative approach is to assume that this was a harmless mistake and close the range [N..N + 1) so that it contains only the value N. (Note that I'm using interval notation in which a square bracket indicates that the range _includes_ the bound specified, and a parenthesis _excludes_ the bound specified). However, there is a problem with this approach: it means we can't generate a range that includes the maximum value, Y! That's no good, so we'll assume that a final range opening with N means that we really want [N..Y] (inclusive). In fact, since we treat ranges as exclusive at the end, this is actually the only way we can generate a range containing Y.

For now, we also assume that running out of values is an error. We're assuming that our caller will handle the termination conditions.

>getRangeFromChangePoints :: [Integer] -> (Integer, Integer, [Integer])
>getRangeFromChangePoints ([]) = error "getRangeFromRLE: no change points in list!"
>getRangeFromChangePoints (a:[]) = (a, max, []) where max = 0xFFF
>getRangeFromChangePoints (a:b:[]) = (a, b - 1, [])
>getRangeFromChangePoints (a:b:ls) = (a, b - 1, ls)

Wow, that's pretty wordy as Haskell code goes, but I kind of like the way the pattern-matching breaks down. Note that in particular it makes the termination conditions obvious. This helps me think about the problem domain, rather than the implementation, in a way that the imperative Ruby code did not.

Next, here is the recursive function to generate values from ranges. We represent a range using the first two values of our tuple, while the third value is a list of the remaining ranges (sort of like our continuation). We handle the termination case, in which we have no more work to do, by just returning a list from our range, while the non-terminating case gets the next range (including the list representing our remaining work) and calls the function again to continue generating the list.

>genValuesFromRange :: (Integer, Integer, [Integer]) -> [Integer]
>genValuesFromRange (first, last, []) = [first..last]
>genValuesFromRange (first, last, to_do) = [first..last] ++ 
>  (genValuesFromRange (getRangeFromChangePoints to_do))

To generate the initial case we use a helper function called valuesFromRLE. In ddition to the change points list, it takes a Boolean value which indicates whether we are to assume that our RLE decoding is initially inclusive or initially exclusive: that is, whether the first value in our first range indicates that the range is starting or ending. If we want it to behave inclusively, we prepend a zero value to create a range starting with zero. Note that this will be harmless if our first range starts with zero, because we'll generate a range like [0..(-1)], which will contain no members.

>valuesFromRLE :: [Integer] -> Bool -> [Integer]
>valuesFromRLE change_points initially_inclusive = genValuesFromRange $ 
>    if initially_inclusive then getRangeFromChangePoints (0 : change_points)
>        else getRangeFromChangePoints change_points

Note that we can evaluate the whole thing, but more interestingly, we can use "take" to produce only the first few elements:

take 16 $ valuesFromRLE rle_change_points True

[0,1,2,144,145,146,257,258,544,560,577,578,1088,3087,3088,3328]

And there you have it. If you like, play with the code. Next time we'll see if we can encode the function IDs back into change points, and confirm the round trip. As always, I appreciate your comments.

15 February 2007

Richard or Edna? (And Graham Hutton's Programming in Haskell)

For some reason, Richard Bird's Introduction to Functional Programming Using Haskell, 2nd Edition, is now listed on Amazon as written by "Edna Bird" (and coauthored by "Wadler" with no first name). Assuming Richard is not actually undergoing a transformation along the lines of Walter Carlos to Wendy, this looks like an error...

Anyway, my copy of Graham Hutton's Programming in Haskell arrived from Amazon. The first remarkable thing about this book is that it is the thinnest book on Haskell I've yet seen. The highest page number is 170. It is even thinner than The C Programming Language, Second Edition.

What does this mean? Well, for one thing, it means the chapters are very brief. It is a beginner's text aimed at novices, even those who may have not programmed (in any language). I would definitely consider using it as a textbook for an introductory programming class. (I have a dream about teaching a programming class to gifted high school students or college first-year students before they have had their minds corrupted by too much Java or C++; we'd learn a little Haskell, Scheme, and Dylan, in that order, and then use our perspective gained to take a brief look at Java, Pascal, and BASIC. In that order, implementing a parser for Pascal and a complete implementation of BASIC. We'd start with this book. But anyway).

The explanation of curried functions (or how "all functions with multiple arguments are curried") is the clearest I've seen. So is the explanation of basic types and classes. His chapter on recursion is also a model of clarity. (I'm pleased to see it goes above and beyond Fibonacci numbers). In general, Hutton seems to be encouraging his readers to think like programmers think, but he doesn't require very many words to get his ideas across.

If you are busy like I am, you don't have time to read a long review, either, so I'll just conclude by saying that this looks like the introductory Haskell text!

08 February 2007

Graham Hutton's Book has Shipped

Amazon tells me that my copy of Graham Hutton's Programming in Haskell has shipped. Excellent!

With the passage of time and the assistance of an acupuncturist, the nerve damage and eye sensitivity left in the wake of my shingles infection has healed, to the point where reading is no longer quite so painful. And I don't have to wear my sunglasses to work on the computer anymore!

So, soon, hopefully back to actually writing some Haskell, as opposed to just reading about it. I have a couple of small real-world programs to try porting from Ruby, which should be educational. More soon!

17 January 2007

Haskell Books Piling Up, and Herpes Zoster Ophthalmicus at One Month

I've received a couple of Haskell-related books -- my Christmas presents to myself. I have only been able to skim these books, but here are my first impressions.

The first book is Chris Okasaki's Purely Functional Data Structures. When I was working on some code to solve Sudoku puzzles in Scheme, I became stymied because in the midst of my efforts to program in a function style, I kept slipping back into an imperative style. This book is the answer to the question "how do you manage data structures when you don't have mutable variables?" The answer, of course, is that you return new data structures based on the old ones. This is not necessarily as inefficient as you might expect, because the new data structure can frequently share content -- sometimes most of the content -- with the previous data structure. Okasaki's book is a model of clarity; on the back it indicates that it can be used for self-study. A lot of texts claim that, but this book seems to actually deliver it. The diagrams are marvelously clear and the learning curve is gentle. The body of the text uses Standard ML with extensions but there is an appendix containing Haskell code. I will have more to say about this book in future postings; I am really looking forward to trying out some of the ideas it contains.

The second book is The Haskell Road to Logic, Maths, and Programming by Kees Doets and Jan van Eijck. This is part of the King's College Texts in Computing series which includes An Introduction to Lambda Calculi for Computer Scientists. That book is thin, while this one is fairly fat; this volume is bursting with exposition (and code) on logic, set theory, and other topics dear to my heart. It looks quite fascinating and again I hope to have more to say about it soon.

It has been approximately a month since I was hit with the first symptoms of herpes zoster ophthalmicus (shingles infection involving the eye). The news is generally very good: there is little long-term damage to my cornea. My opthalmologist can see some slight scarring, but it should continue to improve. But the pain and hyper-sensitivity to light has persisted, and I have found it difficult to return to work because my eye tends to start watering madly or lose focus when I work at the computer for longer than a few minutes. The problem is in the branches of the trigeminal nerve, not the eyeball itself. Nerves heal slowly, so this is unfortunately not an illness for the impatient!

02 January 2007

Shingles

Well, I had great plans for my ten days off -- work on the apartment, organize the office, deep-clean the kitchen, read, and study Haskell!

So much for best-laid plans. Just before Christmas I came down with a shingles infection, which started in my right eye.

Shingles is a reactivation of the herpes zoster or "varicella" virus, the same virus that causes chicken pox. I did a near all-nighter the Monday a week before Christmas, which might not have been too bad, but then the following night we had a sick baby and I was also unable to sleep. That was apparently enough stress to cause the virus to reactivate.

Shingles is painful. I have become intimately familiar with two new types of pain, which I call the blowtorch and the nightstick. The blowtorch is a feeling like my skin is being burned by an open flame and is bubbling. That is the sensation of the blisters forming, although it comes at other random times too. The nightstick is deep pain along the nerves of my face, where they feel like I've been beaten. That produces a tremendous amount of muscle tension in my jaw, and around the base of my skull as well, where large areas are still hyper-sensitive to touch.

Valtrex (an anti-retroviral drug) also has nasty side effects. I got waves of fever and chills, often back-to-back. It also caused nausea, diziness, and strange visions. I'm off the Valtrex now, so thankfully those side effects have gone away.

For most of my ten days off work I was fairly incapacitated: unable to see, unable to read, lying in bed with fever or chills and with strange pains, covered with blisters and scabs. When I got out of bed, I was dizzy, stumbling around the house looking like a crazed sunburned pirate with an eye-patch on.

Anyway, I am now back at work. The scabs are gone, although my face is still visibly splotchy. My right eye is still swollen and there is still some pain. I am still using an antibiotic gel in my eye, which means my right eye is still hazy, so I am not driving yet. I see my regular doctor today and the ophthalmologist again tomorrow. I think the likely outcome is that my vision will not be significantly damaged. It seems like the pain will persist at reduced level for at least a while longer.

Dear reader, I sincerely hope you had a better holiday than mine. And if not, God bless you!