Greetings! Last time I introduced a piece of code to solve a small real-world problem: decoding a set of function IDs from a run-length-encoded list of change points. I received some excellent feedback. One of the most valuable suggestions was that I avoid unnecessarily taking the length of the list in my "triplify" function. That makes a great deal of sense, considering that ideally I'd like to be able to handle infinite lists, and of course it would be nice if the runtime of my algorithm did not spiral out of control quite so quickly!
I also gave a little more thought to the various base cases. One of the nicest things about Haskell is its pattern matching. Indeed, even a simple function like "square a = a * a" makes use of pattern matching, where "a" on the left-hand side is irrefutable. My implementation last time failed to fully take advantage of pattern matching on multiple parameters, particularly to handle base cases, and did not even handle all the various termination conditions properly. So, here is a slightly condensed version of the code that makes the termination conditions much more explicit, and so I claim even though it is shorter, it is more self-explanatory.
>import Char >import Numeric >rle_raw_str = "00 30 90 09 31 01 10 32 20 22 12 30 23 12 41 24" ++ > "34 40 44 1C 0F C1 1D 00 D0 1D 03 D0 4D 05 D0 7D 0A D0 CD 10" ++ > "D1 2D 20 D2 6D 30 D3 1D 40 D4 1D 50 D5 1F 00"
A new triplify:>
>triplify :: String -> [String] >triplify  =  >triplify (x:y:z:rest) = [x, y, z] : triplify rest >triplify str = error "Unexpected string length (trying to match 3 characters)"
Treating the input data as a list of 12-bit numbers:
>rle_change_points = map (fst . head . readHex) > (triplify (filter isHexDigit rle_raw_str))
The main function: given a list of change points, a hint as to whether we are initially presuming the decoder will output ones or zeroes, and a maximum value, decode RLE data from the change points. Return the indices of the one bits.
>deRLE :: [Integer] -> Bool -> Integer -> [Integer] >deRLE  False _ =  >deRLE points False max = deRLEHelper points max >deRLE points True max = deRLEHelper (0 : points) max >deRLEHelper :: [Integer] -> Integer -> [Integer] >deRLEHelper  max =  >deRLEHelper [unmatched] max = [unmatched..max] >deRLEHelper (start:end:rest) max = [start..end - 1] > ++ deRLEHelper rest max
This is more like it -- the kind of program that resembles a proof more than code. I'm starting to prefer this style!
A brief explanation: if we're initially decoding zeroes, the empty list case can be disposed of quickly. The program filters this case out first so that the recursive helper function, which accumulates the output, doesn't need to handle this case. The next two lines handle non-empty lists. If we're initially decoding ones, we can still treat all cases the same if we just prepend a zero to "flip" the decoder. Our helper function then has only three cases: a termination case where an interval of ones is not closed (because we have a single change point left), in which case we spit out ones up to and including our maximum possible value; a termination case where we have two change points forming a closed interval, in which case we spit out an interval inclusive of the start and exclusive of the end, and a non-termination case in which we have more than two entries, which spits out a closed interval and continues the process.
Let's look at some test cases. First, we confirm that if we're generating zeroes and we encounter no change points, our result set is empty:
deRLE  False 0xF 
We confirm that if we hit a single change point, the result set will contain all the values up to and including the maximum possible:
deRLE  False 0xF [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15] deRLE  False 0xF [11,12,13,14,15]
In the general case where our interval is closed, our result set is inclusive on the left and exclusive on the right:
deRLE [5, 10] False 0xF [5, 6, 7, 8, 9]
Closing one range and opening another without closing it works correctly:
deRLE [5, 10, 11] False 0xF [5,6,7,8,9,11,12,13,14,15]
If we begin assuming our decoder generates zeroes, an empty list of change points maps to all possible values:
deRLE  True 0xF [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
A single change point at zero toggles the state of the decoder and produces and empty result:
deRLE  True 0xF 
A single change point greater than zero allows some values into our result set before it is closed:
deRLE  True 0xF [0,1,2,3,4]
Two change points defines a range to exclude from the results, and we get the rest of the values in the range:
deRLE [5, 10] True 0xF [0,1,2,3,4,10,11,12,13,14,15]
Three change points suppresses the rest of the result set:
deRLE [5, 10, 13] [0,1,2,3,4,10,11,12]
With these tests I'm satisified that deRLE works for all expected cases.
I received more comments, including some recommendations for a nifty unfold. Thanks to everyone who made suggestions. Next time I'll take a look at how those work, talk about anything else that comes to mind, and then proceed to my next real-world problem!