Praise, Curse, and Recurse: Scheme

Showing posts with label Scheme. Show all posts

31 May 2013

An Interesting Interview Question

(Warning: some non-FP content!)

So yesterday I had a job interview for a developer position and I got the chance to put my money where my mouth is, so to speak, about my C and C++ programming skills. I was asked to write 3 functions given specifications including their prototypes, and to review a piece of C++ sample code. The interviewers left me alone with some paper for fifteen minutes or so.

There were four tasks. I scrambled around at the first one, started down a blind alley, started to get nervous, realized I was getting nervous, and decided to move on to the next questions and come back to the first question if I could. So I'll reveal that one last.

The second task was to write a "population count" -- to count the one bits in an unsigned char. This is a common operation in real-world code that operates on packed data structures such as tries. I chose a very simple lookup-table approach, with a table to look up values nybble-wise. Easy-peasy. The function itself then becomes a one-liner. "Full marks," as they say in some countries.

The third task was to write a function that tests whether the CPU it is running on is big-endian, or little-endian. If you don't know what that means, I'm not going to explain it in great detail here, but it refers to the way that the individual bytes of multi-byte values are laid out in memory. You can write quite a bit of code without running into endian problems, but it becomes an issue if you need to exchange data over a network or across a bus between processors that store their data differently. Fortunately I have written quite a few drivers, so this one was also easy.

The last task was to find errors and questionable practices. There were a lot of them -- I marked up my page with several "WTFs?" They weren't all errors that a compiler would complain about. Fortunately I've studied C++ extensively -- I used to subscribe to C++ Report -- and I was able to talk about several design issues, ranging from the lack of a virtual destructor in a base class to failure to properly implement the necessary constructor, copy constructor, and assignment operator. There was a problem with a const method. There was a pointer aliasing problem. There were a few other small issues I did not notice. I missed an issue: dynamic_cast can return NULL. I missed this because I have never had to use that sort of run-time introspection in C++, although one of my interviewers pointed out how it can be useful and necessary for adding test functionality to a framework. Overall, I think I got most of the credit for this one.

The first task, the one I botched, was to write a function to reverse a standard null-terminated C string, in place. Wow, the world loves nothing more than to humble us now and then, doesn't it?

Right off the bat, I became confused because of the prototype for the function. The question indicated that the first parameter would be a pointer to a zero-terminated string, but the second parameter was to be a size_t type -- a buffer size. I suppose this is analogous to the standard C library routines strn*, which are safer versions of the old strcpy, strlen, etc. These versions take a number indicating how far the function can travel in memory before bailing out -- giving a sort of "backup plan" in case the incoming string is not properly null-terminated. These are certainly better than the old functions that can rip through memory, or leave your code vulnerable to a variety of stack-smashing and buffer-overflow exploits. But to me there's something inherently horrible about a function with such a contradictory "contract" -- it is designed in such a way that it will hide, or at least give safe harbor, to bugs in the code that calls it. It has the information it needs to report that it has been called with an unterminated string, but there is no provision to return an error code, and the assignment did not indicate what I should do if the incoming string is not terminated. Presumably I'm not supposed to crash the computer -- although with no way to return an error, forcing a crash immediately is really the safest possible option!

The "safer" standard library function strnlen receives a value telling it how many characters it should count before it bails out, and returns a value indicating how many characters it actually counted. If you get back the same number you provided, it means that the incoming string is not properly null-terminated. You can (and hopefully will) do something with that information. But the "safer" strncpy doesn't provide any indication that it has been handed a source string that isn't null-terminated, and so can leave the destination filled with an unterminated string. That's inherently unsafe. The prototype given makes this function a little like strncpy.

So while my brain was chewing on all that in the background, I started trying to get something down on paper, writing a shell for the function, to dispose of some possible error conditions first. And I confused myself further.

In the bad old days, size_t was not guaranteed to be unsigned, so it was possible that the provided buffer size could be less than zero. That's not the case in standard C, but I coded to test for it anyway. Most compilers would tell me that I was testing for a condition that can't happen, fortunately.

Next, I tested for whether I could bail out early -- whether the provided buffer size value is less than or equal to 1. I'm not sure if this would be a performance win -- it depends a lot on the sizes of the string that the function will be given. But I wanted to show that I was at least thinking about these different possible cases.

And then it was time to write the meat of the function. I had my heart set on writing a recursive solution. Why? To show off, maybe, even though this isn't actually a very practical solution in C. But more likely, because I've gotten used to reversing lists in languages like Scheme and Haskell that use Lisp-style lists. Which... are not at all like plain old arrays of bytes. But I was determined. To show that I was at least aware that recursive functions can fail when they use too much stack space, invented a constant, MAX_STACK_SPACE_ALLOWED, and made sure my buffer size didn't exceed it. So then I had several error-checks and had filled most of the page. There was nothing else I could do but write the recursive helper function to reverse the string.

And I realized that it has been a long time since I wrote something like that in plain C I was still chewing over what to do with the buffer length parameter, and wondering if I was missing something fundamental about the problem. And so I panicked and bailed out, feeling really stupid. Stupid brain! How could you let me down like that?

I had some more time to think about it today, and play around with some code, and think about the "contract" for the function and just why I got stuck. The idea behind a recursive reverse of a list is that you recurse to the end of the list, and then build up a list of the elements in reverse order as the recursion "unwinds." In Scheme, something like this:

(define (my-reverse lat)
   (if (null? lat)
      '()
      (append (my-reverse (cdr lat)) (list (car lat)))))

C doesn't do the data-structure bookkeeping for you. Recall that we're supposed to the reverse in place. This solution starts with an empty list and creates new cons cells to add to it. And of course a list doesn't look like a primitive string of bytes. So I've got to get my head back in the C game.

As a warmup, let's write a recursive solution that uses an auxiliary buffer for the output, because that sounds easier to me. First, I want a routine to handle some special cases, which will then call a recursive function. It is at this point that I have to figure out what I'm going to do with an unterminated string. My answer: I'm going to terminate it. You're terminated, sucker. Yes, I'm going to overwrite a character of the data you passed me. Dude, where's my data? No, DUDE! Where's my string terminator?

Is that the right answer? I don't know. Actually, I'm pretty sure that this is not what the guys that gave me the task had in mind. In retrospect I think it probably would have made more sense to just use strnlen to get a "working length," and just reverse that many characters. But I think the bigger problem is that asking for a function with that prototype is asking the wrong question. It's a guaranteed "WTF?" for someone. Anyway, here's the "preamble" error-checking code as I wrote it today, for working with a function that returns the reversed string in an auxiliary data buffer:

void ReverseString( char *str_p, size_t buffer_size )
{
    if ( NULL == str_p )
    {
        printf( "NULL pointer!\n" );
        return;
    }
    else
    {
        unsigned long len = strnlen( str_p, buffer_size );
        if ( buffer_size == len )
        {
            str_p[buffer_size - 1] = '\0';
            len--;
            printf( "Terminating your string for you!\n" );
        }

        /* Recursive with an auxiliary array */
        char *str_out_p = (char *)( malloc( buffer_size ) );
        if ( NULL == str_out_p )
        {
            printf( "malloc failed!\n" );
            return;
        }
        else
        {
            strncpy( str_out_p, str_p, buffer_size );
            printf( "Before: %s\n", str_p );
            ReverseStringAuxHelper ( str_out_p, str_p );
            printf( "After: %s\n", str_out_p );
            free( str_out_p );
        }

        /* This doesn't actually meet the requirements... */
    }
}

So let's write ReverseStringAuxHelper now. Something like this:

void ReverseStringAuxHelper( char *str_out_p, char *str_in_p )
{
    /* Do some setup, and call a recursive helper */
}

Our basic strategy for recursive calls is clear: we want to call ourselves on a shorter substring of the input string, until we hit a NUL. And then... build up the data to return. But how do we do that? As we "unwind" the call stack, we will have a series of pointers to characters in the input string, in reverse order. But we want to write them into the outgoing string in forward order. To keep track of the index into the auxiliary data structure we could use a global (a file-scope variable, maybe declared static). Gag me with a spoon. No, let's use a local. You can't nest functions in C; you don't have lexical closure. But we can do something like this (get your barf bags ready):

void ReverseStringAuxHelper( char *str_out_p, char *str_in_p )
{
    char *str_out_walking_p = str_out_p;
    ReverseStringAuxRecursive
        ( &str_out_walking_p, str_in_p );
}

We can pass the address of the local pointer variable, to allow the recursive function to modify it -- so it can use the pointer, and increment its value. The recursive function now has a prototype that looks like this:

void ReverseStringAuxRecursive( char **str_out_p_p,
    char *str_in_p );

Note the double-asterisks: a pointer to a pointer. And now the whole function:

void ReverseStringAuxRecursive( char **str_out_p_p,
    char *str_in_p )
{
    if ( '\0' != *str_in_p )
    {
        ReverseStringRecursiveAuxHelper( str_out_p_p, 
            str_in_p + 1 );
        **str_out_p_p = *str_in_p;
        *str_out_p_p += 1;
    }
}

Good heavens. That is.. inelegant. In C++ we could do that by reference, at least, which would clean the appearance of the pointer expressions, but that really just sweeps it under the rug. It works, but can we do better? In fact, after sleeping on it for a while, I woke up abruptly with a prettier solution in my head. The key insight was realizing that there is no good reason I can't recurse with two parameters that are each changed to bring us closer to the terminating condition. We have already calculated the length of the string, so why don't we use that to figure out the terminating condition?

void ReverseStringAux2Recursive( char *str_out_back_p,
    char *str_in_fwd_p, char *str_in_term_p )
{
    if ( str_in_fwd_p < str_in_term_p )
    {
        *str_out_back_p = *str_in_fwd_p;
        ReverseStringAux2Recursive( str_out_back_p - 1, 
            str_in_fwd_p + 1, str_in_term_p );
    }
}

The idea here is that we will recurse on the input string, walking forwards through it and ending when we reach a supplied end pointer, and copying characters into the output string walking backwards. It's tail-recursive, which is nice (I'll talk about that a little more later). We could let the function write the terminating NUL by changing the test to read

if ( str_in_fwd_p < str_in_term_p )

But if we did that, we'd wind up writing a terminating NUL before the first character in the output string. We don't want to do that, so we need a special case for this. Let's just let our helper function do it.

void ReverseStringAux2Helper( char *str_out_p, 
    char *str_in_p, unsigned long len )
{
    /*
       The caller must guarantee that the
       output string can hold len + 1 chars
    */
    ReverseStringAux2Recursive( &str_out_p[len - 1],
        str_in_p, &str_in_p[len] );
    /* Write the terminator in the output string */
    str_out_p[len] = '\0';
}

That doesn't look too bad. We make use of the fact that we've already calculated the length of the string. Now, that we've figured out how to do this with two changing pointers, can we do that in place? Yes, we can:

void ReverseStringInPlaceRecursive( char *fwd_p, 
    char *back_p )
{
    if ( fwd_p < back_p )
    {
        char temp = *fwd_p;
        *fwd_p = *back_p;
        *back_p = temp;
        ReverseStringInPlaceRecursive(fwd_p + 1,
            back_p - 1);
    }
}

void ReverseStringInPlaceHelper( char *str_p,
    unsigned long len )
{
    ReverseStringInPlaceHelper( str_p,
        &str_p[len - 1] );
}

That's longer, but most of the body of the recursive function is just devoted to swapping characters using a temp value, before the tail recursion. But what's going on with

if ( fwd_p < back_p )

? Well, I'm glad you asked. Note that we're swapping characters. We walk along the string, going backwards, swapping characters from the front of the string, going forwards. But then a funny thing happens. The pointers meet in the middle, and cross. In other words, you want "ABC" to become "CBA" but if you don't stop the reversing process at "B", the left and right sides of the string are swapped again, giving you "CBC." The resulting string is palindromic -- it reads the same forward as backwards -- and so reversing it again leaves it unchanged -- but that's not what you want. Run this in an interactive debugger if you can -- it's interesting to watch. Well, if you like that sort of thing. It might help if I give you a picture:

By the way, both the recursive reverse in place and the second recursive reverse to an auxiliary string have a subtle bug. Can you spot it? I'll come back to that later at the end of this post.

Anyway, pretty as that in-place solution is, we probably don't really want a recursive solution in C. The biggest reason is that although I hear some C compilers now will optimize tail recursion into simple iteration, they aren't required to do it. I've heard that GCC can do it, but I wasn't able to figure out how to get my setup (LLVM GCC 4.2 with XCode 4.6.2) to do the optimization, as far as I could tell. So I'd say you definitely can't count on it, and so a big string might blow up your stack.

So what about an iterative solution? There are a few options, but a basic pointer math approach is very idiomatic C, and it looks very similar to the recursive solution, even if walking a pointer backwards seems a little odd and hazardous:

char *fwd_p = str_p;
char *back_p = &(str_p[len - 1]);

while ( fwd_p < back_p )
{
    char temp = *fwd_p;
    *fwd_p = *back_p;
    *back_p = temp;

    fwd_p++;
    back_p--;
}

That's not a complete function. Here's a complete function with a test routine. This version is written without the buffer size parameter of the original spec. You'll need to include assert.h, string.h, and stdio.h

/* in-place reverse with two pointers */
void ReverseStringWithPointers( char *str_p )
{
    /*
         There are several conditions that would induce us to exit
         early. A null pointer is the most obvious one.
    */
    if ( NULL == str_p )
    {
        return;
    }
    else
    {
        /* 
            A zero-length string is the next
        */
        if ( '\0' == *str_p )
        {
            return;
        }
        else
        {
            /*
                Assume the string is properly terminated; find the end
            */
            char *back_p = str_p + 1;
            while ( '\0' != *back_p)
            {
                back_p++;
            }
            /* Leave back_p pointing to the last character before the terminator */
            back_p--;

            while ( str_p < back_p )
            {
                char temp = *str_p;
                *str_p = *back_p;
                *back_p = temp;
                    
                str_p++;
                back_p--;
            }
        }
    }
}

char const *one_char_str = "1";
char const *two_char_str = "12";
char const *three_char_str = "123";

int main(int argc, const char * argv[]) {

    /* Put these in writable storage */
    char writable_one_char_str[2];
    char writable_two_char_str[3];
    char writable_three_char_str[4];
    strcpy( writable_one_char_str, one_char_str );
    strcpy( writable_two_char_str, two_char_str );
    strcpy( writable_three_char_str, three_char_str );
    
    ReverseStringWithPointers( NULL );
    ReverseStringWithPointers( writable_one_char_str );
    ReverseStringWithPointers( writable_two_char_str );
    ReverseStringWithPointers( writable_three_char_str );
    
    assert( 0 == strncmp( writable_one_char_str, "1", 1 ) );
    assert( 0 == strncmp( writable_two_char_str, "21", 2 ) );
    assert( 0 == strncmp( writable_three_char_str, "321", 3 ) );
    
    return 0;
}

That's probably a preferred interview solution for the "standard" in-place string reverse without the buffer size parameter. It illustrates how idiomatic C pointer math is; it's idiomatic because it expressive and it works. It is probably one of the biggest reasons for C's longevity.

So, any other ideas for implementing the string reverse? Well, for the core logic you can use array indexing like so:

for ( unsigned long idx = 0; idx < ( ( len / 2 ) - 1 ); idx++ )
{
    unsigned long rev_idx = len - 1 - idx;
    char temp = str_p[idx];
    str_p[idx] = str_p[rev_idx];
    str_p[rev_idx] = temp;
}

But that is actually problematic for some string lengths. If len is less than two, dividing it by two to find the array's halfway point yields zero, and when you subtract one, you'll have an underflow, giving a very large positive number for your array index, giving undefined befhavior (if you're lucky, you'll get an immediate crash). This approach can be rehabilitated by screening the early return conditions.

Oh, there's another little trick you can do. I'll leave it to you to figure this one out, if you haven't seen it before. Instead of using a temporary variable to swap the characters, you could do this:

str_p[rev_idx] ^= str_p[idx];
str_p[idx] ^= str_p[rev_idx];
str_p[rev_idx] ^= str_p[idx];

And there, dear reader, I will leave you for today. I hope you found this interesting and maybe learned something. I did. Maybe you can see why my brain locked up -- I didn't want to write a function with that prototype; it it seems to contradict itself as far as whether it should be safe, or just efficient. Wanting to write a recursive solution was a secondary issue. Oh, and of course let's not forget that this is the wrong solution for modern strings -- for any Unicode types. It only works for old-school strings of bytes.

Don't get me wrong -- I'm not actually complaining about the interview question. It's a great interview question, because it looks easy but it should also bring up a lot of safety and design issues in the programmer's head. I've been humbled just a bit. I've been reminded that while I've spent years writing code that uses STL and reference counting and nice, modern features of C++ and more modern libraries, I've forgotten just a bit how low-level C often turns out to be. I should never do that, because of course sometimes you have to debug all that nice, modern code, and remember what is going on underneath. Had I been asked to write the solution in a simple assembly pseudo-code, I would have had to come up with an iterative solution immediately, although I'm not sure I would have been able to work out the pointer-crossing problem in the interview setting. Recursion is almost certainly the wrong way to go in C, but maybe you can see why I was drawn to the idea, especially of writing the reversed string to a second buffer. It's funny how a relatively simple problem in C can be so "interesting." May we one day no longer live in such interesting times...

FOLLOWUP: I got feedback that the interviewers were quite pleased with both my performance and attitude. I actually sent them the source code for my experiments today, too, with some comments, in case they have anything to add. I screwed that up again, though, because the first version had a crashing bug, so I sent another one, and that still didn't have the most elegant form of the recursive functions. And, a couple of the solutions have that subtle bug I mentioned earlier. What is the subtle bug?

Obviously, it is a bug to write past the end an array in memory -- whether we're talking about an array allocated dynamically on the heap, on the stack, or in static storage, like a global variable. You can't guarantee that this bug will be caught -- it could be symptomless, just silently corrupting some other variable in your program. And obviously it is a similar bug to write before the beginning of an array. But did you know it is a bug to calculate the address of an item before the first item, even if you never use it to write to the array?

The C standard guarantees that you are allowed to create the address of an extra array element past the end of your array. Reading or writing at this location is "undefined behavior" -- that is, very bad. But there is no such guarantee for creating a pointer to an array element at index -1, even if you don't read or write it. See also this CERT coding guideline.

There are several places in our recursive code where that is actually a concern. If we have a zero-length string, our calculated len is zero. Every place where we use len should be inspected. For example, in ReverseStringAux2Helper, I pass the value &str_out_p[len - 1]. This gives me the address of the last writable character in the output array. But for a zero length string, this will result in calculating the address of the character before the first character of the output string.

The rationale for this, I believe, goes something like this. On an arbitrary computer running your C program, you don't actually know what your memory addresses look like, at the hardware level. C has to cover a lot of weird and arcane and now-deceased architectures: segmented pointers, odd pointer sizes, architectures where different types of addresses have different numbers of bits, architectures where only address registers can hold addresses and they aren't the same size as any other type, etc. What if you're running this on a small processor, and it just so happens that the array in question was started at hardware address zero? Calculating the address of element -1, in an unsigned pointer type, would "underflow" into a large number. I don't specifically know of an architecture where just storing that number in an address register or in memory will really crash, but there could be one, or at least could have been one. And the designers of the C standard wanted to make sure that C was as portable as possible.

The logic in ReverseStringAux2Recursive determines that there is nothing to do, and so we don't write to that address, but it is still illegal. To be strictly scrupulous about observing the standard, logic should be changed so that in the case of the zero length string, we don't get to the point where we calculate the address of that non-existent element. This also happens in the code where we calculate the length of a string and determine if it needs to be forcibly terminated -- in the case of a specified buffer length of zero, len underflows. A zero-length string can't be a valid string. I really do need to avoid this case entirely. I also need to think hard about the string size that is specified as one character, which can become a zero-length string if we forcibly terminate it. So the wrapper logic, given my chosen strategy of terminating unterminated strings, needs to look something like this:

void ReverseString( char *str_p, size_t buffer_size )
{
    if ( NULL == str_p )
    {
        printf( "ReverseString: received NULL pointer!\n" );
        return;
    }
    else if ( 0 == buffer_size )
    {
        printf( "ReverseString: received buffer_size 0!\n" );
        return;
    }
    else
    {
        /*
           strnlen returns the number of characters before
           the NUL or the provided maximum. If we get back
           the maximum, assume the string is NOT terminated.
        */
        unsigned long len = strnlen( str_p, buffer_size );
        if ( buffer_size == len )
        {
            /*
               Forcibly terminate the incoming string,
               consider its length to be one shorter, log a
               problem, and continue
            */
            str_p[buffer_size - 1] = '\0';
            len--;
            printf( "ReverseString: no termination found!\n" );

            /*
               If our adjusted string length is zero, don't
               continue because some of our algorithms that
               create pointers to the first character before
               the terminator will generate a pointer to an
               array item at index -1, an invalid operation
               per the C standard
            */
            if ( 0 == len )
            {
                printf( "After termination, len is 0!\n" );
                return;
            }
        }
        
        /*
           There is really nothing we need to do to reverse
           a string of length 1, but rather than rule that out
           here, we let our algorithms handle that case to
           exercise their boundary conditions.
        */

This is another case where the two parameters, the string pointer itself and the buffer_size, can contradict each other, with results that are complicated no matter what you do.

What happens if we don't forcibly terminate incoming strings that aren't terminated within buffer_size chars, but just reverse buffer_size chars? Well, the preamble code is a little different. Here's a version of the whole function that does that. The reversal algorithm in this version isn't very efficient -- it copies bytes twice, and it allocates extra storage. But efficiency wasn't mentioned in the task description, and I think it is correct, at least for this interpretation of how to deal with strings that aren't terminated within buffer_length chars. It also seems like something I might have had a better chance of getting right on paper, quickly, when put on the spot.

void ReverseStringFinal( char *str_p, size_t buffer_size )
{
    /*
       We take the buffer_size parameter to mean
       that we shouldn't touch more that buffer_size
       bytes, even if the string is longer.

       There are several error conditions that really
       should trigger failure, returning some kind of
       error code, but the spec doesn't give us that
       option.
    */
    if ( NULL == str_p )
    {
        return;
    }
    else
    {
        /*
           strnlen will return buffer_size if it
           doesn't find a terminator before counting
           buffer_size characters.
        */
        unsigned long len = strnlen( str_p, buffer_size );
        /* I'd like to use max(), but it is not standard C */
        unsigned long working_len = 0;
        if ( len < buffer_size )
        {
            working_len = len;
        }
        else
        {
            working_len = buffer_size;
        }

        if ( working_len < 2 )
        {
            /*
               Reversing a string of 0 or 1 character
               is a no-op
            */
            return;
        }
        else
        {
            /*
               An inefficient but simple solution using a
               temporary buffer
            */
            char *temp_buf_p = (char *)
                ( malloc( working_len ) );
            if ( NULL == temp_buf_p )
            {
                return;
            }
            else
            {
                /* Copy chars backwards */
                for ( unsigned long idx = 0;
                    idx < working_len; idx++ )
                {
                    temp_buf_p[ working_len - 1 - idx] = 
                        str_p[idx];
                }
                /* Copy forwards into original buffer */
                for ( unsigned long idx = 0; 
                    idx < working_len; idx++ )
                {
                    str_p[idx] = temp_buf_p[idx];
                }
                /*
                    Don't write a null; only touch
                    the chars in the working length.
                */
                free( temp_buf_p );
            }
        }
    }
}

30 August 2006

The Kernel Programming Language and the Quest for Simplicity

I've been reading with considerable interest about the Kernel programming language under development by John Shutt. Information and a PDF file are available here.

I am not sure I fully understand the implications and details of his $vau and $wrap constructions, but I strongly support his goals, and the rest of his report kept me, literally, awake half the night reading it. Shutt writes:

...the mandate of the Kernel language is to have a clean design, flowing from a broad design philosophy as refined by a series of more specific design guidelines --- just one of which is that all manipulable entities should be first-class.

He is trying to create a Scheme-like language which is even more uniform, predictable, and configurable than R5RS Scheme.

I don't have a Ph.D. in the programming language field, so bear with me if my terminology is a little vague. It seems that the big thing Shutt is attempting is to give fine-grained control over evaluation, essentially separating out the parameter-evaluation stage of a lambda call into a distinct and fully configurable stage. This means you can do more clever things with objects without evaluating them. I think it also points you towards macro-like constructions that don't lead off into the existing slightly swampy maze of different Scheme macro implementations. It also could be a more explicit and flexible means of handling the need for quoting.

On top of that he is also doing a number of slightly smaller things (on the scale of theoretical complexity), but which I think are great.

For instance, Shutt has in his design very explicit rules for what to do with cyclic structures, a.k.a. infinite lists. While in some situations creating these would be considered an error, in others they make perfect sense: for example, in creating trees that describe recursive-descent parsing. Therefore, giving a sensible semantics to handling cyclic structures is a sensible and intriguing thing to do. If you want to write functions which aren't willing to handle cyclic structures, you can use one of his predicates to check for that possibility.

He also has addressed a number of minor irritants which don't require a deep understanding of the lambda calculus to appreciate. For instance, he proposes that boolean evaluation should never be implicit: that is, if you want to evaluate something as a boolean value, you have to use a predicate on it to generate a boolean value, and compare it to true or false. In other words, no more "anything not explicitly false is not true" or "false is false, and nil is false, but anything else is true, and true is true" or other bizarre variant on this theme.

As someone who switches languages a lot, I have always tried to make my boolean evaluations absolutely explicit to avoid the potential confusion that implicit evaluation can bring. For example, I always write (pedantically) in C++:

if ( false == stack.full() ) {
    push()...

In addition, he has addressed the question "if everything in Scheme has a value, what is the value of a control structure?" To make this explicit he proposes an immutable value called inert, somewhat analagous to void in C/C++, and which is an explicit type and which even has explict external representation.

There's lots more great stuff in Shutt's draft report and it addresses many of the weaknesses that make Scheme implementations so maddeningly inconsistent. Read it, study it, live it. I am looking forward to Shutt's doctoral thesis and I hope that there is one day very soon a freely available implementation of Kernel to try out!

09 February 2006

Thinking about Sudoku and Scheme, Part 5.1

Just a brief post today. I have been too busy to do much on my Sudoku solving program itself, but I did spend some time polishing up my array2D library. With revision this library has finally assumed what I feel is its "natural" shape. The optimal design was not clear at the outset, but with some use I feel I've discerned the implementation that minimizes the amount of redundant code and thus the number of places that possible errors may be hiding. I'm sure it could be even more concise; I have a feeling that a little macro magic could cut the lines of code in half, but the shape of that implementation has not yet become clear to me, and I have an allergy to using techniques that don't provide a clear advantage.

It is really quite a simple library, but in order to provide some assurance of correctness, I also wrote a series of PLT Scheme test cases. I have not figured out how to use DrScheme to tell me if this test suite covers all of the code, or if this is even possible, but it is at least nearly complete.

The library does not do meaningful error-checking. Scheme textbook code rarely does anything by way of error-checking. As a result I'm deficient on strategies I could use to address this, especially when trying to write to RSR5.

It isn't possible to code around every conceivable error. For some applications, it makes sense to perform verification any time data is read from or written to memory. We can't operate at that level. We also shouldn't put in effort to detect flaws in the implementation itself. Given our partially functional approach, in which we don't update any top-level definitions, there aren't a lot of unexpected states our library can get into. We should focus our attention on the response to incorrect inputs. There are only a couple of classes of errors that the library should expect:

1. The set of array boundary errors. Because coordinate pairs are mapped to a single index on a vector, many of possible boundary errors will result in an incorrect cell being set or get. Only cell references that exceed the internal representation will generate a meaningful run-time error, and this error will not tell us if the x or y index was out of bounds.

2. The set of errors that can arise from receiving an incorrect value-generator or mapping function. In Scheme, a function signature really consists only of the name and the number of parameters, and does not include type information about the parameters. Supplying an incorrect number of parameters will be caught with a meaningful error at run-time but because the passed functions may be called via several levels of indirection, and the function may not be bound to a named variable, it may be difficult to glean from the error message just what has gone wrong. (Another way to say this is that DrScheme has thrown away too much context information).

Type errors are potentially more insidious, but in my implementation these functions will tend to receive an array (vector), integer indices, and possibly another function. The result of doing a vector access on an integer or function, or treating a vector or function like an integer, should at least fail fast with a moderately meaningful error.

There are richer language constructs that could address these problems. For example, either Common Lisp or Dylan would allow me to specify type information for the parameters. The runtime is then capable of flagging type errors as soon as they happen, or better yet, verify type safety across functions and generate code that operates on POD (plain old data). Dylan also supports limited types, although implementation support in Gwydion's d2c is spotty. Limited types would allow range checking further upstream. Of course, if I wrote this program in Common Lisp or Dylan, I would not bother simulating an array. If our language supported exception handling, I could wrap the top-level operations to catch exceptions without too much modification inside all my functions.

Another approach to bounds-checking in my 2-D array would be to encapsulate the array data together with metadata on its x and y dimensions. I am doing this to a limited extent with the closure generation, but the captured x-dimension is not used to validate that the x-index is within bounds. Other array implementations use a record or a closure to wrap up the representation itself. But then it is more difficult to display the contents of the array, and if we are going to start down that road, we might as well try to generalize our library for an n-dimensional array. That seems like overkill, so for now we will call array2D complete as it is.

31 January 2006

Interesting Embeddable Languages

There are a few languages I am paying particular attention to these days.

The first is Scheme; that's the language I'm using for my Sudoku experiments.

The second is Lua. Lua is a light language that includes a byte code compiler and interpreter. It is designed to have a small implementation footprint and to interface well with C. It has lightweight threads. Lua's home is at http://lua.org.

The third is Io. Io has a syntax that looks a bit weird, unless you're familiar with Smalltalk. I've never used Smalltalk, but I have used NewtonScript. The prototype-based inheritance scheme used in Io reminds me of NewtonScript's inheritance, although Io does not support dual inheritance chains. It may not be quite portable enough to move over to an arbitrary platform, but I won't know for sure until I try. Io's home is at http://iolanguage.com.

And then there is Ruby. Ruby is not quite so lightweight, but it is designed to be embeddable; it would not be my first choice, though, because it does not have a well-tested compiler or byte code runtime. It has too little regularity in its syntax (otherwise known as "too much syntax"), and uses sigils, which I don't like except as convention, as in Dylan, but as far as I'm concerned for usability it beats Perl hands-down. I prefer it to Python for no particularly good reasons other than Python's use of whitespace, which is a relatively trivial feature, and Guido's apparent hostility towards fully usable lexical closures. Ruby's home is at http://ruby-lang.org.

Ruby's support for higher-order functions has the side effect of making it serve as an acceptable dialect of Lisp, even without an exact equivalent of Lisp macros. I was able to bring up my basic Sudoku code in just a couple of hours using Ruby. Ruby is great for rapid prototyping and the standard libraries "feel" just the right size. There's some lessons here to inform future generations of Lisp, if the Lisp community is willing to listen to them.

There are some Forth-inspired languages that are also interesting; I am a fan of Forth, and consider it to be the "other" programmable programming language. Joy (http://www.latrobe.edu.au/philosophy/phimvt/joy.html) is a sort of functional variant on Forth.

30 January 2006

Thinking about Sudoku and Scheme, Part 4

Last time, I wrote:

Now we've got our given board and our candidate board, and we can start smashing them together to shake out candidates, and begin finding values to go in our boxes.

So let's do that. But first, I need to confess something. You may have found yourself wondering "You already have a data structure that contains the givens and their coordinates. Why make the given-board vector at all?" If you wondered that, congratulations. You're paying attention, possibly more than I am, and working towards writing code that is simple.

So what's the answer? The answer is "I don't have a good reason; the code just grew that way, and has not been optimally cleaned up yet." As I write this code, I'm learning to use Scheme more effectively. It has been an iterative process. Right now I'm in the part of the cycle that is excited to have the code successfully working on the first part of the Sudoku board solution. At some point I'll probably be in the part of the cycle that wants to rip up the code and make it smaller and simpler. Scheme lends itself well to stepwise refinement. This is both a blessing and a curse. A blessing, because it is easy to rewrite your code. In C++, assuming your design is not visibly horrible and doesn't crash, it is generally too painful to redo all those interfaces and class definitions and so working code often doesn't get improved unless there is a burning need. Scheme, especially if it is written in an appropriate functional style, is easy to improve locally. This also means that you can always improve it some more, and fall into the trap of improving it endlessly without solving an actual problem. That's the curse.

OK, now let's get to the meaty part, keeping in mind that this is code in progress. Here's how we map the givens and use them to cancel the first round of candidates. From the top:

(cancel-givens
(make-given-board given-list)
(make-initial-candidate-board))

Simple enough so far, right? Now we just have to define cancel-givens, and we're done.

(define cancel-givens
 (lambda (given-board candidate-board)
   (map-board-coordinates-elts
    given-board
    (lambda (x-idx y-idx elt)
      (if elt
          (cancel-given
           candidate-board x-idx y-idx elt))))
   candidate-board))

We know roughly what that map function is going to look like; it will call a function with the coordinates and elements of the given board. Now we just need to define cancel-given, and we're done. here's cancel-given. This one is big (most likely it could and should be further simplified):

(define cancel-given
 (lambda (candidate-board x-idx y-idx given)
   (reduce-candidate-list-to-given!
    candidate-board x-idx y-idx given)
   (letrec ((filtered-neighborhood-unique-coordinates-list
             (make-filtered-coordinate-list
              (get-unique-neighborhood-coordinates
               x-idx y-idx) (list x-idx y-idx)))
            (candidate-remover
             (make-candidate-remover given))
            (coordinate-list-handler
             (lambda (coordinate-list)
               (if (not (null? coordinate-list))
                   (begin
                     (candidate-remover
                      candidate-board
                      (caar coordinate-list)
                      (cadar coordinate-list))
                     (coordinate-list-handler
                      (cdr coordinate-list)))))))
     (coordinate-list-handler
      filtered-neighborhood-unique-coordinates-list))
   candidate-board))

Let's look at this one in a little more detail. Remember that this is called with each given. The first thing we do is turn the candidate list in the box with the given to a list containing only the given. Maybe that isn't the best representation, but it will do for now:

(define reduce-candidate-list-to-given!
 (lambda (candidate-board x-idx y-idx given)
   (set-board-elt!
    candidate-board
    x-idx y-idx
    (list given))))

Note that this is not strictly functional in style. I have tried to move over to a functional style, but my roots are still in imperative programming, so a fully functional, referentially-transparent implementation does not come naturally quite yet. In general, I am at a bit of a loss for good idioms to handle the gamut of solving strategies on the board over time without introducing mutation, but that is probably from lack of experience in purely functional programming techniques and data structures. I could keep regenerating the whole board, but that seems strange to me. I may explore a stricter functional design in a future version.

Meanwile, let's continue on and look at the next task, which is to remove the given from the entire neighborhood. Recall that the neighborhood of a box is the combined row, column and grid. We want a list of coordinates. Simply combining the coordinates for the row, column and grid will generate duplicate sets of coordinates; we also don't want to process the box itself, or we'll remove the given we just set. So we first use get-unique-neighborhood-coordinates:

(define get-unique-neighborhood-coordinates
 (lambda (x-idx y-idx)
   (let ((grid-bounds
          (map-coordinates-to-grid-bounds x-idx y-idx))
         (acc
          (make-unique-coordinate-accumulator)))
     (map-board-row-coordinates y-idx (car acc))
     (map-board-col-coordinates x-idx (car acc))
     (map-board-subarray-coordinates
      (car grid-bounds)
      (cadr grid-bounds)
      (caddr grid-bounds)
      (cadddr grid-bounds)
      (car acc))
     ((cadr acc)))))

The function map-coordinates-to-grid-bounds uses a little bit of Scheme math to determine the bounds of the 3x3 grid our box is in. Here's how I do that:

(define map-coordinates-to-grid-bounds
 (lambda (x-idx y-idx)
   (list (get-grid-lower-bound x-idx grid-x-dim)
         (get-grid-upper-bound x-idx grid-x-dim)
         (get-grid-lower-bound y-idx grid-y-dim)
         (get-grid-upper-bound y-idx grid-y-dim))))

And here are the helper functions:

(define get-grid-lower-bound
 (lambda (idx dim)
   (* dim (floor (/ idx dim)))))

(define get-grid-upper-bound
 (lambda (idx dim)
   (- (* dim (ceiling (/ (+ 1 idx) dim))) 1)))

The particulars of Scheme math require some care if you want to make sure you get a whole number result rounded in the direction that you want. If you're used to using languages that don't do exact fractions, this may seem like too much work. This way seems to work, though, so we will move on for now.

Next, we use an accumulator (created by make-unique-coordinate-accumulator) to collect up all the coordinates. This isn't a numeric accumulator, but a list accumulator: when we create it, it holds a reference to an empty list, and as we call it with coordinates, the list is assembled. Our accumulator generator looks like this:

(define make-unique-coordinate-accumulator
 (lambda ()
   (let ((coordinates (list)))
     (list
      (lambda (x-idx y-idx)
        (if (not (member (list x-idx y-idx) coordinates))
            (set! coordinates
                  (cons (list x-idx y-idx) coordinates))))
      (lambda ()
        coordinates)))))

This is again not strictly functional; in a more functional style, we'd probably generate a new accumulator with a new list created from the previous one. But then the client that held the accumulator would have to mutate its binding in order to get the new accumulator, so it would have to be strictly functional as well -- a sort of viral strict functionality would require rewriting all the upstream code. Something to think about!

This is yet another function generator, with a twist. We use a let to create a binding for a list, called coordinates. This is the list that will accumulate the coordinates. The actual return value of the function is... another list! This time, a list of two functions. The first function in the list receives coordinates and if the coordinate pair is not already present in the closed-over list, as determined by member, we destructively set coordinates to be the list with the new coordinate pair prepended. The second function in the list doesn't look like a function at all; in fact, it consists only of:

(lambda ()
 coordinates)

All that does is return the value held in the binding named coordinates. It's the rough equivalent of a return statement in C, C++, or Java. The important thing to note here is that both of the functions in the list we return contain a reference to coordinates. The binding in question is the one that is in lexical scope when the lambda form is executed. Note that we could stuff these functions inside some other data structure; we could even put them in a single cons cell using set-car! and set-cdr! But for now I am still using the list as my "Swiss Army Knife" data structure, so we'll use that.

Returning to the innards of get-unique-neighborhood-coordinates let's look at what happens next:

(map-board-row-coordinates y-idx (car acc))
    (map-board-col-coordinates x-idx (car acc))
    (map-board-subarray-coordinates
     (car grid-bounds)
     (cadr grid-bounds)
     (caddr grid-bounds)
     (cadddr grid-bounds)
     (car acc))

I'm applying my accumulator to the row, the column, and the subarray. The map functions are unexceptional; they are wrappers around my 2D array implementation. To pass the four parameters that constitute the bounding rectangle of the subarray, I'm again using those all-purpose list functions car, cadr, caddr, and cadddr. These simply correspond to retrieving the first, second, third, and fourth element in a list. Actually, in RSR5 Scheme you can rewrite this to use first, second, third, and fourth if you want to, but older textbooks generally use the composed list abbreviations, so I will stick with them for now; rather than just use clearer names, if I rewrite this to get rid of the composed list functions, I'd rather use a self-documenting data structure of some kind.

The last expression in the function is the return value. Don't overlook its importance:

(cadr acc)

This actually calls the function contained in the second element of our accumulator list, which will retrieve the contents of the accumulator. Since it is the last expression in the function, get-unique-neighborhood-coordinates will return its value.

Now we're back in cancel-given. The next thing we do is call make-filtered-coordinate-list on our unique list. We do this to remove the box containing our given, so that we don't wind up removing that candidate as well:

(define make-filtered-coordinate-list
 (lambda (coordinate-list coordinate-pair)
   (if (null? coordinate-list) (list)
       (if (not (equal?
                 coordinate-pair (car coordinate-list)))
           (cons (car coordinate-list)
                 (make-filtered-coordinate-list
                  (cdr coordinate-list) coordinate-pair))
           (make-filtered-coordinate-list
            (cdr coordinate-list) coordinate-pair)))))

This function is a textbook case of filtering out one member from a list, returning a new list. The key implementation detail here is that it recursively walks the list, building up a new list as it goes. As it stops to examine each element in the list, if it is not the equal to the item in question, it goes ahead and conses it on to the generated list; if it is, the element is not consed, and we just continue to recurse. You can see similar examples in The Little Schemer. Note that this function counts on the fact that equal? works properly on data structures that are not atoms and on atoms that are different objects in memory. In Scheme, Lisp, and other dynamic languages -- basically, as soon as you get away from "unboxed" primitive data types -- equality is a complex subject!

OK, are you still with me? We've got a list of the coordinates for the neighborhood we're examining. Next, we make a candidate remover, which is another function generator; when you pass it a candidate, it removes it from all the candidate list. That should be straightforward enough by now, so I won't belabor it. So, does it work?

Well, of course, it did not work on the first try. Or the second or third try! In fact, while I have presented this design top-down, I actually wrote it using a much more realistic mixture of top-down and bottom-up design. Sometimes the higher-order algorithm will be clear. This is often best worked out on paper. The cancel-givens and cancel-given function started out as a stub that called stubs. To test the implementation function-by-function, the stubs just returned fixed values. These were later fleshed out with real code. As I implemented more functions, I found that certain other function definitions seemed to naturally fall out, so wrote those next. Meanwhile, had already gone "bottom up" in the process of writing my 2D array functions, so my implementation hooked up in the middle. Then there was a bit of "impedance mismatch," where the functions I thought I would need turned out not to be precisely the functions I did need. I wound up iterating in both the "bottom up" and "top down" directions multiple times before getting even to this point. But using the DrScheme listener, turnaround on this kind of iteration is very quick.

The result of removing all the givens from the board, and crossing off candidates in the neighborhood of each given as I go, is a candidate-board vector that looks like the one below. A list of only one value indicates that the value has been solved. I have formatted the output lightly to make the rows a little more clear, and I have marked the initial set of givens in bold:

((2 3 7) (3 8) (2 3 7 8) (2 3 4) (6) (5) (2 3 4 8) (9) (1)

(1 2 3 6 7) (1 3 6 8) (2 3 6 7 8) (2 3 4 9) (2 3 7 8)

(2 3 7 8 9) (5) (4 6 8) (2 3 4 6 8)

(4) (5) (9) (1) (2 3 8) (2 3 8) (2 3 8) (7) (2 3 6 8)

(2 5 6) (4 6 8) (2 5 6 8) (2 5) (9) (1 2 7) (1 2 4 8)

(3) (2 4 6 8)

(2 3 5 9) (3 4) (1) (8) (2 3) (6) (7) (4 5) (2 4 9)

(2 3 5 6 9) (7) (2 3 5 6 8) (2 3 5) (4) (1 2 3) (1 2 8 9)

(1 5 6 8) (2 6 8 9)

(1 3 7) (9) (3 7) (3) (1 3 8) (4) (6) (2) (5)

(1 3 5 6 7) (1 3 6) (4) (2 3 6 9) (1 2 3 8) (1 2 3 8 9)

(1 3 8 9) (1 8) (3 7 8 9)

(8) (2) (3 6) (7) (5) (1 3 9) (1 3 4 9) (1 4) (3 4 9))

Note that there were 28 givens. The next important question to ask is "did we find any more solved boxes?" That is, did we reduce the candidate list in any of the non-given boxes to a length of 1, indicating only a single possibility in that box?

If you look closely, you will notice that, in fact, we have solved some boxes, simply by crossing out the givens and following through on the implications of those givens for the boxes in the neighborhood of each given. In the third-from-last row, which would be row 6 in our zero-based indexing, we see that there is a single 3 candidate in one of the candidate lists. This was not a given:

(1 3 7) (9) (3 7) (3) (1 3 8) (4) (6) (2) (5)

However, our code did not yet discover that we have made more progress. In the next installment, we'll look at how the code can make that discovery, and what the implications are. We'll also look how to apply some slightly less elementary solving rules to the board.

Practical Subversion and Other Subversive Practices

Since I'm a fan of the book Programming Ruby, I purchased the two Pragmatic Programmer books, Pragmatic Version Control Using CVS and Pragmatic Version Control Using Subversion. I was planning to use them as a quick refresher to CVS and a quick cross-introduction to Subversion, since I have a couple of years of experience using (and occasionally fixing) CVS repositories, and also have used a couple of other tools including MKS RCS, Voodoo, and (a little bit of) ClearCase, and we are implementing a Subversion repository at work.

I wound up returning the books to the store, though. I would recommend either of them to someone who is new to both tools, but they are too light for someone who has already used one of these version control systems extensively.

For CVS, there are plenty of free texts available online, including the venerable Cederquist, which is a bit hard to use as a tutorial, and the Fogel and Bar book (at http://cvsbook.red-bean.com/). I have the second edition in the dead tree edition; the third edition is available free online. If you like it, buy a paper copy.

For Subversion, although there are some free texts available, I decided that Practical Subversion, by Garrett Rooney, is the right text, on the right level, at the right time, for me. The chapter on best practices alone helps differentiate it from other books which tend to be organized around a command-line options reference. We'll see how it holds up as I gain a little more experience with Subversion.

I also picked up a copy of The Reasoned Schemer. I am a big fan of The Little Schemer and The Seasoned Schemer, although I must admit I have not completed The Seasoned Schemer.

I also started reading A Little Java, a Few Patterns, which is an effort to teach functional and recursive programming using Java. Given that I had a lot of Java experience and had already thought a lot about design patterns, it seemed like a strange and forced hybrid, so I decided not to bother with it and just work on Scheme instead. The book may be interesting to developers who know Java and who would like to begin to understand Scheme.

I am now re-skimming The Little Schemer as a warmup to a new book on logic programming. I've been interested in learning Prolog for a while now but have not gotten much past reading part of Clocksin and Mellish's book Programming in Prolog, 5th ed. Given that I'm currently learning Scheme, implementing a Prolog-like language in Scheme might be the impetus that I need. If I get enough time to work on it I will report on what I find.

27 January 2006

Thinking about Sudoku and Scheme, Part 3

OK, so I've given you a small view of one part of the array2D library and how it is implemented. Now let's look at my Sudoku solver from the top. Note that this is a hobby project, done in the rare scraps of free time I am able to piece together these days, by a relative novice in Scheme. In other words, it is not finished, polished, textbook work, so be kind.

Let's review what Sudoku is and consider some of the most elementary solving strategies. First, terminology. These are the terms I use; some are borrowed and some are my own. Other Sudoku geeks may use alternate terminology.

Box: the smallest unit; a single space for a number. I identify boxes using zero-based x, y coordinate pairs.

Open box: a box that does not have a number in it yet.

Coordinates: on a 9x9 grid, which does not represent the only possibility in Sudoku, the top row goes from (0, 0) to (8, 0) and the rightmost colum goes from (8, 0) to (8, 8).

Given: one of the numbers already provided in the puzzle.

Grid: in a 9x9 Sudoku puzzle, one of the 9 3x3 sub-parts; on a 16x16 puzzle, one of the 16 4x4 sub-parts, etc. Grids are not necessarily square; some published puzzles could contain 4x3 grids, for example.

Virtual row: a single group that must contain the set of possible numbers (in a 9x9 puzzle, 1 through 9). Note that each box is actually part of three virtual rows: its row, column, and grid.

Number set: for a 9x9 puzzle, the integers 1 through 9. I have seen Sudoku boards that allow the numbers 1 through 12, 1 through 16, 1 through 25, and 1 through 49. Sometimes letters are used: for example, the numbers 1 through 16 can be encoded using the hexadecimal digits 0..F (or 1..G), and the numbers 1 through 25 can be encoded by the letters A..Y. There are more exotic Sudoku involving the letters that make a specific word, but this is really just an encoding issue and does not fundamentally change the rules.

Sudoku rule: there is probably a good formal expression of this, but informally speaking, the Sudoku rule says that each box in each virtual row must contain one of the numbers in the number set.

By implication, each box cannot contain a number that is not a member of the number set; it also cannot contain a duplicate of another number in the same virtual row; in a completed puzzle each virtual row must be filled, which means the whole puzzle must be filled.

If you find that a puzzle you are working on seems to break this rule, either the puzzle was not valid to begin with, or you have made a solving error. The second case is more likely, but I have found errors in published puzzles.

There are several possible types of errors. The givens may violate the Sudoku rule, which will lead to an insoluble puzzle, or more than one solution may be possible, which could lead you to find a solution that follows the Sudoku rule but which does not match the published solution.

Candidate: one of the possible values that could legally be written in a box.

On an empty 9x9 grid, each box has as its candidates all the numbers 1 through 9, so no progress can be made towards a solution.

When the givens are taken into account, the set of candidates in the remaining open boxes in the neighborhood must be reduced to maintain the Sudoku rule. As the puzzle is filled in, the candidates are reduced further.

As you perform logical deductions on the board, you will remove candidates; when only one candidate remains for a box, the number should be written down and the candidates for boxes elsewhere in the neighborhood must have that number removed; repeat until no open boxes remain. (This is actually in a nutshell the algorithm for solving any Sudoku puzzle, and translates directly into the main loop of a computer solution that uses human solving techniques rather than testing every possibile chain of possibilities).

Noncandidate: instead of writing down candidates as you work on the puzzle, you could also write down noncandidates: that is, numbers that have been ruled out for each box. When the set of noncandidates for a given box leaves only one remaining candidate in the number set, you can write down that possibility, and you must then add this number this to the sets of noncandidates elsewhere in the neighborhood. You could represent noncandidates by writing them down directly, although this can be confusing, or writing down the number set in each open box and crossing out numbers as they become noncandidates.

Single Candidate: the case when one candidate remains in a box; once you recognize this case, you should write in the number; the box is solved, you should then make sure to reduce the candidates in the neighborhood. Which brings us to another term.

Neighborhood: the combined row, column, and grid that contains a given open box. Neighborhood is a distinct concept from virtual row because if a number appears anywhere in the neighborhood containing a given box, it cannot be a valid candidate for the box; this is because the Sudoku rule applies to each box three ways.

Basic Solving Strategies

For demonstration purposes, we will use the following 9x9 puzzle (taken from _Sudoku: Easy to Hard Presented by Will Shortz_ Volume 3):

-------------------------
| 9 . 8 | 3 . . | 7 4 2 |
| . 4 . | 5 8 . | . 9 . |
| 2 . . | . 7 . | 1 . 8 |
-------------------------
| . . 6 | 2 9 4 | 8 3 . |
| 4 8 . | . . . | . . 1 |
| 3 7 . | 6 . 8 | . 2 . |
-------------------------
| . 3 . | . 5 6 | . . . |
| 6 . . | 1 . . | . 8 3 |
| . 2 . | . . 3 | . 1 5 |
-------------------------

Let's look at how we might represent this board in Scheme, and then start coming up with functions to operate on it.

To allow my solver to be configured for different board sizes and grid dimensions, we'll provide bindings for the independent dimension values:

(define max-box-value 9)
(define board-x-dim 9)
(define board-y-dim 9)
(define grid-x-dim 3)
(define grid-y-dim 3)

To start solving our sample puzzle, let's use the following algorithm:

1. Make a given board. A given board is a board array that contains only the givens. Here are the givens, as a list of list, where each list contains a coordinate pair as a list and a value:

(define given-list
  '(((4 0) 6) ((5 0) 5) ((7 0) 9) ((8 0) 1)
    ((6 1) 5)
    ((0 2) 4) ((1 2) 5) ((2 2) 9) ((3 2) 1) ((7 2) 7)
    ((4 3) 9) ((7 3) 3)
    ((2 4) 1) ((3 4) 8) ((5 4) 6) ((6 4) 7)
    ((1 5) 7) ((4 5) 4)
    ((1 6) 9) ((5 6) 4) ((6 6) 6) ((7 6) 2) ((8 6) 5)
    ((2 7) 4)
    ((0 8) 8) ((1 8) 2) ((3 8) 7) ((4 8) 5)))

Here's a function which will make a given board:

(define make-given-board
  (lambda (given-list)
    (let ((given-board
           (make-vector (* board-x-dim board-y-dim) #f))
          (setter! (array2D-make-setter board-x-dim)))
      (letrec ((do-write-givens
                (lambda (puzzle-array givens)
                  (if (not (null? givens))
                      (begin
                        (do-write-givens
                         puzzle-array (cdr givens))
                        (setter! puzzle-array
                                 (caaar givens)
                                 (cadaar givens)
                                 (cadar givens)))))))
        (do-write-givens given-board given-list))
      given-board)))

This is a pretty simple function, but let's look at a couple of points. First, the default value of our vector is #f, the canonical "false" value. This has no real purpose other than allowing us to use the elements of this array act as an argument to Scheme's if or cond and give the expected results. We've already seen how to use letrec to create a self-recursive helper function; we already know how the curried setter! function works. The body of this function is a pretty straightforward example of recursion over a list. The arguments we pass to setter! are a little more interesting:

(setter! puzzle-array
              (caaar givens)
              (cadaar givens)
              (cadar givens))

These primitives provide shortcuts for expressing combinations of car and cdr, and are read inside out from right to left; caaar becomes (car (car (car ...))), cadaar becomes (car (cdr (car (car ...)))).

An aside: I have a love-hate relationship with these constructions. On the one hand, they allow you to do quick deconstruction of arbitrary list structures, and that's a good thing because lists are very quick and dirty and you can easily put together data structures using lists. On the other hand, they allow you to do quick deconstruction of arbitrary list structures, and that's bad because they are not self-documenting in any way and nothing in them is named. Using something like a plist or record type is probably a better idea, so I will consider that later; Common Lisp provides destructuring, which would also be a valid option because it is self-documenting.

Anyway, that will build a given board. Now let's build a candidate board. The initial candidate list can be easily generated using another function:

(define make-initial-candidate-list
  (lambda ()
    ((lambda (max)
       (letrec ((make-candidate-list
                 (lambda (count)
                   (if (> count max) (list)
                       (cons count
                             (make-candidate-list
                              (+ count 1)))))))
         (make-candidate-list 1)))
     max-box-value)))

I'm going to wrap this up in the form of a value generator. A value generator is a function that receives x and y indices and returns a new value for the board. In this case our value generator will ignore the parameters:

(define initial-candidate-value-generator
  (lambda (x-idx y-idx)
    (make-initial-candidate-list)))

Now we can use our set-board-elts primitive, which is a curried array mapping function, to fill in the candidates. Each box gets a list containing the full set of candidates:

(define make-initial-candidate-board
  (lambda ()
    (let ((candidate-board
           (make-vector
            (* board-x-dim board-y-dim) '())))
      (set-board-elts
       candidate-board initial-candidate-value-generator)
      candidate-board)))

Now we've got our given board and our candidate board, and we can start smashing them together to shake out candidates, and begin finding values to go in our boxes. More on that next time.

25 January 2006

Thinking about Sudoku and Scheme, Part 2

Here is how I implemented some of my array primitives. I'll start in the middle:

(define map-board-row-elts
  (array2D-make-row-elt-mapper board-x-dim))

When treating a vector like a two-dimensional array, assuming that you store them row-first, the key piece of information you need is the size of the row. (More generally, for an n-dimensional array, to access an element you need to know the size of dimensions 1 to n - 1). On the top side of my program, the dimensions of my Sudoku board are known. The array primitives don't know. Rather than pass this information to every array access call, I create a function. The function array2D-make-row-elt-mapper captures this information using closure and returns another function:

(define array2D-make-row-elt-mapper
  (lambda (x-dim)
    (lambda (array2D y-idx elt-handler)
      (array2D-map-row-elts
       array2D y-idx x-dim elt-handler))))

Note the nested lambda expressions. The top one receives the x-dim parameter, and the body of this function is another lambda expression. This second lambda expression is the return value of this function: it returns a function. This returned function invokes another function which requires an x-dim parameter.

In this case, the purpose of returning this function is to provide the functionality of another function, array2D-map-row-elts, but without the need to provide an x-dim parameter. The new function is a curried function.

What is a curried function? Currying means taking a function with n parameters and creating a new function with n - 1 parameters. The new function provides a default value for the missing parameter, giving you a simplified API to the original function.

Common Lisp and other languages provide curry and rcurry functions which peel off parameters from the left or right of the paramter list, respectively. So our currying may not meet a strict definition, although we are doing the same thing: providing an interface function that will supply a value for one of the original parameters. Currying often uses a constant for these default parameters; in our case, we're using a variable, but the idea is similar. Scheme does not have a standard curry function, but it is easy enough to express currying by using nested lambda expressions.

In my case, x-dim is the curried parameter. Inside this lambda expression is a reference to x-dim. Scheme supports lexical closure; at the time this lambda expression is evaluated, the result is a function object that "closes over" or "captures" -- that is, retains a reference to -- the variable binding in effect when the lambda expression is evaluated.

This is the variable binding owned by the caller, the variable bound to the name board-x-dim. It is not a copy: if some other piece of code updates the value in that variable, this function would see a different value for x-dim when it executes. This creates complex state with multiple possible "listeners" for the state change: this is one of the reasons that Scheme programmers tend to prefer a functional style where variables are not altered after initial binding.

If you are a C programmer, you may find this very confusing. You can think of it as if the returned function retains a pointer to the board-x-dim sent by the caller. The C runtime model is quite different, though: it has distinct lifetimes for variables; local non-static ("automatic") variables that exist on the stack may be recycled when a given function returns, so it is unsafe to retain pointers to them (in fact, retaining a pointer to an automatic variable is a common beginner's bug in C).

In Scheme, bindings exist as long as there are references to them; when there are no more references to them, they may be garbage-collected (or not).

Let's look at the definition of the next function we call, array2D-map-row-elts:

((define array2D-map-row-elts
  (lambda (array2D y-idx x-dim elt-handler)
    (array2D-map-subrow-elts
     array2D
     0 (- x-dim 1)
     y-idx
     x-dim
     elt-handler)))

OK, we call yet another function: array2D-map-subrow-elts, which is more general (it will map any contiguous subset of the array elements). It is the general case, and array2D-map-row-elts is the more specific case (it maps the whole row), which means we can quite naturally define this specific case by parameterizing the general case. It isn't quite the same as currying, but it is a similar idea; we provide some fixed parameters representing the start of the row (index 0) and end (index x-dim - 1). Here is the general function:

(define array2D-map-subrow-elts
  (lambda (array2D
           x-first-idx x-last-idx
           y-idx
           x-dim
           elt-handler)
    (letrec ((iter-x
              (lambda (x-idx y-idx)
                (elt-handler
                 (array2D-get array2D x-idx y-idx x-dim))
                (if (< x-idx x-last-idx)
                    (iter-x (+ x-idx 1) y-idx)))))
      (iter-x x-first-idx y-idx))))

Are we there yet?

This function receives all the parameters and walks through the array, starting with an arbitrary x-idx (in this case, zero), and proceeding until we have processed the last legal x index (in this case, x-idx - 1).

Note that instead of iteration to walk the array, I use recursion. I won't go into a full explanation of recursion here; suffice it to say that although many people feel that recursion is unnatural and does not reflect how they think about iteration in the "real world," after expressing algorithms recursively for a while, it starts to seem much more natural, and the proposition that recursion and iteration can be transformed into each other does not seem so strange.

This function is formulated using tail recursion, where the bindings are not referenced after the recursive function call. In C or Pascal, the compilers are generally not smart about this sort of thing, and recursing on a large array could result in a very deep set of stack frames. A language like Scheme is able to recognize that since the values that exist in previous function calls are not used, it is free to lose track of the previous values, so no deep stack is actually required. Since this is the case, the decision to use recursion was largely arbitrary. I chose it because I wanted to practice writing recursive functions.

The letrec may be unfamiliar. If the innermost function was not recursive, we could just call it like this, without ever binding the function to a name at all:

((lambda (x-idx y-idx)
   ... )
0 y-idx)

In this case the innermost function is never bound to a name, because nothing has to refer to it by name. The expression around it would simply call it. This is also known as applying parameters to the anonymous function object.

You might think that we could just do this with let. The semantics of let allow the call from outside the function, but the binding to the name iter-x would exist only within the body of the let expression. This body consists of the expression or expressions that occur after the binding list in the let expression. Since the definition of the function occurs as part of the binding list, not part of the body of the let expression, let will not work; letrec is designed specifically to allow definition of recursive or mutually recursive functions within the binding list.

So now are we there yet?

Almost. My final comment on this function concerns the use of the elt-handler parameter. The generated function allows the caller to pass in a handler function. Applying functions to functions is known as higher-order programming; support for higher-order programming is one of the essential features of Lisp and Scheme.

It might seem like this is much too complicated. Functions generating functions calling functions calling functions? Well, yes, this is a bit of overkill to access an array; normally we would use some standard functions for handling data structures. But the general style: creating a lot of small functions -- appears very commonly in Scheme programs, and now you have had a taste of how a typical data structure library might be implemented.

I don't claim to be very experienced with Scheme yet, but I am happy to have gotten to the point where writing functions like array2D-make-row-elt-mapper and its supporting functions comes rapidly and naturally. Reading examples helps, but the lessons don't really click until you tackle a programming problem, and feel your mind begin to undergo rewiring. The level of abstraction, and thus the leverage you can gain from Scheme, become much higher than this, so stay tuned.

Thinking about Sudoku and Scheme, Part 1

I've been working on some tools to help me solve Sudoku puzzles in Scheme. That goal is actually secondary to the goal of learning to write better, more idiomatic Scheme programs, and so progress towards the Sudoku solver is frequently interrupted while I backtrack in an attempt to reorganize and optimize the supporting code.

I'm working on this both in a bottom-up and top-down direction. Bottom up, to develop something of a domain-specific Sudoku-solving language. Top down, because my higher-level strategy is also not that clear at the outset and is undergoing stepwise refinement.

My approach to solving Sudoku by computer is not to recursively explore every the tree of moves until a solution is found. That approach is easily doable but less interesting to me than trying to emulate some of the solving strategies that can be performed by a human solver. Ultimately I would like the tool to assist a human solver, not replace her.

My tool of choice is PLT Scheme version 3.00. For reasons to painful to go into just now, I have currently given up on using either SRFIs or SLIB for libraries and am writing everything in plain RSR5 Scheme. More on that later.