30 January 2006

Thinking about Sudoku and Scheme, Part 4

Last time, I wrote:

Now we've got our given board and our candidate board, and we can start smashing them together to shake out candidates, and begin finding values to go in our boxes.

So let's do that. But first, I need to confess something. You may have found yourself wondering "You already have a data structure that contains the givens and their coordinates. Why make the given-board vector at all?" If you wondered that, congratulations. You're paying attention, possibly more than I am, and working towards writing code that is simple.

So what's the answer? The answer is "I don't have a good reason; the code just grew that way, and has not been optimally cleaned up yet." As I write this code, I'm learning to use Scheme more effectively. It has been an iterative process. Right now I'm in the part of the cycle that is excited to have the code successfully working on the first part of the Sudoku board solution. At some point I'll probably be in the part of the cycle that wants to rip up the code and make it smaller and simpler. Scheme lends itself well to stepwise refinement. This is both a blessing and a curse. A blessing, because it is easy to rewrite your code. In C++, assuming your design is not visibly horrible and doesn't crash, it is generally too painful to redo all those interfaces and class definitions and so working code often doesn't get improved unless there is a burning need. Scheme, especially if it is written in an appropriate functional style, is easy to improve locally. This also means that you can always improve it some more, and fall into the trap of improving it endlessly without solving an actual problem. That's the curse.

OK, now let's get to the meaty part, keeping in mind that this is code in progress. Here's how we map the givens and use them to cancel the first round of candidates. From the top:

(cancel-givens
(make-given-board given-list)
(make-initial-candidate-board))

Simple enough so far, right? Now we just have to define cancel-givens, and we're done.

(define cancel-givens
 (lambda (given-board candidate-board)
   (map-board-coordinates-elts
    given-board
    (lambda (x-idx y-idx elt)
      (if elt
          (cancel-given
           candidate-board x-idx y-idx elt))))
   candidate-board))

We know roughly what that map function is going to look like; it will call a function with the coordinates and elements of the given board. Now we just need to define cancel-given, and we're done. here's cancel-given. This one is big (most likely it could and should be further simplified):

(define cancel-given
 (lambda (candidate-board x-idx y-idx given)
   (reduce-candidate-list-to-given!
    candidate-board x-idx y-idx given)
   (letrec ((filtered-neighborhood-unique-coordinates-list
             (make-filtered-coordinate-list
              (get-unique-neighborhood-coordinates
               x-idx y-idx) (list x-idx y-idx)))
            (candidate-remover
             (make-candidate-remover given))
            (coordinate-list-handler
             (lambda (coordinate-list)
               (if (not (null? coordinate-list))
                   (begin
                     (candidate-remover
                      candidate-board
                      (caar coordinate-list)
                      (cadar coordinate-list))
                     (coordinate-list-handler
                      (cdr coordinate-list)))))))
     (coordinate-list-handler
      filtered-neighborhood-unique-coordinates-list))
   candidate-board))

Let's look at this one in a little more detail. Remember that this is called with each given. The first thing we do is turn the candidate list in the box with the given to a list containing only the given. Maybe that isn't the best representation, but it will do for now:

(define reduce-candidate-list-to-given!
 (lambda (candidate-board x-idx y-idx given)
   (set-board-elt!
    candidate-board
    x-idx y-idx
    (list given))))

Note that this is not strictly functional in style. I have tried to move over to a functional style, but my roots are still in imperative programming, so a fully functional, referentially-transparent implementation does not come naturally quite yet. In general, I am at a bit of a loss for good idioms to handle the gamut of solving strategies on the board over time without introducing mutation, but that is probably from lack of experience in purely functional programming techniques and data structures. I could keep regenerating the whole board, but that seems strange to me. I may explore a stricter functional design in a future version.

Meanwile, let's continue on and look at the next task, which is to remove the given from the entire neighborhood. Recall that the neighborhood of a box is the combined row, column and grid. We want a list of coordinates. Simply combining the coordinates for the row, column and grid will generate duplicate sets of coordinates; we also don't want to process the box itself, or we'll remove the given we just set. So we first use get-unique-neighborhood-coordinates:

(define get-unique-neighborhood-coordinates
 (lambda (x-idx y-idx)
   (let ((grid-bounds
          (map-coordinates-to-grid-bounds x-idx y-idx))
         (acc
          (make-unique-coordinate-accumulator)))
     (map-board-row-coordinates y-idx (car acc))
     (map-board-col-coordinates x-idx (car acc))
     (map-board-subarray-coordinates
      (car grid-bounds)
      (cadr grid-bounds)
      (caddr grid-bounds)
      (cadddr grid-bounds)
      (car acc))
     ((cadr acc)))))

The function map-coordinates-to-grid-bounds uses a little bit of Scheme math to determine the bounds of the 3x3 grid our box is in. Here's how I do that:

(define map-coordinates-to-grid-bounds
 (lambda (x-idx y-idx)
   (list (get-grid-lower-bound x-idx grid-x-dim)
         (get-grid-upper-bound x-idx grid-x-dim)
         (get-grid-lower-bound y-idx grid-y-dim)
         (get-grid-upper-bound y-idx grid-y-dim))))

And here are the helper functions:

(define get-grid-lower-bound
 (lambda (idx dim)
   (* dim (floor (/ idx dim)))))

(define get-grid-upper-bound
 (lambda (idx dim)
   (- (* dim (ceiling (/ (+ 1 idx) dim))) 1)))

The particulars of Scheme math require some care if you want to make sure you get a whole number result rounded in the direction that you want. If you're used to using languages that don't do exact fractions, this may seem like too much work. This way seems to work, though, so we will move on for now.

Next, we use an accumulator (created by make-unique-coordinate-accumulator) to collect up all the coordinates. This isn't a numeric accumulator, but a list accumulator: when we create it, it holds a reference to an empty list, and as we call it with coordinates, the list is assembled. Our accumulator generator looks like this:

(define make-unique-coordinate-accumulator
 (lambda ()
   (let ((coordinates (list)))
     (list
      (lambda (x-idx y-idx)
        (if (not (member (list x-idx y-idx) coordinates))
            (set! coordinates
                  (cons (list x-idx y-idx) coordinates))))
      (lambda ()
        coordinates)))))

This is again not strictly functional; in a more functional style, we'd probably generate a new accumulator with a new list created from the previous one. But then the client that held the accumulator would have to mutate its binding in order to get the new accumulator, so it would have to be strictly functional as well -- a sort of viral strict functionality would require rewriting all the upstream code. Something to think about!

This is yet another function generator, with a twist. We use a let to create a binding for a list, called coordinates. This is the list that will accumulate the coordinates. The actual return value of the function is... another list! This time, a list of two functions. The first function in the list receives coordinates and if the coordinate pair is not already present in the closed-over list, as determined by member, we destructively set coordinates to be the list with the new coordinate pair prepended. The second function in the list doesn't look like a function at all; in fact, it consists only of:

(lambda ()
 coordinates)

All that does is return the value held in the binding named coordinates. It's the rough equivalent of a return statement in C, C++, or Java. The important thing to note here is that both of the functions in the list we return contain a reference to coordinates. The binding in question is the one that is in lexical scope when the lambda form is executed. Note that we could stuff these functions inside some other data structure; we could even put them in a single cons cell using set-car! and set-cdr! But for now I am still using the list as my "Swiss Army Knife" data structure, so we'll use that.

Returning to the innards of get-unique-neighborhood-coordinates let's look at what happens next:

(map-board-row-coordinates y-idx (car acc))
    (map-board-col-coordinates x-idx (car acc))
    (map-board-subarray-coordinates
     (car grid-bounds)
     (cadr grid-bounds)
     (caddr grid-bounds)
     (cadddr grid-bounds)
     (car acc))

I'm applying my accumulator to the row, the column, and the subarray. The map functions are unexceptional; they are wrappers around my 2D array implementation. To pass the four parameters that constitute the bounding rectangle of the subarray, I'm again using those all-purpose list functions car, cadr, caddr, and cadddr. These simply correspond to retrieving the first, second, third, and fourth element in a list. Actually, in RSR5 Scheme you can rewrite this to use first, second, third, and fourth if you want to, but older textbooks generally use the composed list abbreviations, so I will stick with them for now; rather than just use clearer names, if I rewrite this to get rid of the composed list functions, I'd rather use a self-documenting data structure of some kind.

The last expression in the function is the return value. Don't overlook its importance:

(cadr acc)

This actually calls the function contained in the second element of our accumulator list, which will retrieve the contents of the accumulator. Since it is the last expression in the function, get-unique-neighborhood-coordinates will return its value.

Now we're back in cancel-given. The next thing we do is call make-filtered-coordinate-list on our unique list. We do this to remove the box containing our given, so that we don't wind up removing that candidate as well:

(define make-filtered-coordinate-list
 (lambda (coordinate-list coordinate-pair)
   (if (null? coordinate-list) (list)
       (if (not (equal?
                 coordinate-pair (car coordinate-list)))
           (cons (car coordinate-list)
                 (make-filtered-coordinate-list
                  (cdr coordinate-list) coordinate-pair))
           (make-filtered-coordinate-list
            (cdr coordinate-list) coordinate-pair)))))

This function is a textbook case of filtering out one member from a list, returning a new list. The key implementation detail here is that it recursively walks the list, building up a new list as it goes. As it stops to examine each element in the list, if it is not the equal to the item in question, it goes ahead and conses it on to the generated list; if it is, the element is not consed, and we just continue to recurse. You can see similar examples in The Little Schemer. Note that this function counts on the fact that equal? works properly on data structures that are not atoms and on atoms that are different objects in memory. In Scheme, Lisp, and other dynamic languages -- basically, as soon as you get away from "unboxed" primitive data types -- equality is a complex subject!

OK, are you still with me? We've got a list of the coordinates for the neighborhood we're examining. Next, we make a candidate remover, which is another function generator; when you pass it a candidate, it removes it from all the candidate list. That should be straightforward enough by now, so I won't belabor it. So, does it work?

Well, of course, it did not work on the first try. Or the second or third try! In fact, while I have presented this design top-down, I actually wrote it using a much more realistic mixture of top-down and bottom-up design. Sometimes the higher-order algorithm will be clear. This is often best worked out on paper. The cancel-givens and cancel-given function started out as a stub that called stubs. To test the implementation function-by-function, the stubs just returned fixed values. These were later fleshed out with real code. As I implemented more functions, I found that certain other function definitions seemed to naturally fall out, so wrote those next. Meanwhile, had already gone "bottom up" in the process of writing my 2D array functions, so my implementation hooked up in the middle. Then there was a bit of "impedance mismatch," where the functions I thought I would need turned out not to be precisely the functions I did need. I wound up iterating in both the "bottom up" and "top down" directions multiple times before getting even to this point. But using the DrScheme listener, turnaround on this kind of iteration is very quick.

The result of removing all the givens from the board, and crossing off candidates in the neighborhood of each given as I go, is a candidate-board vector that looks like the one below. A list of only one value indicates that the value has been solved. I have formatted the output lightly to make the rows a little more clear, and I have marked the initial set of givens in bold:

((2 3 7) (3 8) (2 3 7 8) (2 3 4) (6) (5) (2 3 4 8) (9) (1)

(1 2 3 6 7) (1 3 6 8) (2 3 6 7 8) (2 3 4 9) (2 3 7 8)

(2 3 7 8 9) (5) (4 6 8) (2 3 4 6 8)

(4) (5) (9) (1) (2 3 8) (2 3 8) (2 3 8) (7) (2 3 6 8)

(2 5 6) (4 6 8) (2 5 6 8) (2 5) (9) (1 2 7) (1 2 4 8)

(3) (2 4 6 8)

(2 3 5 9) (3 4) (1) (8) (2 3) (6) (7) (4 5) (2 4 9)

(2 3 5 6 9) (7) (2 3 5 6 8) (2 3 5) (4) (1 2 3) (1 2 8 9)

(1 5 6 8) (2 6 8 9)

(1 3 7) (9) (3 7) (3) (1 3 8) (4) (6) (2) (5)

(1 3 5 6 7) (1 3 6) (4) (2 3 6 9) (1 2 3 8) (1 2 3 8 9)

(1 3 8 9) (1 8) (3 7 8 9)

(8) (2) (3 6) (7) (5) (1 3 9) (1 3 4 9) (1 4) (3 4 9))

Note that there were 28 givens. The next important question to ask is "did we find any more solved boxes?" That is, did we reduce the candidate list in any of the non-given boxes to a length of 1, indicating only a single possibility in that box?

If you look closely, you will notice that, in fact, we have solved some boxes, simply by crossing out the givens and following through on the implications of those givens for the boxes in the neighborhood of each given. In the third-from-last row, which would be row 6 in our zero-based indexing, we see that there is a single 3 candidate in one of the candidate lists. This was not a given:

(1 3 7) (9) (3 7) (3) (1 3 8) (4) (6) (2) (5)

However, our code did not yet discover that we have made more progress. In the next installment, we'll look at how the code can make that discovery, and what the implications are. We'll also look how to apply some slightly less elementary solving rules to the board.

No comments: