27 January 2006

Thinking about Sudoku and Scheme, Part 3

OK, so I've given you a small view of one part of the array2D library and how it is implemented. Now let's look at my Sudoku solver from the top. Note that this is a hobby project, done in the rare scraps of free time I am able to piece together these days, by a relative novice in Scheme. In other words, it is not finished, polished, textbook work, so be kind.

Let's review what Sudoku is and consider some of the most elementary solving strategies. First, terminology. These are the terms I use; some are borrowed and some are my own. Other Sudoku geeks may use alternate terminology.

Box: the smallest unit; a single space for a number. I identify boxes using zero-based x, y coordinate pairs.

Open box: a box that does not have a number in it yet.

Coordinates: on a 9x9 grid, which does not represent the only possibility in Sudoku, the top row goes from (0, 0) to (8, 0) and the rightmost colum goes from (8, 0) to (8, 8).

Given: one of the numbers already provided in the puzzle.

Grid: in a 9x9 Sudoku puzzle, one of the 9 3x3 sub-parts; on a 16x16 puzzle, one of the 16 4x4 sub-parts, etc. Grids are not necessarily square; some published puzzles could contain 4x3 grids, for example.

Virtual row: a single group that must contain the set of possible numbers (in a 9x9 puzzle, 1 through 9). Note that each box is actually part of three virtual rows: its row, column, and grid.

Number set: for a 9x9 puzzle, the integers 1 through 9. I have seen Sudoku boards that allow the numbers 1 through 12, 1 through 16, 1 through 25, and 1 through 49. Sometimes letters are used: for example, the numbers 1 through 16 can be encoded using the hexadecimal digits 0..F (or 1..G), and the numbers 1 through 25 can be encoded by the letters A..Y. There are more exotic Sudoku involving the letters that make a specific word, but this is really just an encoding issue and does not fundamentally change the rules.

Sudoku rule: there is probably a good formal expression of this, but informally speaking, the Sudoku rule says that each box in each virtual row must contain one of the numbers in the number set.

By implication, each box cannot contain a number that is not a member of the number set; it also cannot contain a duplicate of another number in the same virtual row; in a completed puzzle each virtual row must be filled, which means the whole puzzle must be filled.

If you find that a puzzle you are working on seems to break this rule, either the puzzle was not valid to begin with, or you have made a solving error. The second case is more likely, but I have found errors in published puzzles.

There are several possible types of errors. The givens may violate the Sudoku rule, which will lead to an insoluble puzzle, or more than one solution may be possible, which could lead you to find a solution that follows the Sudoku rule but which does not match the published solution.

Candidate: one of the possible values that could legally be written in a box.

On an empty 9x9 grid, each box has as its candidates all the numbers 1 through 9, so no progress can be made towards a solution.

When the givens are taken into account, the set of candidates in the remaining open boxes in the neighborhood must be reduced to maintain the Sudoku rule. As the puzzle is filled in, the candidates are reduced further.

As you perform logical deductions on the board, you will remove candidates; when only one candidate remains for a box, the number should be written down and the candidates for boxes elsewhere in the neighborhood must have that number removed; repeat until no open boxes remain. (This is actually in a nutshell the algorithm for solving any Sudoku puzzle, and translates directly into the main loop of a computer solution that uses human solving techniques rather than testing every possibile chain of possibilities).

Noncandidate: instead of writing down candidates as you work on the puzzle, you could also write down noncandidates: that is, numbers that have been ruled out for each box. When the set of noncandidates for a given box leaves only one remaining candidate in the number set, you can write down that possibility, and you must then add this number this to the sets of noncandidates elsewhere in the neighborhood. You could represent noncandidates by writing them down directly, although this can be confusing, or writing down the number set in each open box and crossing out numbers as they become noncandidates.

Single Candidate: the case when one candidate remains in a box; once you recognize this case, you should write in the number; the box is solved, you should then make sure to reduce the candidates in the neighborhood. Which brings us to another term.

Neighborhood: the combined row, column, and grid that contains a given open box. Neighborhood is a distinct concept from virtual row because if a number appears anywhere in the neighborhood containing a given box, it cannot be a valid candidate for the box; this is because the Sudoku rule applies to each box three ways.

Basic Solving Strategies

For demonstration purposes, we will use the following 9x9 puzzle (taken from _Sudoku: Easy to Hard Presented by Will Shortz_ Volume 3):

```-------------------------
| 9 . 8 | 3 . . | 7 4 2 |
| . 4 . | 5 8 . | . 9 . |
| 2 . . | . 7 . | 1 . 8 |
-------------------------
| . . 6 | 2 9 4 | 8 3 . |
| 4 8 . | . . . | . . 1 |
| 3 7 . | 6 . 8 | . 2 . |
-------------------------
| . 3 . | . 5 6 | . . . |
| 6 . . | 1 . . | . 8 3 |
| . 2 . | . . 3 | . 1 5 |
-------------------------```

Let's look at how we might represent this board in Scheme, and then start coming up with functions to operate on it.

To allow my solver to be configured for different board sizes and grid dimensions, we'll provide bindings for the independent dimension values:

```(define max-box-value 9)
(define board-x-dim 9)
(define board-y-dim 9)
(define grid-x-dim 3)
(define grid-y-dim 3)```

To start solving our sample puzzle, let's use the following algorithm:

1. Make a given board. A given board is a board array that contains only the givens. Here are the givens, as a list of list, where each list contains a coordinate pair as a list and a value:

```(define given-list
'(((4 0) 6) ((5 0) 5) ((7 0) 9) ((8 0) 1)
((6 1) 5)
((0 2) 4) ((1 2) 5) ((2 2) 9) ((3 2) 1) ((7 2) 7)
((4 3) 9) ((7 3) 3)
((2 4) 1) ((3 4) 8) ((5 4) 6) ((6 4) 7)
((1 5) 7) ((4 5) 4)
((1 6) 9) ((5 6) 4) ((6 6) 6) ((7 6) 2) ((8 6) 5)
((2 7) 4)
((0 8) 8) ((1 8) 2) ((3 8) 7) ((4 8) 5)))```

Here's a function which will make a given board:

```(define make-given-board
(lambda (given-list)
(let ((given-board
(make-vector (* board-x-dim board-y-dim) #f))
(setter! (array2D-make-setter board-x-dim)))
(letrec ((do-write-givens
(lambda (puzzle-array givens)
(if (not (null? givens))
(begin
(do-write-givens
puzzle-array (cdr givens))
(setter! puzzle-array
(caaar givens)
(do-write-givens given-board given-list))
given-board)))```

This is a pretty simple function, but let's look at a couple of points. First, the default value of our vector is #f, the canonical "false" value. This has no real purpose other than allowing us to use the elements of this array act as an argument to Scheme's if or cond and give the expected results. We've already seen how to use letrec to create a self-recursive helper function; we already know how the curried setter! function works. The body of this function is a pretty straightforward example of recursion over a list. The arguments we pass to setter! are a little more interesting:

```(setter! puzzle-array
(caaar givens)

These primitives provide shortcuts for expressing combinations of car and cdr, and are read inside out from right to left; caaar becomes (car (car (car ...))), cadaar becomes (car (cdr (car (car ...)))).

An aside: I have a love-hate relationship with these constructions. On the one hand, they allow you to do quick deconstruction of arbitrary list structures, and that's a good thing because lists are very quick and dirty and you can easily put together data structures using lists. On the other hand, they allow you to do quick deconstruction of arbitrary list structures, and that's bad because they are not self-documenting in any way and nothing in them is named. Using something like a plist or record type is probably a better idea, so I will consider that later; Common Lisp provides destructuring, which would also be a valid option because it is self-documenting.

Anyway, that will build a given board. Now let's build a candidate board. The initial candidate list can be easily generated using another function:

```(define make-initial-candidate-list
(lambda ()
((lambda (max)
(letrec ((make-candidate-list
(lambda (count)
(if (> count max) (list)
(cons count
(make-candidate-list
(+ count 1)))))))
(make-candidate-list 1)))
max-box-value)))```

I'm going to wrap this up in the form of a value generator. A value generator is a function that receives x and y indices and returns a new value for the board. In this case our value generator will ignore the parameters:

```(define initial-candidate-value-generator
(lambda (x-idx y-idx)
(make-initial-candidate-list)))```

Now we can use our set-board-elts primitive, which is a curried array mapping function, to fill in the candidates. Each box gets a list containing the full set of candidates:

```(define make-initial-candidate-board
(lambda ()
(let ((candidate-board
(make-vector
(* board-x-dim board-y-dim) '())))
(set-board-elts
candidate-board initial-candidate-value-generator)
candidate-board)))```

Now we've got our given board and our candidate board, and we can start smashing them together to shake out candidates, and begin finding values to go in our boxes. More on that next time.