31 January 2006

Interesting Embeddable Languages

There are a few languages I am paying particular attention to these days.

The first is Scheme; that's the language I'm using for my Sudoku experiments.

The second is Lua. Lua is a light language that includes a byte code compiler and interpreter. It is designed to have a small implementation footprint and to interface well with C. It has lightweight threads. Lua's home is at http://lua.org.

The third is Io. Io has a syntax that looks a bit weird, unless you're familiar with Smalltalk. I've never used Smalltalk, but I have used NewtonScript. The prototype-based inheritance scheme used in Io reminds me of NewtonScript's inheritance, although Io does not support dual inheritance chains. It may not be quite portable enough to move over to an arbitrary platform, but I won't know for sure until I try. Io's home is at http://iolanguage.com.

And then there is Ruby. Ruby is not quite so lightweight, but it is designed to be embeddable; it would not be my first choice, though, because it does not have a well-tested compiler or byte code runtime. It has too little regularity in its syntax (otherwise known as "too much syntax"), and uses sigils, which I don't like except as convention, as in Dylan, but as far as I'm concerned for usability it beats Perl hands-down. I prefer it to Python for no particularly good reasons other than Python's use of whitespace, which is a relatively trivial feature, and Guido's apparent hostility towards fully usable lexical closures. Ruby's home is at http://ruby-lang.org.

Ruby's support for higher-order functions has the side effect of making it serve as an acceptable dialect of Lisp, even without an exact equivalent of Lisp macros. I was able to bring up my basic Sudoku code in just a couple of hours using Ruby. Ruby is great for rapid prototyping and the standard libraries "feel" just the right size. There's some lessons here to inform future generations of Lisp, if the Lisp community is willing to listen to them.

There are some Forth-inspired languages that are also interesting; I am a fan of Forth, and consider it to be the "other" programmable programming language. Joy (http://www.latrobe.edu.au/philosophy/phimvt/joy.html) is a sort of functional variant on Forth.

30 January 2006

Thinking about Sudoku and Scheme, Part 4

Last time, I wrote:

Now we've got our given board and our candidate board, and we can start smashing them together to shake out candidates, and begin finding values to go in our boxes.

So let's do that. But first, I need to confess something. You may have found yourself wondering "You already have a data structure that contains the givens and their coordinates. Why make the given-board vector at all?" If you wondered that, congratulations. You're paying attention, possibly more than I am, and working towards writing code that is simple.

So what's the answer? The answer is "I don't have a good reason; the code just grew that way, and has not been optimally cleaned up yet." As I write this code, I'm learning to use Scheme more effectively. It has been an iterative process. Right now I'm in the part of the cycle that is excited to have the code successfully working on the first part of the Sudoku board solution. At some point I'll probably be in the part of the cycle that wants to rip up the code and make it smaller and simpler. Scheme lends itself well to stepwise refinement. This is both a blessing and a curse. A blessing, because it is easy to rewrite your code. In C++, assuming your design is not visibly horrible and doesn't crash, it is generally too painful to redo all those interfaces and class definitions and so working code often doesn't get improved unless there is a burning need. Scheme, especially if it is written in an appropriate functional style, is easy to improve locally. This also means that you can always improve it some more, and fall into the trap of improving it endlessly without solving an actual problem. That's the curse.

OK, now let's get to the meaty part, keeping in mind that this is code in progress. Here's how we map the givens and use them to cancel the first round of candidates. From the top:

(make-given-board given-list)

Simple enough so far, right? Now we just have to define cancel-givens, and we're done.

(define cancel-givens
 (lambda (given-board candidate-board)
    (lambda (x-idx y-idx elt)
      (if elt
           candidate-board x-idx y-idx elt))))

We know roughly what that map function is going to look like; it will call a function with the coordinates and elements of the given board. Now we just need to define cancel-given, and we're done. here's cancel-given. This one is big (most likely it could and should be further simplified):

(define cancel-given
 (lambda (candidate-board x-idx y-idx given)
    candidate-board x-idx y-idx given)
   (letrec ((filtered-neighborhood-unique-coordinates-list
               x-idx y-idx) (list x-idx y-idx)))
             (make-candidate-remover given))
             (lambda (coordinate-list)
               (if (not (null? coordinate-list))
                      (caar coordinate-list)
                      (cadar coordinate-list))
                      (cdr coordinate-list)))))))

Let's look at this one in a little more detail. Remember that this is called with each given. The first thing we do is turn the candidate list in the box with the given to a list containing only the given. Maybe that isn't the best representation, but it will do for now:

(define reduce-candidate-list-to-given!
 (lambda (candidate-board x-idx y-idx given)
    x-idx y-idx
    (list given))))

Note that this is not strictly functional in style. I have tried to move over to a functional style, but my roots are still in imperative programming, so a fully functional, referentially-transparent implementation does not come naturally quite yet. In general, I am at a bit of a loss for good idioms to handle the gamut of solving strategies on the board over time without introducing mutation, but that is probably from lack of experience in purely functional programming techniques and data structures. I could keep regenerating the whole board, but that seems strange to me. I may explore a stricter functional design in a future version.

Meanwile, let's continue on and look at the next task, which is to remove the given from the entire neighborhood. Recall that the neighborhood of a box is the combined row, column and grid. We want a list of coordinates. Simply combining the coordinates for the row, column and grid will generate duplicate sets of coordinates; we also don't want to process the box itself, or we'll remove the given we just set. So we first use get-unique-neighborhood-coordinates:

(define get-unique-neighborhood-coordinates
 (lambda (x-idx y-idx)
   (let ((grid-bounds
          (map-coordinates-to-grid-bounds x-idx y-idx))
     (map-board-row-coordinates y-idx (car acc))
     (map-board-col-coordinates x-idx (car acc))
      (car grid-bounds)
      (cadr grid-bounds)
      (caddr grid-bounds)
      (cadddr grid-bounds)
      (car acc))
     ((cadr acc)))))

The function map-coordinates-to-grid-bounds uses a little bit of Scheme math to determine the bounds of the 3x3 grid our box is in. Here's how I do that:

(define map-coordinates-to-grid-bounds
 (lambda (x-idx y-idx)
   (list (get-grid-lower-bound x-idx grid-x-dim)
         (get-grid-upper-bound x-idx grid-x-dim)
         (get-grid-lower-bound y-idx grid-y-dim)
         (get-grid-upper-bound y-idx grid-y-dim))))

And here are the helper functions:

(define get-grid-lower-bound
 (lambda (idx dim)
   (* dim (floor (/ idx dim)))))

(define get-grid-upper-bound
 (lambda (idx dim)
   (- (* dim (ceiling (/ (+ 1 idx) dim))) 1)))

The particulars of Scheme math require some care if you want to make sure you get a whole number result rounded in the direction that you want. If you're used to using languages that don't do exact fractions, this may seem like too much work. This way seems to work, though, so we will move on for now.

Next, we use an accumulator (created by make-unique-coordinate-accumulator) to collect up all the coordinates. This isn't a numeric accumulator, but a list accumulator: when we create it, it holds a reference to an empty list, and as we call it with coordinates, the list is assembled. Our accumulator generator looks like this:

(define make-unique-coordinate-accumulator
 (lambda ()
   (let ((coordinates (list)))
      (lambda (x-idx y-idx)
        (if (not (member (list x-idx y-idx) coordinates))
            (set! coordinates
                  (cons (list x-idx y-idx) coordinates))))
      (lambda ()

This is again not strictly functional; in a more functional style, we'd probably generate a new accumulator with a new list created from the previous one. But then the client that held the accumulator would have to mutate its binding in order to get the new accumulator, so it would have to be strictly functional as well -- a sort of viral strict functionality would require rewriting all the upstream code. Something to think about!

This is yet another function generator, with a twist. We use a let to create a binding for a list, called coordinates. This is the list that will accumulate the coordinates. The actual return value of the function is... another list! This time, a list of two functions. The first function in the list receives coordinates and if the coordinate pair is not already present in the closed-over list, as determined by member, we destructively set coordinates to be the list with the new coordinate pair prepended. The second function in the list doesn't look like a function at all; in fact, it consists only of:

(lambda ()

All that does is return the value held in the binding named coordinates. It's the rough equivalent of a return statement in C, C++, or Java. The important thing to note here is that both of the functions in the list we return contain a reference to coordinates. The binding in question is the one that is in lexical scope when the lambda form is executed. Note that we could stuff these functions inside some other data structure; we could even put them in a single cons cell using set-car! and set-cdr! But for now I am still using the list as my "Swiss Army Knife" data structure, so we'll use that.

Returning to the innards of get-unique-neighborhood-coordinates let's look at what happens next:

(map-board-row-coordinates y-idx (car acc))
    (map-board-col-coordinates x-idx (car acc))
     (car grid-bounds)
     (cadr grid-bounds)
     (caddr grid-bounds)
     (cadddr grid-bounds)
     (car acc))

I'm applying my accumulator to the row, the column, and the subarray. The map functions are unexceptional; they are wrappers around my 2D array implementation. To pass the four parameters that constitute the bounding rectangle of the subarray, I'm again using those all-purpose list functions car, cadr, caddr, and cadddr. These simply correspond to retrieving the first, second, third, and fourth element in a list. Actually, in RSR5 Scheme you can rewrite this to use first, second, third, and fourth if you want to, but older textbooks generally use the composed list abbreviations, so I will stick with them for now; rather than just use clearer names, if I rewrite this to get rid of the composed list functions, I'd rather use a self-documenting data structure of some kind.

The last expression in the function is the return value. Don't overlook its importance:

(cadr acc)

This actually calls the function contained in the second element of our accumulator list, which will retrieve the contents of the accumulator. Since it is the last expression in the function, get-unique-neighborhood-coordinates will return its value.

Now we're back in cancel-given. The next thing we do is call make-filtered-coordinate-list on our unique list. We do this to remove the box containing our given, so that we don't wind up removing that candidate as well:

(define make-filtered-coordinate-list
 (lambda (coordinate-list coordinate-pair)
   (if (null? coordinate-list) (list)
       (if (not (equal?
                 coordinate-pair (car coordinate-list)))
           (cons (car coordinate-list)
                  (cdr coordinate-list) coordinate-pair))
            (cdr coordinate-list) coordinate-pair)))))

This function is a textbook case of filtering out one member from a list, returning a new list. The key implementation detail here is that it recursively walks the list, building up a new list as it goes. As it stops to examine each element in the list, if it is not the equal to the item in question, it goes ahead and conses it on to the generated list; if it is, the element is not consed, and we just continue to recurse. You can see similar examples in The Little Schemer. Note that this function counts on the fact that equal? works properly on data structures that are not atoms and on atoms that are different objects in memory. In Scheme, Lisp, and other dynamic languages -- basically, as soon as you get away from "unboxed" primitive data types -- equality is a complex subject!

OK, are you still with me? We've got a list of the coordinates for the neighborhood we're examining. Next, we make a candidate remover, which is another function generator; when you pass it a candidate, it removes it from all the candidate list. That should be straightforward enough by now, so I won't belabor it. So, does it work?

Well, of course, it did not work on the first try. Or the second or third try! In fact, while I have presented this design top-down, I actually wrote it using a much more realistic mixture of top-down and bottom-up design. Sometimes the higher-order algorithm will be clear. This is often best worked out on paper. The cancel-givens and cancel-given function started out as a stub that called stubs. To test the implementation function-by-function, the stubs just returned fixed values. These were later fleshed out with real code. As I implemented more functions, I found that certain other function definitions seemed to naturally fall out, so wrote those next. Meanwhile, had already gone "bottom up" in the process of writing my 2D array functions, so my implementation hooked up in the middle. Then there was a bit of "impedance mismatch," where the functions I thought I would need turned out not to be precisely the functions I did need. I wound up iterating in both the "bottom up" and "top down" directions multiple times before getting even to this point. But using the DrScheme listener, turnaround on this kind of iteration is very quick.

The result of removing all the givens from the board, and crossing off candidates in the neighborhood of each given as I go, is a candidate-board vector that looks like the one below. A list of only one value indicates that the value has been solved. I have formatted the output lightly to make the rows a little more clear, and I have marked the initial set of givens in bold:

((2 3 7) (3 8) (2 3 7 8) (2 3 4) (6) (5) (2 3 4 8) (9) (1)

(1 2 3 6 7) (1 3 6 8) (2 3 6 7 8) (2 3 4 9) (2 3 7 8)

(2 3 7 8 9) (5) (4 6 8) (2 3 4 6 8)

(4) (5) (9) (1) (2 3 8) (2 3 8) (2 3 8) (7) (2 3 6 8)

(2 5 6) (4 6 8) (2 5 6 8) (2 5) (9) (1 2 7) (1 2 4 8)

(3) (2 4 6 8)

(2 3 5 9) (3 4) (1) (8) (2 3) (6) (7) (4 5) (2 4 9)

(2 3 5 6 9) (7) (2 3 5 6 8) (2 3 5) (4) (1 2 3) (1 2 8 9)

(1 5 6 8) (2 6 8 9)

(1 3 7) (9) (3 7) (3) (1 3 8) (4) (6) (2) (5)

(1 3 5 6 7) (1 3 6) (4) (2 3 6 9) (1 2 3 8) (1 2 3 8 9)

(1 3 8 9) (1 8) (3 7 8 9)

(8) (2) (3 6) (7) (5) (1 3 9) (1 3 4 9) (1 4) (3 4 9))

Note that there were 28 givens. The next important question to ask is "did we find any more solved boxes?" That is, did we reduce the candidate list in any of the non-given boxes to a length of 1, indicating only a single possibility in that box?

If you look closely, you will notice that, in fact, we have solved some boxes, simply by crossing out the givens and following through on the implications of those givens for the boxes in the neighborhood of each given. In the third-from-last row, which would be row 6 in our zero-based indexing, we see that there is a single 3 candidate in one of the candidate lists. This was not a given:

(1 3 7) (9) (3 7) (3) (1 3 8) (4) (6) (2) (5)

However, our code did not yet discover that we have made more progress. In the next installment, we'll look at how the code can make that discovery, and what the implications are. We'll also look how to apply some slightly less elementary solving rules to the board.

Practical Subversion and Other Subversive Practices

Since I'm a fan of the book Programming Ruby, I purchased the two Pragmatic Programmer books, Pragmatic Version Control Using CVS and Pragmatic Version Control Using Subversion. I was planning to use them as a quick refresher to CVS and a quick cross-introduction to Subversion, since I have a couple of years of experience using (and occasionally fixing) CVS repositories, and also have used a couple of other tools including MKS RCS, Voodoo, and (a little bit of) ClearCase, and we are implementing a Subversion repository at work.

I wound up returning the books to the store, though. I would recommend either of them to someone who is new to both tools, but they are too light for someone who has already used one of these version control systems extensively.

For CVS, there are plenty of free texts available online, including the venerable Cederquist, which is a bit hard to use as a tutorial, and the Fogel and Bar book (at http://cvsbook.red-bean.com/). I have the second edition in the dead tree edition; the third edition is available free online. If you like it, buy a paper copy.

For Subversion, although there are some free texts available, I decided that Practical Subversion, by Garrett Rooney, is the right text, on the right level, at the right time, for me. The chapter on best practices alone helps differentiate it from other books which tend to be organized around a command-line options reference. We'll see how it holds up as I gain a little more experience with Subversion.

I also picked up a copy of The Reasoned Schemer. I am a big fan of The Little Schemer and The Seasoned Schemer, although I must admit I have not completed The Seasoned Schemer.

I also started reading A Little Java, a Few Patterns, which is an effort to teach functional and recursive programming using Java. Given that I had a lot of Java experience and had already thought a lot about design patterns, it seemed like a strange and forced hybrid, so I decided not to bother with it and just work on Scheme instead. The book may be interesting to developers who know Java and who would like to begin to understand Scheme.

I am now re-skimming The Little Schemer as a warmup to a new book on logic programming. I've been interested in learning Prolog for a while now but have not gotten much past reading part of Clocksin and Mellish's book Programming in Prolog, 5th ed. Given that I'm currently learning Scheme, implementing a Prolog-like language in Scheme might be the impetus that I need. If I get enough time to work on it I will report on what I find.

27 January 2006

Thinking about Sudoku and Scheme, Part 3

OK, so I've given you a small view of one part of the array2D library and how it is implemented. Now let's look at my Sudoku solver from the top. Note that this is a hobby project, done in the rare scraps of free time I am able to piece together these days, by a relative novice in Scheme. In other words, it is not finished, polished, textbook work, so be kind.

Let's review what Sudoku is and consider some of the most elementary solving strategies. First, terminology. These are the terms I use; some are borrowed and some are my own. Other Sudoku geeks may use alternate terminology.

Box: the smallest unit; a single space for a number. I identify boxes using zero-based x, y coordinate pairs.

Open box: a box that does not have a number in it yet.

Coordinates: on a 9x9 grid, which does not represent the only possibility in Sudoku, the top row goes from (0, 0) to (8, 0) and the rightmost colum goes from (8, 0) to (8, 8).

Given: one of the numbers already provided in the puzzle.

Grid: in a 9x9 Sudoku puzzle, one of the 9 3x3 sub-parts; on a 16x16 puzzle, one of the 16 4x4 sub-parts, etc. Grids are not necessarily square; some published puzzles could contain 4x3 grids, for example.

Virtual row: a single group that must contain the set of possible numbers (in a 9x9 puzzle, 1 through 9). Note that each box is actually part of three virtual rows: its row, column, and grid.

Number set: for a 9x9 puzzle, the integers 1 through 9. I have seen Sudoku boards that allow the numbers 1 through 12, 1 through 16, 1 through 25, and 1 through 49. Sometimes letters are used: for example, the numbers 1 through 16 can be encoded using the hexadecimal digits 0..F (or 1..G), and the numbers 1 through 25 can be encoded by the letters A..Y. There are more exotic Sudoku involving the letters that make a specific word, but this is really just an encoding issue and does not fundamentally change the rules.

Sudoku rule: there is probably a good formal expression of this, but informally speaking, the Sudoku rule says that each box in each virtual row must contain one of the numbers in the number set.

By implication, each box cannot contain a number that is not a member of the number set; it also cannot contain a duplicate of another number in the same virtual row; in a completed puzzle each virtual row must be filled, which means the whole puzzle must be filled.

If you find that a puzzle you are working on seems to break this rule, either the puzzle was not valid to begin with, or you have made a solving error. The second case is more likely, but I have found errors in published puzzles.

There are several possible types of errors. The givens may violate the Sudoku rule, which will lead to an insoluble puzzle, or more than one solution may be possible, which could lead you to find a solution that follows the Sudoku rule but which does not match the published solution.

Candidate: one of the possible values that could legally be written in a box.

On an empty 9x9 grid, each box has as its candidates all the numbers 1 through 9, so no progress can be made towards a solution.

When the givens are taken into account, the set of candidates in the remaining open boxes in the neighborhood must be reduced to maintain the Sudoku rule. As the puzzle is filled in, the candidates are reduced further.

As you perform logical deductions on the board, you will remove candidates; when only one candidate remains for a box, the number should be written down and the candidates for boxes elsewhere in the neighborhood must have that number removed; repeat until no open boxes remain. (This is actually in a nutshell the algorithm for solving any Sudoku puzzle, and translates directly into the main loop of a computer solution that uses human solving techniques rather than testing every possibile chain of possibilities).

Noncandidate: instead of writing down candidates as you work on the puzzle, you could also write down noncandidates: that is, numbers that have been ruled out for each box. When the set of noncandidates for a given box leaves only one remaining candidate in the number set, you can write down that possibility, and you must then add this number this to the sets of noncandidates elsewhere in the neighborhood. You could represent noncandidates by writing them down directly, although this can be confusing, or writing down the number set in each open box and crossing out numbers as they become noncandidates.

Single Candidate: the case when one candidate remains in a box; once you recognize this case, you should write in the number; the box is solved, you should then make sure to reduce the candidates in the neighborhood. Which brings us to another term.

Neighborhood: the combined row, column, and grid that contains a given open box. Neighborhood is a distinct concept from virtual row because if a number appears anywhere in the neighborhood containing a given box, it cannot be a valid candidate for the box; this is because the Sudoku rule applies to each box three ways.

Basic Solving Strategies

For demonstration purposes, we will use the following 9x9 puzzle (taken from _Sudoku: Easy to Hard Presented by Will Shortz_ Volume 3):

| 9 . 8 | 3 . . | 7 4 2 |
| . 4 . | 5 8 . | . 9 . |
| 2 . . | . 7 . | 1 . 8 |
| . . 6 | 2 9 4 | 8 3 . |
| 4 8 . | . . . | . . 1 |
| 3 7 . | 6 . 8 | . 2 . |
| . 3 . | . 5 6 | . . . |
| 6 . . | 1 . . | . 8 3 |
| . 2 . | . . 3 | . 1 5 |

Let's look at how we might represent this board in Scheme, and then start coming up with functions to operate on it.

To allow my solver to be configured for different board sizes and grid dimensions, we'll provide bindings for the independent dimension values:

(define max-box-value 9)
(define board-x-dim 9)
(define board-y-dim 9)
(define grid-x-dim 3)
(define grid-y-dim 3)

To start solving our sample puzzle, let's use the following algorithm:

1. Make a given board. A given board is a board array that contains only the givens. Here are the givens, as a list of list, where each list contains a coordinate pair as a list and a value:

(define given-list
  '(((4 0) 6) ((5 0) 5) ((7 0) 9) ((8 0) 1)
    ((6 1) 5)
    ((0 2) 4) ((1 2) 5) ((2 2) 9) ((3 2) 1) ((7 2) 7)
    ((4 3) 9) ((7 3) 3)
    ((2 4) 1) ((3 4) 8) ((5 4) 6) ((6 4) 7)
    ((1 5) 7) ((4 5) 4)
    ((1 6) 9) ((5 6) 4) ((6 6) 6) ((7 6) 2) ((8 6) 5)
    ((2 7) 4)
    ((0 8) 8) ((1 8) 2) ((3 8) 7) ((4 8) 5)))

Here's a function which will make a given board:

(define make-given-board
  (lambda (given-list)
    (let ((given-board
           (make-vector (* board-x-dim board-y-dim) #f))
          (setter! (array2D-make-setter board-x-dim)))
      (letrec ((do-write-givens
                (lambda (puzzle-array givens)
                  (if (not (null? givens))
                         puzzle-array (cdr givens))
                        (setter! puzzle-array
                                 (caaar givens)
                                 (cadaar givens)
                                 (cadar givens)))))))
        (do-write-givens given-board given-list))

This is a pretty simple function, but let's look at a couple of points. First, the default value of our vector is #f, the canonical "false" value. This has no real purpose other than allowing us to use the elements of this array act as an argument to Scheme's if or cond and give the expected results. We've already seen how to use letrec to create a self-recursive helper function; we already know how the curried setter! function works. The body of this function is a pretty straightforward example of recursion over a list. The arguments we pass to setter! are a little more interesting:

(setter! puzzle-array
              (caaar givens)
              (cadaar givens)
              (cadar givens))

These primitives provide shortcuts for expressing combinations of car and cdr, and are read inside out from right to left; caaar becomes (car (car (car ...))), cadaar becomes (car (cdr (car (car ...)))).

An aside: I have a love-hate relationship with these constructions. On the one hand, they allow you to do quick deconstruction of arbitrary list structures, and that's a good thing because lists are very quick and dirty and you can easily put together data structures using lists. On the other hand, they allow you to do quick deconstruction of arbitrary list structures, and that's bad because they are not self-documenting in any way and nothing in them is named. Using something like a plist or record type is probably a better idea, so I will consider that later; Common Lisp provides destructuring, which would also be a valid option because it is self-documenting.

Anyway, that will build a given board. Now let's build a candidate board. The initial candidate list can be easily generated using another function:

(define make-initial-candidate-list
  (lambda ()
    ((lambda (max)
       (letrec ((make-candidate-list
                 (lambda (count)
                   (if (> count max) (list)
                       (cons count
                              (+ count 1)))))))
         (make-candidate-list 1)))

I'm going to wrap this up in the form of a value generator. A value generator is a function that receives x and y indices and returns a new value for the board. In this case our value generator will ignore the parameters:

(define initial-candidate-value-generator
  (lambda (x-idx y-idx)

Now we can use our set-board-elts primitive, which is a curried array mapping function, to fill in the candidates. Each box gets a list containing the full set of candidates:

(define make-initial-candidate-board
  (lambda ()
    (let ((candidate-board
            (* board-x-dim board-y-dim) '())))
       candidate-board initial-candidate-value-generator)

Now we've got our given board and our candidate board, and we can start smashing them together to shake out candidates, and begin finding values to go in our boxes. More on that next time.

Version Control Systems

I've had a fair amount of experience using CVS, but not for a couple of years ago. Recently I have been investigating version control systems again. One scenario I wanted to investigate was the following. Assuming a tree of files like this:


From this, we want to be able to checkout the following to a sandbox directory with a single checkout command:


If you're a CVS or svn user, go along with me for a moment and assume that this is a valid thing to want to do.

Unfortunately, it seems to be impossible.

What about CVS modules?

Well, I would summarize what modules can do by saying that upon checkout, the use of modules can collapse existing directory hierarchy, or add directory hierarchy that does not exist in the repository, or place directories in places where they don't exist to start with, but they fall short of allowing you to arbitrarily rearrange the tree, and they won't rearrange files. This is probably in part because of the ways in which metadata is stored on disk for CVS sandboxes: it goes into your directories, even though CVS does not treat directories in general as first-class citizens.

Here is what you can do with modules:

Alias modules will simply allow you to use a different name for an existing module or path. When you check out, all intermediate directories will be created. For example:

r5rv_alias_dirs -a r5rv/a/b r5rv/c/d

Will give you:


Regular modules let you label all or some of the files in a directory with a module name. For example, this will get only the files indicated and write them in a directory with the name of the module, generating no intermediate directories in the output:

r5rv_foo r5rv/a/b foo.cpp

This will give you:


Leaving off the filename will get you all the files in the specified subdirectory:

r5rv_b r5rv/a/b

giving you:


Note that for regular modules you can't put multiple directories on the same line, or CVS gets confused.

If you want to put specific files or the contents of specific directories together, you have to use ampersand modules. But: when CVS writes out ampersand modules, it addes the module names to the directory structure! So, given:

r5rv_bar r5rv/c/d/ bar.cpp
r5rv_foo_bar &r5rv_foo &r5rv_bar

Checking out r5rv_foo_bar will give you:


Not quite what we want.

You can exclude directories when defining an alias module. For example:

r5rv_no_c -a !r5rv/c r5rv

will give you only:


Note that if you already have r5rv checked out from the module r5rv_alias_dirs above, CVS will not remove the contents of c/, so it would be best to remove your local sandbox version of r5rv before checking out this new module.

So can we achieve an output dir like:


I don't think CVS can do this directly, but we come close, if we're willing to accept intermediate directories around our individual files. Using as a model the r5rv_foo_bar module above, we can change it to rename the working directory to something other than the module name:

r5rv_foo_bar_renamed -d r5rv_proj &r5rv_foo &r5rv_bar

We get:


What if we wanted to fix up this hierarchy with a post-checkout script? Well, assuming we can guarantee the script will run on the client, if we have a script that will move our files, we can specify that we want a script to run by defining a module like this:

r5rv_post -o combine.sh -d r5rv_proj &r5rv_foo &r5rv_bar

This gives us what we want, but there is one problem: we have not properly preserved the contents of the CVS metadata directories. This means that CVS no longer knows what directory foo.cpp and bar.cpp belong to, and will be generating all kinds of errors when, for example, doing a status check or update. There are some workarounds that can be put into place using .cvsignore files, and I suppose that if I was feeling ambitious I could write some scripts that would actually run sed on the metadata files in the CVS subdirectories, but it should be obvious that this solution would be ugly and brittle.

So what about Subversion?

In Subversion, when you commit your initial structure, you can do something like this:



Let's say you want to be able to mix-and-match these into your output. You can check them in to svn as separate projects. Easily done:

svn import ./a file:///cygdrive/e/repo-svn/r5rv-a -m "initial"
svn import ./b file:///cygdrive/e/repo-svn/r5rv-b -m "initial"

Now you want to create a directory that will contain both of them upon checkout:

mkdir r5rv-proj
svn import ./r5rv-proj file:///cygdrive/e/repo-svn/r5rv-proj -m "initial"

Then over in your workspace, you check out that empty directory: the last parameter is the name to give it in your workspace:

svn checkout file:///cygdrive/e/repo-svn/r5rv-proj r5rv-proj

Now you have a working copy of an empty directory (something that is impossible in CVS!) You want to add a svn:externals property to it. This consists of a set of svn URLs for things that should be retrieved here when the directory is checked out. You can do this with svn propset, but since you want one local directory name and one svn URL per line, you can't really do it directly on the command line. Instead you can do something like:

echo "r5rv-a file:///cygdrive/e/repo-svn/r5rv-a" >> props
echo "r5rv-c file:///cygdrive/e/repo-svn/r5rv-c" >> props

Check that it looks right:

$ cat props
r5rv-a file:///cygdrive/e/repo-svn/r5rv-a
r5rv-b file:///cygdrive/e/repo-svn/r5rv-b

then set the externals property using -F, which means read from a file, like so:

$ svn propset svn:externals -F props r5rv-proj/
property 'svn:externals' set on 'r5rv-proj'

You can confirm the property has been set:

$ svn propget svn:externals r5rv-proj
r5rv-a file:///cygdrive/e/repo-svn/r5rv-a
r5rv-b file:///cygdrive/e/repo-svn/r5rv-b

Clean up:

$ rm props

OK, now commit:

svn commit r5rv-proj/ -m "added external property"

And update:

svn update r5rv-proj/

You see something like this:

Fetching external item into 'r5rv-proj/r5rv-a'
A    r5rv-proj/r5rv-a/b
A    r5rv-proj/r5rv-a/b/foo.cpp
Updated external to revision 4.

Fetching external item into 'r5rv-proj/r5rv-c'
A    r5rv-proj/r5rv-c/d
A    r5rv-proj/r5rv-c/d/bar.cpp
Updated external to revision 4.

That's a lot of work! But the model is actually simpler than elaborate use of entries in the modules file. In addition, under svn, a non-privileged user can make this change (that is, a user with ordinary write access, rather than administrator access). And more importantly, these changes are all versioned; under CVS, if you want to use those module definitions, you have to make sure that the paths specified in the modules don't change, or the modules file will be out of sync with the repository structure. This is a general problem with CVS, which is not really designed to cope well with restructuring the repository.

Now your local project looks like this:


Which is pretty close to what we want, taking into account the fact that neither CVS nor svn seems to be able to write individual files from different repository locations to the same output directory in the working copy.

However, there are a couple of things to keep in mind about the svn solution. First, your external links have to be complete svn URLs. That is, they can't be relative URLS to directories in the same repository [see note 1].

Worse, it seems that since I'm using svn URLs that start with svn+ssh, the URL I put into the svn::externals property has to include my username! [see note 2]. This means that the external URL will only work when I do the update, not when my co-workers do it. What a mess.

It appears that svn ignores externals when you perform a commit. That is, it assumes you are using an external piece of code that you don't own, and don't want to (or don't have permission to) change. However, you can specify that the external reference should either remain stuck on a particular version, or should refer to the head; if you choose the latter, when you do an update, if the external tree have changed your working copy will be updated.

Anyway, there it is: externals are slightly more flexible than CVS, but not arbitrarily flexible, and with some critical usability issues. Neither of the tools really will let you assemble a checkout out of arbitrary directories and files.

I am going with svn and hoping not to look back, but the ability to define completely arbitrary modules would have been a nice fit to our build process.

From the point of view of programming languages, or even a DSL for version control, this would seem like a perfect example of failed Greenspunning. Why shouldn't the modules system in CVS let you specify arbitrary paths, or even use regular expression pattern matching, to specify the members of a retrievable module? Why shouldn't a tool like svn let you use a DSL to specify what actually happens in the checkout process, with primitives that correspond to the domain entities and proper handling of metadata?

Without this, it seems to me, there is still room in the version control space for a better tool. But I guess there is always room for a better tool; the question is "how flexible is flexible enough?"

NOTE 1: Apparently this subversion feature (relative external links) has been under discussion for a long time. See the bug report. It looks like it might appear in svn 1.5, but I'm not holding my breath.

NOTE 2: There seems to be a way to configure resource files so that command-line svn clients will let you use external svn+ssh URLs that don't specify a username. However, at my workplace we are all using the TortoiseSVN client, and I'm not aware of any way to make that client work with these URLs without specifying the username. Also, just as a matter of design, it seems like a bad idea to force your repository to contain hard-coded references to its own URL, since these will break if the entire repository is moved.

25 January 2006

Thinking about Sudoku and Scheme, Part 2

Here is how I implemented some of my array primitives. I'll start in the middle:

(define map-board-row-elts
  (array2D-make-row-elt-mapper board-x-dim))

When treating a vector like a two-dimensional array, assuming that you store them row-first, the key piece of information you need is the size of the row. (More generally, for an n-dimensional array, to access an element you need to know the size of dimensions 1 to n - 1). On the top side of my program, the dimensions of my Sudoku board are known. The array primitives don't know. Rather than pass this information to every array access call, I create a function. The function array2D-make-row-elt-mapper captures this information using closure and returns another function:

(define array2D-make-row-elt-mapper
  (lambda (x-dim)
    (lambda (array2D y-idx elt-handler)
       array2D y-idx x-dim elt-handler))))

Note the nested lambda expressions. The top one receives the x-dim parameter, and the body of this function is another lambda expression. This second lambda expression is the return value of this function: it returns a function. This returned function invokes another function which requires an x-dim parameter.

In this case, the purpose of returning this function is to provide the functionality of another function, array2D-map-row-elts, but without the need to provide an x-dim parameter. The new function is a curried function.

What is a curried function? Currying means taking a function with n parameters and creating a new function with n - 1 parameters. The new function provides a default value for the missing parameter, giving you a simplified API to the original function.

Common Lisp and other languages provide curry and rcurry functions which peel off parameters from the left or right of the paramter list, respectively. So our currying may not meet a strict definition, although we are doing the same thing: providing an interface function that will supply a value for one of the original parameters. Currying often uses a constant for these default parameters; in our case, we're using a variable, but the idea is similar. Scheme does not have a standard curry function, but it is easy enough to express currying by using nested lambda expressions.

In my case, x-dim is the curried parameter. Inside this lambda expression is a reference to x-dim. Scheme supports lexical closure; at the time this lambda expression is evaluated, the result is a function object that "closes over" or "captures" -- that is, retains a reference to -- the variable binding in effect when the lambda expression is evaluated.

This is the variable binding owned by the caller, the variable bound to the name board-x-dim. It is not a copy: if some other piece of code updates the value in that variable, this function would see a different value for x-dim when it executes. This creates complex state with multiple possible "listeners" for the state change: this is one of the reasons that Scheme programmers tend to prefer a functional style where variables are not altered after initial binding.

If you are a C programmer, you may find this very confusing. You can think of it as if the returned function retains a pointer to the board-x-dim sent by the caller. The C runtime model is quite different, though: it has distinct lifetimes for variables; local non-static ("automatic") variables that exist on the stack may be recycled when a given function returns, so it is unsafe to retain pointers to them (in fact, retaining a pointer to an automatic variable is a common beginner's bug in C).

In Scheme, bindings exist as long as there are references to them; when there are no more references to them, they may be garbage-collected (or not).

Let's look at the definition of the next function we call, array2D-map-row-elts:

((define array2D-map-row-elts
  (lambda (array2D y-idx x-dim elt-handler)
     0 (- x-dim 1)

OK, we call yet another function: array2D-map-subrow-elts, which is more general (it will map any contiguous subset of the array elements). It is the general case, and array2D-map-row-elts is the more specific case (it maps the whole row), which means we can quite naturally define this specific case by parameterizing the general case. It isn't quite the same as currying, but it is a similar idea; we provide some fixed parameters representing the start of the row (index 0) and end (index x-dim - 1). Here is the general function:

(define array2D-map-subrow-elts
  (lambda (array2D
           x-first-idx x-last-idx
    (letrec ((iter-x
              (lambda (x-idx y-idx)
                 (array2D-get array2D x-idx y-idx x-dim))
                (if (< x-idx x-last-idx)
                    (iter-x (+ x-idx 1) y-idx)))))
      (iter-x x-first-idx y-idx))))

Are we there yet?

This function receives all the parameters and walks through the array, starting with an arbitrary x-idx (in this case, zero), and proceeding until we have processed the last legal x index (in this case, x-idx - 1).

Note that instead of iteration to walk the array, I use recursion. I won't go into a full explanation of recursion here; suffice it to say that although many people feel that recursion is unnatural and does not reflect how they think about iteration in the "real world," after expressing algorithms recursively for a while, it starts to seem much more natural, and the proposition that recursion and iteration can be transformed into each other does not seem so strange.

This function is formulated using tail recursion, where the bindings are not referenced after the recursive function call. In C or Pascal, the compilers are generally not smart about this sort of thing, and recursing on a large array could result in a very deep set of stack frames. A language like Scheme is able to recognize that since the values that exist in previous function calls are not used, it is free to lose track of the previous values, so no deep stack is actually required. Since this is the case, the decision to use recursion was largely arbitrary. I chose it because I wanted to practice writing recursive functions.

The letrec may be unfamiliar. If the innermost function was not recursive, we could just call it like this, without ever binding the function to a name at all:

((lambda (x-idx y-idx)
   ... )
0 y-idx)

In this case the innermost function is never bound to a name, because nothing has to refer to it by name. The expression around it would simply call it. This is also known as applying parameters to the anonymous function object.

You might think that we could just do this with let. The semantics of let allow the call from outside the function, but the binding to the name iter-x would exist only within the body of the let expression. This body consists of the expression or expressions that occur after the binding list in the let expression. Since the definition of the function occurs as part of the binding list, not part of the body of the let expression, let will not work; letrec is designed specifically to allow definition of recursive or mutually recursive functions within the binding list.

So now are we there yet?

Almost. My final comment on this function concerns the use of the elt-handler parameter. The generated function allows the caller to pass in a handler function. Applying functions to functions is known as higher-order programming; support for higher-order programming is one of the essential features of Lisp and Scheme.

It might seem like this is much too complicated. Functions generating functions calling functions calling functions? Well, yes, this is a bit of overkill to access an array; normally we would use some standard functions for handling data structures. But the general style: creating a lot of small functions -- appears very commonly in Scheme programs, and now you have had a taste of how a typical data structure library might be implemented.

I don't claim to be very experienced with Scheme yet, but I am happy to have gotten to the point where writing functions like array2D-make-row-elt-mapper and its supporting functions comes rapidly and naturally. Reading examples helps, but the lessons don't really click until you tackle a programming problem, and feel your mind begin to undergo rewiring. The level of abstraction, and thus the leverage you can gain from Scheme, become much higher than this, so stay tuned.

Some Computer Science Classics

I've been building up a computer science library via ABE Books (http://abebooks.com), highly recommended), purchasing a couple of used books and a couple of new books each month.

Recently arrived used:

  • Hoare, Communicating Sequential Processes
  • Dijkstra, A Discipline of Programming
  • Agha, Actors

Recently arrived new:

  • Brodie, Thinking Forth

Some comments on these texts later.

Thinking about Sudoku and Scheme, Part 1

I've been working on some tools to help me solve Sudoku puzzles in Scheme. That goal is actually secondary to the goal of learning to write better, more idiomatic Scheme programs, and so progress towards the Sudoku solver is frequently interrupted while I backtrack in an attempt to reorganize and optimize the supporting code.

I'm working on this both in a bottom-up and top-down direction. Bottom up, to develop something of a domain-specific Sudoku-solving language. Top down, because my higher-level strategy is also not that clear at the outset and is undergoing stepwise refinement.

My approach to solving Sudoku by computer is not to recursively explore every the tree of moves until a solution is found. That approach is easily doable but less interesting to me than trying to emulate some of the solving strategies that can be performed by a human solver. Ultimately I would like the tool to assist a human solver, not replace her.

My tool of choice is PLT Scheme version 3.00. For reasons to painful to go into just now, I have currently given up on using either SRFIs or SLIB for libraries and am writing everything in plain RSR5 Scheme. More on that later.


This blog, "Curse and Recurse" is a spinoff of my weblogs "Geek Like Me" and "Geek Like Me, Too."



The topic of this blog is pragmatic programming and programming languages. I am interested in clearer and more effective ways to express programs and tools that will get to bug-free code faster, Sapir-Worf be damned. I don't have a Ph.D. and this is not an exposition about theoretical type systems.

The languages I am currently most interested in are Scheme, and Ruby. I have long-term goals to learn idiomatic programming in Common Lisp and to find better approaches to embedded systems programming by using embedded languages such as Lua and Io. This is not currently another Ruby on Rails weblog and I am not doing web development these days, although I have done so in the distant past.

If you are still reading, welcome! Maybe we can both learn something.