16 June 2013

Objective-Dylan, or Perhaps Subjective-C?

Yesterday my wife took the kids with her on an overnight trip to Ann Arbor so I've had a bit of extra quiet time. How am I making use of this bounty? Getting on with some minor home repairs? Cleaning my office from top to bottom? Er, no... porting the game logic I've written so far in Objective-C back to Dylan, so that I can do some more thinking about it.

So after a phone job interview yesterday (which went well, I thought -- I'm optimistic about this possibility!) I started working on this task, and then about twelve hours later, around 2 a.m., I had the basic setup and population of the game board working. It's embarrassing to admit how long it took. I started on my Mac, and when I began encountering constant runtime errors switched over to my Ubuntu box, thinking that the Mac version of Open Dylan might just be broken (it isn't; I got the identical behavior on the Linux build). I finally figured out workarounds -- it's funny how taking a break clears my head far better than pressing on ever does -- then read a little Gene Wolfe (I'm working my way through In Green's Jungles, one of his books I've repeatedly tried and failed to finish), and fell asleep with no children in the bed to kick or otherwise interfere with a good night's sleep. I'm back up this morning, had a bath, and I'm drinking a large coffee with soy creamer and stevia and trying to hold off on a lunch break until I have some more done. It's about 10 a.m. and I'm expecting my family back in about six hours, so the race is on!

This has taken far longer than I hoped; I lost quite a bit of time stumbling across things in Open Dylan that still seem just plain broken. I had to start working on a smaller and smaller program to figure out exactly what was broken. These things I've flagged in comments, as places where, basically, I wish Dylan worked a certain way, and it doesn't. I may just be asking for something that doesn't quite match the original spec or isn't quite possible, but I'll share those with the Dylan Hackers team and see if it seems like I can help with them. The biggest thing that was broken, though, was me -- my brain, that is -- since it's been a long time since I've worked with Dylan's type-union and singleton pseudo-classes and I had forgotten the details. The compiler was not a big help with this, since it is such a dynamic language and leaves an awful lot of things to the runtime to figure out, which it does by throwing an error message that may or may not help much. The documentation is a bit scanty, but it does contain everything you need to know, if you re-read and squint at the scanty examples that are out there hard enough.

The good news is that the port is working and I'd like to share it. Dylan is still up there with Scheme (and now Haskell) as one of my favorite languages for designing programs -- yes, even though Dylan is quite old as languages go. I like to see what it can do especially with generic functions and its sophisticated model for object-oriented dispatch. I've been a little stymied as to how to express the design best in Objective-C. If it was a complicated game design, I wouldn't feel bad about having a program that looked complex. But it's really an elegantly simple game, and so I feel like the implementation should reflect that. My Objective-C implementation has been feeling more and more bloated and pointlessly complex, although it works, so my thought was to get it down to a simple implementation that takes full advantage of Dylan's object-oriented programming features, largely borrowed from CLOS, and then port that back to Objective-C, adding whatever minimalist support is needed to fake up some of the features that Dylan gives me that Objective-C doesn't have. This might be by way of also writing a Haskell or Scala implementation later, for yet more learning and language comparison, although really what I should focus on is getting the iOS GUI up and working so that I have something to show people.

Anyway, I've got a Dylan program that plays the Polar game, using singletons to represent tile types, and methods dispatched on singletons to handle specific kinds of collisions. The classes -- which are empty, pretty much used only for their usefulness as types, for driving dispatch -- are like so:

In Dylan you can create some instances, and create something called a type-union, which is something that is a type, I think, but not a class. You can use it to define a slot type or a parameter type. But you can't make one:

define constant $the-bomb = make();
define constant $the-empty = make();
define constant $the-heart = make();
define constant $the-house = make();
define constant $the-ice-block = make();
define constant $the-mountain = make();
define constant $the-tree = make();

define constant  = type-union(
    singleton( $the-bomb ), singleton( $the-empty ),
    singleton( $the-heart ), singleton( $the-house ),
    singleton( $the-ice-block ), singleton( $the-mountain ),
    singleton( $the-tree ), singleton( #f ) );

And eventually dispatch on singletons -- meaning that a given method will be called with it is called with references to the exact objects that you specify:

define method collide( model :: <model>, dir :: <dir>,
    heart-pos :: <pos>, heart-tile == $the-heart,
    house-pos :: <pos>, house-tile == $the-house )
    format-out( "collide: $the-heart / $the-house\n" );
    setTileAtPos( model, heart-pos, $the-empty );
    model.decrementHeartCount();
end;

That gives you an idea of how some of the code in the Dylan program is organized. I have it mostly working, however, I'm not going to present the full code quite yet because I have a crashing bug, and I haven't yet been able to figure out if it is a dumb mistake on my part or a compiler or runtime bug in Open Dylan. I've also asked the Dylan hackers to take a look at my design and see what they think -- if they can find, as I put it, "a simpler design struggling to get out." Which is always the challenge, when trying to write not just functional, but model code, isn't it?

07 June 2013

Objective-C, Day 5

(Warning: non-FP content). (I've been including this for the benefit of any "Planet X" aggregators who are including my feed, where X is a functional language like Haskell. I'm assuming you've got that by now and either don't care or are unsubscribing/ignoring this series if it bothers you. I expect to get back to more functional language code at some point, maybe even implementing the same problem. But I dabble in different languages and sometimes the digressions go on for a while...)

So, today's topic is dispatch. I'm starting to design and implement the logic for pieces interacting on the board. The key in object-oriented designs is always to determine who (which object) manages which state. The design I've come up with is a sort of hybrid design, where position on the board is not actually a property of the board tiles. Instead, there's a board which keeps track of them. The instances of, say, a tree on the board don't have any unique state, so I'm not instantiating different objects for each one; they all point to a singleton (which I should properly enforce with a singleton factory method at some point).

There are trade-offs. On paper, my design involved adding a "push" method to the classes for tile pieces. In a language like Dylan, this "push" method would be a generic function, and dispatch on multiple parameter types, so that I could write some very short methods and use the method dispatcher, instead of explicit if-then or switch logic to find the right bit of code at run-time according to the types of the interacting objects (either their literal types or an enum, or some such). I could even do this in C++ because it has method overloading based on the parameters -- as long as this is based on their static type known at compile-time. Which... isn't true in this case. I miss Dylan's generic function dispatch! Objective-C is a underpowered in this respect even compared to C++. For example, I'd like to be able to write methods like this for the bomb class (this is pseudocode):

bomb::push(mountain)
{
    // blow up the mountain
}

bomb::push(empty)
{
    // slide the bomb onto the tile
}

But I can't. I can't just use the class construct to organize methods to call without an instance -- there doesn't seem to be the equivalent of C++ static methods. There also is no equivalent of a pure virtual function; I can't declare the need for a push() method in the common base class of the tile pieces and have the compiler demand that I implement in any subclasses to make them instantiable. The closest I can come, I think, is to create a method that generates an error if it itself is called instead of being overridden in a subclass. That seems to lack semantic clarity. So maybe I could do this, with double dispatch, but it doesn't seem worth the trouble for a small number of collision behaviors, when the behavior isn't simply supported by the language. I keep telling myself "thin layer on top of C... thin layer on top of C..."

So last night I had my Mac running upstairs in the office, and I used my iPad downstairs to connect to it via VNC, and write some code, running it on the iPad simulator, which I then viewed and controlled with a real iPad (mad scientist laugh). It needs a little tweaking this morning but here's more-or-less what I came up with. Note that I have started using some different naming conventions for file-scope and local variables. I'm not sure they are very standard Objective-C but they are closer to what I've grown comfortable with over the years in C and C++.

typedef struct {
    int x_idx;
    int y_idx;
} pos_t;

typedef enum {
    dir_east,
    dir_west,
    dir_south,
    dir_north
} dir_e;

// A straight C function to return a pos_t updated with a dir_e;
// the result may be out of bounds
pos_t getUpdatedPos( pos_t original_pos, dir_e dir );
BOOL posValid( pos_t pos );

static const int board_width = 24, board_height = 4;
// The short board design is part of what makes it so
// easy to get sliding objects stuck against the edges
// of the world or in corners where you can't get the
// penguin avatar around to the other side to push them.
// We could consider a bigger board later and maybe
// implement the original puzzles surrounded by water,
// or something like that.

// Equivalent of C++ class forward declaration
@class ArcticSlideTile;
@class ArcticSlideBomb;
@class ArcticSlideEmpty;
@class ArcticSlideHeart;
@class ArcticSlideHouse;
@class ArcticSlideIceBlock;
@class ArcticSlideMountain;
@class ArcticSlideTree;

@interface ArcticSlideModel : NSObject
{
    ArcticSlideTile* board[board_height][board_width];
}

- (id)init;
- (id)initWithLevelIndex:(int)level_idx;
- (NSString*) description;
- (ArcticSlideTile*)getTileFromPosition:(pos_t)pos
                            inDirection:(dir_e)dir;
- (void)setTileAtPosition:(pos_t)pos
                       to:(ArcticSlideTile*)type;

@end

@interface ArcticSlideTile : NSObject

- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir;

@end


@interface ArcticSlideBomb : ArcticSlideTile
// Bombs can be pushed and will slide until they hit an
// object and stop. If the object is a mountain, both bomb
// and mountain are destroyed. If another object hits a bomb
// it stops (I think -- I'm not sure it is possible to set 
// up a board such that you can slide something into a bomb).

// push is called when the penguin pushes against a tile.
// It returns YES if the penguin can move onto the tile with
// this action. This is only ever the case for a tree or empty
// tile.
- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir;
- (NSString*) description;
@end

@interface ArcticSlideEmpty : ArcticSlideTile
// The penguin can always step onto an empty tile
- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir;
- (BOOL)slideFromPosition:(pos_t)pos inDirection:(dir_e)dir;
- (NSString*) description;
@end

@interface ArcticSlideHeart : ArcticSlideTile
// When a heart hits a house the heart disappears (getting
// all the hearts into the house is how you win the game).
// Otherwise they cannot be destroyed, and slide like other
// slidable items.
- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir;
- (NSString*) description;
@end

@interface ArcticSlideHouse : ArcticSlideTile
// Houses cannot be pushed and stop other objects except
// hearts. When a heart hits a house the heart disappears
// (getting the hearts into the house is how you win the game).
// So the model should keep track of the number of hearts
// on the board and trigger a "win the level" behavior when
// the last heart is removed.
- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir;
- (NSString*) description;
@end

@interface ArcticSlideIceBlock : ArcticSlideTile
// Ice blocks can be pushed and will slide until they hit
// an object and stop. If they are pushed directly against
// an object they will be crushed (there should be an animation)
// and disappear.
- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir;
- (NSString*) description;
@end

@interface ArcticSlideMountain : ArcticSlideTile
// Mountains cannot be moved and are destroyed by bombs.
- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir;
- (NSString*) description;
@end

@interface ArcticSlideTree : ArcticSlideTile
// Trees cannot be pushed or destroyed and stop all sliding
// objects, but the penguin avatar character can walk through
// them.
- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir;
- (NSString*) description;
@end

Here's part of the implementation. It seems way too wordy; I need to rethink the amount of code required for each step. At the least, some refactoring seems to be in order. As I mentioned earlier, I'm not sure the tile classes are really earning their keep. I got rid of the singleton instantiation machinery but now there are order of initialization dependencies. I'll need to do further thinking as I consider what communication needs to happen between the "model" part and "controller" part -- how to interact with the GUI, how to indicate that tiles need to be redrawn, or animated transitions should play, or sound effects should play, or that the score is changed, and what to do when a level is completed. There is lots more to think about for such a simple little game! And of course I haven't really even begun to implement the "view" parts.

static ArcticSlideEmpty* empty_p;
static ArcticSlideTree* tree_p;
static ArcticSlideMountain* mountain_p;
static ArcticSlideHouse* house_p;
static ArcticSlideIceBlock* ice_block_p;
static ArcticSlideHeart* heart_p;
static ArcticSlideBomb* bomb_p;

static ArcticSlideModel* model_p;

pos_t getUpdatedPos( pos_t original_pos, dir_e dir )
{
    pos_t updated_pos = original_pos;
    int x_offset = 0;
    int y_offset = 0;
    if ( dir_east == dir )
    {
        x_offset = 1;
        y_offset = 0;
    }
    else if ( dir_west == dir )
    {
        x_offset = -1;
        y_offset = 0;
    }
    else if ( dir_north == dir )
    {
        x_offset = 0;
        y_offset = -1;
    }
    else if ( dir_south == dir )
    {
        x_offset = 0;
        y_offset = +1;
    }
    updated_pos.x_idx += x_offset;;
    updated_pos.y_idx += y_offset;
    return updated_pos;
}

BOOL posValid( pos_t pos )
{
    return ( ( ( pos.x_idx >= 0 ) ||
               ( pos.x_idx < board_width  ) ) ||
             ( ( pos.y_idx >= 0 ) ||
               ( pos.y_idx < board_height ) ) );
}

@implementation ArcticSlideTile

- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir
{
    // Should be implemented in subclass
    return NO;
}

@end

@implementation ArcticSlideBomb

- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir
{
    // Penguin has pushed bomb in the given direction.
    // Get our own position:
    pos_t bomb_pos = getUpdatedPos( pos, dir );
    // What are we being pushed into?
    ArcticSlideTile *target_tile_p =
    [model_p getTileFromPosition:bomb_pos
                     inDirection:dir];
    
    if ( nil == target_tile_p )
    {
        // Edge of the world. TODO:
        // queue a "boop" sound effect
    }
    else if ( mountain_p == target_tile_p )
    {
        // bomb pushed into mountain
        // TODO: queue animation of bomb moving onto
        // mountain, animate explosion
        // remove bomb and mountain
        pos_t new_bomb_pos = getUpdatedPos( bomb_pos, dir );
        [model_p setTileAtPosition:new_bomb_pos
                                to:empty_p];
        new_bomb_pos = getUpdatedPos( new_bomb_pos, dir );
        [model_p setTileAtPosition:new_bomb_pos
                                to:empty_p];
    }
    else if ( empty_p == target_tile_p )
    {
        // TODO: queue bomb moving into space
        pos_t new_bomb_pos = getUpdatedPos( bomb_pos, dir );
        // Set bomb at new position
        [model_p setTileAtPosition:new_bomb_pos
                                to:bomb_p];
        // Remove bomb from old position
        [model_p setTileAtPosition:bomb_pos
                                to:empty_p];

        // Bombs will continue to slide until stopped
        ArcticSlideTile *target_tile_p =
        [model_p getTileFromPosition:new_bomb_pos
                         inDirection:dir];

        while ( empty_p == target_tile_p )
        {
            // TODO: animate bomb moving into space
            pos_t new_bomb_pos = getUpdatedPos( bomb_pos, dir );
            // set bomb at new position
            [model_p setTileAtPosition:new_bomb_pos
                                    to:bomb_p];
            // remove bomb from old position
            [model_p setTileAtPosition:bomb_pos
                                    to:empty_p];
        }

        if ( mountain_p == target_tile_p )
        {
            // bomb pushed into mountain
            // TODO: queue animation of bomb moving
            // onto mountain, animate explosion
            // remove bomb and mountain
            [model_p setTileAtPosition:new_bomb_pos
                                    to:empty_p];
            new_bomb_pos = getUpdatedPos( new_bomb_pos, dir );
            [model_p setTileAtPosition:new_bomb_pos
                                    to:empty_p];
        }
    }
    // The penguin cannot actually move in this turn
    return NO;
}

- (NSString*) description
{
    return @"Bomb  ";
}

@end

@implementation ArcticSlideEmpty
- (BOOL)pushFromPosition:(pos_t)pos inDirection:(dir_e)dir
{
    // If the penguin pushes onto an empty tile, he can always
    // move there
    return YES;
}

- (NSString*) description
{
    return @"      ";
}
@end

I'm leaving out unfinished tile classes for clarity, but here is the model implementation:

@implementation ArcticSlideModel

- (id)init
{
    // Initialize the global tile objects. I messed around
    // with singleton factory methods for creating a single
    // instance of each of these and accessing it everywhere
    // but the resulting code was too wordy to justify this.

    empty_p = [[ArcticSlideEmpty alloc] init];
    tree_p = [[ArcticSlideTree alloc] init];
    mountain_p = [[ArcticSlideMountain alloc] init];
    house_p = [[ArcticSlideHouse alloc] init];
    ice_block_p = [[ArcticSlideIceBlock alloc] init];
    heart_p = [[ArcticSlideHeart alloc] init];
    bomb_p = [[ArcticSlideBomb alloc] init];

    self = [super init];

    for ( unsigned int idx_y = 0;
         idx_y < board_height; idx_y++ )
    {
        for ( unsigned int idx_x = 0;
             idx_x < board_width; idx_x++ )
        {
            board[idx_y][idx_x] = empty_p;
        }
    }

    return self;
}

- (id)initWithLevelIndex:(int)level_idx
{
    self = [self init];

    // Lookup table to decode the original Polar resource
    // data as strings
    ArcticSlideTile *
        polar_data_tile_map[POLAR_DATA_NUM_TILE_VALS] =
    {
        empty_p, tree_p, mountain_p, house_p, ice_block_p,
        heart_p, bomb_p
    };

    if ( level_idx > ( num_polar_levels - 1) )
    {
        NSLog(@"initWithLevelIndex: bad level_idx %d!\n",
              level_idx);
    }
    else
    {
        unsigned int level_data_idx = 0;
        for ( unsigned int idx_y = 0;
             idx_y < board_height; idx_y++ )
        {
            for ( unsigned int idx_x = 0;
                 idx_x < board_width; idx_x++ )
            {
                int polar_data_tile_val =
                    polar_levels[level_idx]
                                [level_data_idx] - '0';
                if ( ( polar_data_tile_val < 0 ) ||
                     ( polar_data_tile_val > 
                       polar_data_max_tile_val ) )
                {
                    NSLog(@"tile value %d out of range!\n",
                          polar_data_tile_val );
                    self = nil;
                }
                else
                {
                    board[idx_y][idx_x] =
                        polar_data_tile_map[polar_data_tile_val];
                    level_data_idx++;
                }
            }
        }
    }

    return self;

}

- (ArcticSlideTile*)getTileFromPosition:(pos_t)pos 
                            inDirection:(dir_e)dir
{
    pos_t updated_pos = getUpdatedPos(pos, dir);
    if ( posValid( updated_pos ) )
    {
        return board[updated_pos.y_idx]
                    [updated_pos.x_idx];
    }
    else
    {
        return nil;
    }
}

- (NSString*)description
{
    NSMutableString *desc_str =[[NSMutableString alloc]init];
    
    [desc_str appendString:@"ArcticSlideModel board state:\n"];
    for ( unsigned int idx_y = 0;
         idx_y < board_height; idx_y++ )
    {
        for ( unsigned int idx_x = 0;
             idx_x < board_width; idx_x++ )
        {
            [desc_str appendString:[board[idx_y][idx_x] 
                                    description]];
        }
        [desc_str appendString:@"\n"];
    }
    return desc_str;
}

- (void)setTileAtPosition:(pos_t)pos to:(ArcticSlideTile*)type
{
    board[pos.y_idx][pos.x_idx] = type;
}

@end

It's not much yet, and it doesn't have any kind of user interface outside of NSLog, but my code will successfully respond to moving the penguin through trees, through open space, and pushing a bomb, which then moves into an open space, continues to slide until it comes up against a mountain, and destroys the mountain. I'm driving this with a test method like this:

NSLog(@"%@\n", model_p);
// Penguin starts at 0,0, on a tree tile
pos_t penguin_pos = { 0, 0 };

// Walk the penguin south onto another tree tile
ArcticSlideTile* tile_p =
[model_p getTileFromPosition:penguin_pos 
                 inDirection:dir_south];
NSLog(@"Penguin is facing: %@\n", tile_p);
BOOL allowed = [tile_p pushFromPosition:penguin_pos
                            inDirection:dir_south];
NSLog(@"Penguin allowed: %s\n", ( allowed ? "YES" : "NO" ) );
tile_p = [model_p getTileFromPosition:penguin_pos
                              inDirection:dir_south];
penguin_pos = getUpdatedPos(penguin_pos, dir_south);
NSLog(@"Penguin is facing: %@\n", tile_p);
    
// Walk the penguin east onto an empty space
tile_p = [model_p getTileFromPosition:penguin_pos
                              inDirection:dir_east];
NSLog(@"Penguin is facing: %@\n", tile_p);
allowed = [tile_p pushFromPosition:penguin_pos
                       inDirection:dir_east];
NSLog(@"Penguin allowed: %s\n", ( allowed ? "YES" : "NO" ) );
tile_p = [model_p getTileFromPosition:penguin_pos
                          inDirection:dir_east];
penguin_pos = getUpdatedPos(penguin_pos, dir_east);

// Try walking into a bomb, which should slide
// and blow up a mountain
tile_p = [model_p getTileFromPosition:penguin_pos
                          inDirection:dir_east];
NSLog(@"Penguin is facing: %@\n", tile_p);
allowed = [tile_p pushFromPosition:penguin_pos
                           inDirection:dir_east];
NSLog(@"Penguin allowed: %s\n", ( allowed ? "YES" : "NO" ) );

NSLog(@"%@\n", model_p);

I'll think on this whole model of updates some more. Maybe it can be even simpler. And I have to consider how the penguin state will be managed, including its orientation (in the original, the penguin can face in the cardinal directions). Should I preserve that in a touch-driven game user interface?

06 June 2013

Objective-C, Day 4

(Warning: more non-FP content)

It's time to get the board representation filled out with a real board. I've only got a couple of hours left before I have to pack up my computer to leave my Undisclosed Location, but let's see if I can get a little more done. Let's review what level 1 looks like:

Level layouts are taken from the original Macintosh Polar game created by Go Endo. These were originally MacOS resources of type 'STGE.' Let's see if we can decode them. Using ResEdit, the raw data for 'STGE' resource ID -16000 looks like:

0x0000 0x0000 0x0003 0x0001
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0001 0x0000
0x0000 0x0000 0x0000 0x0000
0x0004 0x0000 0x0000 0x0001
0x0000 0x0006 0x0000 0x0002
0x0000 0x0005 0x0004 0x0005
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0000 0x0000
0x0000 0x0001 0x0000 0x0000
0x0001 0x0000 0x0000 0x0001
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0000 0x0005
0x0000 0x0000 0x0000 0x0002
0x0003 0x0000 0x0000 0x0001
0x0001 0x0000 0x0000 0x0000
0x0000 0x0001 0x0000 0x0000
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0000 0x0000
0x0000 0x0000 0x0000

There are 99 16-bit values. My best guess is that this corresponds to the 24x4 grid (96 board positions) plus 3 extras for some kind of of header of footer data (maybe the total number of hearts is indicated, for example). There are 7 unique values, so it seems extremely likely that they correspond almost exactly to our eight different tile types, with zero representing a blank space. But the counts of each type don't _quite_ match up. The first board has 8 trees, 1 bomb, 2 hearts, 2 ice blocks, 2 mountains, 3 hearts, 1 house, and 1 penguin (there is always 1 penguin), while this 'STGE' resource has: 9 ones, 2 twos, 2 threes, 2 fours, 3 fives, and 1 six. The counts are very close, so this has to represent level 1, and the the 5 almost certainly represents a heart, but I'm not clearly seeing the layout. The first vertical column goes penguin, tree, tree, tree. I don't quite see a pattern that looks like that, but resources -1599 and -15996 give me a hint that the "extra" data is at the front: they contain 0x0007 and 0x0008 as their third values. Those don't appear anywhere else so they probably don't indicate tiles. So let's try rearranging resource -16000 without the first 6 bytes, remove redundant zeroes for clarity, and looking at the values aligned by groups of 24 instead of 4:

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 4 0 0
1 0 6 0 2 0 5 4 5 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 2 3 0 0
1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

There's the board. The left column is actually all trees -- when the board first appears, the penguin is hiding a tree. There are actually nine trees. So the encoding looks like so: empty space = 0, tree = 1, mountain = 2, home = 3, ice block = 4, heart = 5, and bomb = 6. The penguin doesn't have a value, but his starting position is probably represented by the first two values, (0,0), most likely encoded as row index, column index to correspond to row-major indexing, and there are 3 hearts to count down as the game is solved.

Can we validate this with the second board? Yes, it looks like there's a 4 indicating 4 hearts. In all the boards I've seen so far (the first four), the penguin is in the upper left. The fifth resource has 0, 1 for its first two values, so I'm guessing I can confirm the encoding of the penguin's position if and when I get to that stage.

And now to come up with a quick way to populate the board in my code. As I'm still a complete n00b in Objective-C, and on the general principle that there's no real need to make this any more clever or less obvious than it has to be, let's just use a string with the data from the resource. Let's add an init method to the interface for the model class:

- (id)initWithLevelIndex:(int)level_idx;

And here is some data, and an initializer method:

static const int num_polar_levels = 1;
static const int polar_data_len = 96;
static const int polar_data_num_tile_vals = 7; // 0-6 inclusive
static const int polar_data_max_tile_val = 6;
static const NSString *polar_levels[num_polar_levels] =
{
    @"100000000000000100000400" \
    @"106020545000000000100100" \
    @"100000000000000050002300" \
    @"110000100000000000000000"
};

- (id)initWithLevelIndex:(int)level_idx
{
    // Call our own basic initializer. This will 
    // result in redundant setting of board values,
    // but maybe I will clean that up later.
    self = [self init];

    // A simple lookup table to decode the original
    // Polar resource data as strings
    ArcticSlideTile 
        *polar_data_tile_map[polar_data_num_tile_vals] = {
        empty, tree, mountain, house,
        ice_block, heart, bomb };

    if ( level_idx > ( num_polar_levels - 1) )
    {
        NSLog( @"initWithLevelLayout: level %d out of range!\n",
               level_idx );
        self = nil;
    }
    else
    {
        const NSString* level_str = polar_levels[level_idx];
        unsigned int level_data_idx = 0;
        for ( unsigned int idx_y = 0;
             idx_y < board_height; idx_y++ )
        {
            for ( unsigned int idx_x = 0;
                 idx_x < board_width; idx_x++ )
            {
                NSRange range = NSMakeRange(level_data_idx, 1);
                const NSString * item_str =
                    [level_str substringWithRange: range];
                int polar_data_tile_val = [item_str intValue];
                if ( polar_data_tile_val >
                    polar_data_max_tile_val )
                {
                    NSLog(@"tile val %d out of range!\n",
                        polar_data_tile_val );
                    self = nil;
                }
                else
                {
                    board[idx_y][idx_x] =
                        polar_data_tile_map[polar_data_tile_val];
                    level_data_idx++;
                }
            }
        }
    }
    return self;
}

Hmmm, that seems overly complicated. There probably is a better, more idiomatic Objective-C way to use NSStrings for that, but I'm tempted to just write it with a basic C string. Using NSString objects in this context didn't even really help me catch bugs or avoid crashes, since I had forgotten to initialize a heart object and the result was hitting a nil object pointer at runtime and crashing, pretty much the same result as dereferencing a null pointer in straight C except with a better error logged. I'm a little disconcerted by Objective-C's inability to allocate objects on the stack, but that comes down to the aforementioned "thin veneer" over C. I don't really care about the overhead in using method dispatch in this simple piece of code operating on such a small amount of data, but I do care about simplicity. Just to compare, here's a more straightforward C implementation of that inner loop:

static const char *polar_levels[num_polar_levels] =
{
    "100000000000000100000400"
    "106020545000000000100100"
    "100000000000000050002300"
    "110000100000000000000000"
};

int polar_data_tile_val = level_str_p[level_data_idx] - '0';
if ( ( polar_data_tile_val < 0 ) || 
     ( polar_data_tile_val > 
       polar_data_max_tile_val ) )
{
    NSLog(@"polar data tile value %d out of range!\n",
          polar_data_tile_val );
    self = nil
}

Maybe the lesson here is "use Objective-C objects only when objects are a win."

I'm going to continue with this project, developing the code to handle the interactions of objects on the playing field and eventually the user interface. However, now that my week away from home is done, I might not be able to make progress very quickly. Stay tuned -- I'll post what I can, when I can. As always, if you have any comments or questions, I'm happy to have feedback.

05 June 2013

Objective-C, Day 3

(Warning: non-FP content)

So, this is day 5 in my Undisclosed Location and I haven't gotten much done -- I'm still engaged in a job search, and spent about eight hours working on a "take-home test" for an employer (with a few interruptions), and I'm pursuing more leads, and I'm trying to socialize with my hosts occasionally, so there are some distractions. But I've got enough information to go on to start implementing something original.

What I want to implement is a small game. I'll get as far as I can in the back-end code today (the Model and Controller parts of the MVC "trinity") and we'll see just how productive I really am in Objective-C. I read most of Objective-C Programming by Aaron Hillegass last night -- it's a very quick read for someone well-versed in C, and it reiterates parts of the iOS Programming book. I haven't covered protocols, categories, blocks, or run loops, but I think I can get by without those things for now.

Many years ago there existed on old-school MacOS a small game called "Polar." It was a very simple game, written by a guy (Go Endo) who was probably a student at the time, but I was fond of it -- fond enough to save it for 23 years, with the intention of studying its design and re-implementing it in the future. (In fact, I've saved a lot of old bits and bobs like this). I made notes of how to beat the first 3 levels (it was one of those "incredibly simple game play, but maddeningly tricky" games), drew out the levels, made notes on how the objects behaved, etc. I haven't been able to run that game for a long time, but today I just got it working under SheepShaver. Here's what level 1 looks like (blown up a bit):

The penguin is your avatar. The rest of the objects are ice blocks, trees, hearts, bombs, mountains, and houses. The world is a sheet of ice. You can walk around on the ice. You can walk through trees. Some objects (trees, mountains, and houses) can't be moved. Bombs, hearts, and ice blocks move without friction -- if you push them, they will keep going until they hit the edge of the world or another object. If an sliding ice block hits another object, it stops. If you push it against another object, it crumbles and disappears. The penguin can walk through trees, but other objects can't. Bombs will blow up mountains when they slide into them. Everything else simply stops them. The goal of the game is to slide all the hearts into a house (cute, huh?) But because the ice is frictionless, it's incredibly easy to get objects stuck against walls or corners where you can no longer move them the way you need to. So you have to carefully plan out your moves, and if you get stuck, there's an option to start the level over.

I should mention that the original game had a copyright notice (1990), and was shareware ($2.00). I can't remember if I ever sent the author $2.00. I'm not sure how he would feel about me taking apart and trying to re-implement his game, or whether he'd try to assert that copyright prevented me from doing so, but I'll assume he's a nice guy and wouldn't care as long as I don't charge for it, and go ahead, on the theory that easier to ask forgiveness than permission. I was not able to find him online -- maybe "Go Endo" was a pseudonym?

Back in 1991 I came up with a C++ class design (actually, it doesn't quite look like C++; I think it was written using THINK C's object-oriented extensions, which are sort of lost in the mists of time to me -- what is that indirect keyword? What did #pragma options(virtual) do? I don't remember for sure, but let's see if we can come up with an Objective-C implementation. The objects, other than the penguin, which can face in different directions, don't seem to have any real distinct properties except for their locations in the board, so I'm tempted not to take the obvious route and implement a board full of instances of the objects. I'm more inclined to just define a class for the board, and let it encapsulate most of the game logic. For the objects themselves, I'd like to just make references (pointers) to their classes (the class object) rather than putting them in a container. That's apparently not really possible -- there are no predefined singletons for the class objects that are accessible at run-time the way there are for NSNull. So I have to make some (sort of) singletons.

Here's what I've got today:

@interface ArcticSlideTile : NSObject
// Not necessarily useful yet, but I am guessing it might
// be helpful to have a separate base class at some point.
@end

@interface ArcticSlideTileStateless : ArcticSlideTile
// In implementation, there will be a single shared instance.
@end

@interface ArcticSlideBomb : ArcticSlideTileStateless
// Bombs can be pushed and will slide until they
// hit an object and stop. If the object is a mountain,
// both bomb and mountain are destroyed and
// there should be an animation. If another object
// hits a bomb it stops (I'm not sure you can test this
// combination in the original game with the board layouts
// available
- (NSString*) description;
@end

@interface ArcticSlideEmpty : ArcticSlideTileStateless
// The penguin can walk on empty space. Pushable objects
// on empty space are on ice and they slide until something
// stops them.
- (NSString*) description;
@end

@interface ArcticSlideHeart : ArcticSlideTileStateless
// When a heart hits a house the heart disappears (getting
// all the hearts into the house is how you win the game).
// Otherwise they cannot be destroyed, and slide like other
// slidable items.
- (NSString*) description;
@end

@interface ArcticSlideHouse : ArcticSlideTileStateless
// Houses cannot be pushed and stop other objects
// except hearts. When a heart hits a house the heart
// disappears (getting the hearts into the house is
// how you win the game). So the model should keep track
// of the number of hearts on the board and trigger a
// "win the level" behavior when the last heart is 
// destroyed.
- (NSString*) description;
@end

@interface ArcticSlideIceBlock : ArcticSlideTileStateless
// Ice blocks can be pushed and will slide until they
// hit an object and stop. If they are pushed directly
// against an object they will be crushed (there should
// be an animation) and disappear.
- (NSString*) description;
@end

@interface ArcticSlideMountain : ArcticSlideTileStateless
// Mountains cannot be moved and are destroyed by bombs.
- (NSString*) description;
@end

@interface ArcticSlidePenguin : ArcticSlideTile
// The penguin is the avatar. It has the special
// quality of being able to face different directions
// in the original game, although that's because you 
// can click near him to turn him and make him walk in
// different directions. In a touchscreen implementation
// I'm not sure how this should be implemented -- maybe
// he can slide in discreet steps. The penguin has the
// ability to walk through trees. We might want to 
// implement this temporary state using using an
// "override" object reference in the model. Sliding
// objects might be implemented the same way.
typedef enum {
    north, south, east, west
}penguinDirection_e;

@property penguinDirection_e facing;
- (NSString*) description;
@end

@interface ArcticSlideTree : ArcticSlideTile
// Trees cannot be pushed or destroyed and stop
// all sliding objects, but the penguin avatar 
// character can walk through them.
- (NSString*) description;
@end

static const int board_width = 24, board_height = 4;
// The short board design is part of what makes it
// so easy to get sliding objects stuck against the
// edges of the world or in corners where you can no longer
// get around to the other side to push them. We could
// consider a bigger board later and maybe implement the
// original puzzles surrounded by impassible water.

@interface ArcticSlideModel : NSObject
{
    ArcticSlideTile* board[board_height][board_width];
}

- (id)init;
- (NSString*) description;

@end

@implementation ArcticSlideTile
{
}
@end

@implementation ArcticSlideTileStateless
{
}
@end

@implementation ArcticSlideBomb
- (NSString*) description
{
    return @"Bomb";
}
@end

@implementation ArcticSlideEmpty
- (NSString*) description
{
    return @"Empty";
}
@end

@implementation ArcticSlideHouse
- (NSString*) description
{
    return @"House";
}
@end

@implementation ArcticSlideIceBlock
- (NSString*) description
{
    return @"Ice Block";
}
@end

@implementation ArcticSlideMountain
- (NSString*) description
{
    return @"Mountain";
}
@end

@implementation ArcticSlidePenguin
- (NSString*) description
{
    return @"Mountain";
}
@end

@implementation ArcticSlideTree
- (NSString*) description
{
    return @"Tree";
}
@end

@implementation ArcticSlideModel
{
    ArcticSlideBomb *bomb;
    ArcticSlideEmpty *empty;
    ArcticSlideHeart *heart;
    ArcticSlideHouse *house;
    ArcticSlideIceBlock *ice_block;
    ArcticSlideMountain *mountain;
    ArcticSlidePenguin *penguin;
    ArcticSlideTree* tree;

}
- (id)init
{
    bomb = [[ArcticSlideBomb alloc] init];
    empty = [[ArcticSlideEmpty alloc] init];
    heart = [[ArcticSlideHeart alloc] init];
    house = [[ArcticSlideHouse alloc] init];
    ice_block = [[ArcticSlideIceBlock alloc] init];
    mountain = [[ArcticSlideMountain alloc] init];
    penguin = [[ArcticSlidePenguin alloc] init];
    tree = [[ArcticSlideTree alloc] init];

    for ( unsigned int idx_y = 0;
         idx_y < board_height; idx_y++ )
    {
        for ( unsigned int idx_x = 0;
             idx_x < board_width; idx_x++ )
        {
            board[idx_y][idx_x] = empty;
        }
    }
    return self;
}

- (NSString*)description
{
    NSMutableString *desc_str =
        [[NSMutableString alloc]init];
    
    [desc_str appendString:@"ArcticSlideModel board state:\n"];
    for ( unsigned int idx_y = 0;
         idx_y < board_height; idx_y++ )
    {
        for ( unsigned int idx_x = 0;
             idx_x < board_width; idx_x++ )
        {
            [desc_str
             appendString:[board[idx_y][idx_x] description]];
            [desc_str appendString:@" "];
        }
        [desc_str appendString:@"\n"];
    }
    return desc_str;
}

@end

My quiet time at my undisclosed location ends tomorrow. I'm not sure I'll be able to get back to this for a few days. I haven't gotten nearly as much done with this as I've hoped. I've been distracted by a lot of job-search things. But hey, it's a start!

03 June 2013

Objective-C, Day 2

(Warning: non-FP content)

So today I'm working on chapter 3 in iOS Programming. This is about memory management. I have vague memories of manual reference-counting using retain and release in earlier experiments with Objective-C, but this book teaches ARM (Automatic Reference Counting). You enable ARM when you configure an XCode project of iOS.

The idea behind object references (well, pointers in iOS) and trees of objects and their owners is not new to me. And here they they still do not clarify whether a statement like items = nil actually invokes runtime behavior to destroy objects. I believe it does not. When I override a dealloc method to log, I see that the dealloc methods are called when the curly brace at the end of @autoreleasepool is reached. This makes sense, as in C it is the point where variables go out of scope. So reference-counting bookkeeping must be invoked then. It does not seem to be possible to step in with the debugger at this point to view what is happening, though -- Apple's Objective-C runtime is proprietary software.

We next get into weak references. There's a __weak specifier that can appear as part of an object pointer declaration, and iOS Programming says that "an interesting property of weak references is that they know when the object they reference is destroyed... the [parent object] automatically sets its [child object] instance variable to nil." I'm going to have to read more about that. I really wish this book were more precise in its language. There's another specifier, __unsafe_unretained, which is not set -- and so a pointer to a destroyed object could still be dereferenced. I suppose I should just be happy that this works and use it, but I'm the type to want to know what is going on at the register level. Apparently ARC (the "automatic" in reference-counting) is all due to some amazing work done in clang, where the static analysis actually figures out all the possible points of object creation and deletion and can wrap up your reference-counting for you. That seems like a great development -- because I've never quite liked how garbage collectors (at least, those without hardware support), no matter how efficient they are, had to actually be implemented, looking at dumb machine pointers. It seems to me that there could be room in purely-GC'ed languages for this kind of static analysis. But I haven't thought very hard about that yet.

Properties seem to be a shorthand, asking the compiler to provide getters and setters. There's a great example as to how much code this can eliminate. There's a strange wart in that we are asked to specify (nonatomic) in every property, because that is not the default setting. Presumably this is to support multiple threads accessing the object. Again, a dark corner to look into.

For a little refresher, I took a break by reading the first few chapters of Brad Cox's book. He outlines his vision of a software IC, along the lines of a hardware C, that would support reusability primarily through late binding, limited interfaces, and wrapping up functionality with data so that clients don't have to write the operations on an object's data type. It's an interesting vision.

Cox writes of early Objective-C:

Objective-C adds precisely one new data type, the object, to those C provides already, and precisely one new operation, the message expression.

Cox's book indicates that early versions of Objective-C used an intermediate preprocessor step, between the C macro-preprocessing step and the regular C compiler. If you are my age you might recall that, early on, C++ was treated similarly, with a program called CFront. This is no longer the case with either Objective-C or C++, although this approach lives on in tools like Qt, with its Meta Object Compiler, or moc).

Cox describes the pragmatic design of Objective-C:

One of Objective-C's key features is that it is always possible to bypass the object-oriented machinery to access an object's private information directly. This is one of a hybrid language's greatest theoretical weaknesses and one of its greatest pragmatic strengths. Low-level code is often best developed in a conventional manner, exploiting static binding to obtain high machine efficiency and strong type checking. Conversely, user-level code is often best written as objects. It is important that these levels interface smoothly and efficiently.

There is an interesting passage about object identifiers. In a slight muddling of his earlier statement he writes that "objects are identified by a new Objective-C data type called an id, or object identifier." In practice, though, this isn't really a new data type per se. The address (a pointer to) an object is this identifier. The veneer over C is so thin that, he writes, it would have been perfectly feasible to implement the dynamic message sends using straightforward C language calls like

reply = _msg_(aReceiver, "aMessage", argument1, ...)
(I assume he'd use C's variable-length argument list mechanism here), but for efficiency he wanted to avoid string comparison in the dispatch mechanism. Cox was quite aware of the reaction that messages like [anObject do:arg1 with:arg2] would inspire in C programmers, writing "the convention seems strange to those accustomed to conventional function call syntax, and frankly, I find it a mixed blessing."

In this formulation, Objective-C classes were still objects, and class methods were called "factory methods" (but I'm peeking ahead a number of pages, so there might be some differences from their current incarnation), as distinct from instance methods. Cox writes "...the programmer conceives the initial object... the loader gives it life by installing it in memory... this is done once for each class in the system. These primal objects are called factory objects... every factory's name is published in a global variable."

Cox is a very thoughtful writer and he presents a minimalist view of what an object-oriented language requires in order to provide the most basic advantage of OOP. He writes:

One last time: the only substantive difference between conventional programming and object-oriented programming is the selection mechanism. This is not to minimize its importance, but to demystify what is basically a very simple notion. Its significance is that it moves a single responsibility across the boundary between the consumer of a service and the supplier of that service. In conventional programming, the consumer is responsible for choosing functions that operate properly on the data managed by that service. The selection mechanism moves that single responsibility off the consumer and onto the supplier. In this small change lies all the power of the technique.

That really caused me to, as they say, nearly drop my monocle. Could this really be where most of the advantage, such as it is, imperfectly realized, understood, and applied, of OOP comes from?

The mainstream business world got C++ instead, a mash-up of C and Simula, and then Java instead, the Cobol of object-oriented programming languages. I was not pleased. As I consider my career I also consider which implementation languages I should focus my efforts on, for the next decade. After exposure to Haskell it's hard to believe that, ultimately, functions -- without explicit state -- won't prove to be a cleaner reusable element than classes, whatever kind of binding they use. And don't get me wrong -- I like classes. I'm pretty certain that I'll still be writing C code, at least occasionally, in ten years. I'd prefer to be writing less C++. But what I'd really like is to move on -- to pick a "last programming language." Objective-C isn't that, for me -- it's too imperative, and can't truly be made safe, just safer. I'm enjoying it in this context, but would Objective-C even be an viable option outside of the Apple ecosystem? It doesn't seem to have gained much adoption. The state of GnuStep doesn't seem terribly robust. And so my career-long quest for a Better Way continues, while at the same time trying to gain facility with all the Good Ways along the way.

02 June 2013

Objective-C, Day 1

(Warning: more non-FP content).

So I'm in an undisclosed location away from home specifically to spend a few quiet days focusing on programming. I had a goal to teach myself enough iOS programming to get a small demo app submitted to the app store by the end of the week. I'm not sure I'll manage that, but I'm learning. I have 3 relevant books with me: iOS Programming: The Big Nerd Ranch Guide (Third Edition, Second Printing, August 2012), by Joe Conway and Aaron Hilegass; Objective-C Programming: The Big Nerd Ranch Guide (Second Printing, January 2012), by Aaron Hillegass; and a much older book, Object Oriented Programming: An Evolutionary Approach (Reprinted with corrections August 1986), by Brad J. Cox. Brad Cox is the guy who created Objective-C, and although the language he describes in his book is not the same as modern Objective-C, I thought it might give me some insight into the language design. I own, but did not bring with me, a number of books on Smalltalk including the "blue book" (Smalltalk-80: The Language and its Implementation by Adele Goldberg and David Robinson). I wish I had brought it! But on the other hand, maybe it's best that I not go too far down a rabbit hole into the history of programming languages and the way language constructs are adopted over time by other languages -- even though it's one of my favorite subjects.

So far, I've worked through a few chapters of each. I'm not really here to write reviews of the books; let's just say they have their various strengths and weaknesses. But here are some thoughts on the process so far.

1. Learning Objective-C is a little frustrating (that's good, actually -- most languages are a LOT frustrating to learn).

1a. One of the fundamental strengths of C, as a kind of portable assembly language, is that almost everything about the behavior of your program is right on the surface. I don't mean that there aren't sometimes surprising behaviors. But as far as the usage of memory and the order of execution, C is pretty easy to reason about, with a few minor exceptions (for example, the switch statement implies that the compiler may create conditional tests whose order and number don't match exactly what is indicated in your source code).

1b. Objective-C hides some functionality in a way that C programmers may find disturbing. Those coming to it from another language might not even notice that this is an issue, since they might expect that the runtime is doing a lot of implicit things. Or they might not think about it at all.

1c. The syntax of Objective-C is a weird mash-up of Smalltalk and C. It has strange aesthetics.

1d. Nevertheless, it is pretty clear that Objective-C is an enormously pragmatic and highly successful language. Therefore, from the perspective of a language designer or someone who studies the history and design of programming languages, it would be an enormous mistake to dismiss it because of its hybrid design. By all accounts it is an extremely productive language once a certain level of expertise is achieved.

2. These books are not necessarily written for someone who knows other languages (such as C and C++) well, and that is also frustrating.

2a. The iOS Programming book uses language imprecisely here and there and it is driving me to distraction.

2b. This is not necessarily their fault; terminology is often only precise within a specific domain and has imprecise meaning or different meaning outside of that domain.

3. Now, some details.

3a. The sentence "The class acts as a factory that creates instances of that class." This needs a little unpacking. The language of "factories" seems to come from design patterns and is also big in .NET and enterprise-y Java as the concept of a factory method. But this sentence actually buries a few surprises for C++ programmers. The first is that classes are objects -- they actually have a distinct run-time representation. And instances have a different run-time representation than classes. This seems reminiscent of the concept of prototypes in languages like NewtonScript, Self, and Omega, where objects can inherit from other objects, but it is different as well, since in NewtonScript a prototype is just another instance of a fundamental container data structure, a frame. I'm not sure yet whether in Objective-C you can create a class at runtime, but I wouldn't be surprised.

3b. From Smalltalk: a method call is triggered by sending an object a message. This implies more run-time overhead than a function call, even a virtual function call from C++, although no doubt this has been highly optimized in the runtime. A message-sending expression can return a value. So, for example, it seems to be idiomatic Objective-C to create an object with a statement like

Object *obj = [[Object alloc] init]

This wraps up quite a bit of behavior: Object is used as a type for a simple pointer variable obj, and also as a reference to the runtime Object class object. Sending Object the alloc message returns a new instance of an object, which can then be used as the receiver for the nested message send, sending it the init message, which returns its receiver.

3c. This "message" is a distinct thing, and seems to either be the same as or closely related to a "selector." Syntactically a selector is something like "methodNamePart1:argument1Name:argument2Name" and a selector like this is compatible with a message-sending statement like "[receiver methodNamePart1: argument1Name: argument1 argument2Name: argument2]." This implies (but I have not seen clearly explained, yet) that the compiler has to break all that down and keep track of it. iOS Programming mentions "arguments interposed into the selector." This seems to be similar to using required keyword arguments as in Dylan which, I think, borrowed the idea from CLOS. I think the packed string with colon-separated selectors might be an innovation in Objective-C; I'm not certain.

3d. iOS Programming mentions that an expression like obj_p = nil "destroys the object and sets its instance variable to nil." If this is accurate, and they don't just mean that the object is "cut loose," this implies hidden run-time behavior that I would like to see explained much more fully.

3e. Sending a message to nil is apparently a no-op at run-time. This scares me a bit; won't that tend to hide a situation where we might actually want a crash?

3f. I like the extensions to printf format strings: %@ representing "any object," producing a string generated by calling the object's description method.

3g. I'm not sure I understand the need for NSNull, if the compiler can already wrap special run-time behavior around nil. But I don't claim to understand everything yet.

3e. Instance variables are not included in class objects. That's... interesting. It sounds a bit like how static methods and fields in a C++ class are available without creating an instance.

3f. Getters and setters are not created for you, and by convention (which can be important, apparently, for compatibility with code that uses introspection) are called set+instance var name and the instance var name. I have a great distaste for different things that have the same name.

3g. Some interesting bits and pieces to digest further, later: instance methods; initializers; id, isa, self, super, and the "designated initializer." iOS Programming talks about the importance of the designated initializer for some time, which is a shame because it also doesn't seem to explain in any way how you designate an initializer.

3h. Class names seem to be all in one namespace -- in other words, I don't see evidence of a real module system. I know from watching this process in other languages that it can be very difficult to retrofit a language with a module system later.

That's about all I've got for now. I'm sure it will make more sense as I press on, for better or worse!

31 May 2013

An Interesting Interview Question

(Warning: some non-FP content!)

So yesterday I had a job interview for a developer position and I got the chance to put my money where my mouth is, so to speak, about my C and C++ programming skills. I was asked to write 3 functions given specifications including their prototypes, and to review a piece of C++ sample code. The interviewers left me alone with some paper for fifteen minutes or so.

There were four tasks. I scrambled around at the first one, started down a blind alley, started to get nervous, realized I was getting nervous, and decided to move on to the next questions and come back to the first question if I could. So I'll reveal that one last.

The second task was to write a "population count" -- to count the one bits in an unsigned char. This is a common operation in real-world code that operates on packed data structures such as tries. I chose a very simple lookup-table approach, with a table to look up values nybble-wise. Easy-peasy. The function itself then becomes a one-liner. "Full marks," as they say in some countries.

The third task was to write a function that tests whether the CPU it is running on is big-endian, or little-endian. If you don't know what that means, I'm not going to explain it in great detail here, but it refers to the way that the individual bytes of multi-byte values are laid out in memory. You can write quite a bit of code without running into endian problems, but it becomes an issue if you need to exchange data over a network or across a bus between processors that store their data differently. Fortunately I have written quite a few drivers, so this one was also easy.

The last task was to find errors and questionable practices. There were a lot of them -- I marked up my page with several "WTFs?" They weren't all errors that a compiler would complain about. Fortunately I've studied C++ extensively -- I used to subscribe to C++ Report -- and I was able to talk about several design issues, ranging from the lack of a virtual destructor in a base class to failure to properly implement the necessary constructor, copy constructor, and assignment operator. There was a problem with a const method. There was a pointer aliasing problem. There were a few other small issues I did not notice. I missed an issue: dynamic_cast can return NULL. I missed this because I have never had to use that sort of run-time introspection in C++, although one of my interviewers pointed out how it can be useful and necessary for adding test functionality to a framework. Overall, I think I got most of the credit for this one.

The first task, the one I botched, was to write a function to reverse a standard null-terminated C string, in place. Wow, the world loves nothing more than to humble us now and then, doesn't it?

Right off the bat, I became confused because of the prototype for the function. The question indicated that the first parameter would be a pointer to a zero-terminated string, but the second parameter was to be a size_t type -- a buffer size. I suppose this is analogous to the standard C library routines strn*, which are safer versions of the old strcpy, strlen, etc. These versions take a number indicating how far the function can travel in memory before bailing out -- giving a sort of "backup plan" in case the incoming string is not properly null-terminated. These are certainly better than the old functions that can rip through memory, or leave your code vulnerable to a variety of stack-smashing and buffer-overflow exploits. But to me there's something inherently horrible about a function with such a contradictory "contract" -- it is designed in such a way that it will hide, or at least give safe harbor, to bugs in the code that calls it. It has the information it needs to report that it has been called with an unterminated string, but there is no provision to return an error code, and the assignment did not indicate what I should do if the incoming string is not terminated. Presumably I'm not supposed to crash the computer -- although with no way to return an error, forcing a crash immediately is really the safest possible option!

The "safer" standard library function strnlen receives a value telling it how many characters it should count before it bails out, and returns a value indicating how many characters it actually counted. If you get back the same number you provided, it means that the incoming string is not properly null-terminated. You can (and hopefully will) do something with that information. But the "safer" strncpy doesn't provide any indication that it has been handed a source string that isn't null-terminated, and so can leave the destination filled with an unterminated string. That's inherently unsafe. The prototype given makes this function a little like strncpy.

So while my brain was chewing on all that in the background, I started trying to get something down on paper, writing a shell for the function, to dispose of some possible error conditions first. And I confused myself further.

In the bad old days, size_t was not guaranteed to be unsigned, so it was possible that the provided buffer size could be less than zero. That's not the case in standard C, but I coded to test for it anyway. Most compilers would tell me that I was testing for a condition that can't happen, fortunately.

Next, I tested for whether I could bail out early -- whether the provided buffer size value is less than or equal to 1. I'm not sure if this would be a performance win -- it depends a lot on the sizes of the string that the function will be given. But I wanted to show that I was at least thinking about these different possible cases.

And then it was time to write the meat of the function. I had my heart set on writing a recursive solution. Why? To show off, maybe, even though this isn't actually a very practical solution in C. But more likely, because I've gotten used to reversing lists in languages like Scheme and Haskell that use Lisp-style lists. Which... are not at all like plain old arrays of bytes. But I was determined. To show that I was at least aware that recursive functions can fail when they use too much stack space, invented a constant, MAX_STACK_SPACE_ALLOWED, and made sure my buffer size didn't exceed it. So then I had several error-checks and had filled most of the page. There was nothing else I could do but write the recursive helper function to reverse the string.

And I realized that it has been a long time since I wrote something like that in plain C I was still chewing over what to do with the buffer length parameter, and wondering if I was missing something fundamental about the problem. And so I panicked and bailed out, feeling really stupid. Stupid brain! How could you let me down like that?

I had some more time to think about it today, and play around with some code, and think about the "contract" for the function and just why I got stuck. The idea behind a recursive reverse of a list is that you recurse to the end of the list, and then build up a list of the elements in reverse order as the recursion "unwinds." In Scheme, something like this:

(define (my-reverse lat)
   (if (null? lat)
      '()
      (append (my-reverse (cdr lat)) (list (car lat)))))

C doesn't do the data-structure bookkeeping for you. Recall that we're supposed to the reverse in place. This solution starts with an empty list and creates new cons cells to add to it. And of course a list doesn't look like a primitive string of bytes. So I've got to get my head back in the C game.

As a warmup, let's write a recursive solution that uses an auxiliary buffer for the output, because that sounds easier to me. First, I want a routine to handle some special cases, which will then call a recursive function. It is at this point that I have to figure out what I'm going to do with an unterminated string. My answer: I'm going to terminate it. You're terminated, sucker. Yes, I'm going to overwrite a character of the data you passed me. Dude, where's my data? No, DUDE! Where's my string terminator?

Is that the right answer? I don't know. Actually, I'm pretty sure that this is not what the guys that gave me the task had in mind. In retrospect I think it probably would have made more sense to just use strnlen to get a "working length," and just reverse that many characters. But I think the bigger problem is that asking for a function with that prototype is asking the wrong question. It's a guaranteed "WTF?" for someone. Anyway, here's the "preamble" error-checking code as I wrote it today, for working with a function that returns the reversed string in an auxiliary data buffer:

void ReverseString( char *str_p, size_t buffer_size )
{
    if ( NULL == str_p )
    {
        printf( "NULL pointer!\n" );
        return;
    }
    else
    {
        unsigned long len = strnlen( str_p, buffer_size );
        if ( buffer_size == len )
        {
            str_p[buffer_size - 1] = '\0';
            len--;
            printf( "Terminating your string for you!\n" );
        }

        /* Recursive with an auxiliary array */
        char *str_out_p = (char *)( malloc( buffer_size ) );
        if ( NULL == str_out_p )
        {
            printf( "malloc failed!\n" );
            return;
        }
        else
        {
            strncpy( str_out_p, str_p, buffer_size );
            printf( "Before: %s\n", str_p );
            ReverseStringAuxHelper ( str_out_p, str_p );
            printf( "After: %s\n", str_out_p );
            free( str_out_p );
        }

        /* This doesn't actually meet the requirements... */
    }
}

So let's write ReverseStringAuxHelper now. Something like this:

void ReverseStringAuxHelper( char *str_out_p, char *str_in_p )
{
    /* Do some setup, and call a recursive helper */
}

Our basic strategy for recursive calls is clear: we want to call ourselves on a shorter substring of the input string, until we hit a NUL. And then... build up the data to return. But how do we do that? As we "unwind" the call stack, we will have a series of pointers to characters in the input string, in reverse order. But we want to write them into the outgoing string in forward order. To keep track of the index into the auxiliary data structure we could use a global (a file-scope variable, maybe declared static). Gag me with a spoon. No, let's use a local. You can't nest functions in C; you don't have lexical closure. But we can do something like this (get your barf bags ready):

void ReverseStringAuxHelper( char *str_out_p, char *str_in_p )
{
    char *str_out_walking_p = str_out_p;
    ReverseStringAuxRecursive
        ( &str_out_walking_p, str_in_p );
}

We can pass the address of the local pointer variable, to allow the recursive function to modify it -- so it can use the pointer, and increment its value. The recursive function now has a prototype that looks like this:

void ReverseStringAuxRecursive( char **str_out_p_p,
    char *str_in_p );

Note the double-asterisks: a pointer to a pointer. And now the whole function:

void ReverseStringAuxRecursive( char **str_out_p_p,
    char *str_in_p )
{
    if ( '\0' != *str_in_p )
    {
        ReverseStringRecursiveAuxHelper( str_out_p_p, 
            str_in_p + 1 );
        **str_out_p_p = *str_in_p;
        *str_out_p_p += 1;
    }
}

Good heavens. That is.. inelegant. n C++ we could do that by reference, at least, which would clean the appearance of the pointer expressions, but that really just sweeps it under the rug. It works, but can we do better? In fact, after sleeping on it for a while, I woke up abruptly with a prettier solution in my head. The key insight was realizing that there is no good reason I can't recurse with two parameters that are each changed to bring us closer to the terminating condition. We have already calculated the length of the string, so why don't we use that to figure out the terminating condition?

void ReverseStringAux2Recursive( char *str_out_back_p,
    char *str_in_fwd_p, char *str_in_term_p )
{
    if ( str_in_fwd_p < str_in_term_p )
    {
        *str_out_back_p = *str_in_fwd_p;
        ReverseStringAux2Recursive( str_out_back_p - 1, 
            str_in_fwd_p + 1, str_in_term_p );
    }
}

The idea here is that we will recurse on the input string, walking forwards through it and ending when we reach a supplied end pointer, and copying characters into the output string walking backwards. It's tail-recursive, which is nice (I'll talk about that a little more later). We could let the function write the terminating NUL by changing the test to read

if ( str_in_fwd_p < str_in_term_p )

But if we did that, we'd wind up writing a terminating NUL before the first character in the output string. We don't want to do that, so we need a special case for this. Let's just let our helper function do it.

void ReverseStringAux2Helper( char *str_out_p, 
    char *str_in_p, unsigned long len )
{
    /*
       The caller must guarantee that the
       output string can hold len + 1 chars
    */
    ReverseStringAux2Recursive( &str_out_p[len - 1],
        str_in_p, &str_in_p[len] );
    /* Write the terminator in the output string */
    str_out_p[len] = '\0';
}

That doesn't look too bad. We make use of the fact that we've already calculated the length of the string. Now, that we've figured out how to do this with two changing pointers, can we do that in place? Yes, we can:

void ReverseStringInPlaceRecursive( char *fwd_p, 
    char *back_p )
{
    if ( fwd_p < back_p )
    {
        char temp = *fwd_p;
        *fwd_p = *back_p;
        *back_p = temp;
        ReverseStringInPlaceRecursive(fwd_p + 1,
            back_p - 1);
    }
}

void ReverseStringInPlaceHelper( char *str_p,
    unsigned long len )
{
    ReverseStringInPlaceHelper( str_p,
        &str_p[len - 1] );
}

That's longer, but most of the body of the recursive function is just devoted to swapping characters using a temp value, before the tail recursion. But what's going on with

if ( fwd_p < back_p )

? Well, I'm glad you asked. Note that we're swapping characters. We walk along the string, going backwards, swapping characters from the front of the string, going forwards. But then a funny thing happens. The pointers meet in the middle, and cross. In other words, you want "ABC" to become "CBA" but if you don't stop the reversing process at "B", the left and right sides of the string are swapped again, giving you "CBC." The resulting string is palindromic -- it reads the same forward as backwards -- and so reversing it again leaves it unchanged -- but that's not what you want. Run this in an interactive debugger if you can -- it's interesting to watch. Well, if you like that sort of thing. It might help if I give you a picture:

By the way, both the recursive reverse in place and the second recursive reverse to an auxiliary string have a subtle bug. Can you spot it? I'll come back to that later at the end of this post.

Anyway, pretty as that in-place solution is, we probably don't really want a recursive solution in C. The biggest reason is that although I hear some C compilers now will optimize tail recursion into simple iteration, they aren't required to do it. I've heard that GCC can do it, but I wasn't able to figure out how to get my setup (LLVM GCC 4.2 with XCode 4.6.2) to do the optimization, as far as I could tell. So I'd say you definitely can't count on it, and so a big string might blow up your stack.

So what about an iterative solution? There are a few options, but a basic pointer math approach is very idiomatic C, and it looks very similar to the recursive solution, even if walking a pointer backwards seems a little odd and hazardous:

char *fwd_p = str_p;
char *back_p = &(str_p[len - 1]);

while ( fwd_p < back_p )
{
    char temp = *fwd_p;
    *fwd_p = *back_p;
    *back_p = temp;

    fwd_p++;
    back_p--;
}

That's not a complete function, but you get the idea.

So, any other ideas for implementing it? Well, you can use array indexing like so:

for ( unsigned long idx = 0; idx < ( ( len / 2 ) - 1 ); idx++ )
{
    unsigned long rev_idx = len - 1 - idx;
    char temp = str_p[idx];
    str_p[idx] = str_p[rev_idx];
    str_p[rev_idx] = temp;
}

But that is actually problematic for some string lengths. Can you see why? Well, if len is less than two, dividing it by two to find the array's halfway point yields zero, and when you subtract one, you'll have an underflow, giving a very large positive number for your array index, giving undefined befhavior (if you're lucky, you'll get an immediate crash). This approach can be rehabilitated by screening the initial conditions, and we'll talk more about that below.

Oh, there's another little trick you can do. I'll leave it to you to figure this one out, if you haven't seen it before. Instead of using a temporary variable to swap the characters, you could do this:

str_p[rev_idx] ^= str_p[idx];
str_p[idx] ^= str_p[rev_idx];
str_p[rev_idx] ^= str_p[idx];

And there, dear reader, I will leave you for today. I hope you found this interesting and maybe learned something. I did. Maybe you can see why my brain locked up -- I didn't want to write a function with that prototype; it it seems to contradict itself as far as whether it should be safe, or just efficient. Wanting to write a recursive solution was a secondary issue. Oh, and of course let's not forget that this is the wrong solution for modern strings -- for any Unicode types. It only works for old-school strings of bytes.

Don't get me wrong -- I'm not actually complaining about the interview question. It's a great interview question, because it looks easy but it should also bring up a lot of safety and design issues in the programmer's head. I've been humbled just a bit. I've been reminded that while I've spent years writing code that uses STL and reference counting and nice, modern features of C++ and more modern libraries, I've forgotten just a bit how low-level C often turns out to be.

I should never do that, because of course sometimes you have to debug all that nice, modern code, and remember what is going on underneath. Had I been asked to write the solution in a simple assembly pseudo-code, I would have had to come up with an iterative solution immediately, although I'm not sure I would have been able to work out the pointer-crossing problem in the interview setting. Recursion is almost certainly the wrong way to go in C, but maybe you can see why I was drawn to the idea, especially of writing the reversed string to a second buffer. It's funny how a relatively simple problem in C can be so "interesting." May we one day no longer live in such interesting times...

FOLLOWUP: I got feedback that the interviewers were quite pleased with both my performance and attitude. I actually sent them the source code for my experiments today, too, with some comments, in case they have anything to add. I screwed that up again, though, because the first version had a crashing bug, so I sent another one, and that still didn't have the most elegant form of the recursive functions. And, a couple of the solutions have that subtle bug I mentioned earlier. What is the subtle bug?

Obviously, it is a bug to write past the end an array in memory -- whether we're talking about an array allocated dynamically on the heap, on the stack, or in static storage, like a global variable. You can't guarantee that this bug will be caught -- it could be symptomless, just silently corrupting some other variable in your program. And obviously it is a similar bug to write before the beginning of an array. But did you know it is a bug to calculate the address of an item before the first item, even if you never use it to write to the array?

The C standard guarantees that you are allowed to create the address of an extra array element past the end of your array. Reading or writing at this location is "undefined behavior" -- that is, very bad. But there is no such guarantee for creating a pointer to an array element at index -1, even if you don't read or write it. See also this CERT coding guideline.

There are several places in our code where that is actually a concern. If we have a zero-length string, our calculated len is zero. Every place where we use len should be inspected. For example, in ReverseStringAux2Helper, I pass the value &str_out_p[len - 1]. This gives me the address of the last writable character in the output array. But for a zero length string, this will result in calculating the address of the character before the first character of the output string.

The rationale for this, I believe, goes something like this. On an arbitrary computer running your C program, you don't actually know what your memory addresses look like, at the hardware level. C has to cover a lot of weird and arcane and now-deceased architectures: segmented pointers, odd pointer sizes, architectures where different types of addresses have different numbers of bits, architectures where only address registers can hold addresses and they aren't the same size as any other type, etc. What if you're running this on a small processor, and it just so happens that the array in question was started at hardware address zero? Calculating the address of element -1, in an unsigned pointer type, would "underflow" into a large number. I don't specifically know of hardware where just storing that number in an address register or in memory is really a hardware error, but it could be. And the designers of the C standard wanted to make sure that C was as portable as possible.

The logic in ReverseStringAux2Recursive determines that there is nothing to do, and so we don't write to that address, but it is still illegal. To be strictly scrupulous about observing the standard, logic should be changed so that in the case of the zero length string, we don't get to the point where we calculate the address of that non-existent element. This also happens in the code where we calculate the length of a string and determine if it needs to be forcibly terminated -- in the case of a specified buffer length of zero, len underflows. A zero-length string can't be a valid string. I really do need to avoid this case entirely. I also need to think hard about the string size that is specified as one character, which can become a zero-length string if we forcibly terminate it. So the wrapper logic, given my chosen strategy of terminating unterminated strings, needs to look something like this:

void ReverseString( char *str_p, size_t buffer_size )
{
    if ( NULL == str_p )
    {
        printf( "ReverseString: received NULL pointer!\n" );
        return;
    }
    else if ( 0 == buffer_size )
    {
        printf( "ReverseString: received buffer_size 0!\n" );
        return;
    }
    else
    {
        /*
           strnlen returns the number of characters before
           the NUL or the provided maximum. If we get back
           the maximum, assume the string is NOT terminated.
        */
        unsigned long len = strnlen( str_p, buffer_size );
        if ( buffer_size == len )
        {
            /*
               Forcibly terminate the incoming string,
               consider its length to be one shorter, log a
               problem, and continue
            */
            str_p[buffer_size - 1] = '\0';
            len--;
            printf( "ReverseString: no termination found!\n" );

            /*
               If our adjusted string length is zero, don't
               continue because some of our algorithms that
               create pointers to the first character before
               the terminator will generate a pointer to an
               array item at index -1, an invalid operation
               per the C standard
            */
            if ( 0 == len )
            {
                printf( "After termination, len is 0!\n" );
                return;
            }
        }
        
        /*
           There is really nothing we need to do to reverse
           a string of length 1, but rather than rule that out
           here, we let our algorithms handle that case to
           exercise their boundary conditions.
        */

This is another case where the two parameters, the string pointer itself and the buffer_size, can contradict each other, with results that are complicated no matter what you do.

What happens if we don't forcibly terminate incoming strings that aren't terminated within buffer_size chars, but just reverse buffer_size chars? Well, the preamble code is a little different. Here's a version of the whole function that does that. The reversal algorithm in this version isn't very efficient -- it copies bytes twice, and it allocates extra storage. But efficiency wasn't mentioned in the task description, and I think it is correct, at least for this interpretation of how to deal with strings that aren't terminated within buffer_length chars. It also seems like something I might have had a better chance of getting right on paper, quickly, when put on the spot.

void ReverseStringFinal( char *str_p, size_t buffer_size )
{
    /*
       We take the buffer_size parameter to mean
       that we shouldn't touch more that buffer_size
       bytes, even if the string is longer.

       There are several error conditions that really
       should trigger failure, returning some kind of
       error code, but the spec doesn't give us that
       option.
    */
    if ( NULL == str_p )
    {
        return;
    }
    else
    {
        /*
           strnlen will return buffer_size if it
           doesn't find a terminator before counting
           buffer_size characters.
        */
        unsigned long len = strnlen( str_p, buffer_size );
        /* I'd like to use max(), but it is not standard C */
        unsigned long working_len = 0;
        if ( len < buffer_size )
        {
            working_len = len;
        }
        else
        {
            working_len = buffer_size;
        }

        if ( working_len < 2 )
        {
            /*
               Reversing a string of 0 or 1 character
               is a no-op
            */
            return;
        }
        else
        {
            /*
               An inefficient but simple solution using a
               temporary buffer
            */
            char *temp_buf_p = (char *)
                ( malloc( working_len ) );
            if ( NULL == temp_buf_p )
            {
                return;
            }
            else
            {
                /* Copy chars backwards */
                for ( unsigned long idx = 0;
                    idx < working_len; idx++ )
                {
                    temp_buf_p[ working_len - 1 - idx] = 
                        str_p[idx];
                }
                /* Copy forwards into original buffer */
                for ( unsigned long idx = 0; 
                    idx < working_len; idx++ )
                {
                    str_p[idx] = temp_buf_p[idx];
                }
                /*
                    Don't write a null; only touch
                    the chars in the working length.
                */
                free( temp_buf_p );
            }
        }
    }
}