20 August 2013

Arduino, Day 2: the Most Minimal Scripting Language, or, Reifying Frozen Functions for Fun

Planet Haskell readers might want to skip this one.

Recently I've been thinking about the possibilities offered by embedded interpreters. This is becoming a common design pattern in commercial software -- games written in C++ embedding Lua, for example, or Python for graphical user interfaces, or occasionally Scheme, or even the JVM. However, adapting an existing full-blown scripting language won't necessarily fly on a small system. What if you have a DSP, with nothing resembling a *nix-like operating system -- no standard C library, or almost none, no POSIX-like APIs, etc.

Well, there are smaller languages. I've done a little research recently and come across some very small, cut-down implementations of Scheme or other Lisp dialects, or even a tiny BASIC, and languages like Forth can be very small.

But what if we have a really tiny chip? What if the SRAM budget we had to implement our embedded language was, say, not 2 MiB, or even 2 KiB, but 200 bytes?

The Arduino Uno R3 uses an ATmega328 microprocessor. It's considered an 8-bit chip. Calling chips 8-bit, 16-bit, 32-bit, 64-bit, etc. can be a little confusing these days, because a lot of chips can operate on several different data sizes, but this chip has 8-bit registers, so we'll consider it to be primarily an 8-bit part. It has 32 KiB of onboard flash memory to hold your program (it is persistent across power cycles). 32 KiB can hold a moderately complex compiled program. But as for RAM -- the working memory for computation in progress and run-time state -- it has only 2 KiB, a mere 2048 bytes. That's less RAM than the first computer I owned, which was a Radio Shack TRS-80 Model 1, with 4 KiB, although I did work with some smaller computers -- for example, a KIM-1. And presumably we don't want to use all our SRAM to implement our small scripting language itself; the scripts need to have some functions to call.

A microcontroller like this is designed to run simple programs. It's designed to be "bomb-proof" -- hard to fry with stray excess voltage. It has a low pin count -- miniscule, compared to the processors in my desktop and laptop machines -- and it has a number of built-in peripherals designed for digital I/O. It's a pretty cool chip -- these Arduinos have taken off among electronics hobbyists and are part of the whole "maker" movement because they exist in a sort of power per dollar sweet spot. If you fry the chip itself, you can replace it for a couple of dollars; if you fry the whole board, you're only out a few more.

What they aren't, though, is very well-suited for the modern programming model that assumes a very capable operating system, dynamic linking, dynamic allocation using malloc or new, and container objects created using class libraries like the STL (don't get me wrong, the STL is impressively efficient, but it just isn't designed to work around a very tiny memory model). So the Arduino's GCC toolchain doesn't even attempt to provide, say, <vector>.

So what would an embedded scripting language for the Uno R3 look like? Well, for one thing, the Arduino doesn't come with a file system to speak of. You can buy a "shield" of some time that provides a controller and a slot for a memory card, but this isn't included with the basic bard. So you can't count on a user's ability to place a script in a text file and update just that script in order to change the program's behavior, although I might do some experiments along those lines. Without something like that, you have to use the toolchain -- compiling your program and downloading it with avrdude. You can't count on files, but you have strings -- C-style strings, in the form of arrays of bytes. So you could implement a small BASIC or Scheme interpreter that executed a program contained in a C string. I will do some experiments along those lines too, if I can, especially if I can force the strings to live in the 32 KiB flash, which I think is possible, rather than the 2 KiB SRAM.

But what if you want something even simpler? What if pretty much any sort of interpreter looked like it was going to be too heavyweight for your purposes, using too much SRAM -- too much space for tables, and too much stack? Is there something even simpler you can do?

There is. You can write a "script" that consists of a series of objects, using some tricks I'll demonstrate. This is very definitely a work in progress; it has some rather extreme design compromises. In particular, it isn't at all friendly to the "script" author. If there is a typo, the compiler can't provide a very novice-friendly error message. It seems a little sadistic to inflict this sort of hair-shirt discipline on hobbyists, but let's consider it a thought experiment, exploring a a couple of implementation options.

I'm going to build my "scripting language" up in reverse; instead of showing you the syntax of the "scripting language" first, I'll show you the constructs I'm using to implement it, we'll and arrive at the syntax last. But first I need to explain some of the less common things I'm doing with the implementation.

Let's say I want to easily differentiate a set of overloaded functions, each of which is specialized on a single parameter type. (I won't explain why quite yet, but it will make sense later). What's the best way to do that? How about with a set of typedef'ed types, or enum types to distinguish them?

C++ and C are weakly typed, at least in places. Although the C++ class system can enforce strong typing, when working with POD (plain old data) types and particularly the typedef and enum keywords, typing isn't strongly enforced. What do I mean by that? Well, typedef and enum allow you to write code that looks superficially like it is strongly typed:

typedef int special_int;
void handle( int a );
void handle( special_int a );

But on my compiler, if I provide the functions to match these prototypes, and call them:

int handle_param_1 = 0;
special_int handle_param_2 = 0;
handle( handle_param_1 );
handle( handle_param_2 );

The compiler complains that I'm redefining handle. The compiler won't (and can't) really make use of it to provide type safety, because it's not really a distinct type, but a type alias -- that is, really just a nickname, more for documentation purposes than anything else; the compiler doesn't really see the alias as a distinct type at all.

In the case of enums, using enum can make your code more self-documenting, and it appears that I can reliably overload functions using specific enum types, which the compiler keeps distinct:

typedef enum { red, green, blue } color;
typedef enum { salty, sweet, sour } flavor;
void handle( color a );
void handle( flavor a );

But if there is no overloaded function with a parameter that exactly matches the enum type, C++ will happily promote the enum type to an int, or apply a standard conversion to turn it into a float -- so type-safety is potentially compromised again here. If you forget to provide an overloaded handle method that accepts your enum type, but there exists one that takes a single float parameter, the compiler will call that one, without a whisper of warning. Types don't get much weaker than that! (All right, yes they do, but let's not get into a sideways rant about PHP or Perl).

The very latest versions of C++ do provide a new "enum class" feature that gives one more control over enumerated types and allows the more powerful class type system to kick in and keep things straight, but it isn't widely used yet, and I'm not sure the compiler I have available for Arduino supports it. So, at least for today, we'll leave it out.

There is a tweak that makes this quite easy, though. In C++, a struct is actually a class whose members are all public by default. And it's possible to create an empty struct, with no data members. This gives me something I'll call it a type tag (I could call it a typeType in honor of the old AppleEvent manager's descriptor type, but let's not), and I'll call an instance of it a type token. They will let me reliably overload functions, since it uses C++ class types, designed for greater type safety and strictness than C's "plain old data" types.

Let's briefly review structs, in case you need a review: in C, structs are in their own namespace, but they can be typedef'ed into the common namespace:

struct s_tag { int m1, float m2 };

You can instantiate a struct in C like this:

struct s_tag s1;

But you can't say:

s_tag s1;

Unless you've used the typedef trick:

typedef struct s_tag s_type;

s_type s2;

These are used together so frequently that the C syntax supports wrapping the type definition and typedef up together:

typedef struct s_tag { int m1; float m2 } s_type;

So what is the point of that s_tag? Well, it is mostly vestigial, but it has one important use; it is the only way you can declare a struct that is self-referential. This is imperative when declaring a tree or list of struct types, if you are going to take advantage of type-checking, instead of mashing everything through void pointers and unsafe casting to get it back to the type you need. So,

typdef struct s_tag { int m1; float m2; s_tag *next_p } s_type;

and not:

typedef struct s_tag { int m1; float m2; s_type *next_p } s_type;

because the latter would require a forward reference to an incomplete type, and there's no way to specify that. Of course, it doesn't seem that unreasonable these days to ask a compiler to keep track of incomplete things as it goes and make sense of them in a later pass, but back in the day machines were smaller and slower, and so we have that vestigial struct tag.

And in C++, designed to maintain as much compatibility with C as possible, there are (of course) some subtle complications that can come into play, but that's basically how structs work.

In case you're curious, yes, there's a similar "enum tag" for enums, and you can refer to enum types in the enum namespace. I haven't see this sort of thing much in real-world code. There's yet another namespace for unions, and unions are structs as well in C++ -- we'll take advantage of that later. But moving on, let's define some type tags:

typedef struct {} step_1_tt;
typedef struct {} step_2_tt;
typedef struct {} step_3_tt;

And here are some "type tag" instances:

static steps::step_1_tt step_1_tt_i;
static steps::step_2_tt step_2_tt_i;
static steps::step_3_tt step_3_tt_i;

The empty struct isn't standard C (although I think GCC allows it as an extension), but it is allowed in C++. In standard C there is a similar trick where you can declare a pointers to incomplete (that is, forward-declared) struct types (does that let you use self-reference in structs without using the vestigial tag? I'll have to look into that. There's a lot to know for such a seemingly simple language feature!)

Now, we need to be able to distinguish our tiny scripting language "tokens," and for that we use an enumeration. (We could do this with subtypes, as well, and maybe I'll consider that for later, but for now we'll do it this way). We'll define several types that correspond to functions:

typedef enum

} step_e;

Next up, we want to define the functions. This is pretty basic. We could do it with function pointers, of specific function pointer types, and I might explore that later, but for now we'll refer to a hard-coded set of actual functions. Here they are, simple placeholders that print their arguments, for testing:

void step_func_a( signed long );
void step_func_b( unsigned long );
void step_func_c( signed short );

void step_func_a( signed long p1 )
    std::cout << "step func a with param" << p1 << "\n";

void step_func_b( unsigned long p1 )
    std::cout << "step func b with param" << p1 << "\n";

void step_func_c( signed short p1 )
    std::cout << "step func c with param" << p1 << "\n";

So those are the functions we can call from our "scripts." But what about the parameters? We'll use unions. This is unsafe, in the sense that the type system will not keep us safe, and we'll have to use some extra bookkeeping have to make sure we decode parameters the same way they were encoded. Because a union in C++ is also a class, we can provide constructors for our union type, and even overload them, to handle the different types that might be passed in:

union step_param_u
    step_param_slong    slong_param;
    step_param_ulong    ulong_param;
    step_param_sshort   sshort_param;
    step_param_ushort   ushort_param;
    step_param_schar    schar_param;
    step_param_uchar    uchar_param;
    // We could add more here, for any parameter types we need
    // to store and reconstitute later

    step_param_u()                   : slong_param( 0 )   {};
    step_param_u( signed long   p1 ) : slong_param ( p1 ) {};
    step_param_u( unsigned long p1 ) : ulong_param ( p1 ) {};
    step_param_u( signed short  p1 ) : sshort_param( p1 ) {};
    step_param_u( signed char   p1 ) : schar_param( p1 )  {};


One of those union objects will represent one parameter. Let's wrap that all up with a struct that will embody a "frozen function call" with up to 3 parameters. We could extend that if we need more.

There's a sort of type-system hole in that we don't have a safe, distinguishable way to represent a null parameter -- that is, one that wasn't provided -- and so there's not quite a sensible default constructor for step_param_u. We could use subclasses of our "frozen function call" to represent function calls with no parameters, one parameter, two parameters, three parameters, etc. so that we don't have to represent the idea of an unused parameter at all, but this results in an explosion of types (and I'm not even really taking return values into account at all, at this point; our scripting language uses a completely imperative paradigm). Maybe the number of unique parameter and return value combinations is not that large in practice, but for now let's just get a simple thing working:

struct step_s
    step_e step_type;
    step_param_u param_1;
    step_param_u param_2;
    step_param_u param_3;

    step_s( step_1_tt, signed long p1 ) :
        step_type( step_type_1 ),
        param_1( p1 )
    step_s( step_2_tt, signed long p1 ) :
        step_type( step_type_2 ),
        param_1( p1 )
    step_s( step_3_tt, signed short p1 ) :
        step_type( step_type_3 ),
        param_1( p1 )


And now for a trick to create our so-called. We turn things that look like function calls into actual constructor calls by adding a parameter. For this quick-and-dirty prototype, I'll use some macros:

#define MAKE_SEQUENCE(p1) step_s const p1[]
#define step_call_type_1(p1) step_s(step_1_tt_i,p1)
#define step_call_type_2(p1) step_s(step_2_tt_i,p1)
#define step_call_type_3(p1) step_s(step_3_tt_i,p1)

Is it raw and wriggling, precious? Make sure you understand what's going on here: these macros turn function-call-like syntax into constructor calls by passing our type tokens to select a matching overloaded constructor. Of course, the compiler is looking at the preprocessed code, which can make error messages harder to understand. I sometimes have to ask the compiler to generate preprocessor output to diagnose a problem, but a beginning Arduino programmer using the IDE won't even have a way to generate the preprocessor output.

And so here is an example of our tiny "scriping language" -- a sequence of steps to take, in reality function calls to make:

sequence(sequence_1) =



And that's our syntax.

Wait, really? Yes. That's a little script. That's why I call it the "most minimal" scripting language.

That actually turns into an array definition, with an implict (not explicitly declared) size, and an initialization list. Note that the ability to include constructor call expressions in an array initialization list is a newer feature of C++. Older compilers won't be able to handle the syntax at all. To support an older compiler we'd have to do some kind of post-definition assignment, which might require something like calling a variadic function with a list of macro invocations, and that could get really ugly really fast. Er, that is, even uglier than what we're already doing.

So how do we execute that script? Well, we need another bit of macro magic to find the length of a sequence:

// Length of a sequence; see Imperfect C++, the book by Matt Wilson, or his blog post:
// http://blog.imperfectcplusplus.com/2010/06/imperfect-cinventions-dimensionof.html
#define SEQUENCE_LENGTH(seq) (sizeof(seq)/sizeof(0[(seq)]))

And now we have all we need to write a function that runs a sequence. How about this, as the simplest thing that can work:

static void run_sequence( step_s const p1[], int const seq_len )
    for ( int seq_idx = 0; seq_idx < seq_len; seq_idx++ )


Where invoke is a method of step_s:

void invoke() const
    switch( step_type )
        case step_type_1:
            step_func_a( param_1.slong_param);
        case step_type_2:
            step_func_a( param_1.ulong_param);
        case step_type_3:
            step_func_a( param_1.sshort_param);


Well, actually, we need this, too, so a particular instance can be sized at compile time:

#define RUN_SEQUENCE(p1) run_sequence(p1,SEQUENCE_LENGTH(p1))

And then I can just execute RUN_SEQUENCE(sequence_1).

Note that invoke assumes that the entire set of functions we might want to invoke from our script is known. This could be extended in some fashion -- breaking it down into a system library and a user library, allowing the user to add macro definitions to create new usable "statements" to use in our scripts. This would involve a more generic way to represent steps, using function signature-specific subclasses, or a predefined set of supported function pointer types. I think that function pointers can be cast to function pointers of different types, and that might get us part of the way there, but I see an explosion happening somewhere, either in the switch statements in invoke or in subclasses. it doesn't seem easy to really capture everything we might need at compile time. A function invocation in C isn't something that supports much in the way of compile-time introspection -- there's no standard facility to look up its arity, its number of parameters, and their types; we aren't working with S-expressions here. Even gruesome hacks involving variable-length argument lists won't let us really work this out for a general case, as far as I know, and I don't think template metaprogramming can entirely remedy this, although there might be some neat tricks to minimize the code we have to write.

Some extensions to our "scripting language" are clearly possible: for example, our language could be extended to include labels, goto, counters, and even conditional branching, fairly easily -- it's just that all of those items would look like function calls and be macro invocations that boiled down to overloaded constructor calls to some single struct or class type, or different struct or class types that have a single base class so that we can treat them uniformly. Essentially, we're implementing a script without ever parsing the text of our scripting language, per se, and that's what keeps it tiny.

The run_sequence function would need a little more complexity. In fact it could run multiple sequences in multiple threads -- there exist solutions for cooperative multi-tasking on Arduino. To my taste the ones I've seen so far are all still too heavyweight. I'm even tempted to implement a preemptive solution, so that a thread that is executing a delay can't delay the other threads. This may be a topic of some ongoing research, aka "playing," as my time allows.

Next time, if I can work something out, I'll consider some template-based ideas and whether they buy us anything useful. As always, I welcome your comments.

No comments: