I’ll admit that, although I’ve always appreciated and respected Lisp-ish languages (ever since I read SICP) I’ve had a lot of problems getting past the parentheses and the dynamic typing enough to actually use them for my personal projects. I’ve always felt that static typing, especially with good IDE support, makes it so much easier to write programs. Hell, even C# is for many purposes an acceptable functional language thanks to Erik Meijer and the majesty of LINQ.

On of the big benefits of Lisp and its homoiconic goodness is the macro system. Code is data, data is code, you can write code that manipulates code etc etc. That’s all great and I’ve heard it a gajillion times from my Lisp weenie friends. I always figured that personally, in order to really get into Lisp, to be able to look past the parentheses and really embrace it, I would have to find a problem I couldn’t solve using my regular toolbox.

I think that finally happened!

So, a couple days ago I started working on an F# implementation of Push, the framework for genetic programming. Push consists of a stack-based language, and a system to evolve programs written in that language. The main program state consists of a bunch of different stacks:

type ProgramState = { Exec : Program stack; Integer : int stack; Float : float stack; Code : Program stack; Boolean : bool stack; }

As part of the implementation of the language, I need to write different primitive functions (“instructions”) for arithmetic, logic, and stack manipulation, among other things. By convention, many instructions are named in the form `TYPE.INST`

, and there are no type-generic functions. For example, `INTEGER.+`

and `FLOAT.+`

are two different instructions that add the first two numbers on the integer and float stacks, respectively. This makes sense, because float addition and integer addition are really two different animals.

However, some operations really *are* perfectly generic. For example, `FLOAT.DUP`

and `INTEGER.DUP`

both push a duplicate of the first item onto the float stack and integer stack, respectively. `EXEC.DUP`

, `BOOLEAN.DUP`

, etc. all work similarly. This is a generic operation – it doesn’t require any special information about floats, booleans, or whatever. It just requires knowledge of stacks.

So, you would expect to be able to write a function of the form `'a stack -> 'a stack`

, called `DUP`

, which accomplishes this, and so you can:

let DUP stk = match stk with | a::rest -> a::a::rest | _ -> stk

Here’s a few more definitions of this sort of function, to hammer the point home:

let POP stk = match stk with | a::rest -> a::a::rest | _ -> stk let SWAP stk = match stk with | a::b::rest -> b::a::rest | _ -> stk let ROT stk = match stk with | a::b::c::rest -> c::a::b::rest | _ -> stk

So, given these datatype-generic definitions of stack manipulation function, it would be nice to have a function that generates specialized implementations of each of these functions for each stack type, so they can actually be used as Push instruction. That is, given the function `DUP`

, give me implementations of `EXEC.DUP`

, `INTEGER.DUP`

, etc. Each of these functions is of type `ProgramState -> ProgramState`

. So, we want a function that takes an input function of type `'a stack -> 'a stack`

, and gives us a function `ProgramState -> ProgramState`

. Let’s give it a try:

let make_instructions (func : 'a stack -> 'a stack) : ((ProgramState -> ProgramState) list) = [ fun ps -> { ps with Integer = func ps.Integer }; fun ps -> { ps with Boolean = func ps.Boolean }; fun ps -> { ps with Exec = func ps.Exec }; fun ps -> { ps with Float = func ps.Float }; fun ps -> { ps with Code = func ps.Code }; fun ps -> { ps with Auxiliary = func ps.Auxiliary }; fun ps -> { ps with Tag = func ps.Tag }; fun ps -> { ps with Zip = func ps.Zip }; ]

This doesn’t work though. The type checker pauses at the fourth line and says, “hey, you just used `'a`

as an `int`

, now you’re using it as a `bool`

. That’s not allowed!” What gives? We specified that `func`

has to be a generic function `'a stack -> 'a stack`

, it should be able to work no matter what stack we give it. The problem lies with a subtle ambiguity in the type signature of `make_instructions`

. A function containing a type variable `'a`

is called parametric because *for any* type `'a`

, the function works. That is, the signature `('a stack -> 'a stack) -> (ProgramState -> ProgramState)`

is really shorthand for `forall 'a. ('a stack -> 'a stack) -> (ProgramState -> ProgramState)`

. Do you see what the problem is now?

That’s right! (hah) We want a function of the form `(forall 'a. 'a stack -> 'a stack) -> (ProgramState -> ProgramState)`

. That is, it’s not supposed to work for any `'a`

, it’s supposed to work for any parametric function `'a stack -> 'a stack`

. We’re moving the quantifier (the “`forall`

” part) to an inner term. This is called higher-rank or impredicative polymorphism, and is not expressible in F# – F# only permits quantifiers at the outermost level. There are extensions to Haskell and SML (and many other languages) that permit this sort of thing, but unfortunately I didn’t have one on hand.

This is where Typed Racket comes in. Typed Racket has a quite expressive static type system, which supports the same sort of parametric/generic polymorphism as F#, but it has the ultimate Lisp trick up its sleeve: macros. Using macros I was able to accomplish the same sort of abstraction without needing impredicative polymorphism. The following macro allows me to construct a whole bunch of specialized instruction definitions by providing only one generic function:

(define-syntax define-instructions (syntax-rules () ((_ instruction definition) (set! instructions (append (list (Instruction-Definition 'Exec (symbol->string 'instruction) (? (ps) (struct-copy ProgramState ps [Exec (definition (ProgramState-Exec ps))]))) (Instruction-Definition 'Integer (symbol->string 'instruction) (? (ps) (struct-copy ProgramState ps [Integer (definition (ProgramState-Integer ps))]))) (Instruction-Definition 'Float (symbol->string 'instruction) (? (ps) (struct-copy ProgramState ps [Float (definition (ProgramState-Float ps))]))) (Instruction-Definition 'Code (symbol->string 'instruction) (? (ps) (struct-copy ProgramState ps [Code (definition (ProgramState-Code ps))]))) (Instruction-Definition 'Boolean (symbol->string 'instruction) (? (ps) (struct-copy ProgramState ps [Boolean (definition (ProgramState-Boolean ps))]))) instructions))))) (define-instructions DUP (p?: (a) ([stack : (Stack a)]) (match stack [(cons hd tl) (cons hd (cons hd tl))] [_ stack]))) (define-instructions POP (p?: (a) ([stack : (Stack a)]) (match stack [(cons hd tl) tl] [_ stack]))) (define-instructions SWAP (p?: (a) ([stack : (Stack a)]) (match stack [(list-rest a b tl) (cons b (cons a tl))] [_ stack]))) (define-instructions ROT (p?: (a) ([stack : (Stack a)]) (match stack [(list-rest a b c tl) (cons c (cons a (cons b tl)))] [_ stack])))

This gets us exactly the functionality I wanted in the above F# example – I only have to write one parametric function and the concrete implementations are automatically added to my global instruction registry.

The reason this works is because each parametric function is copied to the site of each instruction definition. The type checker doesn’t fail to unify type variables because each definition is completely independent. This is similar to the naive implementation of let-polymorphism in functional languages by simply macro-expanding out let bindings – freshening type variables is accomplished by “freshening” entire expressions. I have never looked into how to implement impredicative polymorphism but this seems like it could be relevant.

Basically, the point of this blog post was just to point out some of the cool stuff you can do with the combination of static typing, especially more sophisticated type system features like parametric polymorphism, and macros. I’m very interested to see whether this simple example could be turned into a workable spec for implementing higher-rank polymorphism on top of Typed Racket, or if not what the limitations are.

Note: after writing this post I did some research and came across this paper by Matthias Felleisen. I haven’t had a chance to read it yet but I wonder whether his notion of “macro expressibility” might be similar to this idea, that it is possible to construct more sophisticated type system features using simple building blocks (like normal parametric polymorphism) and macros.