More fun with higher rank types

August 16, 2011

Quick correction to the last blog post: I actually didn’t need to use macros to get the effect of higher-rank polymorphism – it turns out that Typed Racket supports it out of the box!

In the interests of accuracy, here’s an example of using higher rank types in TR to make a function that accomplishes the sort of thing we discussed last post (use a polymorphic “template function” on lists as a basis to make instances for several different concrete types of list):

#lang typed/racket

  ([Boolean : (Listof Boolean)]
   [Integer : (Listof Integer)]
   [Float : (Listof Float)]))

(: generate-instructions ((All (a) ((Listof a) -> (Listof a)))
                          (Listof (ProgramState -> ProgramState))))

(define (generate-instructions template-func)
   (λ: ([ps : ProgramState])
     (struct-copy ProgramState ps
                  [Boolean (template-func (ProgramState-Boolean ps))]))
   (λ: ([ps : ProgramState])
     (struct-copy ProgramState ps
                  [Integer (template-func (ProgramState-Integer ps))]))
   (λ: ([ps : ProgramState])
     (struct-copy ProgramState ps
                  [Float (template-func (ProgramState-Float ps))]))))

Now there’s no type-checking error because the quantifier (“All”) is part of template-func, not generate-instructions. Let’s try it out with a template-func that, say, duplicates the top element of a list, and make sure it does what we think it does:

(: duplicate-top (All (a) ((Listof a) -> (Listof a))))
(define (duplicate-top xs)
  (match xs
    [(cons hd tl) `(,hd ,hd . ,tl)]
    ['() '()]))

(define start-state (ProgramState '(#f) '(12) '(3.14)))

(match-define `(,dup-bools ,dup-ints ,dup-floats) 
  (generate-instructions duplicate-top))

(define end-state (dup-bools (dup-ints (dup-floats start-state))))

(ProgramState-Boolean end-state)
(ProgramState-Integer end-state)
(ProgramState-Float end-state)

Running this produces:

Welcome to DrRacket, version 5.1.3 [3m].
Language: typed/racket [custom].
'(#f #f)
'(12 12)
'(3.14 3.14)

Just as expected. Cool!

Impredicative polymorphism? Nah, macros!

August 10, 2011

I’ll admit that, although I’ve always appreciated and respected Lisp-ish languages (ever since I read SICP) I’ve had a lot of problems getting past the parentheses and the dynamic typing enough to actually use them for my personal projects. I’ve always felt that static typing, especially with good IDE support, makes it so much easier to write programs. Hell, even C# is for many purposes an acceptable functional language thanks to Erik Meijer and the majesty of LINQ.

On of the big benefits of Lisp and its homoiconic goodness is the macro system. Code is data, data is code, you can write code that manipulates code etc etc. That’s all great and I’ve heard it a gajillion times from my Lisp weenie friends. I always figured that personally, in order to really get into Lisp, to be able to look past the parentheses and really embrace it, I would have to find a problem I couldn’t solve using my regular toolbox.

I think that finally happened!

So, a couple days ago I started working on an F# implementation of Push, the framework for genetic programming. Push consists of a stack-based language, and a system to evolve programs written in that language. The main program state consists of a bunch of different stacks:

type ProgramState =
    { Exec : Program stack;
      Integer : int stack;
      Float : float stack;
      Code : Program stack;
      Boolean : bool stack; }

As part of the implementation of the language, I need to write different primitive functions (“instructions”) for arithmetic, logic, and stack manipulation, among other things. By convention, many instructions are named in the form TYPE.INST, and there are no type-generic functions. For example, INTEGER.+ and FLOAT.+ are two different instructions that add the first two numbers on the integer and float stacks, respectively. This makes sense, because float addition and integer addition are really two different animals.

However, some operations really are perfectly generic. For example, FLOAT.DUP and INTEGER.DUP both push a duplicate of the first item onto the float stack and integer stack, respectively. EXEC.DUP, BOOLEAN.DUP, etc. all work similarly. This is a generic operation – it doesn’t require any special information about floats, booleans, or whatever. It just requires knowledge of stacks.

So, you would expect to be able to write a function of the form 'a stack -> 'a stack, called DUP, which accomplishes this, and so you can:

let DUP stk = match stk with | a::rest -> a::a::rest | _ -> stk

Here’s a few more definitions of this sort of function, to hammer the point home:

let POP stk = match stk with | a::rest -> a::a::rest | _ -> stk
let SWAP stk = match stk with | a::b::rest -> b::a::rest | _ -> stk
let ROT stk = match stk with | a::b::c::rest -> c::a::b::rest | _ -> stk

So, given these datatype-generic definitions of stack manipulation function, it would be nice to have a function that generates specialized implementations of each of these functions for each stack type, so they can actually be used as Push instruction. That is, given the function DUP, give me implementations of EXEC.DUP, INTEGER.DUP, etc. Each of these functions is of type ProgramState -> ProgramState. So, we want a function that takes an input function of type 'a stack -> 'a stack, and gives us a function ProgramState -> ProgramState. Let’s give it a try:

let make_instructions (func : 'a stack -> 'a stack)
    : ((ProgramState -> ProgramState) list) =
        fun ps -> { ps with Integer = func ps.Integer };
        fun ps -> { ps with Boolean = func ps.Boolean };
        fun ps -> { ps with Exec = func ps.Exec };
        fun ps -> { ps with Float = func ps.Float };
        fun ps -> { ps with Code = func ps.Code };
        fun ps -> { ps with Auxiliary = func ps.Auxiliary };
        fun ps -> { ps with Tag = func ps.Tag };
        fun ps -> { ps with Zip = func ps.Zip };

This doesn’t work though. The type checker pauses at the fourth line and says, “hey, you just used 'a as an int, now you’re using it as a bool. That’s not allowed!” What gives? We specified that func has to be a generic function 'a stack -> 'a stack, it should be able to work no matter what stack we give it. The problem lies with a subtle ambiguity in the type signature of make_instructions. A function containing a type variable 'a is called parametric because for any type 'a, the function works. That is, the signature ('a stack -> 'a stack) -> (ProgramState -> ProgramState) is really shorthand for forall 'a. ('a stack -> 'a stack) -> (ProgramState -> ProgramState). Do you see what the problem is now?

That’s right! (hah) We want a function of the form (forall 'a. 'a stack -> 'a stack) -> (ProgramState -> ProgramState). That is, it’s not supposed to work for any 'a, it’s supposed to work for any parametric function 'a stack -> 'a stack. We’re moving the quantifier (the “forall” part) to an inner term. This is called higher-rank or impredicative polymorphism, and is not expressible in F# – F# only permits quantifiers at the outermost level. There are extensions to Haskell and SML (and many other languages) that permit this sort of thing, but unfortunately I didn’t have one on hand.

This is where Typed Racket comes in. Typed Racket has a quite expressive static type system, which supports the same sort of parametric/generic polymorphism as F#, but it has the ultimate Lisp trick up its sleeve: macros. Using macros I was able to accomplish the same sort of abstraction without needing impredicative polymorphism. The following macro allows me to construct a whole bunch of specialized instruction definitions by providing only one generic function:

(define-syntax define-instructions
    (syntax-rules ()
      ((_ instruction definition)
       (set! instructions
		(symbol->string 'instruction)
		(? (ps) (struct-copy ProgramState ps
                         [Exec (definition (ProgramState-Exec ps))])))
		(symbol->string 'instruction)
		(? (ps) (struct-copy ProgramState ps
                         [Integer (definition (ProgramState-Integer ps))])))
		(symbol->string 'instruction)
		(? (ps) (struct-copy ProgramState ps
                         [Float (definition (ProgramState-Float ps))])))
		(symbol->string 'instruction)
		(? (ps) (struct-copy ProgramState ps
                         [Code (definition (ProgramState-Code ps))])))
		(symbol->string 'instruction)
		(? (ps) (struct-copy ProgramState ps
                         [Boolean (definition (ProgramState-Boolean ps))])))

(define-instructions DUP
  (p?: (a) ([stack : (Stack a)])
       (match stack
         [(cons hd tl) (cons hd (cons hd tl))]
         [_ stack])))

(define-instructions POP
  (p?: (a) ([stack : (Stack a)])
       (match stack
         [(cons hd tl) tl]
         [_ stack])))

(define-instructions SWAP
  (p?: (a) ([stack : (Stack a)])
       (match stack
         [(list-rest a b tl) (cons b (cons a tl))]
         [_ stack])))

(define-instructions ROT
  (p?: (a) ([stack : (Stack a)])
       (match stack
         [(list-rest a b c tl) (cons c (cons a (cons b tl)))]
         [_ stack])))

This gets us exactly the functionality I wanted in the above F# example – I only have to write one parametric function and the concrete implementations are automatically added to my global instruction registry.

The reason this works is because each parametric function is copied to the site of each instruction definition. The type checker doesn’t fail to unify type variables because each definition is completely independent. This is similar to the naive implementation of let-polymorphism in functional languages by simply macro-expanding out let bindings – freshening type variables is accomplished by “freshening” entire expressions. I have never looked into how to implement impredicative polymorphism but this seems like it could be relevant.

Basically, the point of this blog post was just to point out some of the cool stuff you can do with the combination of static typing, especially more sophisticated type system features like parametric polymorphism, and macros. I’m very interested to see whether this simple example could be turned into a workable spec for implementing higher-rank polymorphism on top of Typed Racket, or if not what the limitations are.

Note: after writing this post I did some research and came across this paper by Matthias Felleisen. I haven’t had a chance to read it yet but I wonder whether his notion of “macro expressibility” might be similar to this idea, that it is possible to construct more sophisticated type system features using simple building blocks (like normal parametric polymorphism) and macros.


August 1, 2011

I started playing with Typed Racket, which is basically an attempt to build a type system that can accommodate idiomatic Scheme programming without modification. These leads them to such novel features as untagged unions, non-uniformly typed varargs, and (pretty helpful when you have untagged unions),  a type-refining version of instanceof called “occurrence typing”.

As more of an F# programmer than a Scheme programmer, the type system feature that most clearly appealed to me was the ability to abstract over variable numbers of function arguments. You see functions of the form foo2, foo3, foo4,… all the time, identical except for their number of arguments – a clear failure of the type system’s ability to abstract out boilerplate. Typed Racket then gives us an exciting opportunity to explore relatively uncharted territory, modes of abstraction that are not possible in mainstream functional programming languages. One possible use of this feature is to create some really clean foundations for a probabilistic modeling language. I’ve found that working with the data type of probability distributions really showcases the power and expressivity of functional programming, and it’s going to be fun to see if this type system can help us come up with an even cleaner formulation.

Here’s our definition of a probability distribution and a couple helper functions:

(define-type (Dist a) (-> a))
(: sample-dist (All (a) ((Dist a) -> a)))
(define (sample-dist d) (d))
(: always (All (a) (a -> (Dist a))))
(define (always val) (λ () val))

Here, probability distributions of type a are represented as non-referentially-transparent “sampling functions” that take no arguments, and return a sample value of type a. The usual way to make these into pure functions is to change the type from () -> a to  [0, 1] -> a, and feed them a random source, but the version above is even simpler and fine for our purposes.

(always some-value)

produces a probability distribution which is always some-value, that is, it returns some-value every time you sample it.

The type Dist is a functor (among many other things), so it has an operation, fmap:

(: dist-fmap (All (a b) ((a -> b) -> (Dist a) -> (Dist b))))
(define ((dist-fmap func) d)
  (λ () (func (sample-dist d))))

fmap is great for probability modeling – it lets you “lift” operations up into the world of distributions. You can take a function of normal values and turn it into a function over random values. You can see the analogy to lists and mapmap takes a normal function and turns it into a function from lists to lists. But our dist-fmap only works for functions of one argument. Boo. So now we can’t even do things like add two random numbers and get the resulting distribution. We could go ahead and define a bunch more functions, fmap2, fmap3, etc, that let us lift functions of varying numbers of arguments into the world of distributions. But Typed Racket has a trick up its sleeve:

(: dist-fmap* (All (c a b ...) ((a b ... b -> c) ->
                                (Dist a) (Dist b) ... b -> (Dist c))))
(define (dist-fmap* f)
  (λ (da . dbs) (λ () (apply f (sample-dist da) (map sample-dist dbs)))))

This definition uses a “dotted type” to represent the type of all fmap-functions for distributions. The type variable b ... b represents a list of type variables, each of which can be substituted with any type, forming a truly general definition of fmap. Of course this means Dist isn’t just a functor any more, it has a bit more structure, which is a subject for a blog post in and of itself. I like to give this function a shorthand:

(define ^ dist-fmap*)

Which to me conveys the “lifting” idea of the operator. With the power of the magic ^, we can now take any function and turn it into a function on distributions. But wait, we don’t want everything to be a distribution, sometimes we need some constant values. While we think of ^ as allowing us to lift functions into the land of probability, we can think of always as the ability to lift constant values. So one more shorthand:

(define p always)

and now we can easily lift functions and values (I know, functions are values, but you know). Well, now that we have these cool tools, let’s see what they can do. Let’s start with the uniform random variable on the interval [0,1]:

(: uniform (Dist Real))
(define uniform random)
> (uniform)
> (uniform)

Let’s kick it up a notch:

> (((^ min) uniform uniform uniform uniform uniform)))
> (((^ min) uniform uniform uniform uniform uniform)))
> (((^ min) uniform uniform uniform uniform uniform)))

Cool! That’s called the 5th order-statistic. The nth order-statistic is the distribution of the smallest element in n samples of a uniform random variable. And it’s written in about the cleanest way you could imagine. But repetition is the mother of crappy code, and having to repeat “uniform” all those times is lame. That gives me an idea:

(: lift-list (All (a) ((Listof (Dist a)) -> (Dist (Listof a)))))
(define (lift-list ds)
  (foldl (^ cons) (p '()) ds))

(: sample-population (All (a) ((Dist a) Natural -> (Dist (Listof a)))))
(define (sample-population d sample-size)
  (lift-list (repeat d sample-size))) 

(: nth-order-statistic (Natural -> (Dist Real)))
(define (nth-order-statistic n)
  (: min-list ((Listof Real) -> Real))
  (define (min-list xs) (apply min xs))
  ((^ min-list) (sample-population uniform n))

We could have solved this problem a bunch of ways, but lift-list and sample-population are useful for more than just this one example. They allow us to create a derived distribution of sample sets that we can then fold in interesting ways. min is one of those folds, and the combination of sample-population and lifted min gives us the the order statistics. We can even calculate sample mean and standard deviation functions, parametrized by sample size (bodies of statistical functions mean and sigma omitted for brevity):

> (((^ mean) (sample-population d 1000))))
> (((^ mean) (sample-population d 1000))))
> (((^ sigma) (sample-population d 1000))))
> (((^ sigma) (sample-population d 1000))))

Seems pretty accurate (and totally meta – distributions of statistics!).

Anyhow, we’ve established that we can do some pretty cool numerical modeling with this functional language, I think. So far though, this is pretty tame. With the ^ operator I’m making fun use of TR’s typed variadic arguments, but I want to get a bit more adventurous and a bit more Rackety. I really enjoyed the Pictures tutorial, so I wanted to do something in that vein. Unfortunately, slideshow, the library used, is not yet ported to Typed Racket. I would love to port it myself, but I was having trouble digging around and figuring out everything I had to wrap up, and dammit, I want to see some pictures, so let’s take a shortcut. Luckily we have the magic typed/racket/no-check switch, which is going to let us take this existing typeful infrastructure we have created, and link in some dynamically typed code. This should be a great existence proof of the fact that programming in a dynamic language does not mean programming without types. So, we put the following lines at the top of the page:

#lang typed/racket/no-check
(require slideshow)

And we’re off to the races. The types of all the following expressions we’re going to write are (Dist Pict), where Pict is a picture data type. That’s right, we’re dealing in picture-valued random distributions now. Once again for the record, although we’re not officially type-checking, we know exactly what types we’re dealing with and are writing perfectly type-safe code. Here’s a circle of radius 0-50, evenly distributed:

> (((^ circle) ((^ *) (p 50) uniform)))

> (((^ circle) ((^ *) (p 50) uniform)))

> (((^ circle) ((^ *) (p 50) uniform)))

> (((^ circle) ((^ *) (p 50) uniform)))

> (((^ circle) ((^ *) (p 50) uniform)))

Now maybe I’m just sheltered, but this totally rocks my world. Here we’ve created this beautiful little mini-language for creating probabilistic data, and we were in some ways guided there by the types. The design feels really clean and sort of inevitable (modulo minor quibbles like restoring referential transparency, etc.) – I mean to me this feels as about as good a way of drawing random pictures as I could have imagined.

Anyhow, let’s draw some more pictures. You get some pretty cool patterns by overlaying bunches of these circles:

> ((apply (^ cc-superimpose)
          (repeat ((^ circle) ((^ *) (p 100) uniform)) 10)))

> ((apply (^ cc-superimpose)
          (repeat ((^ circle) ((^ *) (p 100) uniform)) 10)))

> ((apply (^ cc-superimpose)
          (repeat ((^ circle) ((^ *) (p 100) uniform)) 10)))

Neato. I have to say, I think this gives me an opportunity to combine functionality from different libraries in ways the authors truly hadn’t anticipated. Let’s try those drawings again, but with the Gaussian distribution imported from the “random” library on PLaneT:

(require (planet schematics/random:1:0/random))
> ((apply (^ cc-superimpose)
          (repeat ((^ circle) ((^ *) (p 60) ((^ abs) random-gaussian))) 20)))

> ((apply (^ cc-superimpose)
          (repeat ((^ circle) ((^ *) (p 60) ((^ abs) random-gaussian))) 20)))

> ((apply (^ cc-superimpose)
          (repeat ((^ circle) ((^ *) (p 60) ((^ abs) random-gaussian))) 20)))


got to make sure we take the absolute value (the lifted absolute value, (^ abs), naturally), as negative circle radii are a big turn-off for drawing libraries. So, it’s easy to see that with the Gaussian, the radii are much more concentrated around the mean.

Anyhow, I’m going to stop for now with the vaguely suggestive drawings and take a breather. Look out for another post getting into the fmap*/^ operator some more and talking about why I think its cool. Also, more pictures. This stuff is pretty weak sauce compared to what could be done – the libraries are very powerful. I’m also quite interested in trying to draw some less geometric things. Imagine drawing a bunch of stick figures, eyes, mouths, and limbs at constrained random angles, and seeing what kind of scenes emerged? Stochastic cartooning, here we come! I’m sure I’m not the first to think of this sort of thing – does anyone know of similar projects?