Variable-arity polymorphism from scratch in Scala, Part II: Type-level computation, HList, KList

October 23, 2011

(Disclaimer, once again: much of this code is shamelessly stolen from the truly mind-blowing series of blog articles over at Apocalisp. Also, the main thrust of this blog post requires very little in the way of type-level computation – go figure.)

So, in the last installment we talked about how type constructors are functions, and higher-kinded types are higher-order functions, and then I said a whole bunch of other stuff, too. The important part was that type constructors are functions. The corollary of that  is that type-checking involves evaluation, and thus computation, on the level of types.

In fact, it turns out that adding higher-kinded types to even the most basic type system requires embedding the entire simply-typed lambda calculus into the type system. The simply-typed LC is a mathematical language for computation and while it’s not Turing-complete (thankfully, otherwise the type checker might not terminate), it’s remarkably expressive. As an aside, I thought I heard that due to nasty interactions with subtyping, recursive types, and all-around bad-assery, the core Scala type checking algorithm is in fact not guaranteed to terminate (they of course practically get around this in prodction by adding a fixed limit to recursion depth or some such mechanism).

Anyhow, cool, we have the ability to perform computations in the type system (as a matter of fact, we already had that due to bounded quantification and recursive types, but I don’t understand exactly how much power that buys you (gotta be at least Peano arithmetic) and how it compares). To that end, check out this cool thingie we can build without using any higher kinds:

  sealed trait HList { }

  final case class HCons[H, T <: HList](head : H, tail : T) extends HList

  case class HNil extends HList

  object HNil extends HNil

  val hl = HCons(1, HCons("asdq", HCons(3, HNil)))
  val a : Int = hl.head
  val b : String = hl.tail.head
  val c : Int = hl.tail.tail.head

An HList is a type and data structure for heterogeneous lists. It’s sort of like an extensible tuple – it permits any length but keeps all the type information of each node. You can see it basically does its consing at the type level as well as the term level, increasing the number of *‘s in the kind as it builds the list. This is an example of the tremendous power of bounded quantification, it’s all done without appealing to any “magic” Scala features. It uses case classes for simplicity, but you can actually build these in Java – it’s just completely insane because you have to write out the types by hand. (Note: I did a little checking and it turns out the secret ingredient here, allowing a type constructor to appear in the bound of its own variable, has a special name – F-bounded polymorphism)

We can even add a neat little infix type shorthand for both the type and the constructor HCons:

  final case class HCons[H, T <: HList](head : H, tail : T) extends HList {
    def :+:[T](v : T) = HCons(v, this)

  case class HNil extends HList {
    def :+:[T](v : T) = HCons(v, this)

  type :+:[H, T <: HList] = HCons[H, T]
  val :+: = HCons

  val hl = 1 :+: "asdq" :+: 3 :+: HNil

So, what is the point of this? Well, HLists are a key ingredient in the variable-arity polymorphism which is the namesake of this blog post. That’s right, there is an actual point to this.

Well after HLists, there’s something called a KList. The motivation is thus – HLists are well and good, but they don’t let me exploit any shared properties between elements of the list. For example, maybe the elements all have different element types, but they’re all Lists. This comes up a lot – for example, in Typed Racket, the creator’s decision to include a mechanism for variable-arity polymorphism was mainly to support functions such as map and fold over variable numbers of lists. These functions are often called map2, map3, map4, etc. or zipWith, and the key element is application of some single-argument type constructor (that is, a constructor of kind * -> *) to a list of types. So, the KList consists of a pair: a type constructor of kind * -> *, and an HList. You could thus store a variable-length, heterogeneous list of List‘s and have a suitable input type for your variable-arity map function.

  sealed trait KList[+M[_], HL <: HList]

  final case class KCons[H, T <: HList, +M[_]](head: M[H], tail: KList[M, T])
    extends KList[M, H :+: T] {
    def :^: [N[X] >: M[X], G](g: N[G]) = KCons(g, this)

  sealed case class KNil extends KList[Nothing, HNil] {
    def :^: [M[_], H](h: M[H]) = KCons(h, this)

  object KNil extends KNil

  val :^: = KCons

  val (m : KList[List, (Int :+: String :+: HNil)]) =
    List(1, 2, 3, 4) :^: List("str1", "str2") :^: KNil

You can see from the type of m (annotation provided for checking purposes, it is optional and inferred by the compiler), that the KList type is doing its job – the List type is separate from the element types. Cool!

Oh also, there’s probably a bunch of crazy stuff that requires explaining up in that there definition there. The +M[_] means its covariant in the type constructor argument. The N[X] >: M[X] means it’s allowed to get less specific in the type constructor argument as it builds up the list (this makes sense). Also note that the way HCons (a.k.a. :+:) is used in the KCons type, and HNil in the KNil type, means that it’s impossible to build a KList with a different number of elements than its associated HList. Very cool.

Ok, so now we have the KList, what do we do with it? Let’s try making that variable-arity map from earlier. We know it’ll take a KList with constructor type List, and to operate on the HList of argument types it will need to take a function that operates on HLists. The type should look something like:

  def mapN[HL <: HList, T](kl : KList[List, HL])(f : HL => T) : List[T]

So now we can actually express the type of a variable-arity, variable-type, polymorphic map function like in Typed Racket – but we built it ourselves! In fact, TR’s dotted type variables seem very much like KLists – they abstract part of their type-definition and apply it to a variable-length list of type arguments. Look at the type-signature for TR’s map:

  (: map
     (All (C A B ...)
          ((A B ... B -> C) (Listof A) (Listof B) ... B
           (Listof C))))

Looking at it through what we know about KLists, it’s like TR turns the “dotted type” into a single-argument type constructor ((Listof B) parametrized by B, which gives the Listof constructor), and create a KList using that and the type list (B). This is exactly what we’re doing with our mapN definition. Of course TR’s version is limited in power and first-class in the type system, which has the advantage of nicer syntax, and not requiring full higher-kinded types to implement, which results in much less type system complexity. On the other hand… we built it ourselves! Wow!

We have left one thing out up until now, which is the actual implementation of the mapN function. I guess technically it counts as cheating unless we do.

The main challenge is to find a way to get the values out of the lists. By that, I mean – fundamentally, the KList is built from values of type List[T] – we’re attempting to apply a function that requires individual elements. It is not, in general, possible to construct a function C[T] => T for every type constructor C. If it was, things like the Haskell IO monad wouldn’t work – effects would be able to escape into the rest of the code. There’s a cool concept underlying this, called natural transformations, (covered in the Apocalisp articles), but I’m going to save that for next time. Right now let’s just get the values out of the Lists the old fashioned way – pattern matching:

def heads[HL <: HList] : (KList[List,  HL] => HL) = {
  case KCons((hd :: _), tail) => HCons(hd, heads(tail))
  case KNil => HNil

def tails[HL <: HList] : (KList[List,  HL] => KList[List,  HL]) = {
  case KCons((_ :: tl), tail) => KCons(tl, tails(tail))
  case KNil => KNil

def toList : (KList[List, _] => List[List[Any]]) = {
  case KCons(hd, tl) => hd :: toList(tl)
  case KNil => List.empty

def anyEmpty(kl : KList[List, _]) : Boolean = toList(kl).exists(_.isEmpty)

def mapN[HL <: HList, T](kl : KList[List, HL])(f : HL => T) : List[T] =
       f(heads(kl)) :: mapN(tails(kl))(f)

Keep in mind that those match cases are not exhaustive because I know we’re pre-checking the KList for any empty components. We write it in curried style so that Scala can infer the type of later arguments from earlier ones. When tested, our mapN does what we expect it to do:

val m = List(1, 2, 3, 4) :^: List("str1", "str2") :^: KNil
println(mapN(m){ case (intVal :+: strVal :+: _) => intVal.toString() + strVal })

It prints “List(1str1, 2str2)”.

So, we’ve succeeded in implementing the nasty mapN function and proving that we can construct variable-arity polymorphism in Scala. With a little syntactic sugar (why can’t Scala have macros!?) this would be a pretty reasonable foundation for a language feature.

What we’ll show next time is that there’s a broader concept hidden under all the digging around for heads and tails that we were doing in the implementation of mapN – It doesn’t have to be this ugly. It can be easy, and, well, natural.

(In progress. Come back later.)


Variable-arity polymorphism from scratch in Scala, Part I: What are higher kinded types?

October 23, 2011

Disclaimer: this post is based on an amazing series of blog posts on type-level programming over at Apocalisp. I don’t know any Scala (I wrote my first Scala code yesterday, while reading the aforementioned blog posts), and I certainly wouldn’t have been able to put together some of the type-machinery in the later posts on my own. So be warned that I… uh… don’t necessarily know what some of these keywords and such precisely mean. Per se.

Act I:

So, I thought I knew a decent bit about type systems – I’ve read most of TAPL/ATTAPL, programmed quite a bit in ML-based languages, done a bit of Haskell, I use words like “monad” and “functor” in polite conversation, etc. Typed Racket appeals to me because it brings powerful and novel type system features to the table that are like nothing I’ve seen in those more traditional functional languages. They allow static checking of the idioms of Scheme programming – dynamic-flavored features like variable-arity functions, path-dependent type refinements, and untagged unions. It’s mind-expanding.

So, anyhow – Scala is insane. I have only begun to scratch the surface of its type system, and already I’m learning more about types than I have since I first picked up TAPL. It continues in the vein of Typed Racket by showing me just how little I know about type systems, and I love it.

Act II, or What am I actually talking about, anyhow?:

Scala is an object-oriented language first and foremost. That means subtyping (and its big brother, bounded quantification) plays a big part in the type system – something that can’t be said about ML or Haskell (type classes notwithstanding). So that’s one thing that sets it apart, but it’s not really the big one – after all, I program C# at my day job, which thanks to LINQ, is basically a functional-OO language with subtyping and bounded quantification. The big feature in Scala is higher-kinded types (A note: Haskell has higher-kinded types as well, but no subtyping.) You’ll notice I didn’t make “higher-kinded types” a hyperlink because I don’t really know a good short explanation of it, so I’ll attempt to provide one.

Act III, or OK, I’m getting to it, for seriously:

Regular ol’ Java or C# style generics are often called, in functional languages, “type constructors”. This is because they can be thought of as functions that take an input type, and return (or “construct”) a result type. If you have a type like List<T>, you can plug in a type like Integer, and get a type in return – in this case, List<Integer>.

Well, if generics are like functions, then they need to have “types” of their own – for example, to make sure that you don’t give them the wrong number of arguments. For example, what if you tried to apply the type constructor Pair<A, B> to the input types <Integer, Integer, String>? Obviously it wouldn’t work – there aren’t enough slots. The way we express this requirement is through something called a “kind” – kinds are the types… of types.

By convention, kinds of type constructors are written using the same -> symbol used for function types (as in, A -> B is the type of a function from A to B), but using *‘s instead of letters to represent the types. For example, the kind of Integer is *, since it takes no arguments – it’s a constant type, not a function from types to types. The kind of List<T> is * -> * – it takes a type and returns a type. The kind of Pair<A, B> is * -> * -> * (you could also write it as (*, *) -> * if you don’t like currying).

Continuing the analogy between type constructors and functions, any red-blooded functional programmer knows that some of our best function friends are “higher-order functions”. They have types like (A -> B) -> List<A> -> List<B>, (A -> Boolean) -> List<A> -> List<A>, and (A -> B -> B) -> B -> List<A> -> B (Extra credit: what functions are those? Answer in white: map, filter, fold).

These higher-order functions are functions that take functions as arguments, and/or return functions as results. Well, the analogy to type constructors would be type constructors that take and/or return other type constructors. These would have kinds that look like, for example, (* -> *) -> *, or to use the C#/Java notation, A<B<C>>. This is not possible in Java/C#, ML, Typed Racket – but it is possible in Scala.

Act IV, or … so what?:

Ok, phew, now we know what higher kinded types are. We are still conspicuously missing a single example though. A weaker person than myself might jump to monads, but frankly those are about as intuitive as a course in macroeconomics. Let’s do something a bit more reasonable and (hopefully) shorter.

map has got to be one of the most kickass higher-order functions. It’s simple and easy to understand, replaces a lot of the need for for-loops, and it has one of the best characteristics of a properly-chosen abstraction: it works well as a verb and a noun. By that I mean, once you learn about the map function, it becomes natural to start thinking about “maps”, and “mapping”. Before long you start talking about algorithms in terms of “map this function over it, filter it by this predicate, then fold up the results and bam!” Nice, composable abstractions.

Anyhow, the great thing about map (defined for this purpose as the version with type (A -> B) -> List<A> -> List<B>), is that it’s even more general than it first seems By that, I mean that not only lists can be mapped over. You could map a function over a tree by applying the function at each node. You could map over a maybe/option/nullable type by applying the function if there’s a value present, and doing nothing if there isn’t.  You can map over a lot more stuff than that, but let’s stick with those for now. Here’s what the type signatures of those different maps would look like:

list-map : (A -> B) -> List<A> -> List<B>
tree-map : (A -> B) -> Tree<A> -> Tree<B>
option-map : (A -> B) -> Option<A> -> Option<B>

Clearly there’s a pattern – these functions all have type (A -> B) -> K<A> -> K<B> where K is some type constructor of kind * -> *. As good programmers, we follow the Don’t Repeat Yourself (DRY) principal, so we look for commonality in code and try to abstract over it and pull it out into one place. Not only does this aid maintenance by keeping changes and bugs localized, it guides us towards a deeper understanding of the structure of our problems (the problems of software design… an understanding of our personal problems is beyond the scope of this article.)

Coming at this from an OO perspective, it’s tempting to try the following solution: what it looks like is that Tree, List, and Option are all implementing an interface – we’ll call it Mappable<T>. In pseudo-C#:

interface Mappable<A> {
    Mappable<B> Map<B>(Func<A, B> fun);

You may be noticing that our definition doesn’t use any higher-kinded types at all! So what’s the problem? Just have Tree, Option, and List implement our Mappable interface, and we can write generic mapping code with no trouble. Or can we? The trouble is, the return type of our Mappable.Map function, is Mappable. That means the only thing we can do with the result of this function is to map more functions over it – we can never get any data out of it. We lose the functions and abilities that are unique to each different type of Mappable<B> structure. That’s no good! Instead, we need the List version of Mappable.Map to return a List, the Tree version to return a Tree, etc.

Clearly, our solution using interfaces is unsatisfactory – abstractness of the inputs means abstractness of the outputs. There’s only one way we know of to have a function that’s polymorphic in the inputs but doesn’t lose any type information – generics (parametric polymorphism). However, we can’t just use regular generics, because we need access to the element types, too – A and B. We need a generic function that is aware that some of its arguments are themselves type constructors.

The solution, as you could imagine, uses higher-kinded types. If C# supported such a thing, it would probably look something like this:

interface Mappable<M<A>> {
  M<B> Map<B>(Func<A, B> func);

Notice that the interface is parametrized not just by the element type, but by the Mappable type constructor, too – that lets us make sure that the result of mapping a List is a List, and so forth. In Scala, a language that actually does support this stuff, it’s done a bit differently. We’ll use the Scala sorta-equivalent of an interface, called a trait. The trait doesn’t get parametrized by the element type of the container, just the type constructor (the “Mappable” part). Also, Scala uses square instead of angle brackets for type parameters (ooo la la!):

trait Functor[F[_]] {
  def fmap[A, B](r: F[A], f: A => B) : F[B]

If this was the Usual Suspects, here would be where you find out the true identity of Verbal Kint. Mappable was Functor all along! Bumm bumm bummmmmm.

Anyhow, that took way longer than I thought it was going to. I hope it was at least vaguely informative about higher-kinded types. Next installment I’m going to jump right in the deep end with totally crazy type constructor baloney.

RushGP released!

October 23, 2011

I released a little project I’ve been working on, called RushGP – a version of the PushGP genetic programming system, written in Typed Racket. Actually, I released it a couple months ago, but this blog has been eerily silent (almost _too_ silent…. gasp!) Anyhow, it borrows the basic structure (and a good bit of code) from Lee Spector’s (the inventer of PushGP and all around prolific genetic programming guy) Racket implementation from a couple years ago, SchushGP.

Anyhow, should anyone be interested, it runs on the latest version of DrRacket (or at least it should… failing that, grab one of the nightly builds), and can be found at my GitHub page.