Variable-arity polymorphism from scratch in Scala, Part I: What are higher kinded types?

Disclaimer: this post is based on an amazing series of blog posts on type-level programming over at Apocalisp. I don’t know any Scala (I wrote my first Scala code yesterday, while reading the aforementioned blog posts), and I certainly wouldn’t have been able to put together some of the type-machinery in the later posts on my own. So be warned that I… uh… don’t necessarily know what some of these keywords and such precisely mean. Per se.

Act I:

So, I thought I knew a decent bit about type systems – I’ve read most of TAPL/ATTAPL, programmed quite a bit in ML-based languages, done a bit of Haskell, I use words like “monad” and “functor” in polite conversation, etc. Typed Racket appeals to me because it brings powerful and novel type system features to the table that are like nothing I’ve seen in those more traditional functional languages. They allow static checking of the idioms of Scheme programming – dynamic-flavored features like variable-arity functions, path-dependent type refinements, and untagged unions. It’s mind-expanding.

So, anyhow – Scala is insane. I have only begun to scratch the surface of its type system, and already I’m learning more about types than I have since I first picked up TAPL. It continues in the vein of Typed Racket by showing me just how little I know about type systems, and I love it.

Act II, or What am I actually talking about, anyhow?:

Scala is an object-oriented language first and foremost. That means subtyping (and its big brother, bounded quantification) plays a big part in the type system – something that can’t be said about ML or Haskell (type classes notwithstanding). So that’s one thing that sets it apart, but it’s not really the big one – after all, I program C# at my day job, which thanks to LINQ, is basically a functional-OO language with subtyping and bounded quantification. The big feature in Scala is higher-kinded types (A note: Haskell has higher-kinded types as well, but no subtyping.) You’ll notice I didn’t make “higher-kinded types” a hyperlink because I don’t really know a good short explanation of it, so I’ll attempt to provide one.

Act III, or OK, I’m getting to it, for seriously:

Regular ol’ Java or C# style generics are often called, in functional languages, “type constructors”. This is because they can be thought of as functions that take an input type, and return (or “construct”) a result type. If you have a type like List<T>, you can plug in a type like Integer, and get a type in return – in this case, List<Integer>.

Well, if generics are like functions, then they need to have “types” of their own – for example, to make sure that you don’t give them the wrong number of arguments. For example, what if you tried to apply the type constructor Pair<A, B> to the input types <Integer, Integer, String>? Obviously it wouldn’t work – there aren’t enough slots. The way we express this requirement is through something called a “kind” – kinds are the types… of types.

By convention, kinds of type constructors are written using the same -> symbol used for function types (as in, A -> B is the type of a function from A to B), but using *‘s instead of letters to represent the types. For example, the kind of Integer is *, since it takes no arguments – it’s a constant type, not a function from types to types. The kind of List<T> is * -> * – it takes a type and returns a type. The kind of Pair<A, B> is * -> * -> * (you could also write it as (*, *) -> * if you don’t like currying).

Continuing the analogy between type constructors and functions, any red-blooded functional programmer knows that some of our best function friends are “higher-order functions”. They have types like (A -> B) -> List<A> -> List<B>, (A -> Boolean) -> List<A> -> List<A>, and (A -> B -> B) -> B -> List<A> -> B (Extra credit: what functions are those? Answer in white: map, filter, fold).

These higher-order functions are functions that take functions as arguments, and/or return functions as results. Well, the analogy to type constructors would be type constructors that take and/or return other type constructors. These would have kinds that look like, for example, (* -> *) -> *, or to use the C#/Java notation, A<B<C>>. This is not possible in Java/C#, ML, Typed Racket – but it is possible in Scala.

Act IV, or … so what?:

Ok, phew, now we know what higher kinded types are. We are still conspicuously missing a single example though. A weaker person than myself might jump to monads, but frankly those are about as intuitive as a course in macroeconomics. Let’s do something a bit more reasonable and (hopefully) shorter.

map has got to be one of the most kickass higher-order functions. It’s simple and easy to understand, replaces a lot of the need for for-loops, and it has one of the best characteristics of a properly-chosen abstraction: it works well as a verb and a noun. By that I mean, once you learn about the map function, it becomes natural to start thinking about “maps”, and “mapping”. Before long you start talking about algorithms in terms of “map this function over it, filter it by this predicate, then fold up the results and bam!” Nice, composable abstractions.

Anyhow, the great thing about map (defined for this purpose as the version with type (A -> B) -> List<A> -> List<B>), is that it’s even more general than it first seems By that, I mean that not only lists can be mapped over. You could map a function over a tree by applying the function at each node. You could map over a maybe/option/nullable type by applying the function if there’s a value present, and doing nothing if there isn’t.  You can map over a lot more stuff than that, but let’s stick with those for now. Here’s what the type signatures of those different maps would look like:

list-map : (A -> B) -> List<A> -> List<B>
tree-map : (A -> B) -> Tree<A> -> Tree<B>
option-map : (A -> B) -> Option<A> -> Option<B>

Clearly there’s a pattern – these functions all have type (A -> B) -> K<A> -> K<B> where K is some type constructor of kind * -> *. As good programmers, we follow the Don’t Repeat Yourself (DRY) principal, so we look for commonality in code and try to abstract over it and pull it out into one place. Not only does this aid maintenance by keeping changes and bugs localized, it guides us towards a deeper understanding of the structure of our problems (the problems of software design… an understanding of our personal problems is beyond the scope of this article.)

Coming at this from an OO perspective, it’s tempting to try the following solution: what it looks like is that Tree, List, and Option are all implementing an interface – we’ll call it Mappable<T>. In pseudo-C#:

interface Mappable<A> {
    Mappable<B> Map<B>(Func<A, B> fun);

You may be noticing that our definition doesn’t use any higher-kinded types at all! So what’s the problem? Just have Tree, Option, and List implement our Mappable interface, and we can write generic mapping code with no trouble. Or can we? The trouble is, the return type of our Mappable.Map function, is Mappable. That means the only thing we can do with the result of this function is to map more functions over it – we can never get any data out of it. We lose the functions and abilities that are unique to each different type of Mappable<B> structure. That’s no good! Instead, we need the List version of Mappable.Map to return a List, the Tree version to return a Tree, etc.

Clearly, our solution using interfaces is unsatisfactory – abstractness of the inputs means abstractness of the outputs. There’s only one way we know of to have a function that’s polymorphic in the inputs but doesn’t lose any type information – generics (parametric polymorphism). However, we can’t just use regular generics, because we need access to the element types, too – A and B. We need a generic function that is aware that some of its arguments are themselves type constructors.

The solution, as you could imagine, uses higher-kinded types. If C# supported such a thing, it would probably look something like this:

interface Mappable<M<A>> {
  M<B> Map<B>(Func<A, B> func);

Notice that the interface is parametrized not just by the element type, but by the Mappable type constructor, too – that lets us make sure that the result of mapping a List is a List, and so forth. In Scala, a language that actually does support this stuff, it’s done a bit differently. We’ll use the Scala sorta-equivalent of an interface, called a trait. The trait doesn’t get parametrized by the element type of the container, just the type constructor (the “Mappable” part). Also, Scala uses square instead of angle brackets for type parameters (ooo la la!):

trait Functor[F[_]] {
  def fmap[A, B](r: F[A], f: A => B) : F[B]

If this was the Usual Suspects, here would be where you find out the true identity of Verbal Kint. Mappable was Functor all along! Bumm bumm bummmmmm.

Anyhow, that took way longer than I thought it was going to. I hope it was at least vaguely informative about higher-kinded types. Next installment I’m going to jump right in the deep end with totally crazy type constructor baloney.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: