Why traits and not just data?

February 7, 2012

I started learning Scala coming from the perspective of functional programming. I really want to “get” traits/mixins, but I’m having trouble seeing any reason to use inheritance/super heavily to achieve “stackability” when weighed against the benefit of a more data-driven approach that just uses functions as data.

In the chapter on traits in the excellent Artima book, Programming in Scala, their example to demonstrate the heavy use of traits is the idea of a modifiable queue, which performs certain stackable modifications on its inputs/outputs. The authors urge you to consider traits in any situation where you are doing a lot of “stackable modifications”.

I’m not sure why you wouldn’t just do this using lists of functions. Instead of relying on super and linearization to apply modifiers in the correct order, just treat the concept of a “modifier” as a piece of data and manipulate it as you would any other data. What’s nice about this is that these data-driven “mixins” can be added at runtime, removed, and re-ordered.

I’m not claiming traits are useless or anything – there are definitely benefits to first-class syntax, and I’m pretty sure I’m missing something more than that. But it seems that most of the time, if you want to organize code as a bunch of stackable modifications, function-valued data is a simpler and more flexible way to go than trying to ponder your way through a bunch of multiple inheritance.

Their example:

import collection.mutable.ArrayBuffer

object MainModule {

  abstract class IntQueue {
    def get(): Int
    def put(x: Int)
  }

  class BasicIntQueue extends IntQueue {
    private val buf = new ArrayBuffer[Int]
    def get() = buf.remove(0)
    def put(x: Int) { buf += x}
  }

  trait Doubling extends IntQueue {
    abstract override def put(x: Int) { super.put(2 * x)}
  }

  trait Incrementing extends IntQueue {
    abstract override def put(x: Int) { super.put(x + 1)}
  }

  def main(args: Array[String]) {

    val q1 = new BasicIntQueue with Doubling with Incrementing

    q1 put 10
    q1 put 20

    println(q1.get())
    println(q1.get())

    val q2 = new BasicIntQueue with Incrementing with Doubling

    q2 put 10
    q2 put 20

    println(q2.get())
    println(q2.get())

  }

}

My re-writing:

import collection.mutable.ArrayBuffer

object MainModule2 {

  abstract class Queue[T] {
    def put(item: T)
    def get(): T
  }

  trait BasicIntQueue extends Queue[Int] {
    private val buf = new ArrayBuffer[Int]
    def get() = buf.remove(0)
    def put(x: Int) { buf += x }
  }

  case class ModifiableQueue(
    getModifiers: List[Int=>Int],
    putModifiers: List[Int=>Int]) extends BasicIntQueue {

    private def compose[A](f: A => A, g: A => A): A => A =
      { x => f(g(x)) }

    override def get() = (getModifiers reduce compose[Int])(super.get())
    override def put(item: Int) {
      super.put((putModifiers reduce compose[Int])(item))
    }

    def modify(qm: QueueModifier) =
      ModifiableQueue(
        this.getModifiers ++ List(qm.getModifier),
        this.putModifiers ++ List(qm.putModifier))
  }

  case class QueueModifier(
    getModifier: Int => Int = { x => x },
    putModifier: Int => Int = { x => x })

  val doubling = QueueModifier(putModifier = { 2 * _ })

  val incrementing = QueueModifier(putModifier = { 1 + _ })

  def queue = new ModifiableQueue(Nil, Nil)

  def main(args: Array[String]) {

    // just like mixins:
    val q1 = queue modify doubling modify incrementing

    q1 put 10
    q1 put 20

    println("increment, then double:")

    println(q1.get())
    println(q1.get())

    val q2 = queue modify incrementing modify doubling

    q2 put 10
    q2 put 20

    println("double, then increment:")

    println(q2.get())
    println(q2.get())

    // but now "traits" are just first-class data...
    // way better than mixins because it can be done at runtime,
    // composably, higher-order, etc:

    val q3 = (List(incrementing, doubling) foldLeft queue) { _ modify _ }

    q3 put 10
    q3 put 20

    println("folded in modifiers:")

    println(q3.get())
    println(q3.get())

    // they can even be removed, stacked N times, etc:

    val q4 = q3.copy(putModifiers = q3.putModifiers.tail)

    q4 put 10
    q4 put 20

    println("removed last modifier:")

    println(q4.get())
    println(q4.get())

    def modifyN(q: ModifiableQueue, qm: QueueModifier, n: Int) =
      (1 to n map { _ => qm } foldLeft q) { _ modify _ }

    val q5 = modifyN(queue, doubling, 5)

    q5 put 10
    q5 put 20

    println("doubled 5 times:")

    println(q5.get())
    println(q5.get())

  }

}

This is a very simple example, and has the downside that a new “modifier” list must be added for each method we which to modify. I’m going to think this through some more and see if I can’t figure out a more generic approach that uses reflection, traits, dictionaries, or any of the above. One approach could be to define a generic trait that represents the “wrapping” of one method, and mix that in for each method you wish to stackably modify.

Advertisements

F#-style pipe operator in Scala

February 7, 2012

This is super trivial, but sometime’s it’s nice to use the F# style pipe operator to write computations that read in the order they execute (basically postfix function application). In idiomatic Scala, this is usually accomplished by method calls and implicits to add extension methods. This is aided by the fact that many common functions for which postfix notation is convenient, like map, are pre-defined as methods on collections. But, general piping can still be a tool for the toolbox:

object MainModule {

  case class Pipeable[A](a: A) {
    def |>[B](f: A => B) = f(a)
  }

  implicit def convert[A](s: A) = Pipeable(s)  

  def main(args: Array[String]) {

    def map2[A, B](f: A => B)(xs: List[A]) = xs map f

    val test = List(1, 2, 3, 4, 5) |>
      map2 { _ + 1 } |>
      map2 { _ * 2 }

    println(test)

  }
}

ScalushGP!

November 16, 2011

I recently (read: a couple weeks ago) released a partial implementation of PushGP in Scala, called (imaginatively!) ScalushGP. It’s still in early stages (I’m not 100% convinced it’s bug free), but it’s been pretty fun getting to know Scala.


Variable-arity polymorphism from scratch in Scala, Part II: Type-level computation, HList, KList

October 23, 2011

(Disclaimer, once again: much of this code is shamelessly stolen from the truly mind-blowing series of blog articles over at Apocalisp. Also, the main thrust of this blog post requires very little in the way of type-level computation – go figure.)

So, in the last installment we talked about how type constructors are functions, and higher-kinded types are higher-order functions, and then I said a whole bunch of other stuff, too. The important part was that type constructors are functions. The corollary of that  is that type-checking involves evaluation, and thus computation, on the level of types.

In fact, it turns out that adding higher-kinded types to even the most basic type system requires embedding the entire simply-typed lambda calculus into the type system. The simply-typed LC is a mathematical language for computation and while it’s not Turing-complete (thankfully, otherwise the type checker might not terminate), it’s remarkably expressive. As an aside, I thought I heard that due to nasty interactions with subtyping, recursive types, and all-around bad-assery, the core Scala type checking algorithm is in fact not guaranteed to terminate (they of course practically get around this in prodction by adding a fixed limit to recursion depth or some such mechanism).

Anyhow, cool, we have the ability to perform computations in the type system (as a matter of fact, we already had that due to bounded quantification and recursive types, but I don’t understand exactly how much power that buys you (gotta be at least Peano arithmetic) and how it compares). To that end, check out this cool thingie we can build without using any higher kinds:

  sealed trait HList { }

  final case class HCons[H, T <: HList](head : H, tail : T) extends HList

  case class HNil extends HList

  object HNil extends HNil

  val hl = HCons(1, HCons("asdq", HCons(3, HNil)))
  val a : Int = hl.head
  val b : String = hl.tail.head
  val c : Int = hl.tail.tail.head

An HList is a type and data structure for heterogeneous lists. It’s sort of like an extensible tuple – it permits any length but keeps all the type information of each node. You can see it basically does its consing at the type level as well as the term level, increasing the number of *‘s in the kind as it builds the list. This is an example of the tremendous power of bounded quantification, it’s all done without appealing to any “magic” Scala features. It uses case classes for simplicity, but you can actually build these in Java – it’s just completely insane because you have to write out the types by hand. (Note: I did a little checking and it turns out the secret ingredient here, allowing a type constructor to appear in the bound of its own variable, has a special name – F-bounded polymorphism)

We can even add a neat little infix type shorthand for both the type and the constructor HCons:

  final case class HCons[H, T <: HList](head : H, tail : T) extends HList {
    def :+:[T](v : T) = HCons(v, this)
  }

  case class HNil extends HList {
    def :+:[T](v : T) = HCons(v, this)
  }

  type :+:[H, T <: HList] = HCons[H, T]
  val :+: = HCons

  val hl = 1 :+: "asdq" :+: 3 :+: HNil

So, what is the point of this? Well, HLists are a key ingredient in the variable-arity polymorphism which is the namesake of this blog post. That’s right, there is an actual point to this.

Well after HLists, there’s something called a KList. The motivation is thus – HLists are well and good, but they don’t let me exploit any shared properties between elements of the list. For example, maybe the elements all have different element types, but they’re all Lists. This comes up a lot – for example, in Typed Racket, the creator’s decision to include a mechanism for variable-arity polymorphism was mainly to support functions such as map and fold over variable numbers of lists. These functions are often called map2, map3, map4, etc. or zipWith, and the key element is application of some single-argument type constructor (that is, a constructor of kind * -> *) to a list of types. So, the KList consists of a pair: a type constructor of kind * -> *, and an HList. You could thus store a variable-length, heterogeneous list of List‘s and have a suitable input type for your variable-arity map function.

  sealed trait KList[+M[_], HL <: HList]

  final case class KCons[H, T <: HList, +M[_]](head: M[H], tail: KList[M, T])
    extends KList[M, H :+: T] {
    def :^: [N[X] >: M[X], G](g: N[G]) = KCons(g, this)
  }

  sealed case class KNil extends KList[Nothing, HNil] {
    def :^: [M[_], H](h: M[H]) = KCons(h, this)
  }

  object KNil extends KNil

  val :^: = KCons

  val (m : KList[List, (Int :+: String :+: HNil)]) =
    List(1, 2, 3, 4) :^: List("str1", "str2") :^: KNil

You can see from the type of m (annotation provided for checking purposes, it is optional and inferred by the compiler), that the KList type is doing its job – the List type is separate from the element types. Cool!

Oh also, there’s probably a bunch of crazy stuff that requires explaining up in that there definition there. The +M[_] means its covariant in the type constructor argument. The N[X] >: M[X] means it’s allowed to get less specific in the type constructor argument as it builds up the list (this makes sense). Also note that the way HCons (a.k.a. :+:) is used in the KCons type, and HNil in the KNil type, means that it’s impossible to build a KList with a different number of elements than its associated HList. Very cool.

Ok, so now we have the KList, what do we do with it? Let’s try making that variable-arity map from earlier. We know it’ll take a KList with constructor type List, and to operate on the HList of argument types it will need to take a function that operates on HLists. The type should look something like:

  def mapN[HL <: HList, T](kl : KList[List, HL])(f : HL => T) : List[T]

So now we can actually express the type of a variable-arity, variable-type, polymorphic map function like in Typed Racket – but we built it ourselves! In fact, TR’s dotted type variables seem very much like KLists – they abstract part of their type-definition and apply it to a variable-length list of type arguments. Look at the type-signature for TR’s map:

  (: map
     (All (C A B ...)
          ((A B ... B -> C) (Listof A) (Listof B) ... B
           ->
           (Listof C))))

Looking at it through what we know about KLists, it’s like TR turns the “dotted type” into a single-argument type constructor ((Listof B) parametrized by B, which gives the Listof constructor), and create a KList using that and the type list (B). This is exactly what we’re doing with our mapN definition. Of course TR’s version is limited in power and first-class in the type system, which has the advantage of nicer syntax, and not requiring full higher-kinded types to implement, which results in much less type system complexity. On the other hand… we built it ourselves! Wow!

We have left one thing out up until now, which is the actual implementation of the mapN function. I guess technically it counts as cheating unless we do.

The main challenge is to find a way to get the values out of the lists. By that, I mean – fundamentally, the KList is built from values of type List[T] – we’re attempting to apply a function that requires individual elements. It is not, in general, possible to construct a function C[T] => T for every type constructor C. If it was, things like the Haskell IO monad wouldn’t work – effects would be able to escape into the rest of the code. There’s a cool concept underlying this, called natural transformations, (covered in the Apocalisp articles), but I’m going to save that for next time. Right now let’s just get the values out of the Lists the old fashioned way – pattern matching:

def heads[HL <: HList] : (KList[List,  HL] => HL) = {
  case KCons((hd :: _), tail) => HCons(hd, heads(tail))
  case KNil => HNil
}

def tails[HL <: HList] : (KList[List,  HL] => KList[List,  HL]) = {
  case KCons((_ :: tl), tail) => KCons(tl, tails(tail))
  case KNil => KNil
}

def toList : (KList[List, _] => List[List[Any]]) = {
  case KCons(hd, tl) => hd :: toList(tl)
  case KNil => List.empty
}

def anyEmpty(kl : KList[List, _]) : Boolean = toList(kl).exists(_.isEmpty)

def mapN[HL <: HList, T](kl : KList[List, HL])(f : HL => T) : List[T] =
   if(anyEmpty(kl))
      List.empty
   else
       f(heads(kl)) :: mapN(tails(kl))(f)

Keep in mind that those match cases are not exhaustive because I know we’re pre-checking the KList for any empty components. We write it in curried style so that Scala can infer the type of later arguments from earlier ones. When tested, our mapN does what we expect it to do:

val m = List(1, 2, 3, 4) :^: List("str1", "str2") :^: KNil
println(mapN(m){ case (intVal :+: strVal :+: _) => intVal.toString() + strVal })

It prints “List(1str1, 2str2)”.

So, we’ve succeeded in implementing the nasty mapN function and proving that we can construct variable-arity polymorphism in Scala. With a little syntactic sugar (why can’t Scala have macros!?) this would be a pretty reasonable foundation for a language feature.

What we’ll show next time is that there’s a broader concept hidden under all the digging around for heads and tails that we were doing in the implementation of mapN – It doesn’t have to be this ugly. It can be easy, and, well, natural.

(In progress. Come back later.)


Variable-arity polymorphism from scratch in Scala, Part I: What are higher kinded types?

October 23, 2011

Disclaimer: this post is based on an amazing series of blog posts on type-level programming over at Apocalisp. I don’t know any Scala (I wrote my first Scala code yesterday, while reading the aforementioned blog posts), and I certainly wouldn’t have been able to put together some of the type-machinery in the later posts on my own. So be warned that I… uh… don’t necessarily know what some of these keywords and such precisely mean. Per se.

Act I:

So, I thought I knew a decent bit about type systems – I’ve read most of TAPL/ATTAPL, programmed quite a bit in ML-based languages, done a bit of Haskell, I use words like “monad” and “functor” in polite conversation, etc. Typed Racket appeals to me because it brings powerful and novel type system features to the table that are like nothing I’ve seen in those more traditional functional languages. They allow static checking of the idioms of Scheme programming – dynamic-flavored features like variable-arity functions, path-dependent type refinements, and untagged unions. It’s mind-expanding.

So, anyhow – Scala is insane. I have only begun to scratch the surface of its type system, and already I’m learning more about types than I have since I first picked up TAPL. It continues in the vein of Typed Racket by showing me just how little I know about type systems, and I love it.

Act II, or What am I actually talking about, anyhow?:

Scala is an object-oriented language first and foremost. That means subtyping (and its big brother, bounded quantification) plays a big part in the type system – something that can’t be said about ML or Haskell (type classes notwithstanding). So that’s one thing that sets it apart, but it’s not really the big one – after all, I program C# at my day job, which thanks to LINQ, is basically a functional-OO language with subtyping and bounded quantification. The big feature in Scala is higher-kinded types (A note: Haskell has higher-kinded types as well, but no subtyping.) You’ll notice I didn’t make “higher-kinded types” a hyperlink because I don’t really know a good short explanation of it, so I’ll attempt to provide one.

Act III, or OK, I’m getting to it, for seriously:

Regular ol’ Java or C# style generics are often called, in functional languages, “type constructors”. This is because they can be thought of as functions that take an input type, and return (or “construct”) a result type. If you have a type like List<T>, you can plug in a type like Integer, and get a type in return – in this case, List<Integer>.

Well, if generics are like functions, then they need to have “types” of their own – for example, to make sure that you don’t give them the wrong number of arguments. For example, what if you tried to apply the type constructor Pair<A, B> to the input types <Integer, Integer, String>? Obviously it wouldn’t work – there aren’t enough slots. The way we express this requirement is through something called a “kind” – kinds are the types… of types.

By convention, kinds of type constructors are written using the same -> symbol used for function types (as in, A -> B is the type of a function from A to B), but using *‘s instead of letters to represent the types. For example, the kind of Integer is *, since it takes no arguments – it’s a constant type, not a function from types to types. The kind of List<T> is * -> * – it takes a type and returns a type. The kind of Pair<A, B> is * -> * -> * (you could also write it as (*, *) -> * if you don’t like currying).

Continuing the analogy between type constructors and functions, any red-blooded functional programmer knows that some of our best function friends are “higher-order functions”. They have types like (A -> B) -> List<A> -> List<B>, (A -> Boolean) -> List<A> -> List<A>, and (A -> B -> B) -> B -> List<A> -> B (Extra credit: what functions are those? Answer in white: map, filter, fold).

These higher-order functions are functions that take functions as arguments, and/or return functions as results. Well, the analogy to type constructors would be type constructors that take and/or return other type constructors. These would have kinds that look like, for example, (* -> *) -> *, or to use the C#/Java notation, A<B<C>>. This is not possible in Java/C#, ML, Typed Racket – but it is possible in Scala.

Act IV, or … so what?:

Ok, phew, now we know what higher kinded types are. We are still conspicuously missing a single example though. A weaker person than myself might jump to monads, but frankly those are about as intuitive as a course in macroeconomics. Let’s do something a bit more reasonable and (hopefully) shorter.

map has got to be one of the most kickass higher-order functions. It’s simple and easy to understand, replaces a lot of the need for for-loops, and it has one of the best characteristics of a properly-chosen abstraction: it works well as a verb and a noun. By that I mean, once you learn about the map function, it becomes natural to start thinking about “maps”, and “mapping”. Before long you start talking about algorithms in terms of “map this function over it, filter it by this predicate, then fold up the results and bam!” Nice, composable abstractions.

Anyhow, the great thing about map (defined for this purpose as the version with type (A -> B) -> List<A> -> List<B>), is that it’s even more general than it first seems By that, I mean that not only lists can be mapped over. You could map a function over a tree by applying the function at each node. You could map over a maybe/option/nullable type by applying the function if there’s a value present, and doing nothing if there isn’t.  You can map over a lot more stuff than that, but let’s stick with those for now. Here’s what the type signatures of those different maps would look like:

list-map : (A -> B) -> List<A> -> List<B>
tree-map : (A -> B) -> Tree<A> -> Tree<B>
option-map : (A -> B) -> Option<A> -> Option<B>

Clearly there’s a pattern – these functions all have type (A -> B) -> K<A> -> K<B> where K is some type constructor of kind * -> *. As good programmers, we follow the Don’t Repeat Yourself (DRY) principal, so we look for commonality in code and try to abstract over it and pull it out into one place. Not only does this aid maintenance by keeping changes and bugs localized, it guides us towards a deeper understanding of the structure of our problems (the problems of software design… an understanding of our personal problems is beyond the scope of this article.)

Coming at this from an OO perspective, it’s tempting to try the following solution: what it looks like is that Tree, List, and Option are all implementing an interface – we’ll call it Mappable<T>. In pseudo-C#:

interface Mappable<A> {
    Mappable<B> Map<B>(Func<A, B> fun);
}

You may be noticing that our definition doesn’t use any higher-kinded types at all! So what’s the problem? Just have Tree, Option, and List implement our Mappable interface, and we can write generic mapping code with no trouble. Or can we? The trouble is, the return type of our Mappable.Map function, is Mappable. That means the only thing we can do with the result of this function is to map more functions over it – we can never get any data out of it. We lose the functions and abilities that are unique to each different type of Mappable<B> structure. That’s no good! Instead, we need the List version of Mappable.Map to return a List, the Tree version to return a Tree, etc.

Clearly, our solution using interfaces is unsatisfactory – abstractness of the inputs means abstractness of the outputs. There’s only one way we know of to have a function that’s polymorphic in the inputs but doesn’t lose any type information – generics (parametric polymorphism). However, we can’t just use regular generics, because we need access to the element types, too – A and B. We need a generic function that is aware that some of its arguments are themselves type constructors.

The solution, as you could imagine, uses higher-kinded types. If C# supported such a thing, it would probably look something like this:

interface Mappable<M<A>> {
  M<B> Map<B>(Func<A, B> func);
}

Notice that the interface is parametrized not just by the element type, but by the Mappable type constructor, too – that lets us make sure that the result of mapping a List is a List, and so forth. In Scala, a language that actually does support this stuff, it’s done a bit differently. We’ll use the Scala sorta-equivalent of an interface, called a trait. The trait doesn’t get parametrized by the element type of the container, just the type constructor (the “Mappable” part). Also, Scala uses square instead of angle brackets for type parameters (ooo la la!):

trait Functor[F[_]] {
  def fmap[A, B](r: F[A], f: A => B) : F[B]
}

If this was the Usual Suspects, here would be where you find out the true identity of Verbal Kint. Mappable was Functor all along! Bumm bumm bummmmmm.

Anyhow, that took way longer than I thought it was going to. I hope it was at least vaguely informative about higher-kinded types. Next installment I’m going to jump right in the deep end with totally crazy type constructor baloney.


RushGP released!

October 23, 2011

I released a little project I’ve been working on, called RushGP – a version of the PushGP genetic programming system, written in Typed Racket. Actually, I released it a couple months ago, but this blog has been eerily silent (almost _too_ silent…. gasp!) Anyhow, it borrows the basic structure (and a good bit of code) from Lee Spector’s (the inventer of PushGP and all around prolific genetic programming guy) Racket implementation from a couple years ago, SchushGP.

Anyhow, should anyone be interested, it runs on the latest version of DrRacket (or at least it should… failing that, grab one of the nightly builds), and can be found at my GitHub page.


More fun with higher rank types

August 16, 2011

Quick correction to the last blog post: I actually didn’t need to use macros to get the effect of higher-rank polymorphism – it turns out that Typed Racket supports it out of the box!

In the interests of accuracy, here’s an example of using higher rank types in TR to make a function that accomplishes the sort of thing we discussed last post (use a polymorphic “template function” on lists as a basis to make instances for several different concrete types of list):

#lang typed/racket

(struct:
  ProgramState
  ([Boolean : (Listof Boolean)]
   [Integer : (Listof Integer)]
   [Float : (Listof Float)]))

(: generate-instructions ((All (a) ((Listof a) -> (Listof a)))
                          ->
                          (Listof (ProgramState -> ProgramState))))

(define (generate-instructions template-func)
  (list
   (λ: ([ps : ProgramState])
     (struct-copy ProgramState ps
                  [Boolean (template-func (ProgramState-Boolean ps))]))
   (λ: ([ps : ProgramState])
     (struct-copy ProgramState ps
                  [Integer (template-func (ProgramState-Integer ps))]))
   (λ: ([ps : ProgramState])
     (struct-copy ProgramState ps
                  [Float (template-func (ProgramState-Float ps))]))))

Now there’s no type-checking error because the quantifier (“All”) is part of template-func, not generate-instructions. Let’s try it out with a template-func that, say, duplicates the top element of a list, and make sure it does what we think it does:

(: duplicate-top (All (a) ((Listof a) -> (Listof a))))
(define (duplicate-top xs)
  (match xs
    [(cons hd tl) `(,hd ,hd . ,tl)]
    ['() '()]))

(define start-state (ProgramState '(#f) '(12) '(3.14)))

(match-define `(,dup-bools ,dup-ints ,dup-floats) 
  (generate-instructions duplicate-top))

(define end-state (dup-bools (dup-ints (dup-floats start-state))))

(ProgramState-Boolean end-state)
(ProgramState-Integer end-state)
(ProgramState-Float end-state)

Running this produces:

Welcome to DrRacket, version 5.1.3 [3m].
Language: typed/racket [custom].
'(#f #f)
'(12 12)
'(3.14 3.14)
>

Just as expected. Cool!