Non-empty lists

I recently watched a video about the library scalaz, which among other things discussed a “non-empty list” data structure provided by the library. I enjoyed very much when someone in the library asked “isn’t that just Cons?”… This is an awesome implication of the fact that in Scala, data constructors are actually full-blown types, something that functional programming guys – like myself – can easily forget.

Of course, scalaz’s NonEmptyList is a bit more than just Cons (or :: or whatever it’s called in Scala’s standard library), because it implements standard collection functions like map, flatmap, etc. such that they preserve the NonEmptyList result type.

Anyhow, this video, and my subsequent investigation of scalaz’s NonEmptyList, got me thinking about this data type. List is such a fundamental data structure – lists and folds on lists (remember, a data type is completely defined by its catamorphism) basically embody the concept of induction. So, it seemed like NonEmptyList should have some sort of similarly deep meaning, and a similarly pithy definition.

I was sort of disappointed to find that the scalaz definition of NonEmptyList actually appeals to the definition of List directly, looking something like this (scalaz doesn’t use a case class, but this gets the idea across):

case class NonEmptyList[+A](hd: A, tl: List[A])

I started reading a Haskell-cafe thread about NonEmptyLists, and saw Edward Kmett (who is one of my favorite writers on functional programming) use a NonEmptyList that was defined in a very similar way, once again appealing to the definition of List:

data NE a = NE a [a] deriving (Eq,Ord,Show,Read)

I felt kind of dissatisfied with these definitions. I find using the definition of List inside NonEmptyList to be sort of ugly – it seems to get in the way of understanding the essence of NonEmptyList, taken as its own concept. From a library implementer’s perspective, I understand the practicality – plenty of functions are already written to work with lists, and this is probably a fairly efficient encoding, performance-wise, as only the first cons cell needs to have any special information compared to a regular List.

With all those practical considerations aside, I thought this was a much slicker definition:

case class NonEmptyList[+A](hd: A, tl: Option[NonEmptyList[A]])

This seems to get a bit more at the fundamental nature of NonEmptyLists. Of course it is trivially isomorphic to the above definition, but I like it better since its a bit more self-contained – Option is simpler than List (something like forall X. 1 + X vs. forall X. mu. x. 1 + X * x).

As far as I understand it, this is the equivalent of calling someNonEmptyList.tail.toNel (toNel produces an Option[NonEmptyList[A]]) in the scalaz library, so they certainly use it this way. Still, I think its slicker to do it by default rather than relying on extension methods on List. Don’t get me wrong, I’m not so foolish as to think they hadn’t considered doing it in the way I propose – these guys know a hell a lot more about types and FP than I do (I’ve quoted their blogs a bunch, and I just bought a pre-release of their upcoming book).

As I continued to read the Haskell-cafe thread, I was excited to see that the next (and final) post was by Conor McBride, and showed that same slick formulation:

data NE x = x :> Maybe (NE x)

Conor McBride is a rock-star of type theory – the guy who showed the definition of “differentiation” (think calculus) can be extended to datatypes! What a boss.

It’s always a fun experience when you see people discussing a problem, you think “why don’t you solve it *this* way??”, and then someone whom you look up to offers that same idea. The occasional affirmation that you’re not totally out to lunch. Unfortunately for me, this only ever seems to happen while looking back at things – if I could just have decent ideas *before* other people have them, then I might be on to something.

All kidding aside, I feel like this phenomenon is common with many aspiring academics or budding programmers (at least among my friends and I): first you start reinventing wheels that were invented 50 years ago, then 20, then 5, then 1, then before you know it you’re actually doing some original research!

Anyhow, I just really liked the idea of NonEmptyList, and this particular slick little formulation, and wanted to share it with everybody. I know above I speculated as to whether NonEmptyList had some sort of similarly deep meaning as List, and I’ll let you know if I read of (or think of) any.


3 Responses to Non-empty lists

  1. Good insight. I think the value of statically non-empty collections is under-recognized in Scala ATM, i.e. no support for them in core. I often have Scala collections that hold 1+ elements, and I wish I could tell the typesystem that, without needing Scalaz. Analogously for Natural numbers vs Ints.

  2. Julian says:

    Hi! Thanks for your entertaining and insightful post. I just ran into it after googling for Scala nonempty lists. I think you might have missed a slicker (and perhaps more basic) formulation. In Haskell terms,

    data NonEmptyList a =
    | Cont a (NonEmptyList a)
    | End a

    Similarly with Scala case classes for the disjoint union. After all, why (conceptually) should you have to pair an item with Nothing in order to make it a nonempty list? Just wrap it in the constructor that tells you it’s at the End of one 🙂

    Also, +1 to Ben’s point. It should not be way off there in scalaz!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: