General musings on programming languages, and Java.

Monday, January 21, 2008

A Trivial Display of How Scala's Type System Beats Java's

This article shows that Scala lets you add methods to objects, depending on their generic types. In Scheme, and other lisps and languages influenced by lisp, there is a construct called a cons, which is a pair of values. Linked lists can be built up by 'consing' pairs together. The first part of a cons can be called the 'car' (if used as a list, this is the head), and the second the 'cdr' (if used as a list, this is the tail). So if you have a cons with a cons as its second element, you can think of that as a compound data structure with three elements. E.g.:

(cons 1 (cons 2 3))

The first value of the second part of that, 2, the head of the tail, the car of the cdr, is known as the cadr. Let's write a Cons class in Java and Scala and then look at how to add the cadr.

Consider this Java code:

class Cons<A,D>
{
    public final A car;
    public final D cdr;

    public Cons(A car,D cdr)
    {
        this.car=car;
        this.cdr=cdr;
    }
}
equivalent to this Scala:
case class Cons[A,D](car: A,cdr: D)
(yes, that's all)

Now let's think about cadr. If a Cons has a car of type A and a cdr of type D, what type is the cadr? It's the type of the car of the cdr, which we haven't got a type parameter for. We could try this:

class Cons<A,D extends Cons<E,F>> but there's a recursion problem here -- F must extend Cons<E,F>>, not going to work. Plus, even if that worked, we could no longer use Cons for simple pairs.

We can make a static method elsewhere:

public static <A,D,E> D cadr(Cons<A,Cons<D,E>> cons)
{
    return cons.cdr.car;
}
That works fine, but it makes cadr somewhat a second-class citizen. It's a shame we can't do pair.cadr. In steps Scala.
implicit def addCadr[A,D,E](cons: Cons[A,Cons[D,E]])=new { def cadr=cons.cdr.car }

Cons(1,Cons(2,3)).cadr gives 2
addCadr, when in scope (it can be imported if it's not written where we need it) defines an implicit conversion from Cons[A,Cons[D,E]] to an anonymous class that implements cadr. So any time we ask for cadr the compiler (not the runtime) looks at our Cons type, doesn't see cadr and looks for any implicit conversions that result in a type that implements cadr. In this case it finds one.

The above is probably not a very practical use, let's think of some. You can make a List[Char] have the same methods as a String, for convenience. You can make a List[Double] have a total method.

(The worst thing about Java generics is how annoying they are to type in blogger!)

Monday, January 07, 2008

In Defence of (0/:l)(_+_) in Scala

A few days ago, Doug posted a rather angry-sounding exit note about Scala, based on two and a half months of 'an in-depth look at the Scala language'. He pointed out some code as an example of how write-only Scala is:

(0/:l)(_+_)

I had no clue what this meant (the first part anyway) at first. Zero Slash Colon One? Is that a new band? I typed it into my Scala interpreter, and while I was typing it, I realised that the One was an Ell. Let's change the name now. l is a bad name. Call it x, xs, or kiwiFruit, but not l:

(0/:list)(_+_)

I then had a bit more of a clue. From the Scala book, I remember reading that operator names that end with a colon are right-associative. In other words, a+b is shorthand for a.+(b), but a+:b is shorthand for b.+:(a). So the above code can be rewritten as:

(list./:(0))(_+_)

This doesn't look a lot clearer, but we can now look up the /: method in the Scaladocs. It's not on List, it's on one of List's traits, Iterable. It's a fold. Here's a longer way of writing the code (and in a way I'd have understood immediately):

list.foldLeft(0)(_+_)

If you're still left bewildered by this version, let's go a bit further.

(_+_) is an anonymous function that takes two values and adds them with the + operator. Let's make up two names and roll them into place. x gets rolled into the first _, y into the second:

list.foldLeft(0)(x, y => x+y)

And if anyone's still not following, what this does is to sum all the elements of list. If the list is List(2,3,4) it looks like 0+2+3+4. Some languages/tools call it reduce (notably MapReduce/Hadoop), some call it inject (Ruby. Any others?). Some languages make you use a for loop, which I assume must be to stop you from getting too cocky about how good your code looks.

So, 1 minute after not understanding the original code, I understood it. The blog in question got mentioned on the Scala mailing list, and Martin Odersky, Scala's inventor, apologised, kind of:

That was my fault. I included it because I liked it, and that for two reasons:

1. (z /: xs) (op) looks like an abbreviation of a left leaning tree
with a `z' on the lower left end (at least to me). I.e. something like

      op
    /    \
   op    x1
  /  \
 z   x0

That's the tree I always draw when I explain fold left.

2. (z /: xs) has the operands in the ``right'' order.
I can see both points, and I'll still use foldLeft. I would not balk at someone else's code using /:. It took me 1 minute to understand, and now I can happily read folds written that way.

In discussions with other Scala programmers, I tried to say that the time taken to learn things isn't as important as how useful they are once learned, but I couldn't find a good way to say it. David MacIver, in an unrelated post to the mailing list, said what I intended to, but much better:

"Optimising your notation to not confuse people in the first 10 minutes of seeing it but to hinder readability ever after is a really bad mistake."

If only I could get him to stop arguing with Ian Clarke and write stuff I want to read!

If I used folds a lot, and perhaps I will someday, I would quite happily use the /: operator. Once I've internalised its meaning I can just get on with reading the names that I chose for things, rather than reading what the language forces me to have. Of course, to most of us, myself included, I'm not that used to folds, but if I considered folding to be just as primitive as +, I'd much rather write /: than foldLeft, just as I'd rather write + than plus.

The rest of Doug's blog really surprised me; I don't know how two people can spend the same amount of time in a programming language's community and get such different results.

My only explanation is that I am comfortable with Haskell, and a long-time Java programmer, so I probably have an advantage when it comes to Scala, which is largely Java minus some bad things, plus some good things from Haskell and many other places.

And while it's in my text buffer:

Don't assume that Scala is only useful for writing web apps, desktop apps, object orientated programming, functional programming, scripting, encoding mathematical properties and concurrency just because that's all that's been discussed on the mailing list this week.

Blog Archive

About Me

A salsa dancing, DJing programmer from Manchester, England.