General musings on programming languages, and Java.

Monday, December 18, 2006

Why Closures in Dolphin is a Good Idea

Why Closures in Dolphin is a Good Idea


On Javalobby, Mikael Grev argues that, while he personally likes closures, and would use them, he would not want them to exist in Java.

He makes quite some deal out of keypresses - namely, the excessive amount of keypresses required for an anonymous class in Java. However, this misses the point somewhat - if something takes many keypresses to type, it will take many 'brain cycles' to parse. In fact, in even the simplest code, an anonymous class is so distracting from the intent of the code that it almost prohibits some excellent coding styles. That is demonstrated in this article.

Apologies for the code formatting - posting from Google Docs seems a bit dodgy. Here's the original. If you're not familiar with use cases for closures, I strongly suggest you take a look at Neal Gafter's blog , and the links he has on there, which, even if you disagree with the authors' points, the articles are at least very entertaining.

Closures Make Code Easier to Understand


Consider the following Haskell code:

map (+2) [1..10]

This is the usual Java code that's roughly equivalent:

int[] list=seq(1,10);
for (int a=0;a<list.length;a++)
    list[a]+=2;


Fairly readable, but it's not as expressive. It doesn't say 'add 2 to each element of a list from 1 to 10'. Instead, it says 'make a list. for each element of that list, add 2'.

That's two sentences, and the second has a sub-clause.

If I try to write the more expressive form (the Haskell version) in Java, I get something like:

result=map(new Function<Integer,Integer>()
{
    public Integer run(Integer input)
    {
        return input+2;
    }
},seq(1,10));


As you can see, the original less expressive form actually maps better onto Java than this better style. And not just in number of keypresses, but in readability. After all, I wrote it, you're just reading it, and you're probably cringing.

The (+2) syntax from Haskell is a way of specifying the + operator, but with one of its values pre-filled. There is a more verbose, probably more familiar, syntax, that resembles Java's impending closures more:

map (x -> x+2) [1..10]

This is an anonymous function that takes a value, x, and returns x+2. The word anonymous is important. The moment we 'simplify' things by giving names to such code snippets, e.g., a named class, or a field that holds the function, or even a local variable that holds the function, we're not actually simplifying, we're increasing the number of sentences needed to express ourselves.

Function<Integer,Integer> addTwo=new Function()
{
    public Integer run(Integer input)
    {
        return input+2;
    }
};

result=map(addTwo,seq(1,10));

Yes, this starts to look more attractive, but it's not really. It just means that we need to understand what addTwo is, as well as what map is and what seq is. We're adding to the number of things we either need to commit to the subconscious, or hold in primary mental space.

For this extremely trivial example, you might wonder what the big deal is. If this was all the benefit one could get from closures, I'd agree with Mikael. However, by making trivial code exceedingly trivial, you can make less trivial code trivial, and complex code, well, readable. Being able to understand more code at once means that you can spot mistakes in it better.

Closures help to keep your code DRY, and encourage excellence.


DRY is Don't Repeat Yourself. By making the above more expressive code also more attractive, you open yourself up to all sorts of optimisations (removal of repetition - I'm not talking about performance, though that does come into it somewhat).

Most operations on lists or Strings can be expressed in terms of mapping or folding (also called reducing). For example, joining a list of Strings to add colons in between is a fold:

result=fold(new String[]{"root","0","/bin/bash"},new Function<Pair<String,String>,String>()
{
    public String run(Pair<String,String> pair)
    {
        return pair.first()+":"+pair.second();
    }
});

Now, at first glance, that code is garbage. Let's add closures:

result=fold(new String[]{"root","0","/bin/bash"},{first,second => first+":"+second});

Now, if you understand that to 'fold' is to run a function on the first and second elements of a list, then run the function on the result of that and the third element, etc., then you'll probably quickly understand the code above - but the excess notation in the anonymous inner class version makes it harder to grasp. This makes using fold unattractive. fold and map are some of the best techniques available for working with lists of data. They are immensely flexible and scalable. Google's famous MapReduce algorithm is entirely based on them.

So, without closures, we are not likely to come up with algorithms like MapReduce - that is, we are actively discouraged from writing the best code. Of course, we are able to think outside of the programming language that we use, but it tends to be slightly harder to do. I doubt that many Java programmers think in terms of folds and then convert that into a suitable Java version. Instead, we think in terms of the Java version, and maybe realise later that it was another hand-coded fold implementation.

Further, by keeping code DRY, you keep maintenance costs low, e.g., if you have a bug in your withLock() implementation, your using() implementation, your withResource() implementation, etc., you can fix it in one place. If you didn't use those, but hand-coded (or IDE-generated) it every time, then you're fixing it in many many places.

I once looked through some of the JDK source, and found that most of the resource allocations don't follow the suggested best practice - the try..finally{try{close}catch{log}} idiom. I wager that this would not have been the case had closures existed from the start. Reusable solutions would have been more attractive - more convenient.

And Now to Refute Some Points in General

These are from Mikael:

"the benefits must be proven to be measurably greater than the costs". It's impossible to prove that, as the benefits and the costs both have humans as part of their variables.

"I would guess that the more advanced coders, the ones that is usually on the closure side, does this". In that statement, 'this' meant auto-generating code using an IDE for anonymous classes. That is probably true, but auto-generating code is a workaround for a missing language feature (not necessarily a feature that should be there though - it's only with this clause that I can make the generalisation). More advanced coders probably get a slight pang of 'this sucks' whenever they auto-generate an anonymous class, or getters and setters.

"That is unless you have to use one of the proposed syntaxes for handling exceptions thrown in the closure or have some funky return structure". Clearly, anonymous classes aren't going to be removed from the language, so if you find the syntax hard to understand, you can always revert to anonymous classes. I expect IDEs will provide automated routes to and from closures and anonymous classes.

"The solution to this aesthetics problem isn't closures though, it can be solved without adding complexity by just allowing a little syntactic sugar for the AICs." Even the syntactic sugar for AICs (anonymous [inner] classes) detracts from the expressiveness, and still discourages DRY and excellent code in the same way that AICs do now. Consider:

result=fold(new String[]{"root","0","/bin/bash"},new Function<Pair<String,String>,String>()
{
        return pair.first()+":"+pair.second();
});


It's not a lot better, it's still got a lot of verbosity that could be inferred (the type parameters to Function, the word Function itself). It's still distracting.

"Closures can do many things that AICs can't. Change the variables outside their scope for instance". Like with autoboxing, you could conceivably configure your IDE to prevent yourself from doing this. For most cases it won't matter. Neal Gafter explained the reasoning behind making code that's inside a closure behave the same as code that's outside it. It doesn't break the WYSIWYG nature of Java, because it's damn obvious that, say invokeAndWait{frame=new JFrame();} will assign to the nearest variable called frame.

"I still think that the AIC should only be working on a copy of the value." This could only promote out-of-sync bugs.

"The primary cost here is that Java developers need to learn new constructs. Constructs that are not very Java-ish and therefore will take some time to getting used to." That's not a cost, it's a benefit. Learning how to use generics benefits those who have. Generics didn't look very Java-ish, but they worked well. It actually helps programmers to learn new concepts.

"Remember that not all are as bright as you and you gain nothing from alienating the Java-Joes however good that feels for your ego." Actually, I do teach some new Java programmers, and I'd be much more comfortable introducing them to:

invokeLater{frame.setVisible(true);}

than:

invokeLater(new Runnable()
{
    public void run()
    {
        frame.setVisible(true); //and, er, you'll have to make frame final.
    }
});


Closures are simpler, for all levels of programmers.

"Take the much loved Collection framework. If it'd been closure-enabled from day one it would've been even better. Now you need to squeeze in closures". Or make sure that closures are implemented in such a way that they are useful with the framework. For example, we can implement a Comparator as a closure, and don't even have to say the word 'Comparator'. It's inferred. Type inference is good.

Collections.sort(list,(x,y => y.intValue()-x.intValue());

If the JDK had to include another version of sort that was closure compatible, which it doesn't, then I would agree with you.

"You could argue the same way [against] for anything that gives more power to the developer. #DEFINE is such a thing.". The use cases for #DEFINE in C and C++, such as inclusion of header files, portability (hoho), definitions of function-like macros, are largely eliminated by simpler features in Java. What features combined are simpler than closures, for all (or most) use cases that closures have?

"With closures you can code Java that doesn't look like Java and that isn't something I'd like for the Java community.". You could replace 'closures' with 'generics' in that statement, rewind a few years, rinse and repeat.

And just a humourous note: This is from Mikael's top ten tips on how to become a Rock Star Programmer: "Write smart cool compressed code constructs". He's joking, but surely that's, well, closures. From the same place: "Less code, in a smart way, means less to maintain.". Agreed. Smart doesn't mean 'hard to understand'.

"Frankly (sort of) defining new keywords on a developer level scares the bejesus out of me." I wonder whether there was a time that allowing developers to write their own functions (rather than having them hard-wired into the machine) was scary.

Shai Almog said:

"Generally I tend to be wary from features that are designed for "experts"". Closures in Java appear not to be designed for experts, but for programmers. It looks to me like every effort is being spent to make programming in Java better. The usage syntax is very compelling.

"VM changes are that much worse even worse than half baked implementations (e.g. generics)." I've found that generics were cooked for just long enough. They could use some extra features, some garnish, but they taste nice. The worst thing about them is that they're awkward to talk about in comments on other peoples' blogs, with the old < etc.

This one's hard to quote, but, Shai conjectured that a closure-accepting method would be hard to maintain, because average programmers wouldn't understand the method.

1. Don't let code into your codebase that is above ALL your staff.
2. IDEs could probably refactor it into an equivalent interface-accepting method anyway with no change to the use site.
3. It's already possible to write code that is above the level of other programmers, e.g., with generics, enums, finally (yes, there are programmers who don't get finally), etc.

12 comments:

Anonymous said...

I would argue that the Haskell example is more declarative than the Java example

Anonymous said...

The map (+2) example doesn't use closures, just anonymous functions. A closure is an anonymous function that references the encolosing scope, as in:

times n list = map (\x -> x * n) list

the anonymous function \x -> x * n is a closure because n is a variable from the enclosing scope.

Anonymous said...

I've read the entire article now. I think what you want is a better syntax for anonymous functions. It's not a problem that the values have to be final: all values in Haskell are final.

And BTW this won't "fix" Java. It will improve Java but it will still be bad. Don't try to turn Java into Haskell: just use Haskell ;-).

Anonymous said...

Who said generics work?!

Isaac Gouy said...

"So, without closures, we are not likely to come up with algorithms like MapReduce - that is, we are actively discouraged from writing the best code. Of course, we are able to think outside of the programming language that we use, but it tends to be slightly harder to do."

iirc "Google's famous MapReduce" is implemented in ... C++

Isaac Gouy said...

...he would not want them to exist in Java.

Just wait a year or two until it becomes obvious how much more straightforward it is to code C# 3.0 using local type inference and LINQ, then Java programmers will decide that they always wanted closures :-)

Meanwhile there's Scala

Anonymous said...

I think you forgot the '\':

map (x -> x+2) [1..10]
should be:
map (\x -> x+2) [1..10]

Ricky Clarkson said...

Grr, the backslash was removed when I used Google Docs to post to blogger. The original's got it in.

Isaac Gouy noted correctly that Google implements MapReduce in C++. I checked the PDF that Google published about it, and they don't say how they came about the idea, but I'd wager it was someone who knew more than just C++ who invented it.

I've had a brief glance at Scala, and Nice. I'll be looking into those more, but at the moment I don't think that I'll be moving to those full-time either.

Tony, about admiring my optimism - I'd like to see Java improve, just as I'd like to see any technology improve. With Java having such a large mindshare, if Java implements something popular, no new language will be able to gain a footing without having that feature (or a significant paradigm shift).

E.g., imagine a new statically typed language without generics or an equivalent now..

Anonymous said...

It is nice that we can compare an upcoming feature of Java with Haskell, Scala, C# ...

In practice we have to explain this to C, VisualBasic and COBOL programmers. To these people Generics are ... strange. Closures will be a desaster.

Java is an API-language and not a programmer's language. Programming in Java demands more design than programming skills.

This had been one important reason for the adoption of Java in the enterprise (People there love UML more than programmers) - Mikael has a point there that adding a new concept to the language might weaken Java.

I prefer more and more to let Java as it is now and work on the integration with other languages on the VM - like Scala and perhaps even Prolog and Ruby

AlBlue said...

You need to be careful with your terminology. You're actually arguing that the Java language should have anonymous function support that can use local variables. The 'closure' comes at run-time, not at syntax time.

See also http://www.eclipsezone.com/eclipse/forums/t86911.html for more information

Anonymous said...

Hello all!

Anonymous said...

Thanks to author.

Blog Archive

About Me

A salsa dancing, DJing programmer from Manchester, England.