General musings on programming languages, and Java.

Saturday, April 26, 2008

So You Like For Loops? Zip It Up!

I've written this little for loop many a time, it finds all duplicates in a sorted list of integers:

List<Integer> nums=Arrays.asList(1,2,2,3,4,6,6,7,8,8);

List<Integer> dups=new ArrayList<Integer>();
int prev=nums.get(0);
for (Integer i: nums.subList(1))
    if (prev==i)
Now go and read your email or something, and come back to read just the for loop (and the line above it). You have to trace through time in your head to work out what it does. The code doesn't say what it means, but that's ok, right, because it works.. hmm.

Let's make up a type called Pair<X,Y>, and quickly imagine that List now has this nice method, zip. Here's the above list, zipped with a sublist of it starting from 1.

output: ArrayList((1,2),(2,2),(2,3),(3,4),(4,6),(6,6),(6,7),(7,8),(8,8));
Can you see how that relates to the original list? I hope so. Given this data, how would you find duplicates? Well, you'd look for any Pairs where the X and Y value are the same, easy. We'll call that zipped list 'zipped', of type List<Pair<Integer, Integer>>:
List<Integer> dups=new ArrayList<Integer>();
for (Pair<Integer, Integer> pair: zipped)
    if (pair.x().equals(pair.y()))
This for loop is a common pattern now, it's a 'collector'. At each iteration we have a/the list of dups, and the current pair. If the current pair has equal elements we modify the list of dups to make it have one more. Really, it seems like it should be a general operation, filter:
List<Integer> dups=zipped.filter(new Predicate<Pair<Integer, Integer>>()
    public boolean invoke(Pair<Integer, Integer> pair)
        return pair.x().equals(pair.y());
}).map(new Function<Pair<Integer,Integer>,Integer>()
    public Integer invoke(Pair<Integer, Integer> pair)
        return pair.x();
It's kind of better but the syntax is getting in the way. Hold up your closure glasses to the screen, or if you don't have any, I've kindly repeated the code but using the proposed Java 7 closures syntax:
List<Integer> dups=zipped.filter( { Pair<Integer, Integer> pair => pair.x().equals(pair.y()) } ).map( { Pair<Integer, Integer> pair => pair.x() } );
I hope that helped you read the previous code. Let's go one step further with these glasses, they are now magically type inference glasses. We don't need to specify the type of 'pair' because it's blindingly obvious from the type of zipped. We don't need to specify the type of dups because it's blindingly obvious from the return type of filter:
val dups=zipped.filter( { pair => pair.x().equals(pair.y()) } ).map( { pair => pair.x() } );
It looks to me like the braces in there are a bit redundant, the () after x and y are just annoying, and .equals is a Java design error that our glasses can correct:
val => pair.x==pair.y).map(pair => pair.x)
Und viz zis Scala zee transfurmaschun vill be complete! (parody of a parody)

Update: Thanks to David MacIver, who spotted a mistake, I had to add 'map' to each of the non-for-loop examples, and in updating it, I've stopped short of what might be the Scala norm - in a closure such as (x => x+2), where the closure parameter is only used once, you can write (_+2). So above you'd write instead of => pair.x).


Robby O'Connor said...

NICE! BGGA/Scala turns the solution into basically a one line solution :) neat =)

Anonymous said...

Turning an uncommented java program with badly named variables into a scala one-liner is not the way to promote new languages.

Ricky Clarkson said...

I don't think commenting the for loop or naming variables differently would have improved the original code.

I strive to write code that is readable so that it doesn't need comments that redundantly describe it.

Anonymous said...

why not:

val dups = for( p <- list.tail ); if p._1==p._2 ) yield p._1

Anonymous said...

Looks really good to me. Does the Scala language do enough type inference to get the types of closures right?

This reminds me of the ML language, which also does this kind of type inference. I really like the advantage of having types that "get out of the way" in code without losing type safety. :-)

Rajiv said...

Interesting Blog, what’s your opinion on outsourcing, have you blogged about that or how technology could help in creating more opportunities for businesses and individuals? There is an interesting competition going on with a chance to win $1000 ( and i was thinking of referring to some of your blog items in my blog. Is it going to be alright with you?

Blog Archive

About Me

A salsa dancing, DJing programmer from Manchester, England.