Ricky's technical blog: Point-free Programming in Java 7

Friday, September 14, 2007

Point-free Programming in Java 7 - Beyond Closures

Here is an introduction to point-free programming, using Haskell, Scheme, Java 5 and the proposed Java 7 for examples, and a couple of ideas for how the Java 7 proposal could improve to make point-free programming more practical.

I've grown quite accustomed to programming in lambda style now, in Common Lisp, Scheme and Haskell. Here's a quick example of how that looks:

Scheme: (lambda (x y) (+ x y))
Haskell: \x y -> x+y
Java 5:

new FunctionIII()
{
    public int invoke(int x,int y)
    {
        return x+y;
    }
}

Java 7: {int x,int y => x+y} Not too bad. You could use this to fold (or reduce) across a list. To fold is to apply a function on a starting number and an element of a list, and then to apply the same function on the returned value and the next element of the list, until the end of the list. You could use this to compute the sum of all the elements of a list:

Scheme: (foldl (lambda (x y) (+ x y)) 0 (list 3 4 5 6 7))
Haskell: foldl (\x y -> x+y) 0 [3,4,5,6,7]
Java 5:

foldl(new FunctionIII()
{
    public int invoke(int x,int y)
    {
        return x+y;
    }
},0,asList(3,4,5,6,7));

Java 7: foldl({int x,int y => x+y},0,asList(3,4,5,6,7));

One point to note here is that you could, in Scheme, write this as (+ 3 4 5 6 7), because Scheme's + function takes any number of arguments, even 0 (whereupon it returns 0). So this is a contrived example using non-optimal code.

It's not idiomatic Haskell either. For a start, you can write sum [3,4,5,6,7], and avoid foldl altogether. Bear with me for a moment though. In Haskell, you can shorten \x y -> x+y to simply (+), which means 'the function called +'. In Scheme you don't even need the parentheses. +, used in a 'value position', which is anything other than the first in a list, refers to the function +, rather than calling it. So you can do:

Scheme: (foldl + 0 '(3 4 5 6 7))
Haskell: foldl (+) 0 [3,4,5,6,7]

This is the simplest example I can think of of pointfree style, which is where you remove lambda expressions but without introducing mutable variables. Here's a better example:

I want to multiply all elements of a list by 2. Here's the lambda style:

Scheme: (map (lambda (x) (* x 2)) (list 3 4 5 6 7))
Haskell: map (\x -> x*2) [3,4,5,6,7]
Java 5:

map(new FunctionII()
{
    public int invoke(int x)
    {
        return x*2;
    }
},asList(3,4,5,6,7));

Java 7: map({int x => x*2},asList(3,4,5,6,7));

But these lambda expressions don't really express my intent that well. I want to map the function 2* across the list. Haskell has a convenient syntax for this:

Haskell: map (2*) [3,4,5,6,7]

What I've done there is to specify the * function, with its first argument filled in. This works because all functions in Haskell are automatically curried. If I type this at the REPL:

:type (*)
Output: (*) :: (Num a) => a -> a -> a

This looks a bit confusing if you're not familiar with it. What it means is that (*) is a function with a type parameter, a, which in Java terms would look like <a extends Number>. It takes one parameter of type a, and returns a function that (takes one parameter of type a, and returns a value of type a). Parentheses used for disambiguation there, this turns out to be hard to write in English.

:type (2*)
Output: (2*) :: (Num t) => t -> t

That means that (2*) is a function with a type parameter, t, which is an instance of the Num typeclass (the same as 'a' in the first example). It takes one parameter of type t, and returns a value of type t.

So I can call (2*) as an ordinary function, adding a value (this is called taking a section, and is a way of partially applying a function). The following expressions all have the same output, 8:

(2*) 4
(*2) 4
(*) 2 4
((*) 2) 4

This seems like an interesting but not-so-useful result of automatic currying at first, but you can actually rewrite any lambda expression in this style. Sometimes the code will be easier to read than a lambda expression, sometimes not. Here's a more complex example:map ((+10) . (*2)) [3..6]

The . operator means function composition, like in mathematics - f(g(x)) would be written as (f . g) x. You can compose functions before you have the values to give them, so f . g has a meaning. Read ((+10) . (*2)) as: "multiply by 2 then add 10". Look ma, no variables!

Scheme doesn't have any support for this, as its procedures are not automatically curried, but you can make something similar happen:

(map (compose (plus 10) (multiply-by 2)) (list 3 4 5 6))

plus and multiply-by need defining in an odd way:

(define plus
    (lambda (x)
      (lambda (y)
        (+ x y))))

(define multiply-by (lambda (x) (lambda (y) (* x y))))

You can write a general curry procedure in Scheme, but as there's no portable way of discovering how many args a procedure takes, you have to supply the arity. Plus, some procedures are of variable arity.

(curry 2 +) would return a procedure that takes a value, and returns a procedure that takes a value and returns the sum of the two values, used like this:

(((curry 2 +) 4) 5) ;; output: 9

Java 5:

public static final FunctionIO<FunctionII> plus=new FunctionIO<FunctionII>()
{
    public FunctionII invoke(final int x)
    {
        return new FunctionII()
        {
            public int invoke(int y)
            {
                return x+y;
            }
        };
    }
};

For brevity I'll leave multiply-by out.. it's the same but with + changed to *!

Java 7:

public static final {int => {int => int}} plus={int x => {int y => x+y}}; public static final {int => {int => int}} multiplyBy={int x => {int y => x*y}};

map(compose(plus.invoke(10),multiplyBy.invoke(2)),asList(3,4,5,6));

Because both Scheme and Java make you curry explicitly, they discourage point-free programming (although it's still possible, and quite useful). I'd like to see all function types in Java 7 be automatically curried. In other words:

{int,int => int} should be the same as {int => {int => int}}.

This doesn't have to have any negative impacts at runtime, as demonstrated by Haskell's great performance.

To make it really useful, it should also be applicable to operators, so that #+ gives a {int,int => int} and #+.invoke(2) gives a {int => int}. (where #+ is a possible syntax for taking a reference to +).

Sometimes you'll want to provide the second argument rather than the first. Haskell's solution to this is a function called flip:

(flip (/)) 3 6 -- gives 2, the result of 6/3. I could use it in this possible future of Java with #/:

flip(#/).invoke(3) would return a {int => int}, that, when you pass it a number, it returns that number divided by 3. Equivalent to Haskell's (/3).

As you do more and more in point-free style, you get more and more used to it, as one would expect, and some constructs that seem unnatural at first seem normal later. I'm still a novice, but I'm having fun.

23 comments:

Slava Pestov said...: Some Schemes have a 'cut' macro.

(cut + 2 _ 3) is the same as

(lambda (x) (+ 2 x 3)); 5:47 am BST
Ricky Clarkson said...: Indeed. That's about as good as you can get with partial application without it being automatic, I think.

cut doesn't actually curry - (cut + 2 _ _) is the same as (lambda (x y) (+ 2 x y)), not (lambda (x) (lambda (y) (+ 2 x y))). But then it's not intended to.; 9:06 am BST
Anonymous said...: @Ricky Clarkson
@This doesn't have to have any negative impacts at runtime, as demonstrated by Haskell's great performance.

I don't really know much about this, but the statement surely isn't informative.

Also, you linked to the wrong Wikipedia link. At the very least, you should also point to: http://en.wikipedia.org/wiki/Tacit_programming; 10:05 pm BST
Tony Morris said...: John, a better statement might be "...as demonstrated by Scala's performance" where the impact of currying is zero.

Making the analogy to Haskell is a little hasty, given the vast differences between the (inferior) JVM and the many (relatively superior) Haskell compilers.

The key question that arises is "can you have a zero (or low) performance impact of exploiting the isomorphism between (A x B) -> C and A -> B -> C on the JVM?". The key part of that question is, "on the JVM", so the analogy to Haskell falls apart.

The answer can be provided by Scala however.; 12:02 am BST
Ricky Clarkson said...: John,

You're right, but anecdotal evidence is that Haskell is fast, and Haskell programmers use point-free programming a lot. I've never seen a Haskell discussion favour lambdas over point-free for performance reasons. If there is a difference in performance, I think it would favour point-free.

I don't know about tacit programming, but the wikipedia page about it is a 2-line stub pointing to a paper that is not accessible without money. I'm sure I could learn more about tacit programming without paying money, but that wikipedia page certainly isn't a good start.

(As it happens, I'll be able to access it from work on Monday, but for the general population, I'd prefer to link to free resources)

I've checked all my links again just now, they seem ok.; 11:24 am BST
Brian Slesinsky said...: All very nice, but why use point-free programming in the first place?; 8:53 am BST
Ricky Clarkson said...: Brian,

For some cases it expresses intent better than lambda expressions. map (*2) someList gives a list with someList's elements, but doubled. That seems very readable.

There is no duplication in (*2), x is duplicated in \x -> x*2. x is also a name we have to invent, and doesn't really mean anything. I like finding ways of removing unnecessary names, because that makes my code easier to think about.; 10:26 am BST
Anonymous said...: what do you mean by point-free programming? from your blog entry its not clear that you aren't talking about plain functional programming?; 5:11 pm BST
Anonymous said...: sorry, google first, ask later..; 5:14 pm BST
Anonymous said...: All this fp style is very annoying.

Java is OO not fp. And a half-assed adding of fp features won't make the language better.

I think languages should stay true to their base philosophy. You should use scheme or Haskell or something else if you want fp.

I fail to see how this will make Java better, because though this adds "glue" it does not change the base limitations of the language.

I vote for leaving Java as it is. Java 5 was bad enough.

As many have said..as soon as the language becomes too complex people will just abandon it for "simpler" less error prone languages (a la C++); 9:33 pm BST
Anonymous said...: Introduction of delegate will solve all the problems mentioned above...

I don't like the syntax you've proposed. It looks so... anti-Java...; 5:07 am BST
Anonymous said...: Delegation is another workaround to closures with more verbose syntax. If You want to do it, do it proper.; 6:59 am BST
Anonymous said...: So having delegation like C# makes it makes it less anti-Java becos C# looks more like Java than most dynamic languages?; 7:00 am BST
Anonymous said...: Java is turning into perlish nightmare with such things. It takes us nowhere further.
Sure it would cost a bit less typing, but what same person really cares? It would not save a life.
Java's syntax is fine, and as cited above, OO, not FP.; 9:12 am BST
Anonymous said...: Interesting last comment as I was just getting ready to post:

Perl:

map { $_ * 2 } (1, 2, 3)

Perl mapping is pretty efficient and the syntax is as simple as anything else in the article.; 10:51 pm BST
Anonymous said...: That's the thing -

Just because it is terse doesn't mean it is easy to maintain by teams of engineers.

I use both Perl and Java on the job and one of the primary reasons I would recommend Java is rapidly disappearing.; 5:50 pm BST
Ricky Clarkson said...: It's not so much terseness, but lack of redundancy that programming language fans such as myself look towards. That coupled with expressiveness. It's hard to argue that Java's for loop idiom is more expressive than map (+2) someList. (map (+ <> 2) some-list), someList.map(_+2) (Scala), and impossible to argue that it is less redundant.

As for teams of engineers understanding it, that's a learning curve issue. If they can't understand it at first sight, that means nothing. After 2 weeks, if any engineer can't understand the code in this article, I'd start worrying.; 6:02 pm BST
Anonymous said...: it IS a matter if terse vs redundancy.
How is your code below redundant?:
//code
new FunctionIII()
{
public int invoke(int x,int y)
{
return x+y;
}
}
// end code

Is it more verbose and explicit? Yes, not NOT redundant.

//code
(lambda (x) (+ 2 x 3))
//code
is more terse. Dollars-to-donuts what is happening on the jvm between the two is little to nothing making it is for brevity/terseness only.
BTW - try maintaining a perl script that was optimized for terseness. You will find yourself in parsing and reference hell.; 6:16 pm GMT
Ricky Clarkson said...: The FunctionIII type specifies that the input is two ints and that the output is an int.. then my anonymous class repeats that info. The rest of the tokens besides + would be repeated if I had another class but for -, or *. Also, I have to repeat it all if I want to do it for doubles or longs.

Just as not all OOP languages tend towards having C++'s idiosyncracies, not all terse languages tend towards Perl.

list fold + is far more readable than any anonymous class nonsense, and that's what counts. Size of code is part of readability.. things that take a lot of code are avoided by programmers.; 6:45 pm GMT
Hiren said...: In J Programming Language

foldl (+) 0 [3,4,5,6,7]
=
0 +/ 3 4 5 6 7

http://en.wikipedia.org/wiki/J_programming_language; 10:35 pm BST
Hiren said...: oops sorry i meant:
0 + +/ 3 4 5 6 7; 10:52 pm BST
Anonymous said...: Does anybody really care? It might be OK for programming gurus to get all gooey inside about scrunching their code into a one-liner, but is the next person going to understand what the hell it means?

Probably not, so I would say you've failed if other people cannot understand your code. Sure, write extremely cryptic code for your own amusement, but if it's going to be consumed/read by others, then keep your coding fantasies to yourself and make it readable!; 11:07 am BST
Ricky Clarkson said...: "but is the next person going to understand what the hell it means?"

Yes, if the syntax for it is workable and the 'next person' is willing to learn stuff.

There's a big difference between "can novices understand it?" and "can programmers understand it?".; 9:46 pm BST

Ricky's technical blog

Friday, September 14, 2007

Point-free Programming in Java 7 - Beyond Closures

23 comments:

Blog Archive

About Me

My Blog List