General musings on programming languages, and Java.

Saturday, December 29, 2007

Make Scala Your Language for 2008

Scala's a statically-typed language based on Java, but with features that make it comparable to Ruby, Groovy, Haskell, Python, Erlang and Smalltalk. It's pronounced "Skah-la", rather than "Skay-la", it has closures, gets rid of Java's controversial checked exceptions, and is almost perfectly interoperable with Java APIs.

There's an official book nearing completion (available now in PDF form), and some very clever people are using Scala (and some not so clever ones, like me). One of the core developers is called Lex Spoon, which has to be a plus for any language. Scala's been cooking, and by the time you finish reading this and procrastinating about whether to use it, it'll be ready.

It is one of those languages where boilerplate isn't welcome, yet is statically typed and supports both OOP and functional programming without blinking. Scala programs can use Java APIs effortlessly, and Scala turns out to be better than Java for testing Java code (despite some integration concerns raised by Ola Bini)!

So how can you use it? Obviously, install it, then you can launch its interpreter (you can use it as a scripting language or a regular compile-to-bytecode JVM language [though the difference is an elaborate illusion]). Here I'll launch the interpreter with the Google Translator API on the classpath:

$ wget -q http://google-api-translate-java.googlecode.com/files/google-api-transla
te-java-0.26.jar
$ scala -classpath google-api-translate-java-0.26.jar
Welcome to Scala version 2.6.0-final.
Type in expressions to have them evaluated.
Type :help for more information.

scala> import com.google.api.translate.{Language,Translate}
import com.google.api.translate.{Language, Translate}

scala> import Translate.translate
import Translate.translate

scala> import Language.{ENGLISH,SPANISH}
import Language.{ENGLISH, SPANISH}

scala> translate("bastante facil",SPANISH,ENGLISH)
res0: java.lang.String = Fairly easy
So it's pretty handy for trying out APIs, even built-in ones. Let's look at replacing all backslashes in a String with forward slashes, presumably to insert the resulting code into our Java program.
scala> "blah\\blah\\".replaceAll("\\","/")
java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
\
 ^
        at java.util.regex.Pattern.error(Pattern.java:1700)
        at java.util.regex.Pattern.compile(Pattern.java:1453)
        at java.util.regex.Pattern.(Pattern.java:1130)
        at java.util.regex.Pattern.compile(Pattern.java:822)
        at java.lang.String.replaceAll(String.java:2190)
        at .(:4)
        at...
scala> "blah\\blah\\".replaceAll("\\\\","/")
res2: java.lang.String = blah/blah/
As you can see it's useful for prototyping little bits of Java too. But Scala in this case has a better way, too. Strings delimited with """ do not need any escaping (and can be multiline):
scala> """blah\blah\""".replaceAll("""\\""","/")
res4: java.lang.String = blah/blah/
Now the only escaping necessary is what Java's regex implementation requires.

Scala's method call syntax can be used without punctuation in some cases. x.y(z) can be written x y z, and x.y() can be written x.y. It also has implicit conversions, so if you define a conversion from type X to type Y, it looks as though X has all Y's methods. The type Int has a conversion to RichInt, and RichInt has a to(Int) method, so I can do:

scala> 1.to(10)
res0: Range = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
or even better:
scala> 1 to 10
res1: Range = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
The lack of punctuation makes some things really attractive:
scala> (1 to 10) ++ (20 to 30) map (_.doubleValue) map Math.sqrt filter (x => x-
x.intValue>0.5) map (x => x*x) map Math.round
res60: Seq[Long] = Array(3, 7, 8, 21, 22, 23, 24)
This code takes the range 1 to 10, the range 20 to 30, and concatenates them together, then gives another range with the same values as the first but as doubles, then another with the same values as the second, but square rooted, then it gives another range with only those square roots whose fractional parts are greater than 0.5, then another range with the remaining values squared, then another with rounded values of those.

You can read the code pretty much like a bash pipeline. Here's the same thing how I imagine a bash programmer would like it:

concat $(range 1 10) $(range 20 30) | map doubleValue | map Math.sqrt | filter x-x.intValue>0.5 | map x*x | map Math.round
As with bash, each new Range does not stomp over the memory of the previous one.

Let's insert the punctuation again to see how it looks with Java-style punctuation:

scala> 1.to(10).++(20.to(30)).map((_.doubleValue)).map(Math.sqrt).filter((x =>
 x-x.intValue>0.5)).map((x => x*x)).map(Math.round)
Eek! I think it's safe to say we wouldn't write such elegant code so often if we had to write (and read) the punctuation!

This use of methods as if they were infix operators is really powerful; so powerful that it is used for what we normally call infix operators. 3+4 is just 3.+(4) (operator precedence rules are preserved though).

That's enough for now.

Sunday, December 16, 2007

A Functional Way of Testing OOP Programs

A message in OOP implies effects on entities[1], rather than mathematical functions. If you use mathematical functions in your code, how often they're evaluated isn't important:

    print cos 0*cos 0 is equivalent to:
    let x=cos 0 in print x*x
If you instead say that cos 0 is a message you send to an object, then you can't make that optimisation without knowing how the code you're calling works, because you'd be eliminating a message. cos 0 as a message may cause effects that you can't easily see as a caller; collapsing two messages to one can introduce different behaviour.

However, most of the time that you send an object a message it doesn't appear to perform an action, it just returns you some value. I'll name a method called for its value a function, and a method called for its effects simply a method.

If you can intercept effects that methods cause, then the method no longer causes effects, but describes them. In other words, if you notice an action and control whether it really happens, you've made the method appear to be a function, and you've made it easier to test, because you can observe all the actions that happen.

Such an interceptor, a body of code that intercepts effects as described could use metaprogramming of some sort, perhaps by changing classes directly at runtime, perhaps through compile-time techniques such as macros. However implemented, it would apply an automated transformation to the innards of methods. Let's see what we'd want that to generate, by writing the result of the transformation ourselves.

The interceptors in the following code vet, log or reject effects. I've made an interceptor return an interceptor on each call so that interceptors themselves can be implemented using return values rather than state changes. The code in this article is something a bit like Java, so that implementation details of a particular language don't get in the way.

public Interceptor writeSomeTextToFile(interceptor,text,file)
{
    (interceptor,val out)=interceptor.new FileStream(file)
    interceptor=interceptor.write(out,text)
    return interceptor.close(out)
}
It looks doable, but pretty ugly. One part of it can be improved. If we change Interceptor so that it can has a type parameter, we can get rid of the tuple return that interceptor.create gave:
public Interceptor[Nothing] writeSomeTextToFile
    (Interceptor[Nothing] interceptor,text,file)
{
    Interceptor[FileStream] out=interceptor.new FileStream(file)
    Interceptor[Nothing] two=out.write(text)
    return two.close(out)
}
It's really up to the Interceptor now what it does with that code. It could run the effects there and then, store them and never execute them, and our code would be none the wiser, because we haven't seen any mechanism for getting values out of the Interceptor.

The code processor to add interceptors would be pretty handy. Let's say we have an annotation that instructs some build tool or macro to do that, so now our source code looks like:

@WithInterceptor
public void writeSomeTextToFile(text,file)
{
    val out=new FileStream(file)
    out.write(text)
    out.close()
}
Our unit test can look like:
val passed={
    val ceptor=new LoggingInterceptor()

    return ceptor.invoke(writeSomeTextToFile,"hello","/etc/passwd")
                 .matches(list(creating(FileStream,"/etc/passwd"),
                               writing(FileStream,"hello"),
                               closing(FileStream)))
}
It's now clearly far easier to reason about and test the method, because you can trivially observe all of its side-effects. You could even decide which ones to allow, externally to the code, to implement a sandbox. In the usual case that you want to execute the effects immediately, you can still do that.

This is a very long-winded way of showing that methods are functions in disguise. Allowing methods to have difficult-to-notice side-effects makes them harder to reason about. It's harder to write tests for them, it's harder to think about them.

This interceptor technique could be applied to existing code, to compare the effects that a 1,000 line method has, to the effects that a refactored version of it has, in the same way we often write unit tests that compare returned values. It seems to make such a good regression test framework that I'd be very surprised if it didn't already exist for most mainstream languages.

The interceptor technique is very heavily based on monads (and may even just be a monad). Haskell programmers, the biggest monad users today, even have special syntax for the interceptor chaining; the translation I mentioned is built into their compiler. In fact, they can do all the things OO programmers do, but they make it harder to have unwanted side effects. To my knowledge though, Haskell's IO monad is largely implemented as a compiler hack, so it's hard to write the same tests for side effects that I've showed in this article.

[1] "in object-oriented programming languages such as Smalltalk or Java, a message is sent to an object, specifying a request for action." -- http://en.wikipedia.org/wiki/Message

Blog Archive

About Me

A salsa dancing, DJing programmer from Manchester, England.