General musings on programming languages, and Java.

Friday, January 02, 2009

The Typeclass Pattern

The Typeclass Pattern

This pattern applies to C# and Java, and possibly some other typed languages. It doesn't apply to C++, which does generics via templates.

A simple problem to demonstrate it with is a generic Pythagoras method that takes two numbers a and b and computes sqrt(a * a + b * b) for them. In both C# and Java this can't be written in the most straightforward way:

public T Pythagoras<T>(T a, T b)
{
    return Math.sqrt(a * a + b * b);
}
There's more than one reason, but the 'innermost' is that T doesn't have the operator *, and there's no way to restrict T so that it does.

The equivalent C++ will work, because C++ generates code rather than using constraints.

So the missing information in Pythagoras is:

   * How to multiply two Ts. (let's assume T * T gives T)

   * How to add two Ts. (let's assume T + T gives T)

   * How to find the square root of a T.


interface Num<T>
{
    T Add(T one, T two);
    T Multiply(T one, T two);
    T Sqrt(T t);
}
Now Pythagoras can be written with an extra parameter, a Num:
public T Pythagoras<T>(T a, T b, Num<T> num)
{
    return num.Sqrt(num.Add(num.Multiply(a, a), num.Multiply(b, b)));
}
Clearly, some language support wouldn't go amiss for this, at least in the arithmetic case. What's more of a problem is that your method gets a 'surprising' extra parameter, the Num. It's one of those things that makes sense in isolation, but is really an unnecessary detail when reading the code to gain an understanding of it. Scala offers an interesting solution to that.

In Scala, you might write Pythagoras as:

def pythagoras[T](a: T, b: T)(implicit num: Num[T]) = num.sqrt(num.add(num.multiply(a, a), num.multiply(b, b)))
Then calling it would look like:
pythagoras(3, 5)
The second parameter list for pythagoras is an 'implicit' parameter list, meaning that in compilation the compiler looks for a Num[T] in scope that it can use for num. You can specify it explicitly, e.g., pythagoras(3, 5)(Num.integer), or you can just import Num.integer at some point.

As with all patterns, though, all it takes is a language to support it directly, and then the pattern disappears from user code. In Haskell, Num is a built in type class and Int, Double, etc. are types that are instances, or members, of Num. +, -, *, / are defined as methods that are part of the Num typeclass. Let's omit the sqrt part for a moment:

Prelude> let pythagorasSquared a b = a * a + b * b
Prelude> pythagorasSquared 3 4
25
Prelude> :type pythagorasSquared
pythagorasSquared :: (Num t) => t -> t -> t
I hope the first two commands I gave to the Haskell interpreter are self-explanatory. The third asks what type pythagorasSquared is. That's like asking what signature a method has.

pythagorasSquared is a function with one type parameter called t. t is an instance, or member, of the Num type class. pythagorasSquared takes two values of this type t, and returns a value of the same type. E.g., given two Doubles it will yield a Double. This usually works internally by passing around a Num implicitly, like in the Scala solution, but can work quite like the C++ code generation way, depending entirely on the compiler.

The reason that I omitted the sqrt is that in Haskell, sqrt is defined on Floating, another type class, rather than Num (giving the sqrt of an Int as an Int is perhaps not useful).

At some point, some readers will have stopped understanding the terms, but if you grasped the concept, this wasn't a waste of time!

11 comments:

James Iry said...

For the sake of completeness, with a slightly different formulation the Scala code can look like

def pythagoras[T <% Num[T]](a: T, b: T) =
((a * a) + (b * b)).sqrt

James Iry said...

I've put a complete example up at http://paste.pocoo.org/show/97661/

Reinier Zwitserloot said...

Isn't this (at least in Java) simply a case of incomplete libraries? Java already has a Number class. It doesn't have any particularly useful methods in it though; certainly no plus, minus, divide, or multiply.

If it DID have those, then you could just use them. Number isn't restricted; anyone can cook up new ones.

So what's the difference between an interface and a typeclass?

Isn't that pretty much always the case - that there's an incomplete or badly designed library involved when you're trying to use e.g. implicit def to patch this omission? I like the notion of being able to fix things, but perhaps it would have been nicer if you could instead fix the source of the problem, instead of using implicit's voodoo magic (and, yes, having import statements make invisible code do different things counts as voodoo in my book). In this case that would essentially mean making e.g. Integer implement another more useful interface of your own design that did have plus / minus / multiply / divide, and at least in the case of java, providing the trivial implementations for them.

The notion that an import statement can have actual effects instead of just being a useful shortcut irks me. My aesthetic sense also thinks the import statement, especially scala's take on it (specifically the notion that they can show up anywhere) is... revolting. An import statement is a lexical scoping issue, so it should have a visible scope. In theory it should look like:

import something, somethingelse, yetanotherthing {
/* code block where the imports 'count' */
}

/* imports no longer count here. */

Perhaps use 'import' only at the top, with the understanding that it applies to all code contained in that source file, and a with keyword that does use the braces everywhere else.

The 'import' term has been overridden with a few too many meanings, perhaps: In python 'import' really means 'include', in the sense that an import statement will trigger the imported stuff to 'load' - any initializers will run on the spot.

In java, no such thing happens. An import statement does not initialize a class at all; it's not even a statement. It is just syntax sugar so that you don't have to type full.package.name.ClassName all the time, that's all. In scala it's a mixed bag. It doesn't run initializers but it does increase the pool of implicits that are scanned by the compiler.

James Iry said...

There are 2 important differences.

1) A Haskell type class, or Ricky's impersonation of one, can be added to a type after the fact.

2) A type class can have operations that don't act on a value of a certain type, but return a value of that type. OO think has sunk into programming brains so much that this important use case is completely ignored. Here's one example: what operations does a stack need? You probably say push and pop and you can immediately imagine the interface. But I say there's one more important operation: creating a new stack. In OO land that means a separate "factory" interface. With type classes, on the other hand, it's all part of the same interface.

James Iry said...

So that's about type classes. Now, about Scala's imports. I think you are confusing Scala's import with...I don't know what. Scala's import is lexically scoped. And all it does is
1) create a shorthand notation for members of a package or object
2) add implicits from a package or object to the scope in which they are imported.

Import does not cause any effects in Scala. Imports can change the meaning of code, but that's already true in Java.

Ricky Clarkson said...

Yes, the specific implementation of Pythagoras here could be addressed reasonably by Java having a useful Number class (same for C#).

In some real code I have, I also need Num to have a zero() method, returning 0 in the appropriate type. Now having int (or Integer) have a zero() method is probably a bit odd. 5.zero()? As James says, interfaces and inheritance don't solve everything.

It's false that this technique is for patching bad or incomplete libraries, unless all libraries that a user wants to extend are bad or incomplete. If you give me a useful Number with plus, minus, etc., I'll want to extend it with greaterThan, lessThan, etc. If you provide those I'll want to extend it with succ, pred, isZero. It's better to support growth, than to presume you know what all library users are going to want.

Reinier Zwitserloot said...

Ah, but I again detect making a simple problem harder.

All we really need is 'static interfaces' - a way to declare that a certain class adheres to a certain contract; where the contract involves a list of static methods. Java doesn't support this, but it's not too difficult to imagine a language change that makes it possible. Then you could do something like:

(nb: There's always 1 generics parameter, and its always the implementing type).

public contract NUMBER<T> {
public static T zero();
public static T seq(T x);
}

public class Integer implements Number conforms NUMBER /* <Integer*> is implied and may be omitted */ {
public static Integer zero() { return 0; }
public static Integer seq(Integer i) { return i+1; }
/* ... other methods of integer ... */
}

You can now do:

public static T one(NUMBER<T> N) {
return N.seq(N.zero());
}

assert 1 == one(Integer.class);

Under the hood, at first usage of any contract, a singleton is created with the standard useful methods (.newInstance(), .class() - that second one required due to lack of generics reification), and all the methods defined in the contract (in this case, seq(T) and zero()). This singleton is what you get when you 'cast' a class to a contract it implements - for example by passing a class literal in a place where a contract is expected. There's precedence for this type of syntactic sugar: java's enum is a similar fake type that is really just a lot of smoke and mirrors; in actual fact you're getting java.lang.Enum subclasses.




Which use-cases for type classes can't you solve with this, presuming for a moment that we aren't interested in slapping bandaids on incomplete / badly designed APIs in this way?

James Iry said...

> Ah, but I again detect making a simple problem harder.

> Which use-cases for type classes can't you solve with this, presuming for a moment that we aren't interested in slapping bandaids on incomplete / badly designed APIs in this way

Ah, but I again detect making an impossible problem look simple. Every API is by definition incomplete for my needs. If an API was truly complete then I wouldn't need to write any code.

The post-hoc nature of type classes allows me to succinctly express generic concepts that are outside the scope of the original API's design. It allows me to impose a common structure over several types that somebody else wouldn't think had any useful relationship.

Reinier Zwitserloot said...

James:

Wrappers.

matroska said...

Cannot it be written also with

def pythagoras[T : Num](a: T, b: T) =
((a * a) + (b * b)).sqrt

using Context Bound instead View Bound with Scala 2.8?

apocalisp said...

Again, I detect somebody proposing an extremely complicated solution involving subtyping, to a problem that is solved simply and elegantly with type classes.

Blog Archive

About Me

A salsa dancing, DJing programmer from Manchester, England.

My Blog List