General musings on programming languages, and Java.

Monday, July 21, 2008

Designing an Object

If there may exist an object with a method that is not appropriate at all points in the existence of the object, then the object or the method are flawed.

A class encapsulating a compile phase in an IDE might have a blocking or non-blocking execute() method, plus a getErrorMessages(). There is an obvious protocol in using this class - instantiate, call execute(), call getErrorMessages(). It's not particularly hard to use, though it's also not hard to get wrong. Even if you decide not to help those users who don't bother to learn the protocol, it's worth thinking about whether that protocol should even exist, and what the alternative is.

Many readers would probably, when prompted at least, make execute() return the error messages (or a Future for them), which solves the problem quite well. If you wouldn't, keep adding phases plus methods only appropriate for each phase, to one class that grows and grows, until you end up agreeing or changing career :) Anyway, in this case it's clear that the object was flawed by doing two things one after the other - executing the compile phase and delivering results.

I bet most people could train themselves to spot this flaw and remove it, and if anyone only goes that far as a result of reading this post I'll be happy. But most people are probably quite happy with another flaw, java.util.Iterator.next(), which is only allowed when hasNext() returns true, in most Iterator implementations. But moving next() or hasNext() onto another object doesn't really work for Iterator. For a long time I was unhappy with Iterator, but didn't really have a solution, despite trying a couple of things out.

The biggest use of Iterator directly in Java for many years was in what has since been replaced by a foreach loop. There are some detractors of, well, anything new, but generally the foreach loop was really well received by the Java community. It provides a higher-level interface than the Iterator gives us. We can write a lot of code using the foreach loop that would have been more verbose and awkward to get right using Iterator directly. But foreach is only one abstraction; there are some more that are higher still than it, and don't (but can) depend on Iterator. If you don't know what those abstractions are I really think you should take the time to learn about map, filter and reduce, and the more general but less usable parent of those, fold. But this isn't a post about those, so I'll return to the topic at hand.

Iterator has been shown to be flawed, though flawed in a way that is acceptable to most of us and in a way we're used to, and in a way that seems non-trivial to solve (without knowing about map, filter, reduce and fold!). You might not be in a position to, or even want to, replace Iterator, but at least you should know not to copy its design, or bind yourself unnecessarily to it. You're now either armed with a simple way of deciding between two API designs, or you're about to tell me why I'm wrong.

Happy coding.

7 comments:

Alex Miller said...

Some call this "temporal coupling" meaning that there is a coupling in time between how you call methods on an interface. I think I probably first heard that from Dave Thomas 7 years ago or so.

Hamlet D'Arcy said...

This is covered fairly well in the 2nd edition of Joshua Bloch's Effective Java, Item 59: Avoid unnecessary use of checked exceptions.

He describes breaking a single method into two ( an isThingValid() and doThing() ) as a way to turn a checked exception into an unchecked one. If they were one method then the authors might throw a checked exception to indicate the invalid state, while if the API is two methods then they can throw a runtime exception because the users are assumed to have performed the check.

The tradeoff of using 2 methods is a more flexible API at the expense of an uglier calling sequence. And it's usually not appropriate during concurrent situations where the object's state can change.

Reinier Zwitserloot said...

No, Iterator is not the same thing as your hypothetical phased object.

getErrorList() doesn't make any sense until you call the execute() method. Without calling execute() the getErrorList() doesn't even come into play at all.

This is -NOT- true for Iterator.next(); you may use Iterators without ever calling hasNext, instead relying solely on NoMoreElementsException, or whatever it is that Iterator is specced to throw if you call next() when there are no more objects.

hasNext() is technically bad design, in the same way that this:

if ( !map.containsKey("foo") ) map.put("foo", 0);

is considerably worse than this hypothetical:

map.putIfAbsent("foo", 0);

because spreading out calls like that is dumb in a multi-threaded world.


@hamlet d'arcy: Yeah. Ignore Josh's advice there, he screwed that up. Check+Call is a fundamentally broken paradigm. Period. If the exception bothers you, return a 'Maybe'-like construction instead. For example, Iterator could have returned a Maybe<T>, which has a method 'exists()', which you could call to see if there's an element, or if you ran out. Calling next() after you've run out always gives you a Maybe that returns 'false' on 'exists()', and throws an exception on its get() call. Seems very similar to checking with hasNext(), except that this time, the Maybe is immutable. Or, just return 'null' if there's nothing there, though then you run into problems if the collection you're iterator over contains nulls.

Hence the following are equivalent, assuming the Iterator takes care of thread synchronicity internally:

synchronized ( it ) { if ( it.hasNext() ) return it.next(); else return null; }

Maybe<T> foo = it.next();
if ( foo.exists() ) return foo.get(); else return null;

note that in the second version, no synchronized needed. For convenience sake you might want to add a method to Maybe: ifExistsReturnObjectElseReturn(null); which you could use to one-liner the above into:

it.next().ifExistsReturnObjectElseReturn(null);

which, aside from my lack of creativity in coming up with a good name for this thing, seems like quite an improvement.

I like alex/Dave Thomas's term, 'temporal coupling'. Exactly what's going on, and not a good thing to have. Then again, aren't ALL methods that mutate the state of an object 'temporally coupled'? Maybe the term is a bit too broad.

GM said...

reiner:
I like alex/Dave Thomas's term, 'temporal coupling'. Exactly what's going on, and not a good thing to have. Then again, aren't ALL methods that mutate the state of an object 'temporally coupled'? Maybe the term is a bit too broad.

No, the term's fine. A mutation is never a "good thing to have". The question is to what lengths are you prepared to go to avoid it? (and the answer is "depends how much code quality is worth to you.")

Anonymous said...

One place it's really hard to avoid temporal coupling is with resources, e.g. Closeable.

Hamlet D'Arcy said...

@anonymous - Check out ARM blocks in Java and the Using keyword in F# (C# has it too, right?).

Reinier - This seems very close to the OCaml/F# option type, where a value can be either Some or None (except in those languages the compiler forces you to deal with the None case whereas in Java you can still call Maybe#get without the Maybe#exists check. Is cool). Thanks for the description though, I forwarded it to me team to start using!

Ricky Clarkson said...

For your reference, guys, the next post covers what Reinier mentioned - Maybe, from Haskell. It's equivalent to Scala's Option type, and probably has parallels in many languages.

I'll ask my manager to put Effective Java on our Safari bookshelf so that I can reread what Josh said about exceptions, but I think it's an awful idea to replace an exception with a protocol (or temporal coupling), especially when an alternative (Option or Either) exists. But then I suppose in the day of Effective Java's 1st edition, the lack of generics made Option and Either much less attractive.

Blog Archive

About Me

A salsa dancing, DJing programmer from Manchester, England.