General musings on programming languages, and Java.

Wednesday, September 10, 2008

Implementing the Builder pattern in Java without repeating code

When writing some Java wrappers around some CGI requests at work, I began with a normal implementation of the builder design pattern, but when I realised I was going to have to do this for about 50 CGI requests, and some of them were nested (CGI requests with query parameters to be sent on to a further CGI request on another machine), and that many of the parameters had interesting constraints, I realised that while the API might be fine, the implementation we were looking at, Josh Bloch's, encouraged repetition of logic.

Anyway, here's an example to get us started. The problem:

Person person = new Person("John", "Jackson", 1979, 11, 10, "AC2193");
We wanted something more like:
Person person = new Person.Builder()
    .forename("John")
    .surname("Jackson")
    .yearOfBirth(1979)
    .monthOfBirth(11)
    .dateOfBirth(10)
    .nationalInsuranceNumber("AC2193").build();
To keep the code short, we'll use a simpler class as an example, a Person consisting of name and age. As you'll see, even this can be large enough to be interesting.
public class Person
{
    private final String name;
    private final int age;

    private Person(Builder builder)
    {
        this.builder = builder;
    }

    public static class Builder
    {
        private String name;
        private int age;

        public Builder name(String name)
        {
            this.name = name;
            return this;
        }

        public Builder age(int age)
        {
            this.age = age;
            return this;
        }

        public Person build()
        {
            return new Person(this);
        }
    }

    public String getName()
    {
        return name;
    }

    public int getAge()
    {
        return age;
    }
}
Even for 2 parameters, this is quite a lot of code, though thus far there isn't really any logic to be duplciated. Now let's look at a really simple constraint, that values cannot be set twice.

The most obvious way of trying this would be to have a boolean alongside each field in the builder, e.g.:

private String name;
private boolean nameAlreadySet;

private int age;
private boolean ageAlreadySet;
And then in the name(String) and age(int) methods in the Builder you would check the value of that boolean, and throw an IllegalStateException if the boolean had already been set. This is clearly a repetition, which can lead to copy-and-paste errors or just make things hard to change.

In object-orientated programming the usual way of handling this would be to package the field with its boolean in an object and call it, say, MaxOnce. There is a good reason not to go down this path, though, it's difficult to chain MaxOnce with other such types, for example when we want BoundedInt, which prevents values outside a certain range, to work with MaxOnce. So we have a problem that the classes don't work together well. Time for another approach.

It would help if MaxOnce and BoundedInt were more like filters that data gets passed through (or not, if the data is invalid). Enter Parameter.

private final Parameter<String> name = maxOnce("name", null);

private final Parameter<Integer> age = bound(maxOnce("age", 0), 0, Integer.MAX_VALUE);
Notice how bound and maxOnce are chained together in the age parameter It's easy to see how you might write other filters. Here's a largely useless example:
Parameter<Integer> number = not(5, bound(maxOnce(0, "a number from one to ten, but not five"), 1, 10));
For a Parameter that has no default, it might be handy to store the value as an Option, rather than use null, or an empty String or some other sentinel. In another case we want to store the value as a TreeMap (for sparse arrays, mentioned later). So generally we'd like to be able to specify an input type and an output type for a Parameter.
private final Parameter<String, Option<String>> name = maxOnce("name");

private final Parameter<Integer, Option<Integer>> age = bound(maxOnce("age"), 0, Integer.MAX_VALUE);
Note that bound and maxOnce work together for the age parameter, as two filters.

In a few cases, the Parameters that we use are allowed to take multiple indexed values. They are effectively sparse arrays. The Parameter's input type is a Pair<Integer, T> and the output type is a TreeMap<Integer, T> - each incoming Pair gets added to the TreeMap.

private final Parameter<Pair<Integer, String>, TreeMap<Integer, String>> = ...;
We can see that the Parameter is more a declaration than it is an actual value. Then it's quite handy that, actually, Parameter holds no mutable state - we store that in a Map<Parameter, Object> but with slightly better types, wrapped up as a GenericBuilder, and we don't modify that Map, we copy it when we add values, like CopyOnWriteArrayList does in the Java libraries.

Here's the original Person class with a new implementation:

public class Person
{
    private final GenericBuilder finalValues;

    private static final Parameter<String, Option<String>> nameParam = param("name", "The name of the person", Conversion.<String>identity());

    private static final Parameter<Integer, Option<Integer>> ageParam = notNegative(param("age", "The age of the person", Conversion.stringToInt));

    private Person(GenericBuilder finalValues)
    {
        if (realBuider.isDefault(nameParam) || realBuilder.isDefault(ageParam))
            throw new IllegalStateException();

        this.finalValues = finalValues;
    }

    public static final class Builder
    {
        private GenericBuilder realBuilder = new GenericBuilder();

        public Builder name(String name)
        {
            realBuilder = realBuilder.with(nameParam, name);
            return this;
        }

        public Builder age(int age)
        {
            realBuilder = realBuilder.with(ageParam, age);
            return this;
        }

        public Person build()
        {
            return new Person(realBuilder);
        }
    }
      
    public String getName()
    {
        return finalValues.get(nameParam);
    }

    public int getAge()
    {
        return finalValues.get(ageParam);
    }
}
There are a couple of extra bells and whistles in the real code, such as building and parsing URLs containing the parameters. I have another post taking this one step further (using Parameters from code generation) in the pipeline.

12 comments:

Hamlet D'Arcy said...

This doesn't leave Person as immutable. You can modify it by holding onto a reference of the builder, right? Because the GenericBuilder's contract doesn't specify that each parameter can only be set once. I think the private constructor on Person should copy the builder parameters into final fields of simple data types.

This is great, and I'm all for it... but how do your co-workers feel about it? I've found three things in going with this approach... 1) the Option type is a good idea, 2) I really only need three generic interfaces in Java and most all other types can be expressed in terms of those: unit -> 'a, 'a -> 'b, and 'a -> 'b -> 'c. and 3) My co-workers hate me for trying to make Java a functional language.

On second thought, don't tell me how your co-workers feel. I can probably guess already and you'd risk turning your blog into another tired, pointless, "let's talk about the feel of Java" musing.

Ricky Clarkson said...

Hamlet,

Person is immutable, because GenericBuilder is. GenericBuilder.with returns a new GenericBuilder with a new TreeMap.

I will find out how my co-workers feel about this next week - this blog post is an adaptation from some "design rationale" that I wrote in advance of a code review. Perhaps I missed something out that would have helped you to realise that Person was immutable.

I have tried quite hard to make this immutable but without it appearing to be a maze of lambdas. The planned blog post in which I use Parameters to generate source might a) be useful if the code review results in a rewrite, b) allow me to generate other languages than Java from the same data - C++ and Delphi are possible use cases.

Ricky Clarkson said...

Correction, it returns a new GenericBuilder with a new HashMap, not a new TreeMap.

Hamlet D'Arcy said...

I would be tempted to modify Person to have name and age final fields and then annotate them with the data being represented by the current Parameter fields. That would hide some of the complexity of the GenericBuilder dependency tree and if you needed your Pojo serialized then you could do so without bringing the rest of the dependencies along for the ride.

Anonymous said...

Nice post Ricky..

Martin Dobmeier said...

Ricky,

very interesting post. I like that idea (been in the same situation once and have gone for your "obvious", but ugly, solution). I have to admit though that I don't understand every point you made. I hope you don't mind me asking a couple of dumb questions?

1) As far as I understand, the Option type stores the value passed to the builder or a default value if no value has been passed, right? So Option allows you to enforce a mandatory field? And the GenericBuilder is responsible for un-packing the value from the Option type?

2) What is the meaning of the Conversion type passed to the param method? What are the other parameters supposed to do? Are the methods param() and nonNegative() statically imported?

3) When are the constrains (like nonNegative) evaluated? Is this also done in the GenericBuilder?

Thanks, Martin

Ricky Clarkson said...

"So Option allows you to enforce a mandatory field?"

Yes, absolutely.

"And the GenericBuilder is responsible for un-packing the value from the Option type?"

No, actually, the Builder implementation is, because only that can know what it wants to do with a missing mandatory field.

"What is the meaning of the Conversion type passed to the param method?"

It is used for converting values to and from Strings (at least, in the version that exists since this blog post), which I use for generating and parsing URL parameters like foo=bar&baz=eggs, and for generating and parsing CSV.

"Are the methods param() and nonNegative() statically imported?"

Yes.

"When are the constrains (like nonNegative) evaluated?"

When a new value is to be stored. Actually the constraints are only a specialisation of a more general technique, reduction. Given a new value and an already-stored value, the Parameter can compute what value to store thereafter, i.e., it can reduce 2 values to 1, and thus reduce n values to 1 when applied n-1 times.

I have updated this code a lot since the blog post - it went down pretty well with my co-workers but it has changed enough, and in ways that I like, that I might blog about this again, this time with something the reader can actually execute.

hannan said...

I like that idea (been in the same situation once and have gone for your "obvious", but ugly, solution).And it is useful for me and most people who have the same problems .And I have added this article in my folder /url ,to make more and more people know this .Thanx for your post !

Hot Water Systems said...
This comment has been removed by a blog administrator.
edward said...
This comment has been removed by the author.
edward said...

Thanks ricky.i had the same problem and your post gave me the solution.

Anonymous said...

Indeed its a nice pattern and also mentioned by Joshua Bloach in Effective Java. I have also shared one example on my post What is Builder Pattern in Java, let me know how do you find it.

Blog Archive

About Me

A salsa dancing, DJing programmer from Manchester, England.