General musings on programming languages, and Java.

Sunday, February 18, 2007

Ricky's Properties for Java

While I think there should be some language change to make properties really useful, it's worth looking at how close we can get to properties without changing the language, to work out what needs changing. Suppose you were to write a method called, say, setField, or setf for short, that takes a property and a value, and sets it.  This is fairly reasonable as something you might like to do with arbitrary properties, for, say, a GUI.

  <T> T setf(Property<T> property,T value)
This approach relies on having one object per property, so it's easy to see it as a memory leak.  It's not actually a memory leak, it's a memory overhead.  Suppose that you create 1000 objects, each of which take 100 bytes normally.  Now you change them to expose Property objects as public final fields, and each object now takes 1000 bytes.  It's only a (potential) leak if you need to create objects every time you use a Property.  What we do have now though, is n more objects, where n is the number of properties.  This isn't a hopeless situation; there is a possible VM-based solution, holding Property objects directly, i.e., without a pointer, as part of the object they're in.  This requires the size of a Property to be known by the verifier, so the actual Property implementation would need to be known at load time. While this might seem like a hopelessly early optimisation, it's worth thinking about now, because if properties do get implemented in the language, and they are completely flexible (so that you can replace a Property object at runtime), then we'll no longer be left with the possibility of this optimisation.  A halfway house would be to make it possible to prevent a Property object from being replaced, or for the VM to be smart enough to tell which Properties aren't going to be replaced.  Obtaining a Property object is tricky
If we want setf to work with properties defined by existing code, we should be able to recognise the getX/setX convention, and make those into Properties.  Let's look at how we can create a legacy Property using current (Java 5/6) code:
  Property<String> nameProperty=new LegacyProperty<String>(new GetterAndSetter<String>()
    public String get()
      return object.getName();
    public void set(String s)
This gets a bit shorter with the BGGA closures syntax, or even method references, but it doesn't get any sweeter.  It's still pointless duplication. Another implementation would use reflection.  Property<String> nameProperty=new ReflectiveProperty<String>("name",object);  Obviously there are the usual problems with this, such as type safety not being guaranteed at compile time, performance, that it requires tool support if the programmer is to be certain that it is refactor-proof.  There is an extra problem caused by erasure of generic types; there's no way of knowing that nameProperty really is a Property<String>.  setName could take an instance of some 'Name' class.  This is not simply a case of choosing dynamic typing over static typing, because erasure doesn't give us a choice.  The reflective solution is not typesafe at all unless we either implement reification or give ReflectiveProperty a 'type token', in this case String.class.
  Property<String> nameProperty=new
It doesn't work with legacy bean-manipulating code Suppose I write a new class and don't write getters or setters, but instead expose my fields via Properties.
  class Person
    public final Property<String> name=new DirectProperty<String>("unknown");
Now any code that reflects on Person looking for getX/setX methods won't find any.  It's arguable that the code should use Introspector to introspect, rather than direct manipulation, and hence that I could provide a BeanInfo class for Person, but not all the code that manipulates beans uses Introspector. Erasure could make a List of Properties useless. Suppose you asked a bean for all its properties, either directly or via some introspector.  You'd get a List<Property<What?>>.  It cannot be Object.  It can be ?, though this would prevent set from being called.  It can be a raw type.  In any case, erasure will stop us from seeing the actual type of the property, unless we add a type token, as mentioned earlier. What needs to change? Now let's take the above and make it convenient to use by changing the language a little. 1.  All getX/setX pairs are properties.  This includes isX, read-only properties and write-only properties.  This allows new code to work with existing beans. 2.  All explicit property declarations generate getX/setX or isX/setX pairs at compile time.  This allows new beans to work with existing code. 3.  The generated code simply calls the Property's get/set methods.  There is no generated field in the declaring class, other than for the Property itself. 4.  A syntax is provided for getting at a named Property given the name of a bean.  This is statically checked for correctness. 5.  A syntax is provided for getting the value or setting the value of a property.  The '.' operator will suffice.  A field and a property cannot exist with the same name, which avoids compatibility issues with existing code. The easiest argument against this is also the easiest to refute, namely that it calls non-obvious code.  The same argument could be used to reject polymorphism.  Plus, there are already precedents in Java.  arrayElement[index]=value is an assignment that does more than it appears to - it checks bounds.  String concatenation calls .toString() on objects.  + promotes low primitive types to int.  These are all good things; there's nothing fundamentally wrong with calling non-obvious code, as long as it is possible to discover what code is actually called. 

The strongest argument in favour is readability - there should be no readability price for using properties.  Currently there is a price.


sproket said...

"Properties remove the readability price that we currently pay for using accessors instead of public fields."

You mean like:

for (int j; j < count; j++) {

as opposed to :

for (int j; j < getCount(); j++) {

OOPS! I shouldn't have a getter in a loop!

Maybe it should be

int count = getCount();
for (int j; j < count; j++) {

See how properties REDUCE readability AND can introduce performance problems in code?

"Properties remove the code duplication price that we currently pay for getters and setters. That price is better measured in time to read code than keypresses."

In IntelliJ I press alt-insert and then chose the fields and press enter. Yeah, lots of keystrokes. Additionally I can view the structure of the class which quickly indicates fields vs properties.

Try a little harder.

Niels Bech said...

Well Sproket,

The for loop is basically designed to test the condition at each iteration, so neither count(as property) nor getCount() would hold the "performance gains" you describe in the third example, and they shouldn't.

The performance gain is when using the for condition in a finite expression. Remember that the coung property may change its value during the code we iterate over. Hence to hold the "true" number at the beginning of the iteration can be a good thing

As to the price of method accessing property value, I think that encapsulation is a good idea and both forms yields little or no overhead in the simple getter.

Anonymous said...

why not just use some annotation and bytecodegeneration via cglib or something else?

so everyone can choose to do it. and we don't break old code which may happen. This is illustrated by the for loop example.

Axel Rauschmayer said...

Excellent proposal! It does away with a lot of confusion that still exists, even in the Java API: one sometimes has getName(), sometimes name(). With your kind of first-class properties, things are a lot clearer and we'd get some of the elegance that Self has had for decades. Gilad Bracha's blog entry applies here, as well.

Note: Property change events are important to consider, too. Having the listener handling etc. generated is a big plus for GUI programming (data binding!).

All these new features are backwards compatible, easy to understand and not syntactically awkward. I wish closures could be as elegantly done (well, maybe if they were only syntactic sugar for single-method implementations plus runtime exceptions for return, break and continue).

Blog Archive

About Me

A salsa dancing, DJing programmer from Manchester, England.