General musings on programming languages, and Java.

Friday, June 29, 2007

On how the lack of macros damages Java code

Update - commenters pointed out an obvious mistake - I should have kept the sample code here the same as my real code. I've updated the code samples. Some time ago I asked why unit tests use assertions instead of returning a boolean. I didn't get any particularly convincing answers, so I continued returning a boolean, in my home-grown test framework, that's so small it hardly deserves the name 'framework'. One obvious benefit from using exceptions is that you get the name of the test method and class in the exception output. In my framework, I had to explicitly supply that. Each test object had a getName(). return "TestUninstallingDrivers". Ok, now I have duplication:


public static final UnitTest testUninstallingDrivers=
    new UnitTest()
{
    public boolean test()
    {
        stuff..
        return !card.hasDrivers();
    }

    public String getName()
    {
        return "testUninstallingDrivers";
    }
};
The usual "Java programmer" approach would be to just go back to exceptions, then you'd be able to get your IDE to show you the test method by clicking on the stack trace. However, there were some reasons not to go back to exceptions. I'd instead like to generate the getName() implementation based on the field name. I believe at least 2 out of the 3 major IDEs can help me with some kind of template like this:

public static final UnitTest $x=new UnitTest()
{
    public boolean test()
    {
        //Auto-generated method stub.  To stop this
       //annoying message, go through some dialogs
       //that you never really wanted to anyway.
        return false;
    }

    public String getName()
    {
        return "$x";
    }
}
So, that's all well and good, until I start refactoring, because the IDEs do not carry on linking the two expansions of $x even after the template has been applied. When I first heard someone rave about IDEA's Live Templates, I assumed they would provide such linking - they do not. Plus, even if they did, you'd still see the generated code, which isn't really that useful. The only part of the above code that's relevant to me is the name of the class and the code inside test. That is, I'd be happy with:

unitTest(testUninstallingDrivers,
{
    stuff..;
    return !card.hasDrivers();
})
..which would generate the above usual Java code. Of course, should I want to see the expanded template, I could just right-click on the template name or something. So what I'd really like from IDEA's Live Templates is for the template's name to stay in my source code, and to be expanded for compilation. But then, if it's staying in my source code, I can't compile without IDEA, or even with IDEA on another computer (unless I successfully manage my project settings). It would be better if the template could stay in my source code too. The source for the template. E.g.:

template unitTest(name,body)
{
    public static final UnitTest <name>=new UnitTest()
    {
       public boolean test()
       {
           <body>
       }

       public String getName()
       {
           return "<name>";
       }
    }
};
This would need a preprocessor before the usual Java compiler. There are some difficulties, like the <body> expansion above generating two sets of braces, but they can be resolved (a syntax like <@body> to splice in the body). In case there's anyone out there who hasn't recognised this yet, I'm really talking about Lisp macros, and not even a significantly complex application of them. I don't see any reason why modern programmers don't use a decent macro system, pretending to be happy with 'Live' Templates, and refactoring everything at a higher level (such as changing the unit test framework to use exceptions) instead of working with the actual source code they like. For a Lispy implementation of a unit test framework (in 26 lines of Lisp!), see this chapter from Practical Common Lisp.

Monday, June 25, 2007

Book Review - Release It! Design and Deploy Production-Ready Software

Release It! Design and Deploy Production-Ready Software, by Michael Nygard (Pragmatic Programmers) You know the drill. Here's a sample case of how something went wrong in production. The author needs to pad out the book, let's make the case more convoluted. Maybe include some code. A book written in dry technical English, painful to write and worse to read. That's what I expected when I received this book, except for the 'Pragmatic Bookshelf' logo on the front. I was pleasantly surprised; none of the above apply. The book took me quite some time to get through, not because it's padded out, but because it's dense. Despite being written in plain (and sometimes overly American) English, a lot of information is presented, and very readably. The author has clearly spent a lot of effort in making sure that the book doesn't patronise, yet does explain its terms clearly. Even though I'm not exactly the target market for this book, as it is aimed at enterprise developers (the author defines enterprise as systems that cost money whenever they go down), I enjoyed it, and I can apply quite a lot of it to my lowly non-enterprise project. There are a lot of references to Java in the book, but I don't think that should deter those who don't use Java, as the same points can be applied to other technologies. Rarely does it go deeply into Java-specific problems. Stop Waffling, And Tell Me What The Book's About Stability, capacity and operations. The stability part covers ways of making sure that errors in software don't spiral out of control. Even where there are failover subsystems in place, and everything seems isolated from everything else, there may be a hidden route of chaos, such that when one system goes down in a certain way, all the rest do. The 'Circuit Breaker' pattern will help to stop that. Capacity has a lot more human factors in it - how many sessions can your server support? Can you make sessions shorter in the event of a surge, saving RAM but annoying users? Does your testing environment match the production environment closely enough? Can you get away with static content for those users who are browsing, not buying? Do you really need pretty-printed HTML, adding 50K to data transfers? Operations is about ensuring that the administrators have an easy time of it, that you can have meaningful logs that are useful, that you can inspect (and even tinker with) a running application, throttling backups so that capacity isn't affected, and generally making sure that you can. A lot of the book may seem intuitive, but it's good to see your intuitions formalised, and taken a few steps further. Some programmers reading this book will be screaming out things like "use Erlang and this problem disappears", etc., and they'll often be right, but it's useful to understand a problem, even if your environment makes it impossible. Further, many problems will out in other ways even if an environment prevents the major cause. So I'd recommend this book to any programmer - learn from Michael's mistakes, and the mistakes of those around him, then hopefully your own mistakes won't be so costly. I'm glad that I keep the review copy, as I will be looking back through it again. Plus, I read some of it in the bath, so it's got a few crinkled pages - I don't fancy posting it back to anyone! The stories about what happened in production are always real, though the names are changed, and the author builds up a real sense of suspense that I've not seen often in technical books. The problems that he and his colleagues find are often hilarious, but the lessons drawn from them are important. I've often found, as I said at the start of this review, that stories in technical books are very dry, very boring and just padded out, but the author has enough writing skill, and great raw material, so that hasn't happened. To give you a flavour of how the book is written, here are a few randomish paragraphs:

Avoid fiddling Human intervention leads to problems. Eliminate the need for recurring human intervention. Your system should run at least for a typical deployment cycle without manual disk cleanups or nightly restarts.
It seems like common sense, but there are plenty of systems that require routine restarts. Common sense just isn't that common, it seems.
For example, I've seen badly configured proxy servers start re-requesting a user's last URL over and over again. I was able to identify the user's session by its cookie and then trace the session back to the registered customer. Logs showed that the user was legitimate. For some reason, fifteen minutes after the user's last request, the request started reappearing in the logs. At first, these requests were coming in every thirty seconds. They kept accelerating, though. Ten minutes later, we were getting four or five requests every second. These requests had the user's identifying cookie but not his session cookie. So, each request was creating a new session. It strongly resembled a DDoS attack except that it came from one particular proxy server on one Navy base.
Would you have thought of that in advance? I wouldn't, I'd have just blocked the user automatically, assuming malice.
Amazon ran into trouble with the Xbox 360, too. In November 2006, Amazon decided to offer 1,000 units for just $100. News of the offer spread far and wide. Not surprisingly, the 1,000 units sold within five minutes. Unfortunately, nothing else sold during that time, because millions of visitors hammered on their Reload buttons, trying to load the special offer page and score a huge discount on the hot console.
The context here is attacks of self-denial, where systems get attacked by themselves, by a special offer attracting an incredible surge of page requests. Again, I'd recommend this to any programmer, and to any administrator who deals with bespoke solutions. Perhaps if all you do as an administrator is install/configure applications for which a support contract exists, and you can afford the time between a problem and the support engineers appearing, then you won't benefit from this book. But then surely you're only 5 minutes away from being replaced by a shell script.. The one missing part, for me, is a discussion of how to go about designing a system from scratch - all the case studies describe existing systems, and often only a small part of them - I'd like some suggestions on how to write a system that will be able to scale easily - and how to avoid writing systems that won't.

Thursday, June 21, 2007

Missing Refactor - Convert to Anonymous Class

I'm still looking through my source for stray uses of 'private', following on from my recent blog about removing private, favouring other forms of encapsulation. One way I like to remove private is by converting named implementations of interfaces into anonymous ones. Here's some real code that I want to change:


public final class EthernetCableIcon implements
  ActionListener
{
  private final MouseInputListener listener;
  private final GUIContext context;
  private final JToggleButton button;

  private EthernetCableIcon(final GUIContext context)
  {
    button=new JToggleButton
      ("Ethernet Cable",EthernetCableHandler.icon);

    this.context=context;
    listener=new MouseInputListener(context,button);

    button.setVerticalAlignment(SwingConstants.CENTER);
    button.setVerticalTextPosition(SwingConstants.BOTTOM);

    button.setHorizontalTextPosition
      (SwingConstants.CENTER);

    button.addActionListener(this);

    button.setToolTipText("Click on this to draw an "+
      "Ethernet Cable, then drag on the display to "+
      "make one appear");
  }

  public static JToggleButton newButton
    (final GUIContext context)
  {
    return new EthernetCableIcon(context).button;
  }

  /**
   * Replaces the NetworkView's mouse listeners with its 
   * own, so that the EthernetCable can be dragged
   * without causing other problems.
  */
  public void actionPerformed(final ActionEvent event)
  {
    final Component view=context.getNetworkContext()
      .networkView;

    if (button.isSelected())
    {
      context.getNetworkContext().toggleListeners.off();

      view.addMouseListener(listener);
      view.addMouseMotionListener(listener);

      view.setCursor(new Cursor(Cursor.HAND_CURSOR));
    }
    else
    {
      view.removeMouseListener(listener);
      view.removeMouseMotionListener(listener);

      context.getNetworkContext().toggleListeners.on();
      view.setCursor(new Cursor(Cursor.DEFAULT_CURSOR));
    }
  }
}
My IDE, IntelliJ IDEA, has lots of refactorings, including from anonymous to nested, from nested to inner, moving classes, etc., but has nothing for making this an anonymous class. It's used precisely once, from the static method newButton. I've refactored it by hand, which isn't too bad, but it does make me wonder why I have to. Eclipse never supported this refactor either. Post-refactor:

public final class EthernetCableIcon
{
  public static JToggleButton newButton
    (final GUIContext context)
  {
    final JToggleButton button=new JToggleButton
      ("Ethernet Cable", EthernetCableHandler.icon);

    final MouseInputListener listener=new
      MouseInputListener(context, button);

    button.setVerticalAlignment(SwingConstants.CENTER);
    button.setVerticalTextPosition(SwingConstants.BOTTOM);

    button.setHorizontalTextPosition
      (SwingConstants.CENTER);

    button.addActionListener(new ActionListener()
    {
      public void actionPerformed(final ActionEvent e)
      {
        final Component view=context.getNetworkContext()
          .networkView;

        if (button.isSelected())
        {
          context.getNetworkContext()
            .toggleListeners.off();
          
          view.addMouseListener(listener);
          view.addMouseMotionListener(listener);

          view.setCursor(new Cursor(Cursor.HAND_CURSOR));
        }
        else
        {
          view.removeMouseListener(listener);
          view.removeMouseMotionListener(listener);

          context.getNetworkContext().toggleListeners.on();

          view.setCursor
            (new Cursor(Cursor.DEFAULT_CURSOR));
        }
      }
    });

    button.setToolTipText("Click on this to draw an "+
      "Ethernet Cable, then drag on the display to "+
      "make one appear");
    
    return button;
  }
}
Whether it's more readable now probably depends on the reader, but it certainly doesn't need private and hasn't lost anything in the encapsulation. I find it slightly more comprehensible anyway.

Tuesday, June 19, 2007

Taking private out of code without losing encapsulation

When I started programming in Java, I took private and public as fairly necessary access levels. It's only in the last couple of years that I've realised that you can achieve better encapsulation than private - by making the variable disappear, or at least not have a name in the relevant scope.

As an exercise, I thought I'd take a look at my codebase and see how it is impacted if I get rid of private. But of course, I didn't want to expose implementation details either, so I did it with the power of thought, a scarce resource.

Low-hanging fruit first: Private constructors. Sometimes these are used to prevent instantiation of a utility class. IDEA actually warns me if I instantiate a utility class, so I don't need the constructor. It's a no-brainer to delete unused code.

Ok, now private fields with get/set methods. There are two kinds - the first is the simple problem of having a private field for no reason - the get and set methods just pass through to the variable with no checking. Make the variable public and get rid of the methods. IDEA helps me with this - Ctrl-Alt-N inlines a method.

Private fields with more interesting get/set methods. In my code, so far, I don't have any interesting get methods, just some set methods that do some work after the value has been set. This isn't an original idea:

Make a Property interface, using generics (no evil casting) with get, set and addPropertyListener. Provide an implementation as a static factory method. Now change the private field to be a Property, adding the logic from the set method to the Property via a PropertyListener. Fix the get and set method so that they delegate to the Property, make sure everything compiles, then inline the get and set methods. If your get and set methods are more interesting than mine, then perhaps you'll need a more advanced Property type, e.g., addPrecondition, addPostcondition, etc.

Now for the fruit at the top of the tree. I use a Map to allocate IDs for Cables (it's a network simulator, and the IDs are useful for the event log). I don't want to expose the Map, but I do want to expose the operation I have on it - cableIDFor(Cable cable). How can I do that without using private? The Map has to persist between invocations, so I can't create it inside cableIDFor.

I'll make a field, called cableIDFor. It will hold an object that has one method. That method will take in a Cable and return an Integer. In other words, it's a Function<Cable,Integer>.


public final Function>Cable,Integer> cableIDFor=
    new Function<Cable,Integer>()
{
    final Map<Integer,Cable> cableIDs=hashMap();
 
    public Integer invoke(Cable cable)
    {
        for (final Map.Entry<Integer, Cable> entry: 
                cableIDs.entrySet())
            if (Caster.equalT(entry.getValue(), cable))
                return entry.getKey();

        for (int a=0;;a++)
            if (!cableIDs.containsKey(a))
            {
                cableIDs.put(a, cable);
                return a;
            }
    }
}
You call that like so: cableIDFor.invoke(cable), and of course now you get it as a Function for free, so if you want to pass it to some other method, you can. Oh, and yes, it's not a pure function, but I know that and treat it accordingly.

Thankfully I only call cableIDFor in one place, so I don't mind that IDEA doesn't provide an automated refactoring for this. Perhaps the structural search and replace will do it.

One more, a small one. I have a private field in a class that implements IncomingPacketListener - it's a StringBuffer, used for unit testing only. Easy solution, make the class anonymous and the name of the StringBuffer variable disappears from relevance.

I haven't lost any encapsulation in any of this, so I'm quite pleased. This all makes me realise that it isn't a problem that Common Lisp's OO support doesn't include private. That might actually be a good thing.

I expect a lot of people will be thinking "No! You're losing flexibility", but that's just not true. None of this code is external - I'm not actually impacting anything outside my own codebase. Externally-facing code gets put in separate projects that I normally leave closed so that I don't mess with it by accident. So, suppose that I decide that one of my public fields does need some validation - I can use IDEA's refactoring to encapsulate it, and then convert it into a Property or leave it as a private field with get/set. I'm not losing anything.

It's not less readable either. Because I use bits of functional programming in my code, it's actually fairly useful to be able to refer to a 'method' without executing it, which I can do if I convert it to a field that holds a Function. I end up with less anonymous classes this way. Likewise, even for 'normal' programmers, it's useful to be able to refer to a property instead of grabbing its value, for instance, for binding to a graphical object, etc.

Wednesday, June 13, 2007

Improving Your Visitors, and a Request For More Autoboxing in Java 7

Prior to Java 5, there was one really common way of implementing the visitor pattern. I'll use my network simulator as an example, defining a visitor that can visit any kind of network component (a card, a cable, a hub or a computer):

public interface Visitor
{
    void visit(Card card);
    void visit(Hub hub);
    void visit(Cable cable);
    void visit(Computer computer);
}
and an example accept method:
public void accept(Visitor visitor)
{
    visitor.visit(this);
}
For those not initiated with how this works, here's a typical example:
public static void drawAComponent
       (NetworkComponent component,final Graphics graphics)
{
    component.accept(new Visitor()
    {
        public void visit(Card card)
        {
            graphics.drawRect(10,10,20,20);
        }

        public void visit(Computer computer)...
    });
}
It works quite well, and you can adapt it to make it work over a collection. In IPSim there is a tree of components (each component knows its children, and there is a collection of top-level components). So you can do:
network.visitAll(new Visitor(){etc.});
If we want to inspect the object (get a value back from it) rather than do something, then we end up with code like this:
public static Dimension getSize(NetworkComponent component)
{
    Dimension[] result={null};

    component.accept(new Visitor()
    {
        public void visit(Card card)
        {
            result[0]=new Dimension(50,60);
        }
        etc.
    });

    return result[0];
}
Clearly it would be useful if the visitor could return something instead of having to set some value. We could make the Visitor's methods return an int, but then we'd have to write it again if we later needed another type. There are two solutions to this: 1. Make the methods return Object. This obviously results in casting, which we all know is for magicians. Let's move on to Java 5. 2. Make the methods return T, where T is a type parameter on Visitor. Ok, easy:
public interface Visitor<T>
{
    T visit(Card card);
    T visit(Cable cable);
    T visit(Hub hub);
    T visit(Computer computer);
}
and the accept method:
public T accept(Visitor<T> visitor)
{
    return visitor.visit(this);
}
The only problem with that is that T is undeclared in the accept method. Don't make the mistake of declaring it on the class (class Card<T> implements NetworkComponent<t>), because then you'll only be able to use one type of Visitor with each instance you create. Declare it on the method:
public <T> T accept(Visitor<T> visitor)
{
    return visitor.visit(this);
}
Now you've got something more useful, the above clunky code becomes:
public static Dimension getSize(NetworkComponent component)
{
    return component.accept(new Visitor<Dimension>()
    {
        public Dimension visit(Card card)
        {
            return new Dimension(50,60);
        }
        etc.
    });
}
This becomes much more interesting when you apply it across a collection. Instead of writing code that iterates over all the components and does something, we can write code that generates a result. For example, suppose I want to know if any of my components are a crossover cable. I want to visit all the components, and return true if any of them are a crossover cable. I could store a list of booleans, to process later, but that would be wasteful and indirect. What I want to do is apply the || operator between each pair of results.
public static boolean areAnyCrossover(Network network)
{
    return network.visitAll(new Visitor<Boolean>()
    {
        public Boolean visit(Cable cable)
        {
            return cable.isCrossover();
        }

        public Boolean visit(Computer computer)
        {
            return false;
        }

        public Boolean visit(Card card)
        {
            return false;
        }

        public Boolean visit(Hub hub)
        {
            return false;
        }
    },new Combinator<Boolean,Boolean>()
    {
        public Boolean combine(Boolean one,Boolean two)
        {
            return one||two;
        }
    });
}
Whew. This is an implementation (or at least behind the scenes there is one) of the 'reduce' algorithm, plus a bit of mapping going on first. It's very flexible. There's quite a bit of boilerplate above though, and as the number of component types grows, the amount of boilerplate grows and grows. The usual solution is to make an abstract class with empty (return null;) methods for each visit method, which you subclass to customise. That works pretty well (remember the @Override annotation!), but I've been avoiding subclassing for some time now, so I immediately started to wonder whether there was another way to do this. I'm not a zealot, if I didn't find another way, I'd just subclass. I found another way..
public static boolean areAnyCrossover(Network network)
{
    return network.visitAll(new DefaultVisitor<Boolean>()
        .visitCards(new Function<Card,Boolean>()
        {
            public Boolean invoke(Card card)
            {
                return card.isCrossover();
            }
        })
        .visitAnythingElse(new Function<NetworkComponent,
            Boolean>()
        {
            public Boolean invoke(NetworkComponent any)
            {
                return false;
            }
        }),new Combinator<Boolean,Boolean>()
        {
            public Boolean combine
                (Boolean one,Boolean two)
            {
                return one||two;
            }
        }
    });
}
Ok, it doesn't look brilliant, but it does look a little like functional programming. Let's make these anonymous classes look more like functions:
public static boolean areAnyCrossover(Network network)
{
    return network.visitAll(new DefaultVisitor<Boolean>()
        .visitCards({Card card => card.isCrossover()})
        .visitAnythingElse(
            {NetworkComponent component => false}),
        {Boolean one,Boolean two => one||two});
}
One more improvement would be if it were possible to get at || as a function. Let's imagine that #|| works for this. Let's also imagine that closures allowed you to elide unused parameters, that type inference were supported, and that method references were supported (!):
return network.visitAll(defaultVisitor
    .visitCards(Card#isCrossover())
    .visitAnythingElse({ => false}),#||);
Interestingly, it still looks like Java, and it doesn't get bloated whenever you add a type. I don't suppose for an instant that Java 7 will go this far. For an example of the <T> visitor pattern in current Java, see the source code for javac. It's littered with them, and it's quite nice to work with, but at least for now, remember your @Override.. And back to normal Java, suppose you implement Visitor<T>, but then come across those cases where you don't want to return something.
void doSomething(NetworkComponent component)
{
    component.accept(new Visitor<Void>()
    {
        public Void visit(Card card)
        {
            whatever.
            return null;
        }
        etc.
    };
}
I'd like to be able to use my Visitor type with a <void> type parameter, but I cannot. Please, if you got this far, and are Neal Gafter or someone else relevant, consider allowing primitive generic type parameters. As far as I can see, this can all be done at the compiler level, making a Visitor<void> appear as a Visitor<Void> in bytecode. At the moment, I often implement two Visitor types for each hierarchy - one returning void, and one T. Perhaps that's overkill, but it reminds me that it's a workaround.

Blog archive

About Me

A salsa dancing, DJing programmer from Manchester, England.