Static typing is not just about preventing simple bugs, and dynamic typing is not just about writing millions of tests that you wouldn't have to write with static typing.
Programmers of languages like Java, C#, C, C++, etc., middle-of-the-road statically typed languages with some dynamic features, have some intuitions, built up from years of experience at getting the best out of their language, but don't often stop to see what contortions their language makes them go through, or that they are limited in what their compiler tells them about their code.
They often think that rigorous static typing is about preventing bugs that a few unit tests would solve, or that dynamic typing is about writing tests that static typing would solve.
Static Typing is Not About Low-Hanging Bugs
Many years back, clever mathematicians proved that functions express types. If you have an expressive enough type system, you only need to give the input and output types for a function, and you have its implementation. Consider the identity function, written in Scheme like this:
(lambda (x) x)
Don't worry if lambdas are unfamiliar to you. No, actually, do worry. Lambdas underpin most of computer science and certainly all of programming, even if that's not obvious. If, as I did, you gained a Computer Science degree without ever hearing about lambdas, ask for your money back, and spend it on a copy of The Structure and Interpretation of Computer Programs (SICP) for you, all your friends and family. It's also available in PDF form for no money, in case you're unsuccessful in getting your money back from your University.
Anyway, a lambda expression is analogous to an anonymous function in many languages, and even an anonymous class (with one method) in Java. The particular lambda expression above is a function that takes a value, and returns that same value.
In Haskell, the syntax isn't much different:
\x -> x
The \ is Haskell's approximation to the Greek letter lambda, the bit before -> is the parameter list, and the bit after it is the result. Scheme doesn't care about telling you the types of things, but Haskell does. The type of \x -> x is t -> t, which means it's a function that takes a t (which can be anything) and returns a t (which is the same as the first t). All fairly obvious.
Given the above, there's only one logical implementation of t -> t, and that's \x -> x. You could make silly ones, where you store x in a local variable, and then return it later, but they're all equivalent. Or perhaps the implementation could look up the type of x, and grab something similarly-typed from a cache. All a bit silly though. Perhaps if all types in a language had a clone function, then you could provide a different implementation, but without that, or something similar,
What you can see from this is that, at least for simple functions, you only need the types to derive the implementation. Does this scale up to things like \x y -> x/y, etc.? Yes, as it happens, though you then need the concept of dependent types, which is types that depend on runtime values. They can still be checked at compile time. In other words, types and values are isomorphic to each other (isomorphic is a mathematical way of saying 'equivalent').
What does the Java version of the identity function look like? Oddly, it depends on whether you write it how Scheme works, or how Haskell works. In Scheme there are no compile time types (or you can say there is exactly one), and the closest that Java comes to that is Object:
public static Object identity(Object x)
{
return x;
}
If you write it like Haskell, then you need to make sure that the return type is the same as the input type, with no silent upcasting (to Object): public static <T> T identity(T x)
{
return x;
}
There is quite a difference between the two pieces of code, which is particularly disturbing, because the Haskell and Scheme versions look so similar to each other. In fact, there's a project called Liskell, which gives Haskell lispy syntax. There, the two would probably be the same. This is a first hint that Java gets in the way of thinking about static typing, by making it very verbose, in particular by making untyped code look different to typed code.
Haskell is also guilty of this, partly by design, and partly because of some deficiencies in its type system. That's the reason that the side of the fence I sit on is the dynamically typed one, but I keep my eye on the static typing people, because they come up with great ideas, and secretly I'd like to be able to use their type systems on some of my code.
So static typing isn't just about making sure that you don't add two credit card numbers together, it's about expressing code as close as possibly to mathematics. Mathematicians tend to be very good at reducing problems to their smallest representations, and coming up with very general solutions, so there can be no doubt that this approach will (and already has) produced amazing results.
Haskell's type system is certainly imperfect, and so are its programmers - this is clear because of the presence of its unit testing system, QuickCheck. If the type system was perfect, and the programmers using it never took shortcuts, there would be no need to run unit tests. It would be evident by the compiler succeeding, that the code could not crash, or produce any results outside the expected range.
If you look for more and more sophisticated type systems, you will probably come across Coq at some point, which is a proof engine that can generate executable code. There you deal with logic, with set theory, all things that are the pinnacle of static typing, and which my Computer Science degree, and probably yours, did not cover.
Go there. Have a look. Even if you decide that static typing is not for you, or that you'll get by with a lesser static typing system such as Haskell's or Java's, you can learn a very general way of thinking that will apply across a lot of programming, in the same way that understanding lambdas will help you when you see C# delegates, JavaScript's anonymous functions, etc.
Dynamic Typing is Not Just About Writing Tests
Many static typing proponents, even Haskellers, have the opinion that while Lisp, Python, Ruby, JavaScript, et al, might have some nice syntax, the fact that they are devoid of static types makes them ultimately useless. You can't easily get the compiler to tell you if you try to do something stupid. For example, 3/"hello world" will be accepted by the language despite it being obvious (read: provable) that a runtime problem will occur.
Many Java programmers assume that you have to guess at the types being used in a dynamic program, or provide excessive documentation. The greater point is that what will work will work. Dynamic languages get out of your way so that you can explore a problem. Whereas in a statically typed language you have to make the solution consistent before you can try it out, you can try out an incomplete function very easily in dynamically-typed languages.
(lambda (x)
(if (< x 0)
(do-this x)
(do-that x)))
I can call the above lambda on negative numbers, even if do-that hasn't been written yet (assuming do-this has). The runtime doesn't generally complain unless it has to.
Once you have explored a problem, and you understand it, you probably have a nearly-working program, so you're pretty close to having a working one. You get no guarantees that the program will work for all possible inputs, and there isn't even a general way of restricting inputs (other than adding checks at runtime).
Dynamic programmers tend to think in the language they are using - that is, they write code while they're thinking, and test functions out to see whether they work for the cases they're interested in. If I write a function that adds two numbers together, I'm not even remotely interested in what happens if someone passes strings to it, because I'm not writing it for that use case.
This way of thinking keeps the code focused on the task in hand. If the code starts to get hard to read, or repetitive, the programmer will come up with an abstraction. E.g., beginner Java programmers often try this:
if (x==3 || 5 || 10)
which doesn't compile, because the compiler sees 3 expressions that should all be booleans, and the 2nd and 3rd are ints instead. The programmer gets told they're wrong, and they should fix it to be: if (x==3 || x==5 || x==10)
That's clearly garbage, because x== is repeated. The programmer has just adopted a poorer way of thinking to suit a language, instead of adapting the language. if (x in (3,5,10))
would be great, as would: if ((3,5,10) contains x)
, or even, as valid Java: if (asList(3,5,10).contains(x))
Of course, this latter example is a little ugly. Translated directly to Scheme: (if (contains (list 3 5 10) x)
If this was used a lot, a Scheme programmer might write: (if (in? x 3 5 10)
A Java programmer could do that too, but they would need to know more of the language to be able to do it. Therefore, the Java programmer is 'corrected', instead of shown how to write their idiom in Java.
Because the language gets out of the way, this kind of code is easy to write. That brings about the worry that because you can write code to be as good as you like, you can also write incredibly bad code, and this is completely true.
The remarkable thing is that the language doesn't try to judge whether your code is good or bad - it just runs the code. That's great, because it lets you learn from your own mistakes, and even better, it lets you come up with things that the language designers didn't think of.
It's become quite clear in recent years that Object-Orientated Programming as Java and C++ do it is not the pinnacle of software engineering - but you can't escape Java's OO system. You can't really implement multiple dispatch for two separate codebases in one central place unless you go outside the language (e.g., reflection, bytecode weaving). If it turns out that Haskell's typeclasses aren't the best way of expressing things, you can't remove them from the language, because existing code depends on them - the compiler always needs to understand them - and probably you will still have to use them because the language's central concepts depend on them.
By the language not telling you that you're wrong, it lets you be right in ways that the language designer might not have envisaged. For example, the code snippet (+ "hello" "world") looks wrong, because + is not defined on strings. But in a language like Scheme, + is only a variable, so I can write:
(let ((+ string-append))
(+ "hello" "world"))
and it's no longer wrong (arguably a bit stupid though).
So how about those tests? Well, if you look at Java code:
public Integer add(Integer a,Integer b)
{
return a+b;
}
Both a and b can possibly be null, so what happens if null is passed? Er, you get a NullPointerException. Some programmers will specify that in documentation for the method, or just write that null is not allowed. The same method in Scheme would be callable with any values at all, though it would fail. That changes the attitude of calling programmers. Instead of looking at the half-hearted type signature, and seeing that null is a possible value, the programmer is more likely to look at the implementation.
In the first half of this entry, I said that types imply implementations - well, the reverse is true! Implementations imply types. You don't need a type signature to see what values can be passed to (lambda (a b) (+ a b))
, you can infer the type signature from the implementation. In this case, you can pass anything to that lambda that you can pass to the primitive + procedure, whose type signature actually depends on the implementation of Scheme you're using, but it accepts at least those types in the specification.
So Scheme programmers aren't likely to write tests to see what happens if you pass credit cards to a function expecting addresses, because that's never going to be a use case. They informally restrict the inputs. They're emulating what a static type checker does, but mentally. That lets them organically approach a solution to a problem, instead of having to design it beforehand.
It's certainly true that this mental processing is error-prone, and bugs get into source code because of this. Then, many programmers are tempted to write swathes of tests, in an informal attempt to prove that their code is correct.
The bad part of this is that they aren't benefitting from the static typing camp's results now. They have a solution, but they can't use the machine to prove it. They'll end up writing the same kinds of tests over and over.
All this is why I think that, despite him talking about middle-of-the-road type checkers like Java's, Gilad Bracha was on the ball with his presentation about adding pluggable type checkers to dynamic languages. I thoroughly expect that it will be some time before a great one exists, and largely because too many people are sitting on the fence, instead of knocking it down.
If you ignore mediocre languages, there isn't a great deal of difference between how statically typed programmers think and how dynamically typed programmers think. We all like lambdas, we all like side-effect free code, we all dislike crashing programs. So the next time you see a proposal to add static typing to your favourite language, or to add some new dynamic feature to your static language (including 'get-out' clauses like Haskell's unsafePerformIO), bear in mind that you're just witnessing a step on the road to convergence.
If the next big language has a mandatory static type system, then it will either be a crap one, or Computer Science degrees will suddenly contain a lot more mathematics. That won't happen. The next big language will be crap, or dynamic.