|
|
The recent "failure"
of the Chandler PIM project generated the question, "Can
Dynamic Languages Scale?" on TheServerSide, and, as is all too typical these days,
it turned into a "You suck"/"No you suck" flamefest between a couple of posters to
the site.
I now make the perhaps vain attempt to address the question meaningfully.
There's an implicit problem with using the word "scale" here, in that we can think
of a language scaling in one of two very orthogonal directions:
Part of the problem I think that appears on the TSS thread is that the posters never
really clearly delineate the differences between these two. Assembly language can
scale(2), but it can't really scale(1) very well. Most people believe that C scales(2)
well, but doesn't scale(1) well. C++ scores better on scale(1), and usually does well
on scale(2), but you get into all that icky memory-management stuff. (Unless, of course,
you're using the Boehm GC implementation, but that's another topic entirely.)
Scale(1) is a measurement of a language's ability to extend or enhance the complexity
budget of a project. For those who've not heard the term "complexity budget",
I heard it first from Mike Clark (though I can't find a citation for it via Google--if
anybody's got one, holler and I'll slip it in here), he of Pragmatic Project Automation
fame, and it's essentially a statement that says "Humans can only deal with a fixed
amount of complexity in their heads. Therefore, every project has a fixed complexity
budget, and the more you spend on infrastructure and tools, the less you have to spend
on the actual business logic." In many ways, this is a reflection of the ability of
a language or tool to raise the level of abstraction--when projects began to exceed
the abstraction level of assembly, for example, we moved to higher-level languages
like C to help hide some of the complexity and let us spend more of the project's
complexity budget on the program, and not with figuring out which register needed
to have the value of the interrupt to be invoked. This same argument can be seen in
the argument against EJB in favor of Spring: too much of the complexity budget was
spent in getting the details of the EJB beans correct, and Spring reduced that amount
and gave us more room to work with. Now, this argument is at the core of the Ruby/Rails-vs-Java/JEE
debate, and implicitly it's obviously there in the middle of the room in the whole
discussion over Chandler.
Scale(2) is an equally important measurement, since a project that cannot handle the
expected user load during peak usage times will have effectively failed just as surely
as if the project had never shipped in the first place. Part of this will be reflected
in not just the language used but also the tools and libraries that are part of the
overall software footprint, but choice of language can obviously have a major impact
here: Erlang is being tossed about as a good choice for high-scale systems because
of its intrinsic Actors-based model for concurrent processing, for example.
Both of these get tossed back and forth rather carelessly during this debate, usually
along the following lines:
"With static languages like Java, we get a select subset of code
tests, with 100% code coverage, every time we compile. We get those tests for "free".
The price we pay for those "free" tests is static typing, which certainly has hidden
costs."
Note that this argument frequently derails into the world of IDE
support and refactoring (as its witnessed on the TSS thread), pointing out that Eclipse
and IntelliJ provide powerful automated refactoring support that is widely believed
to be impossible on dynamic language platforms.
On the subject of IDE refactoring, scripting language proponents point out that the
original refactoring browser was an implementation built for (and into) Smalltalk,
one of the world's first dynamic languages.
A couple of thoughts come to mind on this whole mess.
First of all, it's a widely debatable point as to the actual refactorability of dynamic
languages. On NFJS speaker panels, Dave Thomas (he of the PickAxe book) would routinely
admit that not all of the refactorings currently supported in Eclipse were possible
on a dynamic language platform given that type information (such as it is in a language
like Ruby) isn't present until runtime. He would also take great pains to point out
that simple search-and-replace across files, something any non-trivial editor supports,
will do many of the same refactorings as Eclipse or IntelliJ provides, since type
is no longer an issue. Having said that, however, it's relatively easy to imagine
that the IDE could be actively "running" the code as it is being typed, in much the
same way that Eclipse is doing constant compiles, tracking type information throughout
the editing process. This is an area I personally expect the various IDE vendors will
explore in depth as they look for ways to capture the dynamic language dynamic (if
you'll pardon the pun) currently taking place.
What sometimes gets lost in this discussion is that not all dynamic languages need
be for programmers; a tremendous amount of success has been achieved by creating a
core engine and surrounding it with a scripting engine that non-programmers use to
exercise the engine in meaningful ways. Excel and Word do it, Quake and Unreal (along
with other equally impressively-successful games) do it, UNIX shells do it, and various
enterprise projects I've worked on have done it, all successfully. A model whereby
core components are written in Java/C#/C++ and are manipulated from the UI (or other
"top-of-the-stack" code, such as might be found in nightly batch execution) by these
less-rigorous languages is a powerful and effective architecture to keep in mind,
particularly in combination with the next point....
With the release of JRuby, and the work on projects like IronRuby and Ruby.NET, it's
entirely reasonable to assume that these dynamic languages can and will now run on
top of modern virtual machines like the JVM and the CLR, completely negating arguments
2 and 4. While a dynamic language will usually take some kind of performance and memory
hit when running on top of VMs that were designed for statically-typed languages,
work on the DLR and the MLVM, as well as enhancements to the underlying platform that
will be more beneficial to these dynamic language scenarios, will reduce that. Parrot
may change that in time, but right now it sits at a 0.5 release and doesn't seem to
be making huge inroads into reaching a 1.0 release that will be attractive to anyone
outside of the "bleeding-edge" crowd.
The allure of the dynamic language is strong on numerous levels. Without having to
worry about type details, the dynamic language programmer can typically slam out more
work-per-line-of-code than his statically-typed compatriot, given that both write
the same set of unit tests to verify the code. However, I think this idea that the
statically-typed developer must produce the same number of unit tests as his dynamically-minded
coworker is a fallacy--a large part of the point of a compiler is to provide those
same tests, so why duplicate its work? Plus we have the guarantee that the compiler
will always execute these tests, regardless of whether the programmer using it remembers
to write those tests or not.
Having said that, by the way, I think today's compilers (C++, Java and C#) are pretty
weak in the type expressions they require and verify. Type-inferencing languages,
like ML or Haskell and their modern descendents, F# and Scala, clearly don't require
the degree of verbosity currently demanded by the traditional O-O compilers. I'm pretty
certain this will get fixed over time, a la how C# has introduced implicitly
typed variables.
Meanwhile, why the rancor between these two camps? It's eerily reminiscent of the
ill-will that flowed back and forth between the C++ and Java communities during Java's
early days, leading me to believe that it's more a concern over job market and emplyability
than it is a real technical argument. In the end, there will continue to be a ton
of Java work for the rest of this decade and well into the next, and JRuby (and Groovy)
afford the Java developer lots of opportunities to learn those dynamic languages and
still remain relevant to her employer.
It's as Marx said, lo these many years ago: "From each language, according to its
abilities, to each project, according to its needs."
Oh, except Perl. Perl just sucks, period. :-)
I find it deeply ironic that the news piece TSS cited at the top of the discussion
claims that the Chandler project failed due to mismanagement, not its choice of implementation
language. It doesn't even mention what language was used to build Chandler, leading
me to wonder if anybody even read the piece before choosing up their sides and throwing
dirt at one another.