Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

JavaWorld Daily Brew

Is "Performance" Subjective or Objective in nature?

 

(Editor's note: This post is likely to open a huge can of whoop-*ss on this blog,
so unless you want to get caught up in the huge bar fight that's about to break out,
you're advised to take your whiskey or beer and head outside for a smoke until the
cops come.)

As a fellow Scala writer, I've been following Daniel
Spiewak's blog
with no small amount of interest, as he discovers little tidbits
inside the Scala language (like the Option type). Then I ran across this
entry
, about benchmarks and comparing the performance of Java, Groovy and Scala:

I’ve seen these results dozens of times (looking back at the post), but they never
cease to startle me.  How could Groovy be that much slower than everything else? 
Granted it is very much a dynamic language, compared to Java and Scala which are peers
in static-land.  But still, this is a ray tracer we’re talking about!  There’s
no meta-programming involved to muddle the scene, so a halfway-decent optimizer should
be able to at least squeeze that gradient down to maybe 5x as slow, rather than a
factor of 830.

That's a huge discrepancy, and like Daniel, I'm not sure where the perf hit comes
from, particularly when we consider that JRuby, another language with equally powerful
metaobject protocol (MOP) capabilities, is turning in performance times that are equal
to those we see with the original Ruby interpreter (according to Daniel's blog entry,
though I note that the comparison of JRuby to Java isn't given). And if the disbelievers
in the crowd are starting to tune this out based on the fact that "Ah, it must be
an edge case, after all, there's always one benchmark that any language will fail
compared to another one; maybe Groovy's just not cut out to do ray-tracing. Yeah,
that must be it. Besides, how often do I really do ray-tracing when I'm writing code
at work?", take heed, for Daniel notes this and starts to cite other evidence that
seem to establish a disturbing pattern:

If this were an isolated incident, I would probably just blow it off as bad benchmarking,
or perhaps an odd corner case that trips badness in the Groovy runtime.  Then
a week later, I read this
post
by Pete Knego (which shows Groovy's performance as equally disappointing,
on the order of 7.6x to 56x worse than equivalent Java code --TKN).

All of this is old news, so the question is: Why am I bringing this up now? 
Well, I recently saw a
post
on Groovy Zone by none-other-than Rick Ross, talking about this very subject. 
Rick’s post was in response to two posts (here and here),
discussing ways to improve Groovy code performance by obfuscating code.

Uh, oh. I don't know about y'all, but anytime somebody is suggesting improving performance
by obfuscating code, I'm nervous--almost by definition, code obfuscation
makes code run more slowly, not more quickly, because now the bytecode is
pulled out of familiar patterns recognizable by the JITter and therefore more aggressively
turned into optimized native code. I'm not saying Rick is wrong, but if his experiments
are leading him to understand that obfuscated code is somehow running faster than
non-obfuscated code, then something deeply strange is afoot.

(Editor's note: Better hurry and head outside folks, the Groovyists in the corner
are starting to grumble amongst themselves, working up the courage to toss that first
beer in the piano player's face.)

Daniel's not done here, though, and goes on:

Final result?

This text is being written as I was changing and trying things, I gained 20s from

minor changes of which I lost track. :-) I
am currently at 1m30s (down from the

original 4m and comparing with Java’s 4s).

I’m sorry, this is acceptable performance?  This is someone who’s spent time
trying to optimize Groovy, and by his own admission, Groovy is 23x slower
than the equivalent Java code.  Certainly this is a far cry from the 830x slower
in the ray tracer benchmark, but in this case it’s simple string manipulation, rather
than a mathematically intensive test.

Coming back to Rick’s entry, he looks at the conclusion and has this to say about
it:

Language performance is highly overrated

Much is often made of the theoretical “performance” of a language based on benchmarks
and arcane tests. There have even been cases where vendors have built cheats into
their products specifically so they would score well on benchmarks. In the end, runtime
execution speed is not as important a factor as a lot of people would think it is
if they only read about performance comparisons. Other factors such as maintainability,
interoperability, developer productivity and tool and library support are all very
significant, too.

Wait a minute, that sounds a lot like something else I’ve read recently!  Maybe
something like this:

Is picking out the few performance weaknesses the right way to judge the

overall speed of Groovy?

To me the Groovy performance is absolutely sufficient because of the

easy integration with Java. If something’s too slow, I do it in Java.


And Java compared to Python is in most cases much faster.

I appreciate the efforts of the Groovy team to improve the performance,


but if they wouldn’t, this would be no real problem to me. Groovy is the

grooviest language with a development team always having the simplicity


and elegance of the language usage in mind - and that counts to me.  :-)

This is almost a mantra for the Groovy proponents: performance is irrelevant. 
What’s worse, is that the few times where they’ve been pinned down on a particular
performance issue that’s obviously a problem, the response seems to be along
the lines of: this test doesn’t really show anything, since micro-benchmarks are useless.

I’m sorry, but that’s a cop-out.  Face up to it, Groovy’s performance is terrible. 
Anyone who claims otherwise is simply not looking at the evidence.  Oh, and if
you’re going to claim that this is just a function of shoe-horning a dynamic language
onto the JVM, check out a direct
comparison
between JRuby and and Groovy.  Groovy comes out ahead in only
four of the tests.

Uh, oh.

(Editor's note: Head for the doors, folks--those guys in the corner wearing the
black leather jackets sporting the "Grails Rulez" logos on the back have started to
head for the center of the room, and they're looking drunk, mean, and angry.)

Here comes the coup de grace

What really bothers me about the Groovy performance debates is that most “Groovyists”
seem to believe that performance is in the eye of the beholder.  The thought
is that it’s all just a subjective issue and so should be discounted almost completely
from the language selection process.  People who say this have obviously forgotten
what it means to try to write a scalable non-trivial application which performs decently
under load.  When you start getting hundreds of thousands of hits an hour, you’ll
be willing to sell your soul for every last millisecond.

The only answer I can think of is that the Groovy core team just doesn’t value
performance.
  Why else would they consistently bury their heads in the
sand, ignoring the issues even when the evidence is right in front of them? 
It’s as if they have repeated their own “performance is irrelevant” mantra so many
times that they are actually starting to believe it.  It’s unfortunate, because
Groovy really is an interesting effort.  I may not see any value for my needs,
but I can understand how a lot of people would.  It fills a nice syntactic niche
that other languages (such as Ruby) just miss.  But all of its benefits are for
naught if it can’t deliver when it counts.

That did it.

(Editor's note: Shiiiiiiiit! I didn't say nothing, HE did, why're you swinging
that beer stein at *WHACK*)

^^^^^^^^^^^^^^^^^^^^^

OK, now that we've gotten that out of our system, let's sit back and examine
this issue more carefully, shall we?

The fact is, Groovy is slower than it should be. The Groovy guys can mumble about
how performance isn't that important and that developer productivity is what really
matters and similar kinds of rationale, but at the end of the day, the basic fact
remains that Groovy is, by measurement of several different tests, at least an order
of magnitude slower than compiled Java code.

Or is it? Funny thing is, looking at some of these tests, they don't say whether the
Groovy code was compiled first, or run through the Groovy interpreter. Theoretically
this shouldn't matter, since the Groovy architecture essentially compiles the classes
generated once read, it might make a difference in practice.

Although Daniel's post doesn't mention it, I went back and double-checked. Peter Knego's
benchmark says that "Groovy code was inside a Groovy script, compiled with groovyc.",
which leads me to believe that it was compiled code rather than run through the Groovy
shell interpreter, but it would be nice if the actual code and batch/command scripts
that ran the benchmarks would be available. (He also notes that each time, the benchmark
was "warmed up" by running the bechmark five or six times in a loop, presumably to
allow the JIT to work its magic, but most notably, doesn't point out which JVM was
used, -client or -server.) Meanwhile, Derek Young's ray-tracing example explicitly
uses the Groovy interpreter, but defends that decision in comments: "The only reason
I didn’t use groovyc was because the difference was so great, and the
compilation overhead at the beginning of the run only takes a couple seconds. I decided
it wasn’t worth waiting another two and a half hours to time the compiled output.
Running with groovy first compiles the code just like groovyc does,
then executes that code. It doesn't interpret the source code or run any differently."

So, apparently, it doesn't make much difference. That's not a good development for
the Groovy language.

But let's put the hard numbers out of the way for a moment, and concentrate on the
much bigger question: does the core Groovy team just not value performance? And, as
a corollary to this, does the performance of a language really matter in a day and
age when CPUs are still doubling in size and number of cores? Rick's (and others',
including myself) positions on this seem fairly clear, that we long ago passed the
threshold where programmer time became more expensive than CPU time, and therefore
we should optimize based on programmer productivity, not CPU efficiency, and that's
important to recognize: a language should enable the programmer to express the core
idea without a great deal of "noise" or additional work, what Stu Halloway has coined
as "ceremony", and certainly Groovy takes the Java programmer a step closer towards
that place of lower ceremony.

But...

But I can't help it, folks. Ted's First Law of Computer Science states that "Dogma
is the Root of All Evil", and holding scripting languages up as the last language
you'll ever have to use is dogma, plain and simple. Ted's Second Law of Computer Science
states, "Context matters", and in this case, the context includes the performance
cost of using a language or tool. Taking a performance hit that weighs in at the orders
of magnitude mark is just too big to ignore--the ray-tracing example, at its close-to-four-orders-of-magnitude
hit, almost suggests that it would have been just as expensive to offload all those
calculations through a distributed RPC call to another machine, rather than calculate
it locally in Groovy, and when it becomes faster to go off-CPU to do a calculation
than to do it locally, something is wrong.

And Daniel's point is good to hear clearly through the noise: "When you
start getting hundreds of thousands of hits an hour, you’ll be willing to sell your
soul for every last millisecond.
" Forget getting hundreds or thousands
of hits an hour--the real test will be when the system gets hundreds or thousands
of hits per second, that's when developers will be scrambling to find ways
to eke out those last bits of performance from the system, even if it means selling
their last Mountain Dew (which for some is pretty much synonymous to "soul") to whatever
entity can give it to them.

So what exactly is my point with this particular entry, besides stirring the pot up
a little? In order:

  1. Measure for yourself. As with all things performance and
    scalability related, abstract benchmarks aren't a good measure of how well it works
    for you and your system. Build a prototype, measure, and then compare that against
    your performance and scalability goals. You did establish performance and
    scalability goals as part of your project's runup, right? (If you didn't, then you
    probably assume your users don't care about performance, and I suspect you'll be rudely
    surprised on the veracity of that statement before long...)
  2. Benchmarks are tricky things. Programmers could learn something
    from politicians, and that is the imprecise nature of poll results. Thanks to the
    nature of statistical analysis and the sample size and source used to produce the
    poll, polls are always cited with a "plus or minus 3 percent" (or 5 percent, or 10
    percent) to indicate the imprecision assumed in the poll. Benchmarks, both across
    languages and across other products, should be assumed to have similar kinds of imprecision.
    As people have already noted, benchmarks very quickly get "gamed" in order to produce
    results that are unfairly biased one way or another if they're not explicitly written
    and administered to be fair and equal to all sides involved. This isn't to say that
    benchmarks aren't useful, they're just not useful to a point more precise than rounding
    to the nearest 10% figure.
  3. Groovy IS slower than it could or should be. As much as
    I like the people involved in the Groovy space, and as much as I like the language
    itself, I can't help but be very very worried that Groovy's performance numbers aren't
    anywhere close to where they should be. Yes, productivity will get you a long way
    in the technology-adoption market, but once people have adopted your language or tool,
    if your system proves to be unresponsive and performance-challenged, the doors letting
    people in will get blocked by the people trying to get out, and
    that's not good for Groovy.
  4. Groovy's performance may not be a reason not to use Groovy.
    Let's be honest, again: productivity matters. This is Mort's principal goal,
    remember, and there's nothing wrong with it. Groovy fits into that most natural of
    places, a scripting language gluing together pieces written in a system language.
    Perl and Python serve the same purpose for native/C/C++ code, PowerShell does the
    same (I believe) for .NET code, and it's high time we see the value in doing that
    for Java code.
  5. I believe that the core Groovy team holds performance as a value. I
    know Graeme and Guillaume, and I believe they believe in the value of performance.
    I believe that Groovy will get faster over time, as they discover new and better ways
    to compile Groovy code into bytecode. That doesn't mean users of Groovy should walk
    into this exercise with their eyes shut, mind you, but take the whole of this discussion
    into context as you figure out where Groovy can be used to make your life, as a developer,
    more productive and powerful. Certain parts of your system are perf-sensitive, and
    certain parts aren't. Identify which of those parts are which, and apply Groovy (and
    other tools) judiciously.
  6. These discussions are always good, so long as they're held without rancor. Groovy
    doesn't suck. It has warts, but so does everything. Hushing them up or pooh-poohing
    them just leads to arguments--I encourage the Groovy team to take the criticism of
    Groovy's performance the way I intend it in this blog entry: a challenge to be faced
    and overcome, and not as an indictment of any and all Groovy code everywhere. Because,
    and I will say this outright, if Groovy's backers seriously mean Groovy as "a better
    Java 7", then they have a large gap to fill.

 

Oh, and if some of you wouldn't mind sticking around to clean up the mess...? Getting
beer off the ceiling can be tricky.





Enterprise consulting, mentoring or instruction. Java, C++, .NET or XML services.
1-day or multi-day workshops available. Contact
me for details
.