Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

JavaWorld Daily Brew

The Complexities of Black Boxes

 

Kohsuke Kawagachi has posted a blog entry describing how to watch the assembly code
get generated by the JVM during execution, using a non-product (debug or fastdebug)
build of Hotspot and the -XX:+PrintOptoAssembly flag, a trick he says he learned while
at TheServerSide Java Symposium a few weeks ago in Vegas. He goes on to do some analysis
of the generated assembly instructions, offering up some interesting insights into
the JVM's inner workings.

There's only one problem with this: the flag doesn't exist.

Looking at the source for the most recent builds of the JVM (b24, plus whatever new
changesets have been applied to the Mercurial repositories since then), and in particular
at the "globals.hpp" file (jdk7/hotspot/src/share/vm/runtime/globals.hpp), where all
the -XX flags are described, no such flag exists. It obviously must have at one point,
since he's obviously been able to use it to get an assembly dump (as must whomever
taught him how to do it), but it's not there anymore.

OK, OK, I lied. It never was there for the client build (near as I can tell), but
it is there if you squint hard enough (jdk7/hotspot/src/share/vm/opto/c2_globals.hpp),
but as the pathname to the source file implies, it's only there for the server build,
which is why Kohsuke has to specify the "-server" flag on the command line; if you
leave that off, you get an error message from the JVM saying the flag is unrecognized,
leading you to believe Kohsuke (and whomever taught him this trick) is clearly a few
megs shy in their mental heap. So when you try this trick, make sure to use "-server",
and make sure to run methods enough to force JIT to take place (or set the JIT threshold 
using -XX:CompileThreshold=1) in order to see the assembly actually get generated.

Oh, and make sure to swing the dead chicken--fresh, not frozen--by the neck over your
head three times, counterclockwise, while facing the moon and chanting, "Ohwah...
Tanidd... Eyah... Tiam...". If you don't, the whole thing won't work. Seriously.

...

Ever feel like that's how we tune the JVM? Me too. The whole thing is this huge black
box, and it's nearly impossible to extract any kind of useful information without
wandering into the scores of mysterious "-XX" flags, each of which is barely documented,
not guaranteed to do anything visibly useful, and barely understood by anybody outside
of Sun.

Hey, at least we have those flags in the JVM; the CLR developers have to
take whatever the Microsoft guys give them. ("And they'll like it, too! Why,
when I was their age, I had to program using nothing but pebbles by the side of the
road on my way to school! Uphill! Both ways! In the raging blizzards of Arizona!")

Interestingly enough, this conversation got me into an argument with a friend of mine
who works for Sun.

During the conversation, I mentioned that I was annoyed at the difficulty a Java developer
has in trying to see how the Java code he/she writes turns into assembly, making it
hard to understand what's really happening inside the black box. After all, the CLR
makes this pretty trivial--when you set a breakpoint in Visual Studio, if you have
the right flags turned on, your C# or VB source is displayed alongside the actual
native instructions, making it fairly easy to see that the JITted code. This was always
of great help when trying to prove to skeptical C++ developers that the CLR wasn't
entirely brain-dead, and did a lot of the optimizations their favorite C++ compiler
did, in some cases even better than the C++ compiler might have done. "Why don't we
have some kind of double-X-show-me-the-code flag, so I can do the same with the JVM?",
I lamented.

His contention was that this lack of a flag is a good thing.

Convinced I was misunderstanding his position, I asked him what he meant by that,
and he said, roughly paraphrasing, that there are only about 20 or so people in the
world who could look at that assembly dump and not draw incredibly misguided impressions
of how the JVM operates internally; more importantly, because so few people could
do anything useful with that output, it was to our collective benefit that this information
was so hard to obtain.

To quote one of my favorite comedians, "Well excuuuuuuuuuuse ME." I was a bit... taken
aback, shall we say.

I understand his point--that sometimes knowledge without the right context around
it can lead to misinterpretation and misunderstanding. I'll agree totally with the
assertion that the JVM is an incredibly complex piece of software that does some very sophisticated
runtime analysis to get Java code to run faster and faster. I'll even grant you that
the timing involved in displaying the assembly dump is critical, since Hotspot
re-JITs methods that get used repeatedly, something the CLR has talked about ("code
pitching") but thus far hasn't executed on.

But this idea that only a certain select group of people are smart enough and understand
the JVM well enough to interpret the results correctly? That's dangerous, on several
levels.

First, it's potentially an elitist attitude to take, essentially presenting a "We
look down on you poor peasants who just don't get it" persona, and if word gets out
that this is how Sun views Java developers as a whole, then it's a black mark on Sun's
PR image and causes them some major pain and credibility loss. Now, let me brutally
frank here: For the record, I don't think this is the case--everybody I've
met at Sun thus far is helpful and down-to-earth, and scary-smart. I have a hard time
believing that they're secretly thumbing their nose at me. I suppose it's possible,
but it's also possible that Bill Gates and Scott McNealy were in cahoots the whole
time, too.

Second, and more importantly, there will never be any more than those 20 people we
have now, unless Sun works to open the deep dark internals of the JVM to more people.
I know I'm not alone in the world in wanting to know how the JVM works at the same
level as I know how the CLR works, and now that the OpenJDK is up and running, if
Sun wants to see any patches or feature enhancements from the community, then they
need to invest in more educational infrastructure to get those of us who are interested
in this stuff more up to speed on the topic.

Third, and most important of all, so long as the JVM remains a black box, the
"myths, legends and lore" will haunt us forever.
Remember when all the Java performance
articles went on and on about how method marked "final" were better-performing and
so therefore should be how you write your Java code? Now, close to ten years later,
we can look back at that and laugh, seeing it for the micro-optimization it is, but
if challenged on this idea, we have no proof. There is no way to create demonstrable
evidence to prove or disprove this point. Which means, then, that Java developers
can argue this back and forth based on nothing more than our mental model of the JVM
and what "logically makes sense".

Some will suggest that we can use micro-benchmarks to compare the two options
and see how, after a million iterations, the total elapsed time compares. Brian Goetz
has spent a lot of time and energy refuting this myth, but to put it in some degree
of perspective, a micro-benchmark to prove or disprove the performance benefits of
"final" methods is like changing the motor oil in your car and then driving across
the country over and over again, measuring how long until the engine explodes. You
can do it, but there's going to be so much noise from everything else around the experiment--weather,
your habits as a driver, the speeds at which you're driving, and so on--that the results
will be essentially meaningless unless there is a huge disparity, capable of shining
through the noise.

This is a position born out across history--we've never been able to understand a
system until we can observe it from outside the system; examples abound, such as the
early medical understanding of Aristotle's theories weighed against the medical experiments
performed by the Renaissance thinkers. One story says a skeptic, looking at the body
in front of him disproving one of Aristotle's theories, shook his head and said, "I
would believe you except that it was Aristotle who said it." When mental models are
built on faith, rather than on fact and evidence, progress cannot reasonably occur.

Don't think the analogy holds? How long did we as Java developers hold faith with
the idea that object pools were a good idea, and that objects are expensive to create,
despite the fact that the Hotspot FAQ has explicitly told us otherwise since JDK 1.3?
I still run into Java developers who insist that object pools are a good
idea all across the board. I show them the Hotspot FAQ page, and they shake their
head and say, "I would believe you except that it was (so-and-so article author) who
said it."

Oh, and don't get me started on a near-total opacity of the Parrot and Ruby environments,
among others--this isn't a "static vs dynamic" thing, this is something everybody
running on a managed platform needs to be able to do.

I'm tired of arguing from a position of faith. I want evidence to either prove or
disprove my assertions, and more importantly, I want my mental model of how the JVM
operates to improve until it's more reflective of and closer to the reality. I can't
do that until the JVM offers up some mechanisms for gathering that evidence, or at
least for gathering it more easily and comprehensively. You shouldn't have to be a
JVM expert to get some verification that your understanding of how the JVM works is
correct or incorrect.





Enterprise consulting, mentoring or instruction. Java, C++, .NET or XML services.
1-day or multi-day workshops available. Contact
me for details
.