Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

JVM performance optimization, Part 5: Is Java scalability an oxymoron?

Why better Java performance tuning won't solve Java's scalability problem

  • Print
  • Feedback

Page 2 of 4

  • Architecting or modeling deployments in chopped-up large instance pools, even though it leads to a horrible operations monitoring and management situation.
  • Tuning and re-tuning the JVM configuration, or even the application, to "avoid" (meaning postpone) the worst-case scenario of a stop-the-world compaction pause. The most that developers can hope for is that when a pause happens, it won't be during a peak load time. This is what I call a Don Quixote task, of chasing an impossible goal.

Now let's dig a little deeper into Java's scalability problem.

Over-provisioning and over-instancing Java deployments

To be able to utilize all of the hardware memory available in a large box, many Java teams choose to scale their application deployment through multiple instances instead of within a single instance or a few larger instances. While running 16 application instances on a single box is a good way to use all of the available memory, it doesn't address the cost of managing and monitoring that many instances, especially when you've deployed multiple servers the same way.

Another problem designed as a solution is the dramatic precautions that teams take to stay up (and not go down) during peak load times. This means configuring heap sizes for worst-case peak loads. Most of that memory isn't needed for everyday loads, so it becomes an expensive waste of resources. In some cases, teams will go even further, deploying not more than two or three instances per box. This is exceptionally wasteful, both economically and in terms of environmental impact, particularly during non-peak hours.

Now let's compare architectures. On the left side of Figure 2 we see many small-instance clusters, which are harder to manage and maintain. To the right are fewer, larger instances handling the same load. Which solution is more economical?

Figure 2. Fewer larger instances (click to enlarge)

Image copyright Azul Systems.

As I discussed in my last article, concurrent compaction is a truly viable solution to this problem. It makes fewer, larger instances possible and removes the scalability limitations commonly associated with the JVM. Currently only Azul's Zing JVM offers concurrent compaction, and Zing is a server-side JVM, not yet in the hands of developers. It would be terrific to see more developers take on Java's scalability challenge at the JVM level.

Since performance tuning is still our primary tool for managing Java scalability issues, let's look at some of the main tuning parameters and what we're actually able to achieve with them.

Tuning parameters -- some examples

The most well-known JVM performance option, which most Java developers specify on their command-line when launching a Java application, is -Xmx. This option lets you specify how much memory is allocated to your Java heap, although results will vary by JVM.

Some JVMs include the memory needed for internal structures (such as compiler threads, GC structures, code cache, and so on) in the -Xmx setting, while others add more memory outside of it. As a result, the size of your Java processes may not always reflect your -Xmx setting.

Once you've reached your maximum memory threshold, you can't return that memory to the system. You also can't grow beyond it; it is a fixed upper limit. If you don't get your -Xmx setting "right" -- that is, if the application's object-allocation rate, the lifetime of your objects, or the size of your objects exceeds your JVM's memory configuration settings -- you will run out of memory. Then the garbage collector will throw an out-of-memory error and your application will shut down.

If your application is struggling with memory availability, you currently don't have many other options than to restart it with a larger -Xmx size. To avoid downtime and frequent restarts, most enterprise production environments tend to tune for the worst-case load, hence over-provisioning.

Tip: Tune for production load

A common error for Java developers is to tune heap memory settings in a lab environment, forgetting to re-tune it for production load. The two loads can differ significantly, so always re-tune for your production load.

Tuning generational garbage collectors

Some other common tuning options for the JVM are -Xns and -XX:NewSize. These options are used to tune the size of young generation (or nursery) in your JVM. They specify the amount of the heap that should be dedicated for new allocation in generational garbage collectors.

Most Java developers will try to tune the nursery size based on a lab environment, which means risking a production-load fail. It's common to set a third or half of the heap as nursery, which is a bit of glue that seems to work most of the time. It isn't based on a real rule, however, as the "right" size is actually application dependent. You're better off investigating the actual promotion rate and the actual size of your long-living objects, and then setting the nursery size to be as large as possible without causing promotion failure in the old space of the heap. (Promotion failure is a sign that the old space is too small, and will trigger a number of garbage collection actions that could result in an out-of-memory error. See JVM performance optimization, Part 2 for an in-depth discussion of generational garbage collection and heap sizing.)

Another nursery-related option is -XX:SurvivorRatio. This option lets you set the promotion rate, meaning the life-length an object has to survive before getting promoted to old space. To set this option "right" you will have to know the frequency of young-space collection and be able to estimate for how long new objects will be referenced in your application. Note that the "right" setting for these options depends on the allocation rate, so setting them based on the lab environment will cost you in production.

Tuning concurrent garbage collectors

If you have a pause-sensitive application, your best bet is to use a concurrent garbage collector -- at least until someone invents something better. Although parallel approaches give excellent throughput benchmark scores and are often used for JVM comparative publications, parallel GC does not benefit response times. Concurrent GC is currently the only way to achieve some kind of consistency and the least number of stop-the-world interruptions. Different JVMs supply different options for setting a concurrent garbage collector. For the Oracle JVM it's -XX:+UseConcMarkSweepGC. More recently G1 has become the default for the Oracle JVM, which uses a concurrent approach.

  • Print
  • Feedback

Resources

Earlier articles in the JVM performance optimization series:

Also on JavaWorld: