Java

Java ™ HotSpot Virtual Machine Performance Enhancements - JDK 7

 

Zero Based Compressed OOPS

In a 64-bit JVM, if the UseCompressedOops flag is set to true, the JVM asks the operating system to reserve memory for heap at a specific address. If the operating system supports such a request and reserves memory at the specified address, then zero based compressed oops are used.

Zero based compressed oops means that the narrow oop base starts at 0 instead of starting at an arbitrary address (narrow oop base is Java heap base minus one protected page size). With a zero base, the encoding and decoding of compressed oops can be optimized.

Read more about Compressed OOPS.

Escape Analysis Improvements

Escape analysis is a technique by which the Java™ Hotspot Server Compiler can analyze the scope of an object and decide whether to allocate memory on the heap or not.

The Java Hotspot Server Compiler implements the flow-insensitive escape analysis algorithm described in:

 [Choi99] Jong-Deok Shoi, Manish Gupta, Mauricio Seffano,
          Vugranam C. Sreedhar, Sam Midkiff,
          "Escape Analysis for Java", Procedings of ACM SIGPLAN
          OOPSLA  Conference, November 1, 1999

The server compiler constructs a "connection graph" (CG) for the method being analyzed. The server compiler makes a pass over the nodes and determines their escape state. A node's escape state may be one of the following:

After escape analysis, the server compiler eliminates scalar replaceable object allocations and associated locks from heap. The server compiler also eliminates locks for all non globally escaping objects. It does not replace a heap allocation with a stack allocation for non globally escaping objects.

Some scenarios for escape analysis are described below:

NUMA Collector Enhancements

The Parallel Scavenger garbage collector has been extended to take advantage of the machines with NUMA (Non Uniform Memory Access) architecture. Most modern computers are based on NUMA architecture, in which it takes different amount of time to access different parts of memory. Typically, every processor in the system has a local memory that provides low access latency and high bandwidth, and remote memory that is considerably slower to access.

In the Java HotSpot VM, the NUMA-aware allocator has been implemented to take advantage of such systems and provide automatic memory placement optimizations for Java applications. The allocator controls the eden space of the young generation of the heap, where most of the new objects are created. It divides the space into regions each of which is placed in the memory of a specific node. The allocator relies on a hypothesis that a thread that allocates the object will be the most likely to use it. To ensure the fastest access to the new object the allocator places it in the region local to the allocating thread. The regions can be dynamically resized to reflect the allocation rate of the application threads running on different nodes. That makes it possible to increase performance even of single-threaded applications. In addition to that, "from" and "to" survivor spaces of the young generation, the old generation and the permanent generation have page interleaving turned on for them. This ensures that all threads have equal access latencies to these spaces on average.

The NUMA-aware allocator is implemented for Solaris (>= 9u2) and Linux (kernel >= 2.6.19, glibc >= 2.6.1) operating systems and can be turned on with the -XX:+UseNUMA flag in conjunction with the selection of the Parallel Scavenger garbage collector, which is a default for a server-class machine and also may be turned on explicitly specifying the -XX:+UseParallelGC option.

NUMA Performance Metrics

When evaluated against the SPEC JBB 2005 benchmark on an 8 chip Opteron machine, NUMA-aware systems showed the following performance increase:


Copyright ©2009 Sun Microsystems, Inc. All Rights Reserved.
Feedback

Sun
Java Technology