Java Garbage Collection


Java Garbage Collection

JDK provides several garbage collection algorithms to meet different needs in different cases.

One aspects of performance tuning of Java application is to select suitable GC based on application's limitation on a desired maximum GC pause goal, a desired application throughput goal, minimum footprint.

Garbage Collection Algorithms

Design Choices

Serial versus Parallel

Concurrent versus Stop-the-world

Compacting versus Non-compacting versus Copying

Serial Collector(-XX:+UseSerialGC)

In Java 5, the serial collector is automatically chosen as the default garbage collector on machines that are not server-class machines

Parallel Collector(-XX:+UseParallelGC)

In Java 5, the parallel collector is automatically chosen as the default garbage collector on server-class machines

It is suitable for applications that run on machines with multi-processors and do not have pause time constraints, since infrequent, but potentially long, old generation collections will still occur.

Parallel Compacting Collector(-XX:+UseParallelOldGC)

The parallel compacting collector was introduced in J2SE 5.0 update 6.

Eventually, the parallel compacting collector will replace the parallel collector.

Concurrent Mark-Sweep (CMS) Collector(-XX:+UseConcMarkSweepGC)

The CMS collector is the only collector that is non-compacting.

CMS collector is suitable if your application needs shorter garbage collection pauses.

The Garbage-First Garbage Collector

This is firstly introduced in Java 1.6 Update 14, and is the long-term replacement for HotSpot's low-latency Concurrent Mark-Sweep GC.

G1 is a "server-style" GC and has attributes: Parallelism and Concurrency, Generational, Compaction, Predictability.

Garbage Collection Change in JDK versions

Enhancements in JDK 5.0


Garbage Collector (GC) has changed the default GC algorithm from the previous serial collector (-XX:+UseSerialGC) to a parallel collector (-XX:+UseParallelGC).

Ergonomics -- Automatic Selections

In Java 5, default values for the garbage collector, heap size, and HotSpot virtual machine type (client or server) are automatically chosen based on the platform and operating system on which the application is running.

This can better match the needs of different types of applications and require fewer command line options.

Ergonomics - Support behavior based tuning

Prior to Java 5.0 tuning, you tune garbage collection by principally of specifying the size of the overall heap, possibly the size of the generations in the heap, the size of the survivor spaces, the threshold for promotion from the young generation to the old generation.

This is complex and requires experiments and profiling carefully.

From Java 5.0, we can specify desired behavior by Maximum pause time goal -XX:MaxGCPauseMillis= and Application throughput goal(-XX:GCTimeRatio=), JVM would tune the size of the heap to meet the specified behavior.

Enhancements in JDK 6

Parallel compaction is used by default in JDK 6.

Parallel Compaction Enhancements(-XX:+UseParallelOldGC)

Parallel compaction is a feature that enables the parallel collector to perform major collections in parallel

Parallel Compaction was introduced in JDK 5.0 update 6, JDK 6 contains significant performance improvements.

CMS Enhancements(-XX:+ExplicitGCInvokesConcurrent)

The Concurrent Mark Sweep Collector has been enhanced to provide concurrent collection for the System.gc() and Runtime.getRuntime().gc() method instructions.

Garbage-First Garbage Collector

Java 1.6 Update 14 includes the initial preliminary version of the new Garbage First algorithm that is the long-term replacement for HotSpot's low-latency Concurrent Mark-Sweep GC.

Garbage Collection Ergonomics Improvement

Larger Default Young Generation Size

In Java 5, it allows users to specify a desired behavior by setting Maximum pause time goal -XX:MaxGCPauseMillis= and Application throughput goal(-XX:GCTimeRatio=), In Java 6, the default selections have been further enhanced to improve application runtime performance and garbage collector efficiency.

GC technique

Reference counting

The runtime keeps track of how many live objects point to a particular object at a given time.

It is fairly efficient, except for the obvious flaw that cyclic constructs can never be garbage collected.

Tracing techniques (Mark and Sweep)

Mark all objects currently seen by the running program as live. Then recursively mark all objects reachable from those objects live as well,

After that, other objects are thought as unreachable and can be garbage collected.

Root set

The root set includes all Java objects on local frames and contains global data, such as static fields.

Mark and sweep

This is the basis of all the garbage collectors in all commercial JVMs today.

Generational garbage collection

This is based on the observation that most objects are temporary or short-lived, the objects that already exist for a while are very likely to continue to exist.

Few references from older to younger objects exist.

Memory is divided into generations; separate pools hold objects of different ages. Different algorithms can be used to perform garbage collection in the different generation, each algorithm optimized based on commonly observed characteristics for that particular generation.

HotSpot Generations Memory in the Java HotSpot virtual machine is organized into three generations: a young generation, an old generation and a permanent generation.

Young generation

Space is usually small and likely to contain a lot of objects that are no longer referenced.

Young generation collections occur relatively frequently and are efficient and fast. Garbage collection algorithm chosen for a young generation should be time efficient.

The young generation consists of an area called Eden plus two smaller survivor spaces.

Most objects are initially allocated in Eden. The survivor spaces hold objects that have survived at least one young generation collection and have thus been given additional chances to die before being considered "old enough" to be promoted to the old generation.

Old generation

Objects that survive some number of young generation collections are eventually promoted, or tenured, to the old generation. This generation is typically larger than the young generation and its occupancy grows more slowly. As a result, old generation collections are infrequent, but take significantly longer to complete.

Garbage collection algorithm chosen for old generation should be space efficient.

Permanent Generation

The permanent generation is used to store meta-data of class and methods as well as the classes and methods themselves.

Once the data is in the permanent generation, the default behavior is that it remains there forever - this might vary between garbage collection policies.

Garbage Collection Types

When the young generation fills up, a young generation collection (a minor collection) of just that generation is performed. When the old or permanent generation fills up, a full collection (a major collection) is typically done. That is, all generations are collected.

Command line options

Garbage Collector Selection

–XX:+UseSerialGC

–XX:+UseParallelGC

–XX:+UseParallelOldGC

–XX:+UseConcMarkSweepGC

Options for the Parallel and Parallel Compacting Collectors

–XX:ParallelGCThreads=n

–XX:MaxGCPauseMillis=n

–XX:GCTimeRatio=n

Options for the CMS Collector

–XX:+CMSIncrementalMode

–XX:+CMSIncrementalPacing

–XX:ParallelGCThreads=n

Heap and Generation Sizes

–Xmsn, –Xmxn, –XX:MaxPermSize=n

–XX:NewSize=n Default initial size of the new (young) generation

–XX:NewRatio=n Ratio between the young and old generations.

–XX:SurvivorRatio=n Ratio between each survivor space and Eden.

Garbage Collector Statistics

–XX:+PrintGC

–XX:+PrintGCDetails

–XX:+PrintGCTimeStamps

Resources

Java HotSpot Garbage Collection

http://download.oracle.com/javase/6/docs/technotes/guides/vm/index.html

The Garbage-First Garbage Collector

Java SE 6 Performance White Paper

Garbage Collection in Earlier versions

Labels

adsense (5) Algorithm (69) Algorithm Series (35) Android (7) ANT (6) bat (8) Big Data (7) Blogger (14) Bugs (6) Cache (5) Chrome (19) Code Example (29) Code Quality (7) Coding Skills (5) Database (7) Debug (16) Design (5) Dev Tips (63) Eclipse (32) Git (5) Google (33) Guava (7) How to (9) Http Client (8) IDE (7) Interview (88) J2EE (13) J2SE (49) Java (186) JavaScript (27) JSON (7) Learning code (9) Lesson Learned (6) Linux (26) Lucene-Solr (112) Mac (10) Maven (8) Network (9) Nutch2 (18) Performance (9) PowerShell (11) Problem Solving (11) Programmer Skills (6) regex (5) Scala (6) Security (9) Soft Skills (38) Spring (22) System Design (11) Testing (7) Text Mining (14) Tips (17) Tools (24) Troubleshooting (29) UIMA (9) Web Development (19) Windows (21) xml (5)