Java Garbage Collection
JDK provides several garbage collection algorithms to meet different needs in different cases.
One aspects of performance tuning of Java application is to select suitable GC based on application's limitation on a desired maximum GC pause goal, a desired application throughput goal, minimum footprint.
Garbage Collection Algorithms
Design Choices
Serial versus Parallel
Concurrent versus Stop-the-world
Compacting versus Non-compacting versus Copying
Serial Collector(-XX:+UseSerialGC)
In Java 5, the serial collector is automatically chosen as the default garbage collector on machines that are not server-class machines
Parallel Collector(-XX:+UseParallelGC)
In Java 5, the parallel collector is automatically chosen as the default garbage collector on server-class machines
It is suitable for applications that run on machines with multi-processors and do not have pause time constraints, since infrequent, but potentially long, old generation collections will still occur.
Parallel Compacting Collector(-XX:+UseParallelOldGC)
The parallel compacting collector was introduced in J2SE 5.0 update 6.
Eventually, the parallel compacting collector will replace the parallel collector.
Concurrent Mark-Sweep (CMS) Collector(-XX:+UseConcMarkSweepGC)
The CMS collector is the only collector that is non-compacting.
CMS collector is suitable if your application needs shorter garbage collection pauses.
The Garbage-First Garbage Collector
This is firstly introduced in Java 1.6 Update 14, and is the long-term replacement for HotSpot's low-latency Concurrent Mark-Sweep GC.
G1 is a "server-style" GC and has attributes: Parallelism and Concurrency, Generational, Compaction, Predictability.
Garbage Collection Change in JDK versions
Enhancements in JDK 5.0
Garbage Collector (GC) has changed the default GC algorithm from the previous serial collector (-XX:+UseSerialGC) to a parallel collector (-XX:+UseParallelGC).
Ergonomics -- Automatic Selections
In Java 5, default values for the garbage collector, heap size, and HotSpot virtual machine type (client or server) are automatically chosen based on the platform and operating system on which the application is running.
This can better match the needs of different types of applications and require fewer command line options.
Ergonomics - Support behavior based tuning
Prior to Java 5.0 tuning, you tune garbage collection by principally of specifying the size of the overall heap, possibly the size of the generations in the heap, the size of the survivor spaces, the threshold for promotion from the young generation to the old generation.
This is complex and requires experiments and profiling carefully.
From Java 5.0, we can specify desired behavior by Maximum pause time goal -XX:MaxGCPauseMillis=
Enhancements in JDK 6
Parallel compaction is used by default in JDK 6.
Parallel Compaction Enhancements(-XX:+UseParallelOldGC)
Parallel compaction is a feature that enables the parallel collector to perform major collections in parallel
Parallel Compaction was introduced in JDK 5.0 update 6, JDK 6 contains significant performance improvements.
CMS Enhancements(-XX:+ExplicitGCInvokesConcurrent)
The Concurrent Mark Sweep Collector has been enhanced to provide concurrent collection for the System.gc() and Runtime.getRuntime().gc() method instructions.
Garbage-First Garbage Collector
Java 1.6 Update 14 includes the initial preliminary version of the new Garbage First algorithm that is the long-term replacement for HotSpot's low-latency Concurrent Mark-Sweep GC.
Garbage Collection Ergonomics Improvement
Larger Default Young Generation Size
In Java 5, it allows users to specify a desired behavior by setting Maximum pause time goal -XX:MaxGCPauseMillis=
GC technique
Reference counting
The runtime keeps track of how many live objects point to a particular object at a given time.
It is fairly efficient, except for the obvious flaw that cyclic constructs can never be garbage collected.
Tracing techniques (Mark and Sweep)
Mark all objects currently seen by the running program as live. Then recursively mark all objects reachable from those objects live as well,
After that, other objects are thought as unreachable and can be garbage collected.
Root set
The root set includes all Java objects on local frames and contains global data, such as static fields.
Mark and sweep
This is the basis of all the garbage collectors in all commercial JVMs today.
Generational garbage collection
This is based on the observation that most objects are temporary or short-lived, the objects that already exist for a while are very likely to continue to exist.
Few references from older to younger objects exist.
Memory is divided into generations; separate pools hold objects of different ages. Different algorithms can be used to perform garbage collection in the different generation, each algorithm optimized based on commonly observed characteristics for that particular generation.
HotSpot Generations Memory in the Java HotSpot virtual machine is organized into three generations: a young generation, an old generation and a permanent generation.
Young generation
Space is usually small and likely to contain a lot of objects that are no longer referenced.
Young generation collections occur relatively frequently and are efficient and fast. Garbage collection algorithm chosen for a young generation should be time efficient.
The young generation consists of an area called Eden plus two smaller survivor spaces.
Most objects are initially allocated in Eden. The survivor spaces hold objects that have survived at least one young generation collection and have thus been given additional chances to die before being considered "old enough" to be promoted to the old generation.
Old generation
Objects that survive some number of young generation collections are eventually promoted, or tenured, to the old generation. This generation is typically larger than the young generation and its occupancy grows more slowly. As a result, old generation collections are infrequent, but take significantly longer to complete.
Garbage collection algorithm chosen for old generation should be space efficient.
Permanent Generation
The permanent generation is used to store meta-data of class and methods as well as the classes and methods themselves.
Once the data is in the permanent generation, the default behavior is that it remains there forever - this might vary between garbage collection policies.
Garbage Collection Types
When the young generation fills up, a young generation collection (a minor collection) of just that generation is performed. When the old or permanent generation fills up, a full collection (a major collection) is typically done. That is, all generations are collected.
Command line options
Garbage Collector Selection
–XX:+UseSerialGC
–XX:+UseParallelGC
–XX:+UseParallelOldGC
–XX:+UseConcMarkSweepGC
Options for the Parallel and Parallel Compacting Collectors
–XX:ParallelGCThreads=n
–XX:MaxGCPauseMillis=n
–XX:GCTimeRatio=n
Options for the CMS Collector
–XX:+CMSIncrementalMode
–XX:+CMSIncrementalPacing
–XX:ParallelGCThreads=n
Heap and Generation Sizes
–Xmsn, –Xmxn, –XX:MaxPermSize=n
–XX:NewSize=n Default initial size of the new (young) generation
–XX:NewRatio=n Ratio between the young and old generations.
–XX:SurvivorRatio=n Ratio between each survivor space and Eden.
Garbage Collector Statistics
–XX:+PrintGC
–XX:+PrintGCDetails
–XX:+PrintGCTimeStamps
Resources
Java HotSpot Garbage Collection
http://download.oracle.com/javase/6/docs/technotes/guides/vm/index.html
The Garbage-First Garbage Collector