GC Algorithms
Garbage collection algorithms are the strategies the JVM uses to identify and reclaim unreachable objects. Different algorithms make different trade-offs between throughput, latency, and memory overhead. The foundational algorithms are mark-sweep, mark-compact, and copying — each with distinct properties that make them suitable for different parts of the heap and different application requirements. Modern production collectors (G1GC, ZGC, Shenandoah) combine these algorithms with concurrency, parallelism, and incremental collection to achieve the best overall performance. This entry covers the core algorithms in depth, their correctness properties, performance characteristics, why generational collection is effective, and how the major production collectors implement these algorithms.
Core Algorithms — Mark-Sweep, Mark-Compact, Copying
// ── Mark-Sweep conceptually ──────────────────────────────────────────
//
// BEFORE GC:
// GC Roots → A → B → C
// ↘
// D (D reachable via A→B→D)
// E → F (E, F not reachable from any GC root)
// G (G not reachable)
//
// MARK phase (traverses from GC roots):
// Marked: A, B, C, D (reachable)
// Unmarked: E, F, G (unreachable)
//
// SWEEP phase (scan heap, reclaim unmarked):
// Reclaimed: E, F, G
//
// AFTER GC:
// GC Roots → A → B → C
// ↘
// D
// Free: [E space][F space][G space] ← fragmented free space
// ── Copying collection (young gen) ───────────────────────────────────
//
// BEFORE GC (fromspace):
// [A][B][C][D][E][F][G][H][I] ← mostly garbage (dark = live, light = dead)
// ↑ ↑ ↑ ↑
// live live live live (only 4 objects are live)
//
// COPY phase (copy only live objects to tospace):
// fromspace: [A][B][C][D][E][F][G][H][I] ← all reclaimed in one step
// tospace: [A][B][D][H] ← only 4 live objects
// ↑ bump pointer advances sequentially
//
// RESULT: No fragmentation, fast allocation, cost proportional to LIVE set
// Only 4 objects processed — 5 dead objects cost nothing!
// ── When each algorithm is used in HotSpot ────────────────────────────
//
// Young generation: Copying (semi-space or to-survivor)
// Cost: proportional to LIVE set → fast because most young objects die
//
// Old generation (G1): Concurrent marking + incremental compaction
// Cost: proportional to heap occupancy → slower but infrequent
//
// Old generation (ZGC): Concurrent marking + concurrent compaction
// Cost: concurrent → spread across application execution, minimal STW
// ── Fragmentation — why it matters ───────────────────────────────────
// After many mark-sweep cycles WITHOUT compaction:
// [ free ][obj][ free ][obj][ free ][obj][ free ]
// 8 bytes 4 16 bytes 4 32 bytes 4 8 bytes
// Total free: 64 bytes
// Can allocate: up to 32-byte object — fragmented, cannot use full free space
//
// After compaction:
// [obj][obj][obj][ 64 bytes free ]
// Can allocate: up to 64-byte object — contiguous, fully usableGenerational Collection — Why It Works
// ── Young generation lifecycle ───────────────────────────────────────
//
// Allocation in Eden:
// ┌──────────────────────────────────┬────────┬────────┐
// │ Eden (empty, fills fast) │ Surv S0│ Surv S1│
// └──────────────────────────────────┴────────┴────────┘
//
// After Eden fills → Minor GC:
// 1. Mark live objects in Eden + S0 (from space)
// 2. Copy survivors to S1 (to space), incrementing their age
// 3. Promote objects with age >= threshold to Old Gen
// 4. Eden + S0 entirely wiped → bump pointer reset to start of Eden
//
// ┌──────────────────────────────────┬────────┬────────┐
// │ Eden (empty again!) │ empty │[A,age2]│
// └──────────────────────────────────┴────────┴────────┘
// ↑ New allocations start here again ↑ A survived 2 GCs
// ── The card table for cross-generational references ──────────────────
//
// Old Gen: Young Gen (Eden):
// ┌────────┬────────┬────────┐ ┌────────────────┐
// │ OldObj │ OldObj │ OldObj │ │ YoungObj │
// │ ref ──┼────────┼────────┼─────────► (referenced │
// │ │ │ │ │ from old gen) │
// └────────┴────────┴────────┘ └────────────────┘
// card 0 card 1 card 2
// dirty! ← card table records this card is "dirty"
//
// During minor GC: dirty cards are added to GC root set
// → YoungObj is reached via OldObj.ref → kept alive (correctly)
// ── Tenuring threshold — adaptive promotion ───────────────────────────
// java -XX:MaxTenuringThreshold=15 (max age before forced promotion)
// java -XX:InitialTenuringThreshold=7 (starting threshold)
// The JVM dynamically adjusts threshold to keep Survivor spaces ~50% full
//
// jstat -gcnew <pid> shows tenuring info:
// TT MTT — current threshold, max threshold
// S0U S1U — Survivor space usage
// If Survivor fills → "premature promotion" → more old gen GCs
// ── Object allocation decision tree ──────────────────────────────────
// new Object() call:
// Is it larger than TLAB?
// YES → allocate in shared Eden (slower, lock needed)
// NO → allocate in TLAB (fast bump pointer, no lock)
// Does TLAB have space?
// YES → bump pointer + return (< 10ns)
// NO → retire TLAB, get new TLAB from Eden
// Eden has space?
// YES → new TLAB from Eden
// NO → trigger Minor GC
// After GC, Eden has space?
// YES → allocate
// NO → promote to Old Gen or OOMG1GC — The Default Collector
// ── G1GC key flags ────────────────────────────────────────────────────
// java -XX:+UseG1GC (default Java 9+, explicit for clarity)
// java -XX:MaxGCPauseMillis=200 (target max pause, default 200ms)
// java -XX:InitiatingHeapOccupancyPercent=45 (trigger concurrent mark at 45% full)
// java -XX:G1HeapRegionSize=4m (region size: 1-32MB, auto-calculated if not set)
// java -XX:G1NewSizePercent=5 (min % of heap for young gen)
// java -XX:G1MaxNewSizePercent=60 (max % of heap for young gen)
// ── G1GC phases ───────────────────────────────────────────────────────
//
// 1. Young-only GC (most frequent — every Eden fill):
// STW pause, evacuates Eden + Survivor to new Survivor/Old
// Duration: proportional to live set in young gen
// Goal: stay within MaxGCPauseMillis
//
// 2. Concurrent Marking Cycle (triggered by IHOP threshold):
// Initial Mark: STW, brief — piggybacks on Young GC
// Root Region Scan: concurrent
// Concurrent Mark: concurrent — traverses object graph
// Remark: STW, brief — finalises marking
// Cleanup: STW, brief — identifies empty regions
//
// 3. Mixed GC (follows concurrent marking if old gen is full enough):
// STW, evacuates young gen + selected old gen regions
// "Garbage First": selects regions with highest garbage density
// Continues in multiple increments until old gen reclaimed
// ── G1GC log analysis ─────────────────────────────────────────────────
// Healthy application log pattern:
// [0.234s] GC(1) Pause Young (Normal) 45M->22M(256M) 8ms ← Normal
// [1.456s] GC(8) Pause Young (Normal) 68M->31M(256M) 11ms ← Normal
// [5.678s] GC(25) Pause Young (Concurrent Start) ... ← Mark beginning
// [5.679s] Concurrent Cycle ← Mark running
// [8.901s] GC(26) Pause Remark 180M->180M(256M) 3ms ← Mark finishing
// [9.012s] GC(27) Pause Cleanup 180M->128M(256M) 2ms ← Empty regions reclaimed
// [9.234s] GC(28) Pause Young (Mixed) 128M->65M(256M) 15ms ← Mixed collection
//
// PROBLEM signs:
// Pause Full (G1 Compaction Pause) → emergency full GC, investigate
// "to-space exhausted" → heap too small or promotion failure
// Concurrent cycles not completing before next → IHOP too lowZGC and Shenandoah — Sub-Millisecond Latency
// ── ZGC key flags (Java 21 LTS) ──────────────────────────────────────
// java -XX:+UseZGC (enable ZGC)
// java -XX:+UseZGC -XX:+ZGenerational (generational ZGC, Java 21+, recommended)
// java -XX:SoftMaxHeapSize=2g (soft limit, GC more aggressively before this)
// java -Xmx4g (hard limit)
// ZGC automatically sizes GC threads — usually no tuning needed
// ── Shenandoah key flags ──────────────────────────────────────────────
// java -XX:+UseShenandoahGC
// java -XX:ShenandoahGCMode=iu (incremental update, default)
// java -XX:ShenandoahGCMode=satb (snapshot-at-beginning)
// ── Choosing the right GC ─────────────────────────────────────────────
//
// Criterion Serial Parallel G1GC ZGC Shenandoah
// Heap size Small Any Any Any Any
// Throughput LOW BEST GOOD GOOD GOOD
// Max pause (200MB heap) Secs 100-500ms <200ms <1ms <1ms
// Max pause (64GB heap) N/A Secs 200ms+ <1ms <1ms
// Memory overhead Lowest Low Medium Medium Medium
// JDK availability All All All 11+ 12+
// Maturity Oldest Old 8+years 4years 4years
//
// Decision:
// Default (most apps): G1GC — well-tested, balanced
// Latency-critical (<1ms): ZGC or Shenandoah
// Batch/throughput jobs: Parallel GC
// Memory-constrained: Serial GC
// Java 21+ new apps: ZGC with generational mode
// ── ZGC in practice — tuning is minimal ──────────────────────────────
// Unlike G1GC, ZGC rarely needs tuning beyond heap size.
// Key metrics to watch:
// Allocation stall: application threads blocked waiting for GC
// → Fix: increase heap (ZGC needs headroom for concurrent work)
//
// Recommendation: set Xmx 30-50% larger than live set
// Live set 2GB → -Xmx3g or -Xmx4g provides enough headroom for ZGC
// ── Concurrent GC overhead measurement ───────────────────────────────
// -Xlog:gc*:gc.log:time,uptime,level,tags
// Look for: concurrent-mark, concurrent-relocate durations
// These consume CPU but do NOT stop the application
// Throughput reduction ≈ GC thread CPU / total CPU