☕ Java

Heap Memory

The heap is the runtime data area from which all Java object instances and arrays are allocated. It is shared across all threads in a JVM process, grows dynamically up to a configured maximum, and is managed entirely by the garbage collector. Understanding heap memory means understanding how objects are allocated, how the generational hypothesis drives GC design, how the major garbage collectors (G1, ZGC, Shenandoah, Parallel) partition and manage the heap, what triggers garbage collection, how to tune heap size and GC behaviour, and how to diagnose heap-related problems including OutOfMemoryError and excessive GC pause times. This entry covers heap structure, generational design, object allocation, garbage collection triggers, common collectors, heap tuning, and memory leak detection.

Heap Structure and the Generational Hypothesis

The heap is a single contiguous (or logically contiguous) region of virtual memory. All objects created with 'new' are allocated on the heap, and all arrays regardless of element type are heap objects. The heap is shared — any thread can access any object on the heap, subject to synchronisation. This sharing is the source of both the heap's power (objects can be passed between threads by reference) and its complexity (concurrent access requires careful synchronisation). The generational hypothesis is the empirical observation that drives modern GC design: most objects die young. In typical Java applications, the vast majority of objects (estimates suggest 90-98%) become unreachable very shortly after creation — within milliseconds of being allocated, often within the same method call. Iterator objects, intermediate string results, boxing wrappers, DTO objects used to carry data to a view — these all have very short lifespans. A small minority of objects live for the duration of the application: caches, connection pools, static configuration. This bimodal distribution of lifespans motivates partitioning the heap into generations. Young generation (or Eden + Survivor spaces in the traditional layout): newly allocated objects go here, and GC here is fast because most objects are already dead and can be collected cheaply. Old generation (or tenured space): objects that survive multiple young-generation collections are promoted here; GC here is more expensive but rarer. This design concentrates GC effort where it pays off most — the young generation where most garbage is. G1GC (the default since Java 9), ZGC, and Shenandoah all use region-based layouts rather than the traditional contiguous young/old split, but the generational principle still applies: recently allocated objects are collected more frequently.
Java
// ── Object allocation and generational movement ──────────────────────
//
//  HEAP STRUCTURE (G1GC region-based, simplified view):
//
//  ┌────────────────────────────────────────────────────────────────┐
//  │                        HEAP                                    │
//  │                                                                │
//  │  ┌──────┐ ┌──────┐ ┌──────┐  Eden regions                     │
//  │  │ Eden │ │ Eden │ │ Eden │  ← new objects allocated here      │
//  │  └──────┘ └──────┘ └──────┘                                   │
//  │  ┌──────┐ ┌──────┐          Survivor regions                   │
//  │  │ Surv │ │ Surv │          ← objects that survived 1+ GC     │
//  │  └──────┘ └──────┘                                             │
//  │  ┌──────┐ ┌──────┐ ┌──────┐ Old regions                       │
//  │  │ Old  │ │ Old  │ │ Old  │ ← long-lived objects promoted here│
//  │  └──────┘ └──────┘ └──────┘                                   │
//  │  ┌──────────────────────────┐ Humongous regions                │
//  │  │   Humongous Object       │ ← objects > ~50% region size    │
//  │  └──────────────────────────┘                                  │
//  └────────────────────────────────────────────────────────────────┘

// ── Allocation lifecycle ──────────────────────────────────────────────
public class AllocationDemo {

    // These objects follow different allocation paths:

    public void shortLived() {
        // Created in Eden, dies before or during next minor GC
        String temp = "Hello " + userId;          // ephemeral
        List<String> batch = new ArrayList<>(100); // ephemeral
        batch.add(temp);
        processBatch(batch);
        // temp and batch become garbage when method returns
    }

    private static final Cache CACHE = new Cache(1000);  // long-lived
    // CACHE is allocated in old gen or promoted there quickly

    // ── Objects that get promoted to old gen ──────────────────────────
    // 1. Objects referenced by static fields
    // 2. Objects that survive enough minor GCs (tenuring threshold)
    // 3. Objects too large for Eden (humongous objects)
    // 4. Objects surviving a major GC collection cycle
}

// ── Allocation rate and GC pressure ──────────────────────────────────
// HIGH allocation rate → frequent minor GCs → CPU overhead
// If live set is large → frequent major GCs → pause time

// Mitigating high allocation:
// 1. Object pooling for expensive-to-create objects
// 2. Reusing StringBuilder instead of creating new
// 3. Using primitive arrays instead of boxed collections
// 4. StringBuilder.setLength(0) to reset without reallocation

Garbage Collection — Triggers, Algorithms, and Collectors

Garbage collection reclaims memory occupied by objects that are no longer reachable from any GC root. GC roots are the starting points of reachability: thread stack local variables, static fields, JNI references, and class loaders. An object is live if there is a path from any GC root to it through a chain of strong references. An object is garbage if no such path exists — regardless of whether any variable points to it, if the variable itself is unreachable, the object is garbage. Minor GC (young generation collection) is triggered when Eden space fills up. It collects only the young generation using a copying algorithm: live objects are copied to Survivor space (or directly to old gen for large objects or objects that have survived many GCs), and all of Eden is reclaimed in one sweep. Minor GC is typically fast (milliseconds to tens of milliseconds) because the young generation is small and most objects are dead. Major GC (old generation collection or full GC) is triggered when the old generation is too full to accept promotions from the young generation. It involves the entire heap and is more expensive. The goal of modern GC development (G1, ZGC, Shenandoah) is to make even old generation collections concurrent — running most GC work concurrently with application threads — to reduce pause times from hundreds of milliseconds to single-digit milliseconds or less. Stop-the-world (STW) pauses are the core GC challenge. During an STW pause all application threads are suspended while the GC performs work that cannot be done concurrently. Traditional collectors (Serial, Parallel) have multi-hundred-millisecond STW pauses that are unacceptable for latency-sensitive applications. G1GC achieves sub-200ms pauses in most cases. ZGC and Shenandoah achieve sub-millisecond pauses even for terabyte heaps.
Java
// ── GC roots — starting points of object reachability ────────────────
//
// GC Root types:
// 1. Local variables on thread stacks (all active stack frames)
// 2. Static fields of loaded classes
// 3. Active JNI references
// 4. Objects referenced by ClassLoaders
// 5. Interned Strings in the String pool
// 6. Synchronisation monitors
//
// An object is LIVE if it is reachable from ANY root through any chain
// An object is GARBAGE if it is unreachable from all roots

// ── Object reachability demonstration ────────────────────────────────
public void reachabilityDemo() {
    Object a = new Object();   // reachable via local var 'a'
    Object b = new Object();   // reachable via local var 'b'
    a = b;                     // original Object for 'a' is now unreachable → GARBAGE
    b = null;                  // the Object (now referenced by nothing) is GARBAGE too
    // GC may collect both original Objects now
}

// ── Reference types affect GC behaviour ──────────────────────────────
import java.lang.ref.*;

// Strong reference — object kept alive as long as any strong ref exists
Object strong = new Object();   // not collected while 'strong' is reachable

// Soft reference — collected when JVM needs memory (good for caches)
SoftReference<byte[]> soft = new SoftReference<>(new byte[1024 * 1024]);
byte[] data = soft.get();   // null if collected, non-null if still alive

// Weak reference — collected at next GC (good for canonicalising maps)
WeakReference<Object> weak = new WeakReference<>(new Object());

// Phantom reference — collected after finalisation (used for cleanup)
ReferenceQueue<Object> queue = new ReferenceQueue<>();
PhantomReference<Object> phantom = new PhantomReference<>(new Object(), queue);

// ── GC log output interpretation ─────────────────────────────────────
// Enable with: -Xlog:gc (Java 9+) or -verbose:gc (older)
//
// [0.234s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause)
//                         23M->8M(128M) 4.532ms
//   ↑         ↑              ↑       ↑      ↑         ↑
//   time  GC type         before  after  heap   pause duration
//
// Interpreting:
// Before: 23MB used, After: 8MB used, Heap: 128MB max, Pause: 4.5ms
// 15MB of garbage collected in 4.5ms — healthy young GC

// ── Common GC collectors and their characteristics ────────────────────
// Serial GC (-XX:+UseSerialGC):
//   Single-threaded, stop-the-world. For small heaps (<100MB), embedded.
//
// Parallel GC (-XX:+UseParallelGC):
//   Multi-threaded STW. High throughput, accepts longer pauses. Batch jobs.
//
// G1GC (-XX:+UseG1GC, default Java 9+):
//   Concurrent + incremental. Balanced throughput/latency. Most apps.
//   Target: sub-200ms pauses via -XX:MaxGCPauseMillis=200
//
// ZGC (-XX:+UseZGC, Java 15+ production):
//   Mostly concurrent. Sub-millisecond pauses, any heap size.
//   For latency-critical apps with large heaps.
//
// Shenandoah (-XX:+UseShenandoahGC, OpenJDK):
//   Mostly concurrent. Sub-millisecond pauses. Red Hat distribution.

Heap Sizing and Tuning

Heap size is controlled by two JVM flags: -Xms (minimum/initial heap size) and -Xmx (maximum heap size). Setting both to the same value prevents heap resizing at runtime, which avoids GC pauses caused by heap expansion and makes memory consumption predictable. This is the standard recommendation for production deployments where predictable performance matters more than minimal memory use. The correct heap size is the smallest value at which the application operates within its GC pause time budget. Too small a heap causes frequent GC, high GC CPU overhead, and eventually OutOfMemoryError. Too large a heap causes infrequent but very long GC pauses (because there is more live data to scan and copy), and wastes memory. The right size requires measurement with realistic workloads — not guessing. G1GC's -XX:MaxGCPauseMillis flag is a pause time goal (default 200ms), not a hard limit. G1 adjusts region sizes, collection frequencies, and concurrency to attempt to stay within the goal. Setting it lower (50ms, 20ms) tells G1 to be more aggressive about keeping pauses short, at the cost of throughput. ZGC and Shenandoah essentially eliminate this trade-off for most workloads — they can maintain sub-millisecond pauses regardless of the heap size setting. New Relic, Datadog, JVM flags like -XX:+PrintGCDetails, and tools like JConsole, VisualVM, JFR (Java Flight Recorder), and async-profiler all provide visibility into heap usage and GC behaviour. GC tuning without measurement is guessing — always profile before tuning and measure the impact after each change.
Java
// ── Essential heap flags ──────────────────────────────────────────────
// java -Xms512m -Xmx2g MyApp
//   -Xms512m  → initial heap size 512MB
//   -Xmx2g    → maximum heap size 2GB
//   Recommendation: set Xms = Xmx in production for predictability

// java -Xms4g -Xmx4g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 MyApp
//   Fixed 4GB heap, G1GC, targeting 100ms max pause

// ── Container-aware heap sizing (Java 10+) ────────────────────────────
// In containers (Docker/Kubernetes), heap is percentage of container memory:
// java -XX:MaxRAMPercentage=75.0 MyApp
//   Allocate up to 75% of container memory to heap
//   Better than -Xmx in containers where memory limits vary by environment

// ── OutOfMemoryError — types and causes ───────────────────────────────
// 1. "Java heap space"
//    Cause: live set exceeds -Xmx, or memory leak
//    Fix: increase heap, fix leak, or reduce live set
//
// 2. "GC overhead limit exceeded"
//    Cause: JVM spending >98% of time GC-ing and recovering <2% memory
//    Signals: heap is too small or there's a near-leak
//    Fix: increase heap or find what's holding references
//
// 3. "unable to create new native thread"
//    Cause: OS-level thread limit or virtual memory exhaustion
//    Fix: reduce -Xss, reduce thread count, increase OS limits
//
// 4. "Metaspace" (separate — see Metaspace entry)

// ── Heap dump for memory leak analysis ────────────────────────────────
// Generate on OutOfMemoryError (essential for production):
// java -XX:+HeapDumpOnOutOfMemoryError
//      -XX:HeapDumpPath=/dumps/heap.hprof MyApp

// Generate manually via jcmd:
// jcmd <pid> GC.heap_dump /tmp/heap.hprof

// Generate via jmap:
// jmap -dump:format=b,file=/tmp/heap.hprof <pid>

// Analyse with Eclipse Memory Analyser (MAT) or IntelliJ heap analyser

// ── Memory leak pattern — unintentional retention ─────────────────────
public class CacheWithLeak {

    // LEAK: static map holds references indefinitely
    // Objects added are NEVER removed — heap grows without bound
    private static final Map<Long, byte[]> CACHE = new HashMap<>();

    public void cache(Long id, byte[] data) {
        CACHE.put(id, data);    // grows forever
    }
}

// Fixed with WeakHashMap (keys collected when no other refs exist):
private static final Map<Long, byte[]> SAFE_CACHE =
    Collections.synchronizedMap(new WeakHashMap<>());

// Or with bounded cache:
private static final Map<Long, byte[]> BOUNDED = new LinkedHashMap<>() {
    @Override protected boolean removeEldestEntry(Map.Entry<Long, byte[]> e) {
        return size() > 10_000;   // evict when size exceeds limit
    }
};

Object Allocation Internals and TLAB

In HotSpot, heap allocation uses Thread-Local Allocation Buffers (TLABs). Each thread is assigned its own private chunk of Eden space — typically a few hundred KB. When a thread allocates an object, it simply advances a pointer within its TLAB — a single compare-and-swap or bump-pointer operation that requires no locking or synchronisation with other threads. Only when a thread's TLAB is exhausted does it need to interact with the shared Eden space to get a new TLAB, which requires synchronisation but is infrequent. This TLAB design makes object allocation in HotSpot extremely fast — comparable in cost to stack allocation. The common intuition that "heap allocation is expensive" is based on older allocators; modern JVM allocation is typically 10-20 nanoseconds for a simple object in a warm application. Objects are initialised to their default values (zeros and nulls) as part of allocation — the memory received from the TLAB is already zeroed by the OS or by a previous GC cycle. This is the JVM's guarantee that fields have predictable initial values. The constructor then runs on top of this zero-initialised memory to set the actual initial values. Object size in memory follows alignment rules. Object headers in HotSpot are 12 bytes (with compressed oops, the default for heaps under ~32GB) or 16 bytes without compression. Field layout is optimised by the JVM to minimise padding — fields are sorted by size to pack them efficiently. Knowing object sizes is important for understanding GC overhead and for designing memory-efficient data structures.
Java
// ── TLAB allocation — per-thread bump pointer ────────────────────────
//
// Thread 1's TLAB:
// ┌──────────────────────────────────────────────────────┐
// │ [Object A][Object B][Object C][free space .........] │
// │            ↑                   ↑                     │
// │           used                ptr  (bump pointer)    │
// └──────────────────────────────────────────────────────┘
//
// Allocating new Object D:
// ptr = ptr + sizeof(D)   ← atomic bump, no locking needed
// initialise D at old ptr location
//
// Thread 2 has its OWN TLAB — no synchronisation between threads

// ── Object memory layout in HotSpot ──────────────────────────────────
// With compressed oops (-XX:+UseCompressedOops, default):
//
// Object header: 12 bytes
//   Mark word:   8 bytes  (hash code, lock state, GC age)
//   Klass ptr:   4 bytes  (compressed pointer to class)
//
// Example object sizes:
// new Object()          = 16 bytes (12 header + 4 padding)
// new Integer(42)       = 16 bytes (12 header + 4 int field)
// new Long(42L)         = 24 bytes (12 header + 8 long field + 4 padding)
// new int[10]           = 56 bytes (12 header + 4 length + 40 data)
// new Object[10]        = 56 bytes (12 header + 4 length + 40 refs)

// ── Measuring object sizes ────────────────────────────────────────────
// Java Agent approach (org.openjdk.jol):
// Instrumentation instrumentation;
// long size = instrumentation.getObjectSize(object);

// JOL (Java Object Layout) library:
// System.out.println(ClassLayout.parseInstance(new Integer(42)).toPrintable());
// Output: java.lang.Integer object internals:
//  OFFSET  SIZE   TYPE DESCRIPTION
//       0     4        (object header: mark)
//       4     4        (object header: class)
//       8     4    int Integer.value
//      12     4        (loss due to the next object alignment)
// Instance size: 16 bytes

// ── Allocation rate impact on GC ─────────────────────────────────────
// Allocation rate = how fast threads are filling Eden
// Eden size ÷ allocation rate = time between minor GCs
//
// Example:
// Eden = 512MB, allocation rate = 1GB/second
// → minor GC every 0.5 seconds — very frequent!
//
// Reducing allocation rate:
// 1. Object pooling (ByteBuffer.allocateDirect, thread-local pools)
// 2. Avoid boxing in hot paths (IntStream vs Stream<Integer>)
// 3. Reuse StringBuilder, avoid string concatenation in loops
// 4. Off-heap memory for large datasets (ByteBuffer, MemorySegment)

Related Topics in Java Memory Management

Stack Memory
Stack memory is the region of memory where the JVM stores method invocation frames, local variables, and partial results. Every thread has its own private stack created at thread creation, and the stack grows and shrinks as methods are called and return. Stack memory operates on a last-in, first-out discipline — the frame for the most recently called method sits on top, and when that method returns its frame is immediately discarded. Understanding stack memory explains why local variables are thread-safe by default, why recursive algorithms can cause StackOverflowError, why primitive values behave differently from objects, and what the JVM does at every method call and return. This entry covers stack frame structure, the stack pointer, local variable storage, operand stacks, frame lifecycle, thread isolation, and the performance characteristics that make stack allocation extremely fast.
Metaspace
Metaspace is the JVM memory region that stores class metadata — the internal representations of loaded classes, methods, fields, constant pools, and annotations. It replaced PermGen (Permanent Generation) in Java 8. Unlike PermGen which was a fixed-size heap region, Metaspace is allocated from native memory (outside the Java heap) and grows dynamically up to an optional maximum. Understanding Metaspace means understanding what class metadata contains, what causes Metaspace to grow, how class unloading reclaims Metaspace, what OutOfMemoryError from Metaspace looks like, how to monitor and limit it, and what the practical implications are for application servers, OSGi containers, and frameworks that generate classes dynamically.
JVM Memory Model
The JVM Memory Model encompasses two related but distinct concepts. The first is the runtime memory architecture — how the JVM partitions memory into regions (stack, heap, Metaspace, code cache, native memory) and what each region stores. The second, more precise meaning is the Java Memory Model (JMM) defined in the Java Language Specification — the formal set of rules that govern how multithreaded programs observe memory: when writes made by one thread become visible to other threads, what ordering guarantees exist, and how volatile, synchronized, and final provide those guarantees. This entry covers both: the complete runtime memory architecture with all regions explained, and the Java Memory Model's happens-before relationship, visibility rules, and the practical consequences for concurrent code.
Garbage Collection
Garbage collection is the automatic process by which the JVM reclaims memory occupied by objects that are no longer reachable from the running program. It is one of Java's most defining features — eliminating the manual memory management required in C and C++ and the entire class of bugs that comes with it: use-after-free, double-free, and memory leaks caused by forgotten deallocation. Understanding garbage collection means understanding reachability, GC roots, the collection process, what the GC guarantees and what it does not, how to work with it rather than against it, and how to diagnose and resolve GC-related performance problems. This entry covers the reachability model, GC triggers, what GC does not collect, finalization, the GC performance trade-off triangle, and practical guidance for writing GC-friendly code.