☕ Java

JVM Memory Model

The JVM Memory Model encompasses two related but distinct concepts. The first is the runtime memory architecture — how the JVM partitions memory into regions (stack, heap, Metaspace, code cache, native memory) and what each region stores. The second, more precise meaning is the Java Memory Model (JMM) defined in the Java Language Specification — the formal set of rules that govern how multithreaded programs observe memory: when writes made by one thread become visible to other threads, what ordering guarantees exist, and how volatile, synchronized, and final provide those guarantees. This entry covers both: the complete runtime memory architecture with all regions explained, and the Java Memory Model's happens-before relationship, visibility rules, and the practical consequences for concurrent code.

JVM Runtime Memory Architecture — All Regions

The JVM partitions its memory use into several distinct regions, each serving a different purpose. Understanding all regions is necessary for correctly diagnosing memory-related problems and for sizing JVM deployments accurately. The heap is the largest region — where all Java objects and arrays live. Its size is bounded by -Xms and -Xmx. All threads share the heap, and it is managed by the garbage collector. This is the region most developers think of as "JVM memory." Metaspace (native memory) stores class metadata. It is separate from the heap, allocated from the operating system's virtual address space directly. -XX:MaxMetaspaceSize limits it; without a limit it can grow to consume all available virtual memory. The stack is per-thread. Each thread has a private stack of fixed size (-Xss) containing the thread's method call frames. The stack is not garbage collected — frames are pushed and popped automatically as methods are called and return. Code Cache stores JIT-compiled native code. As the JVM's JIT compiler hot-compiles frequently executed methods, the compiled native code is stored here. Code Cache is limited by -XX:ReservedCodeCacheSize (default 240MB in Java 17+). A full code cache causes JIT deoptimisation and performance degradation. Direct memory (off-heap) is allocated by Java programs explicitly using ByteBuffer.allocateDirect() or through java.nio channels. It resides outside the heap, is not garbage collected by the normal GC cycle, and must be explicitly managed. -XX:MaxDirectMemorySize limits it. Netty, Cassandra drivers, and Kafka heavily use direct memory for zero-copy I/O. Native memory covers everything else: JVM internals, thread state, JNI, garbage collector data structures. It is not directly controllable and grows with the number of threads and loaded classes.
Java
// ── Complete JVM memory picture ──────────────────────────────────────
//
// ┌────────────────────────────────────────────────────────────────────┐
// │                   JVM PROCESS MEMORY                               │
// │                                                                    │
// │  ┌─────────────────────────────────────────────────────────────┐  │
// │  │                    JAVA HEAP                                 │  │
// │  │  Eden | Survivor | Old Gen | Humongous (G1GC regions)        │  │
// │  │  Controlled by -Xms and -Xmx                                 │  │
// │  └─────────────────────────────────────────────────────────────┘  │
// │                                                                    │
// │  ┌──────────────────────────┐  ┌──────────────────────────────┐  │
// │  │       METASPACE          │  │        CODE CACHE            │  │
// │  │  Class metadata          │  │  JIT-compiled native code    │  │
// │  │  Native memory           │  │  -XX:ReservedCodeCacheSize   │  │
// │  │  -XX:MaxMetaspaceSize    │  └──────────────────────────────┘  │
// │  └──────────────────────────┘                                     │
// │                                                                    │
// │  ┌──────────────────────────┐  ┌──────────────────────────────┐  │
// │  │   THREAD STACKS (×N)     │  │      DIRECT MEMORY           │  │
// │  │  Per-thread stack        │  │  ByteBuffer.allocateDirect() │  │
// │  │  -Xss per thread         │  │  -XX:MaxDirectMemorySize     │  │
// │  └──────────────────────────┘  └──────────────────────────────┘  │
// │                                                                    │
// │  ┌──────────────────────────────────────────────────────────────┐ │
// │  │              NATIVE MEMORY (JVM internals)                   │ │
// │  │  GC metadata, JVM structures, JNI, symbols, etc.             │ │
// │  └──────────────────────────────────────────────────────────────┘ │
// └────────────────────────────────────────────────────────────────────┘

// ── Total process memory = sum of all regions ─────────────────────────
// Common mistake: setting -Xmx to available RAM leaves no room for other regions
// Typical non-heap memory (for a medium Spring Boot app):
// Metaspace:    150-300MB
// Code Cache:   50-150MB
// Thread stacks: threads × Xss (100 threads × 512KB = 50MB)
// Direct memory: depends on NIO usage
// GC overhead:  ~15-30% of heap size
//
// Rule of thumb: containerMemory - Xmx should be >= 500MB for non-heap

// ── Monitoring all memory regions ─────────────────────────────────────
// jcmd <pid> VM.native_memory   (full breakdown including all regions)
// Output:
// Native Memory Tracking:
// Total: reserved=4183MB, committed=1234MB
//   Java Heap (reserved=2048MB, committed=512MB)
//   Class (reserved=1085MB, committed=42MB)    ← Metaspace
//   Thread (reserved=166MB, committed=166MB)   ← Stack for all threads
//   Code (reserved=240MB, committed=12MB)      ← Code Cache
//   GC (reserved=195MB, committed=195MB)       ← GC data structures
//   Symbol (reserved=17MB, committed=17MB)     ← String pool symbols

The Java Memory Model — Visibility and Happens-Before

The Java Memory Model (JMM) addresses a fundamental problem in multiprocessor systems: when does a write by one thread become visible to reads by other threads? On modern hardware, processors have multi-level caches, write buffers, and instruction reordering optimisations. Without explicit constraints, a value written by Thread 1 may reside in Thread 1's CPU cache and not be flushed to main memory for an indeterminate time — Thread 2 may never see the write, or may see it after an unpredictable delay. The JMM defines visibility through the happens-before relationship. A happens-before edge between two operations A and B means: the memory effects of A are guaranteed to be visible to B. If A happens-before B, then all writes visible to A's thread at the time A executed are also visible to B's thread when B executes. The JMM specifies the complete set of rules that establish happens-before edges. The key happens-before rules are: within a single thread, each statement happens-before every subsequent statement (program order); an unlock of a monitor happens-before every subsequent lock of the same monitor; a write to a volatile field happens-before every subsequent read of the same field; a call to Thread.start() happens-before any action in the started thread; and all actions in a thread happen-before any other thread successfully returns from join() on that thread. The practical consequence: if two threads share data and no happens-before relationship connects the write to the read, the reading thread may see stale data, partially written data, or data in an order that defies the writing thread's program order. This is not a bug in the JVM — it is the intentional design of the JMM, which allows the JVM and hardware to perform aggressive optimisations when programs correctly establish synchronisation, while giving precise guarantees when they do.
Java
// ── Visibility problem without synchronisation ───────────────────────
public class VisibilityProblem {

    private boolean stopRequested = false;   // non-volatile shared variable

    public void start() {
        Thread worker = new Thread(() -> {
            while (!stopRequested) {   // may NEVER see stopRequested = true!
                doWork();
            }
        });
        worker.start();

        Thread.sleep(1000);
        stopRequested = true;           // main thread writes
        // No happens-before between this write and the worker's read
        // Worker may loop forever — JIT may hoist the read out of the loop
    }
}

// ── Fix 1: volatile establishes happens-before ───────────────────────
public class VisibilityFixed {

    private volatile boolean stopRequested = false;
    //       ↑ volatile write happens-before subsequent volatile read

    public void start() {
        Thread worker = new Thread(() -> {
            while (!stopRequested) {   // guaranteed to see the write
                doWork();
            }
        });
        worker.start();

        Thread.sleep(1000);
        stopRequested = true;   // volatile write — establishes happens-before
    }
}

// ── Fix 2: synchronized establishes happens-before ────────────────────
public class SynchronizedFixed {

    private boolean stopRequested = false;   // protected by 'this' monitor

    public synchronized void requestStop() {
        stopRequested = true;               // monitor unlock happens-before
    }

    public synchronized boolean isStopped() {
        return stopRequested;               // monitor lock happens-after
    }
}

// ── The happens-before rules in code ──────────────────────────────────
// Rule 1: Program order within a thread
int x = 1;   // hb
int y = 2;   // hb (y write guaranteed to see x write within same thread)

// Rule 2: Monitor unlock → lock
synchronized (lock) { sharedData = 42; }     // unlock hb
synchronized (lock) { System.out.println(sharedData); }  // subsequent lock

// Rule 3: Volatile write → read
volatile int counter = 0;
counter = 1;                // volatile write hb
int v = counter;            // subsequent volatile read: guaranteed to see 1

// Rule 4: Thread.start() → all actions in started thread
sharedResource = "initialised";
Thread t = new Thread(() -> {
    System.out.println(sharedResource);  // guaranteed to see "initialised"
});
t.start();    // start() hb all actions in t

// Rule 5: All thread actions hb join() return
Thread worker = new Thread(() -> { result = computeResult(); });
worker.start();
worker.join();
System.out.println(result);  // guaranteed to see result computed by worker

volatile, synchronized, and final — Memory Semantics

The three primary tools for establishing happens-before in Java each have distinct memory semantics beyond their obvious synchronisation effects. volatile provides two guarantees: visibility (every write is immediately visible to all subsequent reads by any thread) and ordering (reads and writes to volatile fields are not reordered relative to each other or relative to other memory operations). volatile is appropriate when one thread writes and others read, with no compound operations (check-then-act). It is not appropriate for operations like volatile++ which require read-modify-write atomicity. synchronized provides mutual exclusion (only one thread in the block at a time) and happens-before (everything done before an unlock is visible to everything done after the subsequent lock on the same monitor). synchronized can protect compound actions safely, but contention on a lock serialises all competing threads, which can severely reduce throughput in high-concurrency scenarios. final has a special JMM guarantee: once an object is fully constructed (its constructor has returned), all threads that obtain a reference to the object will see all final fields' values correctly — even without any explicit synchronisation, and even if the reference was published unsafely (through a non-volatile, non-synchronised field). This guarantee is why immutable objects shared via unsynchronised publication are still safe: the final-field guarantee covers their contents. Mutable objects published without synchronisation are not safe, even if their fields happen to be read-only after construction, unless those fields are final.
Java
// ── volatile — visibility + ordering, no atomicity ───────────────────
public class VolatileCounter {

    private volatile int count = 0;

    // WRONG — count++ is read-modify-write (three operations), not atomic:
    public void increment() {
        count++;   // Not thread-safe even with volatile!
        // Thread 1: read count (5), increment (6), write (6)
        // Thread 2: read count (5), increment (6), write (6)  ← lost update!
    }

    // Use AtomicInteger for atomic increment:
    private final AtomicInteger atomicCount = new AtomicInteger(0);
    public void safeIncrement() { atomicCount.incrementAndGet(); }

    // volatile IS appropriate for status flags (single writer, many readers):
    private volatile boolean running = false;
    public void start()  { running = true;  }
    public void stop()   { running = false; }
    public boolean isRunning() { return running; }
}

// ── synchronized — mutual exclusion + happens-before ──────────────────
public class SafeCounter {

    private int count = 0;

    public synchronized void increment() {
        count++;   // read-modify-write safely inside synchronized
    }

    public synchronized int getCount() {
        return count;   // always returns current value
    }
}

// ── final — safe publication without synchronisation ──────────────────
// UNSAFE publication (without final or synchronisation):
public class UnsafePublication {
    public int    x;
    public int    y;
    public UnsafePublication(int x, int y) { this.x = x; this.y = y; }
}
// Another thread receiving reference to UnsafePublication may see x=0 or y=0
// even if constructor ran completely — without happens-before, no guarantee

// SAFE publication via final fields:
public class SafePublication {
    public final int x;
    public final int y;
    public SafePublication(int x, int y) { this.x = x; this.y = y; }
}
// JMM final-field guarantee: any thread seeing a reference to SafePublication
// is guaranteed to see x and y correctly — no synchronisation needed for reading

// SAFE publication via volatile field holding the reference:
private volatile SafePublication pub;
// Thread 1: pub = new SafePublication(3, 4);  volatile write hb
// Thread 2: SafePublication p = pub;           volatile read → sees x=3, y=4

// ── Double-checked locking — correct implementation ───────────────────
public class Singleton {

    private static volatile Singleton instance;   // volatile required!

    public static Singleton getInstance() {
        if (instance == null) {                    // first check (no lock)
            synchronized (Singleton.class) {
                if (instance == null) {            // second check (with lock)
                    instance = new Singleton();    // volatile write → visible to all
                }
            }
        }
        return instance;
    }

    // Without volatile: the partially constructed Singleton may be visible
    // (reference published before object fully initialised)
    // With volatile: construction completes hb volatile write hb first check
}

Memory Model Practical Guide — Common Scenarios

The Java Memory Model has several practical implications that every Java developer writing concurrent code must understand. The most important is that Java offers no guarantee of visibility for ordinary (non-volatile, non-synchronised) shared variable access between threads. The JIT compiler, the CPU, and the memory subsystem are all free to reorder, cache, and defer writes as they see fit, within the happens-before rules. Code that appears to work in development may break under heavy load, on different hardware, or with different JVM flags — because these optimisations are more aggressively applied as the JVM warms up. The safest approach for shared mutable state is to use the java.util.concurrent package, which provides thread-safe data structures (ConcurrentHashMap, CopyOnWriteArrayList), atomic primitives (AtomicInteger, AtomicReference), and higher-level synchronisers (CountDownLatch, CyclicBarrier, Semaphore, Phaser). These classes are correct, well-tested, and often more efficient than hand-written synchronisation because they use lock-free algorithms where possible. The immutability approach — making shared objects immutable and publishing them via final fields or volatile references — is the simplest concurrent programming strategy and often sufficient. Immutable objects have no synchronisation overhead because they cannot change state. They can be shared freely across threads. The only synchronisation needed is for publishing the reference to the immutable object.
Java
// ── Scenario 1: One thread initialises, many threads read ────────────
// WRONG — no happens-before between init and reads:
static Config config;   // non-volatile
Thread initThread = new Thread(() -> config = Config.load());
initThread.start();
initThread.join();
// join() establishes happens-before! After join(), config IS visible
// (join() is one of the happens-before rules)
System.out.println(config);  // safe — join hb this read

// But without join():
executor.submit(() -> config = Config.load());
// Other threads accessing config without join() have no happens-before guarantee

// ── Scenario 2: Lazy initialisation (thread-safe) ─────────────────────
// Initialisation-on-demand holder idiom:
public class Registry {
    private Registry() { /* expensive init */ }

    private static class Holder {
        // static field initialization is performed under class loading lock
        // Class loading is thread-safe by JVM specification
        static final Registry INSTANCE = new Registry();
    }

    public static Registry getInstance() {
        return Holder.INSTANCE;
        // First access loads Holder — thread-safe class initialisation
        // No explicit synchronisation needed
    }
}

// ── Scenario 3: Producer-consumer with BlockingQueue ──────────────────
// BlockingQueue provides happens-before: put() hb take()
// Anything done before put() is visible after take()
BlockingQueue<Work> queue = new LinkedBlockingQueue<>();

// Producer thread:
Work work = new Work(data);
queue.put(work);       // happens-before any subsequent take()

// Consumer thread:
Work taken = queue.take();
taken.process();       // guaranteed to see work fully initialised

// ── Scenario 4: AtomicReference for lock-free update ─────────────────
AtomicReference<Config> configRef =
    new AtomicReference<>(Config.load());

// Any thread can update config atomically:
Config newConfig = Config.load();
configRef.set(newConfig);          // atomic + happens-before for readers

// Any thread reads the current config:
Config current = configRef.get();  // reads latest value

// Compare-and-swap for conditional update:
Config expected = configRef.get();
Config updated  = expected.withTimeout(30);
boolean swapped = configRef.compareAndSet(expected, updated);
// swapped = true if expected was still current — atomic

// ── Summary: when to use what ─────────────────────────────────────────
// volatile:       single writer, multiple readers, no compound operations
// synchronized:   compound operations, critical sections, notification (wait/notify)
// AtomicXxx:      single-variable compound operations (CAS, increment)
// java.util.concurrent: collections, higher-level coordination
// Immutability:   shared objects that do not change — safest approach

Related Topics in Java Memory Management

Stack Memory
Stack memory is the region of memory where the JVM stores method invocation frames, local variables, and partial results. Every thread has its own private stack created at thread creation, and the stack grows and shrinks as methods are called and return. Stack memory operates on a last-in, first-out discipline — the frame for the most recently called method sits on top, and when that method returns its frame is immediately discarded. Understanding stack memory explains why local variables are thread-safe by default, why recursive algorithms can cause StackOverflowError, why primitive values behave differently from objects, and what the JVM does at every method call and return. This entry covers stack frame structure, the stack pointer, local variable storage, operand stacks, frame lifecycle, thread isolation, and the performance characteristics that make stack allocation extremely fast.
Heap Memory
The heap is the runtime data area from which all Java object instances and arrays are allocated. It is shared across all threads in a JVM process, grows dynamically up to a configured maximum, and is managed entirely by the garbage collector. Understanding heap memory means understanding how objects are allocated, how the generational hypothesis drives GC design, how the major garbage collectors (G1, ZGC, Shenandoah, Parallel) partition and manage the heap, what triggers garbage collection, how to tune heap size and GC behaviour, and how to diagnose heap-related problems including OutOfMemoryError and excessive GC pause times. This entry covers heap structure, generational design, object allocation, garbage collection triggers, common collectors, heap tuning, and memory leak detection.
Metaspace
Metaspace is the JVM memory region that stores class metadata — the internal representations of loaded classes, methods, fields, constant pools, and annotations. It replaced PermGen (Permanent Generation) in Java 8. Unlike PermGen which was a fixed-size heap region, Metaspace is allocated from native memory (outside the Java heap) and grows dynamically up to an optional maximum. Understanding Metaspace means understanding what class metadata contains, what causes Metaspace to grow, how class unloading reclaims Metaspace, what OutOfMemoryError from Metaspace looks like, how to monitor and limit it, and what the practical implications are for application servers, OSGi containers, and frameworks that generate classes dynamically.
Garbage Collection
Garbage collection is the automatic process by which the JVM reclaims memory occupied by objects that are no longer reachable from the running program. It is one of Java's most defining features — eliminating the manual memory management required in C and C++ and the entire class of bugs that comes with it: use-after-free, double-free, and memory leaks caused by forgotten deallocation. Understanding garbage collection means understanding reachability, GC roots, the collection process, what the GC guarantees and what it does not, how to work with it rather than against it, and how to diagnose and resolve GC-related performance problems. This entry covers the reachability model, GC triggers, what GC does not collect, finalization, the GC performance trade-off triangle, and practical guidance for writing GC-friendly code.