☕ JavaMultithreading

volatile

The volatile keyword in Java is a field modifier that provides two guarantees: visibility and ordering. A write to a volatile field is guaranteed to be immediately visible to all threads that subsequently read that field — the write is flushed to main memory rather than sitting in a CPU cache or register, and the read always fetches from main memory rather than a local cache. A volatile write also establishes a happens-before relationship with every subsequent volatile read of the same variable, which means all writes performed by a thread before a volatile write are visible to any thread that reads that volatile variable afterward. volatile does not provide atomicity for compound operations — reading a volatile long or double is atomic, and writing one is atomic, but a read-modify-write such as counter++ on a volatile int is not atomic and is not thread-safe. Understanding precisely what volatile guarantees and what it does not is the foundation for correctly using it as a lightweight synchronization mechanism for specific, narrow patterns: flag variables, one-time publication, double-checked locking, and status fields that are written by one thread and read by many. This entry covers the CPU cache model that makes volatile necessary, the exact visibility and ordering guarantees, the happens-before chain volatile creates, the atomicity limitation and where it leaves gaps, correct volatile patterns, incorrect volatile anti-patterns, and the relationship between volatile and the Java Memory Model.

Why volatile Exists — CPU Caches, Registers, and the Visibility Problem

Modern CPUs do not read and write directly to main memory on every memory access. Each core has multiple levels of private cache (L1, L2, and sometimes L3) and may hold values in registers between reads and writes. When a thread writes a field, the new value may initially exist only in that core's L1 cache or in a register, and other cores running other threads may continue reading the old value from their own caches. This is not a bug — it is a performance optimization that makes CPUs orders of magnitude faster by avoiding the latency of main memory access. But it means that without explicit coordination, a value written by thread A on core 1 may never be seen by thread B on core 2, or may be seen with an arbitrarily long delay. The Java compiler adds another layer of indirection. The JIT compiler is permitted to hoist loop-invariant reads out of loops, keeping a field's value in a register across many iterations without re-reading from memory. It may reorder independent memory operations for pipeline efficiency. It may eliminate reads that it proves (within a single thread's view) are redundant. All of these optimizations are correct for single-threaded code and dramatically improve performance. But they break multi-threaded code that relies on one thread seeing the writes of another. The Java Memory Model (JMM) formalizes this problem. In the absence of synchronization actions (volatile, synchronized, final, or java.util.concurrent operations), the JMM gives no guarantee about when — or whether — a write by one thread becomes visible to reads by another thread. A thread that writes to a non-volatile, non-synchronized field may never see a re-read of that field from another thread's perspective, because the JMM explicitly allows the compiler and CPU to keep the old value in cache indefinitely. The volatile keyword instructs both the JVM and the CPU to treat reads and writes of the marked field differently. A volatile write causes a cache-line flush: the new value is written through to main memory (or to the coherent cache line shared across all cores via the MESI cache coherence protocol) and a memory barrier instruction is emitted, preventing the CPU from reordering this write with preceding writes. A volatile read causes a cache-line invalidation and a fresh fetch: the CPU is prevented from using a stale cached value and must fetch the current value. The net effect is that volatile writes are always visible to subsequent volatile reads, with no CPU cache or register interposition.

Java

// ── The visibility problem without volatile ───────────────────────────
class BrokenFlag {
    boolean running = true;   // NOT volatile — JIT may cache in register

    void stop() {
        running = false;   // write may stay in CPU cache / register
    }

    void loop() {
        // JIT may hoist 'running' into a register after first read:
        // effectively becomes: boolean local = running; while (local) { ... }
        while (running) {
            // work
        }
        System.out.println("Loop exited");  // may NEVER print
    }
}

BrokenFlag broken = new BrokenFlag();
Thread looper = new Thread(broken::loop, "looper");
looper.start();
Thread.sleep(100);
broken.stop();        // write not guaranteed visible to looper
looper.join(2000);    // may time out — looper may never see running=false
System.out.println("Looper alive: " + looper.isAlive());  // may be true — stuck

// ── The fix: volatile ─────────────────────────────────────────────────
class CorrectFlag {
    volatile boolean running = true;   // volatile: writes always visible

    void stop() {
        running = false;   // volatile write: flushed to main memory, barriers emitted
    }

    void loop() {
        // Each iteration re-reads 'running' from main memory — cannot be cached:
        while (running) {
            // work
        }
        System.out.println("Loop exited cleanly");  // always prints eventually
    }
}

CorrectFlag correct = new CorrectFlag();
Thread correctLooper = new Thread(correct::loop, "correct-looper");
correctLooper.start();
Thread.sleep(100);
correct.stop();        // volatile write: guaranteed visible
correctLooper.join();  // always completes
System.out.println("Exited: " + !correctLooper.isAlive());  // true

// ── CPU cache interaction — what volatile does at hardware level ──────
// Without volatile on a write:
//   store r1, [addr]          // stores to L1 cache; may not reach L2/L3/RAM
//   (no fence instruction)    // CPU free to reorder subsequent stores

// With volatile on a write (x86 — other architectures vary):
//   lock xchg [addr], r1     // atomic + full memory fence
//   OR:
//   mov [addr], r1
//   mfence                   // memory fence: all prior stores visible globally

// Without volatile on a read:
//   load r1, [addr]          // may read from L1 cache — stale value

// With volatile on a read:
//   lfence                   // load fence: invalidate stale cache lines
//   load r1, [addr]          // now fetches coherent value

// ── JIT reordering without volatile ──────────────────────────────────
// Original source:
boolean ready = false;
int value = 0;

// Thread 1:
value = 42;
ready = true;    // JIT may reorder: ready=true might be visible before value=42

// Thread 2:
if (ready) {
    System.out.println(value);  // may print 0 — value write reordered after ready write
}

// With volatile on ready:
volatile boolean readyV = false;
// volatile write to readyV acts as a StoreStore barrier:
// ALL writes before readyV=true (including value=42) are committed before readyV=true
value = 42;
readyV = true;   // StoreStore barrier: value=42 guaranteed committed first

// volatile read of readyV acts as a LoadLoad barrier:
if (readyV) {        // LoadLoad barrier: reads after this see all writes before the volatile write
    System.out.println(value);  // guaranteed: 42
}

The happens-before Guarantee — Visibility, Ordering, and What volatile Covers

The Java Memory Model defines volatile's guarantee precisely through the happens-before relation. A volatile write to variable V happens-before every subsequent volatile read of V. The transitive closure of happens-before means that all actions performed by the writing thread before the volatile write are also visible to the reading thread after the volatile read — not just the volatile variable itself, but every write that preceded it in the writing thread's program order. This is the most important and most commonly misunderstood aspect of volatile. When thread A writes x = 10, then y = 20, then volatile flag = true, and thread B reads volatile flag == true, then reads x and y, thread B is guaranteed to see x == 10 and y == 20 — even though x and y are not volatile. The volatile write to flag acts as a publication barrier: it ensures all writes preceding it (x = 10, y = 20) are flushed. The volatile read of flag acts as an acquisition barrier: it ensures all reads following it (x, y) see the most recently committed values. This is the mechanism behind the safe publication pattern and the double-checked locking pattern. The happens-before chain works in one direction: the write must happen-before the read, in real time (the read must observe the write). A volatile read that observes an old value of the volatile field is not subject to the happens-before guarantee from the current write — it only sees the happens-before from the specific write it observes. This means that if multiple threads write to the same volatile variable, a reader that observes thread A's write gets happens-before from thread A's pre-write actions, but not from thread B's pre-write actions (if B also wrote to the volatile field at a different time). volatile reads and writes are totally ordered with respect to each other across all threads. The JMM requires that all volatile operations on a given variable form a total order that all threads agree on. This is stronger than the partial order imposed by happens-before but weaker than the total order of synchronized — there is no exclusion, so multiple volatile writes can race. The ordering guarantee has two components. The StoreStore barrier before a volatile write prevents any earlier write (volatile or not) from being reordered after the volatile write. The LoadLoad barrier after a volatile read prevents any later read (volatile or not) from being reordered before the volatile read. On x86 processors, these barriers are largely implicit because x86 has a strong memory model; on ARM and POWER processors, which have weaker memory models, explicit fence instructions are required and the JVM emits them.

Java

// ── happens-before chain: volatile write publishes all preceding writes ─
class SafePublication {
    int x = 0;           // not volatile
    int y = 0;           // not volatile
    volatile boolean published = false;  // volatile — the publication gate

    // Thread A:
    void writer() {
        x = 10;           // write 1 — not volatile
        y = 20;           // write 2 — not volatile
        published = true; // volatile write — StoreStore barrier:
                          // x=10 and y=20 are committed before published=true
    }

    // Thread B:
    void reader() {
        if (published) {  // volatile read — if this sees true:
                          //   LoadLoad barrier: reads below see all writes before published=true
            System.out.println(x);  // guaranteed: 10
            System.out.println(y);  // guaranteed: 20
        }
    }
}

SafePublication pub = new SafePublication();
Thread writer = new Thread(pub::writer, "writer");
Thread reader = new Thread(pub::reader, "reader");
writer.start(); writer.join();  // ensure write completes before reader starts
reader.start(); reader.join();
// If published is true, x and y are guaranteed to be 10 and 20

// ── Which volatile write a read "observes" determines what it sees ────
class MultiWriter {
    volatile int v = 0;
    int a = 0, b = 0;  // not volatile

    void writerA() {
        a = 1;          // (A1)
        v = 1;          // (A2) volatile write
    }

    void writerB() {
        b = 2;          // (B1)
        v = 2;          // (B2) volatile write
    }

    void readerSees1() {
        if (v == 1) {   // reads (A2) — happens-before from (A1): sees a=1
            System.out.println(a);  // 1 — guaranteed
            System.out.println(b);  // 0 OR 2 — b has no happens-before from writerA
        }
    }

    void readerSees2() {
        if (v == 2) {   // reads (B2) — happens-before from (B1): sees b=2
            System.out.println(b);  // 2 — guaranteed
            System.out.println(a);  // 0 OR 1 — a has no happens-before from writerB
        }
    }
}

// ── Total order on volatile operations ───────────────────────────────
class TotalOrder {
    volatile int counter = 0;

    // All threads agree on the order of volatile writes to 'counter'.
    // If thread A writes 1 and thread B writes 2, all threads see either:
    //   0 → 1 → 2   OR   0 → 2 → 1
    // But never:
    //   thread C sees 0 → 1, thread D sees 0 → 2, both disagreeing on final value

    void increment() { counter++; }   // BUT: NOT atomic (see next section)
}

// ── Memory barrier positions ──────────────────────────────────────────
// Volatile WRITE barriers (pseudocode — actual barriers are CPU-specific):
//   [StoreStore barrier]   // all prior stores complete before this volatile store
//   store volatile field
//   [StoreLoad barrier]    // this volatile store completes before any subsequent loads
//                          // (StoreLoad is the most expensive barrier — x86 uses MFENCE)

// Volatile READ barriers:
//   [LoadLoad barrier]     // all prior loads complete before this volatile load
//   load volatile field
//   [LoadStore barrier]    // this volatile load completes before any subsequent stores

// x86 specifics (strong memory model — most barriers are free):
// x86 has total store order (TSO): StoreStore and LoadLoad are implicit.
// Only StoreLoad requires an explicit mfence — emitted for volatile writes.
// ARM/POWER have weaker models — all four barriers require explicit instructions.

Atomicity Limitation, Correct Patterns, and Anti-Patterns

volatile provides visibility and ordering but not atomicity for compound operations. The word "atomic" means that an operation completes as a single indivisible unit with no intermediate state visible to other threads. Reading a volatile field is atomic. Writing a volatile field is atomic (including long and double, which are otherwise permitted by the JMM to be written in two 32-bit halves). But any operation that consists of more than one step — read, then modify, then write — is not atomic on a volatile field, because the thread can be preempted between any two of those steps. The canonical example is counter++. On a volatile int, this compiles to: read counter (volatile read), add 1 (CPU instruction), write counter (volatile write). If two threads execute counter++ simultaneously, both may read the same value, both add 1, and both write the same incremented value — the result is one increment instead of two. This is the lost update problem, and volatile does not prevent it. For atomic compound operations, use AtomicInteger, AtomicLong, AtomicReference, or synchronized. The correct patterns for volatile are narrow and well-defined. The flag pattern uses a single volatile boolean that one thread writes and one or more threads read: the reader loops checking the flag, and when it sees the new value, it stops or changes behavior. This is correct because there is exactly one writer and the write is a single store — no compound operation. The one-time publication pattern uses a volatile reference field that is written exactly once after the referent is fully initialized and then read by many threads. The volatile write ensures the fully initialized object is visible. Double-checked locking — checking a volatile reference, entering synchronized to initialize if null, checking again under the lock, initializing, and assigning the volatile — is correct in Java 5 and later because volatile prevents the partial initialization visibility problem. The status/configuration update pattern writes a new configuration object to a volatile reference field, which readers observe on their next access. The anti-patterns follow directly from the atomicity limitation. Using volatile for a counter that is incremented by multiple threads is wrong. Using volatile for a check-then-act sequence (if volatile == null, then assign) is wrong — the check and the act are not atomic and can be interleaved. Using volatile as a substitute for synchronized when a lock is actually needed for atomicity is wrong. The rule is: volatile is correct when the write does not depend on the previous value of the field, and when atomicity of a multi-step operation is not required.

Java

// ── Atomicity failure: volatile counter ──────────────────────────────
class VolatileCounter {
    volatile int count = 0;

    void increment() {
        count++;   // NOT atomic: read count → add 1 → write count
                   // Two threads can both read 5, both write 6 → count stays 6 (lost update)
    }
}

VolatileCounter vc = new VolatileCounter();
List<Thread> threads = new ArrayList<>();
for (int i = 0; i < 10; i++) {
    Thread t = new Thread(() -> {
        for (int j = 0; j < 1000; j++) vc.increment();
    });
    threads.add(t);
    t.start();
}
for (Thread t : threads) t.join();
System.out.println("Expected 10000, got: " + vc.count);  // e.g., 8743 — lost updates

// Fix: AtomicInteger for atomic compound operations:
AtomicInteger atomicCount = new AtomicInteger(0);
List<Thread> atomicThreads = new ArrayList<>();
for (int i = 0; i < 10; i++) {
    Thread t = new Thread(() -> {
        for (int j = 0; j < 1000; j++) atomicCount.incrementAndGet();  // atomic CAS
    });
    atomicThreads.add(t);
    t.start();
}
for (Thread t : atomicThreads) t.join();
System.out.println("Correct: " + atomicCount.get());  // always 10000

// ── CORRECT pattern 1: single-writer flag ─────────────────────────────
class CancellableTask implements Runnable {
    private volatile boolean cancelled = false;   // one writer (main), many readers

    public void cancel() { cancelled = true; }    // single store — no compound op

    @Override
    public void run() {
        while (!cancelled) {                       // volatile read — sees cancel() immediately
            doWork();
        }
        System.out.println("Task cancelled cleanly");
    }

    private void doWork() {
        try { Thread.sleep(10); } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }
}

CancellableTask task = new CancellableTask();
Thread worker = new Thread(task, "worker");
worker.start();
Thread.sleep(100);
task.cancel();     // volatile write: visible to worker immediately
worker.join();

// ── CORRECT pattern 2: one-time safe publication ──────────────────────
class ResourceHolder {
    private volatile ExpensiveResource resource = null;   // volatile reference

    ExpensiveResource getResource() {
        if (resource == null) {                           // fast path — no lock
            synchronized (this) {                         // slow path — only on first call
                if (resource == null) {                   // double-check under lock
                    resource = new ExpensiveResource();   // volatile write: publishes fully init'd object
                }
            }
        }
        return resource;  // volatile read: sees fully initialized object or null
    }
    // Correct in Java 5+ because volatile prevents partial initialization visibility
}

// ── CORRECT pattern 3: state/configuration swap ───────────────────────
class ConfigHolder {
    private volatile Config current = Config.defaultConfig();   // volatile reference

    void update(Config newConfig) {
        current = newConfig;   // atomic volatile write — replaces entire Config object
        // Readers always see either old or new Config, never a half-written one
    }

    Config get() {
        return current;   // volatile read — always sees the latest Config
    }
}

// ── ANTI-PATTERN: check-then-act is not atomic ────────────────────────
class UnsafeCheckThenAct {
    volatile String value = null;

    void initIfNull(String v) {
        if (value == null) {     // check  ← preemption can occur here
            value = v;           // act    ← two threads may both pass the check
        }
        // Two threads may both see null, both assign — last write wins arbitrarily
    }
}

// Fix: use synchronized for check-then-act:
class SafeCheckThenAct {
    volatile String value = null;   // volatile still needed for visibility outside synchronized

    synchronized void initIfNull(String v) {
        if (value == null) {     // check
            value = v;           // act — atomic under synchronized
        }
    }
}

// ── long and double visibility — volatile makes 64-bit writes atomic ──
class LongVisibility {
    volatile long timestamp = 0L;   // without volatile: JVM may write in two 32-bit halves
    // A reader could see the high 32 bits from one write and low 32 bits from another
    // volatile guarantees the full 64-bit write is atomic and visible

    void update(long ts) { timestamp = ts; }     // atomic volatile write
    long read() { return timestamp; }            // atomic volatile read
}

volatile

Why volatile Exists — CPU Caches, Registers, and the Visibility Problem

The happens-before Guarantee — Visibility, Ordering, and What volatile Covers

Atomicity Limitation, Correct Patterns, and Anti-Patterns

Related Topics in Multithreading