☕ JavaMultithreading

Thread Scheduler

The thread scheduler is the component of the operating system responsible for deciding which RUNNABLE thread executes on which CPU core at any given moment. Java threads map to native OS threads (in the HotSpot JVM's default threading model), so Java thread scheduling is OS thread scheduling. The JVM provides no scheduling guarantees beyond what the OS provides, and the OS provides no real-time guarantees in a standard (non-RTOS) environment. The scheduler's decisions are influenced by thread priority, the current scheduling policy (CFS on Linux, preemptive priority scheduling on Windows), the number of available CPU cores, thread time slices (quanta), and OS-specific heuristics for interactive responsiveness. Java programs interact with the scheduler through a small set of methods — Thread.sleep(), Thread.yield(), Object.wait(), and LockSupport.park() — each of which surrenders CPU time in different ways with different wake-up semantics. Understanding how the scheduler works, what guarantees it makes and does not make, and how Java's threading methods interact with it is essential for writing correct concurrent code and diagnosing performance problems in multithreaded systems.

How the OS Scheduler Works — Time Slicing, Preemption, and the Run Queue

The OS scheduler maintains a run queue — a data structure of RUNNABLE threads waiting for CPU time — and a set of per-core current threads actually executing. On a machine with N cores, the scheduler allows exactly N threads to execute simultaneously. All other RUNNABLE threads wait in the run queue. The scheduler periodically preempts running threads (removes them from a core at the end of their time slice, or quantum) and selects new threads from the run queue to run next. This cycle of run-preempt-queue-run is how the illusion of concurrent execution is created when there are more threads than cores. The time slice (quantum) is the maximum duration a thread is permitted to run before the scheduler forcibly preempts it. On Linux with CFS, the quantum is not a fixed duration — CFS allocates proportional CPU time based on a virtual runtime accounting scheme, and threads are preempted when their virtual runtime exceeds that of the lowest-virtual-runtime thread in the run queue. In practice this produces scheduling quanta that typically range from 0.75ms to 5ms depending on system load and the number of runnable threads. On Windows, threads receive fixed quanta of approximately 15ms in server mode or 2ms in desktop (UI-responsive) mode, with higher-priority threads receiving longer quanta. Preemption can happen between any two bytecode instructions. The JVM has no control over when the OS scheduler decides to preempt a running thread. From the Java program's perspective, any thread can be preempted at any point during execution — including between the read and write of a compound operation like i++, between the null check and the field access of a reference, or in the middle of updating a multi-field object. This is the fundamental source of race conditions: two threads interleaved by the scheduler in arbitrary ways, reading and writing shared state without synchronization. The scheduler also handles I/O blocking. When a thread issues an OS-level I/O call (read() on a socket, write() to a file, accept() on a server socket), the OS typically transitions the thread out of the run queue into a wait queue for the I/O completion event. While waiting, the thread does not consume CPU and does not count against the OS's runnable thread count. When the I/O completes, the OS moves the thread back to the run queue. This is why blocking I/O doesn't waste CPU — the thread is simply not scheduled while waiting — but does waste a thread and its stack memory while the I/O is pending. This trade-off is the core motivation for non-blocking I/O and, in modern Java, virtual threads.

Java

// ── Preemption is invisible and can happen anywhere ──────────────────
class SharedCounter {
    int value = 0;

    // i++ is NOT atomic — it compiles to: read value, add 1, write value
    // Preemption can occur between any of these three steps:
    void increment() {
        value++;   // read → preempt → other thread reads same value → both write same result
    }
}

SharedCounter counter = new SharedCounter();
Thread[] threads = new Thread[10];
for (int i = 0; i < threads.length; i++) {
    threads[i] = new Thread(() -> {
        for (int j = 0; j < 1_000; j++) counter.increment();
    });
    threads[i].start();
}
for (Thread t : threads) t.join();
// Expected: 10,000. Actual: anything from ~5,000 to 10,000
// Preemption between read and write causes lost updates
System.out.println("Lost updates example: " + counter.value);

// Fix: remove the race condition, not the preemption (you can't stop preemption)
AtomicInteger atomicCounter = new AtomicInteger(0);
Thread[] fixedThreads = new Thread[10];
for (int i = 0; i < fixedThreads.length; i++) {
    fixedThreads[i] = new Thread(() -> {
        for (int j = 0; j < 1_000; j++) atomicCounter.incrementAndGet();
    });
    fixedThreads[i].start();
}
for (Thread t : fixedThreads) t.join();
System.out.println("Correct count: " + atomicCounter.get());  // always 10,000

// ── Thread.sleep() — yielding CPU with a time-based wake-up ──────────
// sleep() requests that the OS remove the thread from the run queue for
// AT LEAST the specified duration. "At least" — not exactly.
long before = System.nanoTime();
Thread.sleep(100);   // request: don't schedule me for 100ms
long actualMs = (System.nanoTime() - before) / 1_000_000;
System.out.printf("Requested 100ms sleep, actual: %dms%n", actualMs);
// Typical: 100–115ms on an unloaded system
// Under load or high timer resolution jitter: 100–200ms

// sleep(0) is special: releases the time slice but immediately re-enters run queue
Thread.sleep(0);   // hint to scheduler: reschedule now if anything else wants to run

// ── LockSupport.park() — the low-level primitive behind all Java blocking
import java.util.concurrent.locks.LockSupport;

Thread parker = new Thread(() -> {
    System.out.println("Parker going to sleep via park()");
    LockSupport.park();   // blocks until unpark() is called
    System.out.println("Parker woke up");
}, "parker");

parker.start();
Thread.sleep(200);                          // let parker reach park()
System.out.println(parker.getState());      // WAITING
LockSupport.unpark(parker);                 // wake up parker
parker.join();

// park() with timeout — produces TIMED_WAITING:
Thread timedParker = new Thread(() -> {
    LockSupport.parkNanos(500_000_000L);    // 500ms in nanoseconds
    System.out.println("Timed park complete");
}, "timed-parker");
timedParker.start();
Thread.sleep(50);
System.out.println(timedParker.getState()); // TIMED_WAITING
timedParker.join();

// ── OS scheduler differences — Linux CFS vs Windows ──────────────────
// On Linux (CFS): each thread gets "fair" share of CPU proportional to its weight.
// Time slices are dynamic: fewer runnable threads = longer effective quantum.
// Thread priority maps to CFS weight (with nice values), affects share not absolute time.

// On Windows: fixed quanta (2ms desktop, 15ms server).
// Higher-priority thread preempts lower-priority thread immediately when it becomes RUNNABLE.
// This makes Windows scheduling more deterministic but also more prone to starvation.

// Neither provides real-time guarantees without RTOS configuration.

Thread.sleep(), Thread.yield(), and Object.wait() — Interacting with the Scheduler

Java provides three main mechanisms for a thread to voluntarily relinquish CPU time, each with different semantics and different use cases. Understanding the difference between them is essential for writing correct concurrent code and for understanding why certain common idioms work or fail. Thread.sleep(long millis) asks the OS to keep the thread out of the run queue for at least the specified duration. The thread transitions to TIMED_WAITING, releases its CPU core for other threads to use, and will not be scheduled again until at least millis milliseconds have elapsed. "At least" is the operative phrase: the OS provides no upper bound on how long sleep() might actually take. Under heavy load, a timer interrupt might be delayed; on systems with coarse timer resolution (Windows with a 15ms default timer interrupt interval), sleeps shorter than 15ms may round up to the next timer tick. Thread.sleep() does not release any monitors the thread holds. If the thread holds a synchronized lock and calls sleep(), that lock remains held for the duration of the sleep, blocking any other thread waiting for it — a common source of concurrency bugs. Thread.yield() is a hint to the scheduler that the current thread is willing to give up its remaining time slice. The scheduler may move the thread to the back of the run queue for its priority level, allowing other threads of equal or higher priority to run. The scheduler may also completely ignore the hint. yield() never changes the thread's state from RUNNABLE, does not release any locks, and does not provide any happens-before guarantee. Its correct uses are limited to specific performance scenarios: spin loops where busy-waiting is intentional but should be courteous to other threads, and test code that needs to encourage thread interleaving to expose races. In production application logic, yield() is almost always either unnecessary (the OS scheduler handles CPU sharing automatically) or masking a design problem. Object.wait() is fundamentally different from sleep() and yield(): it must be called while holding the object's monitor (within a synchronized block), and it atomically releases the monitor and transitions the thread to WAITING. This atomic release is critical — no thread can observe the state between "released monitor" and "entered WAITING" because the JVM handles both as a single indivisible operation. A thread in wait() can be awakened by Object.notify() or Object.notifyAll() on the same monitor, or by Thread.interrupt(). When awakened by notify(), the thread transitions to BLOCKED (re-competing for the monitor it released) before becoming RUNNABLE. The canonical correct usage of wait() is inside a while loop that re-checks the condition: while (!condition) { obj.wait(); }. An if rather than while is the spurious wakeup bug — a thread can wake from wait() without being notified (permitted by the Java Memory Model for OS-level reasons), and without the while loop it proceeds incorrectly on a stale condition.

Java

// ── Thread.sleep() — releases CPU, holds locks ───────────────────────
Object lock = new Object();

Thread sleepWithLock = new Thread(() -> {
    synchronized (lock) {
        System.out.println("Acquired lock, going to sleep");
        try {
            Thread.sleep(2000);   // HOLDS LOCK during sleep — other threads blocked!
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
        System.out.println("Woke from sleep, releasing lock");
    }
}, "sleep-with-lock");

Thread blockedOnLock = new Thread(() -> {
    System.out.println("Waiting for lock...");
    synchronized (lock) {   // BLOCKED for 2 seconds while sleepWithLock sleeps
        System.out.println("Finally got lock");
    }
}, "blocked-on-lock");

sleepWithLock.start();
Thread.sleep(50);
blockedOnLock.start();
blockedOnLock.join();
sleepWithLock.join();

// ── Thread.yield() — courteous spin-wait ─────────────────────────────
AtomicBoolean flag = new AtomicBoolean(false);

Thread spinner = new Thread(() -> {
    // Busy-wait with yield — courteous but still burns CPU
    while (!flag.get()) {
        Thread.yield();   // give other threads a chance to set the flag
    }
    System.out.println("Flag set, spinner done");
}, "spinner");

Thread setter = new Thread(() -> {
    try { Thread.sleep(100); } catch (InterruptedException e) {}
    flag.set(true);
    System.out.println("Flag set by setter");
}, "setter");

spinner.start();
setter.start();
spinner.join();
setter.join();

// Better alternative for most cases: LockSupport.park() / unpark()
// or blocking queues — no CPU burned at all

// ── Object.wait() — releases lock AND suspends ────────────────────────
class ProducerConsumer {
    private final Queue<String> queue = new LinkedList<>();
    private final int maxSize = 5;

    synchronized void produce(String item) throws InterruptedException {
        // WHILE loop — not IF — to handle spurious wakeups:
        while (queue.size() == maxSize) {
            System.out.println("Queue full, producer waiting");
            wait();   // atomically: releases monitor + enters WAITING
        }
        queue.add(item);
        System.out.println("Produced: " + item);
        notifyAll();  // wake consumers
    }

    synchronized String consume() throws InterruptedException {
        // WHILE loop — guards against spurious wakeup on empty queue:
        while (queue.isEmpty()) {
            System.out.println("Queue empty, consumer waiting");
            wait();   // atomically: releases monitor + enters WAITING
        }
        String item = queue.poll();
        System.out.println("Consumed: " + item);
        notifyAll();  // wake producers
        return item;
    }
}

ProducerConsumer pc = new ProducerConsumer();

Thread producer = new Thread(() -> {
    for (int i = 0; i < 10; i++) {
        try { pc.produce("item-" + i); } catch (InterruptedException e) {
            Thread.currentThread().interrupt(); break;
        }
    }
}, "producer");

Thread consumer = new Thread(() -> {
    for (int i = 0; i < 10; i++) {
        try { pc.consume(); } catch (InterruptedException e) {
            Thread.currentThread().interrupt(); break;
        }
    }
}, "consumer");

producer.start(); consumer.start();
producer.join();  consumer.join();

// ── Spurious wakeup — why while, not if ──────────────────────────────
// The Java spec explicitly permits wait() to return without notify() being called.
// The OS may wake a thread for implementation-defined reasons.
// Code that uses if instead of while is incorrect:

synchronized (lock) {
    if (queue.isEmpty()) {
        lock.wait();               // WRONG: spurious wakeup → proceeds on empty queue
    }
    String item = queue.poll();    // NullPointerException if spurious wakeup
}

synchronized (lock) {
    while (queue.isEmpty()) {
        lock.wait();               // CORRECT: re-check condition after every wakeup
    }
    String item = queue.poll();    // guaranteed: queue is non-empty here
}

Non-Determinism, Virtual Threads, and Scheduling Guarantees

The most important practical consequence of OS thread scheduling for Java programmers is non-determinism: the order in which threads execute is not fixed, not reproducible across runs, and not under the Java program's control. Two runs of the same program on the same machine with the same input can produce different thread interleaving patterns, different execution orders, and — in the presence of race conditions — different results. This non-determinism is not a bug in the JVM or the OS; it is a fundamental property of concurrent systems running on shared hardware. Tests that pass most of the time but occasionally fail ("flaky tests") are almost always caused by race conditions exposed by different scheduling decisions across runs. The Java Memory Model (JMM) provides the formal framework for reasoning about what results are possible given non-deterministic scheduling. The JMM defines which memory operations can be reordered by the compiler, the JVM, and the CPU, and it defines the happens-before relation as the mechanism that prevents specific reorderings when necessary. Correctly synchronized Java programs produce consistent results regardless of scheduling non-determinism because their synchronization establishes the happens-before relationships needed to constrain the set of possible outcomes to the intended ones. Incorrectly synchronized programs are data races, and the JMM explicitly specifies that data races produce undefined behavior — any outcome is permissible, including values appearing out of thin air. Virtual threads (Project Loom, GA in Java 21) change the scheduling model significantly. Virtual threads are lightweight threads scheduled by the JVM rather than the OS. The JVM maintains a pool of carrier threads (typically one per CPU core) and multiplexes many virtual threads onto these carriers. When a virtual thread blocks — on I/O, on a lock, on Thread.sleep() — the JVM unmounts it from the carrier thread, stores its stack in heap memory, and mounts another virtual thread that is ready to run. This allows millions of virtual threads to coexist without consuming millions of OS thread stacks. The JVM's virtual thread scheduler is a work-stealing scheduler (based on ForkJoinPool) that aims to keep carrier threads busy. Virtual thread scheduling is still non-deterministic, but the scheduling unit is now JVM-managed, enabling future JVM-level scheduling optimizations impossible with pure OS threads. Scheduling guarantees in standard Java amount to: a RUNNABLE thread will eventually be scheduled (no indefinite starvation, courtesy of OS starvation prevention), sleep() will wait at least the requested duration, notify() will eventually wake a waiting thread (though not immediately), and synchronization operations establish happens-before as specified by the JMM. There are no ordering guarantees, no latency guarantees, no throughput guarantees, and no real-time guarantees without specialized OS and JVM configuration.

Java

// ── Non-determinism — different runs, different ordering ─────────────
for (int run = 0; run < 3; run++) {
    Thread a = new Thread(() -> System.out.print("A"), "thread-A");
    Thread b = new Thread(() -> System.out.print("B"), "thread-B");
    Thread c = new Thread(() -> System.out.print("C"), "thread-C");
    a.start(); b.start(); c.start();
    a.join(); b.join(); c.join();
    System.out.println();
}
// Possible output on different runs:
// ABC
// BAC
// ACB
// BAC  ← same as run 2 but different from run 1 and 3
// Neither order is guaranteed; any permutation is legal

// ── Non-determinism is not fixable with sleep() ───────────────────────
// WRONG: using sleep() to "ensure" ordering (broken — just widens the window)
Thread producer = new Thread(() -> { /* set shared state */ });
Thread consumer = new Thread(() -> {
    Thread.sleep(100);   // "wait for producer" — BROKEN on slow machines or GC pause
    /* read shared state */
});
// A 200ms GC pause in producer makes sleep(100) insufficient.
// The only correct fix is synchronization, not timing.

// ── Virtual threads — JVM-scheduled, not OS-scheduled ─────────────────
// Java 21+: virtual threads use Thread.ofVirtual()
Thread vt = Thread.ofVirtual().name("virtual-1").start(() -> {
    System.out.println("Running on carrier: " + Thread.currentThread());
    try { Thread.sleep(100); } catch (InterruptedException e) {}
    // During sleep: virtual thread is unmounted from carrier.
    // Carrier is free to run other virtual threads.
    System.out.println("Resumed on: " + Thread.currentThread());
});
vt.join();

// Millions of virtual threads — impossible with OS threads:
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    List<Future<?>> futures = new ArrayList<>();
    for (int i = 0; i < 100_000; i++) {
        futures.add(executor.submit(() -> {
            Thread.sleep(1000);   // unmounts during sleep — carrier not consumed
            return null;
        }));
    }
    for (Future<?> f : futures) f.get();
}
System.out.println("100,000 virtual threads completed");
// With OS threads: 100,000 × ~1MB stack = ~100GB — impossible
// With virtual threads: stacks in heap, carriers = num CPUs — entirely feasible

// ── What the JVM DOES guarantee for scheduling ────────────────────────
// 1. No indefinite starvation: RUNNABLE threads eventually get CPU
//    (OS starvation prevention — priority boosting on Windows, fair CFS on Linux)

// 2. sleep() minimum duration: Thread.sleep(N) sleeps for AT LEAST N ms
long start = System.nanoTime();
Thread.sleep(50);
long elapsed = (System.nanoTime() - start) / 1_000_000;
System.out.println("Slept at least 50ms: " + (elapsed >= 50));  // always true

// 3. notify() effect: a thread in wait() will eventually be woken by notify()
//    (not guaranteed to be immediate — could be delayed by scheduler)

// 4. Happens-before per JMM: synchronized, volatile, join(), start() establish
//    happens-before and prevent race conditions on correctly synchronized code

// ── What the JVM does NOT guarantee ──────────────────────────────────
// No ordering guarantee: thread A starting before thread B doesn't mean A runs first
// No latency guarantee: notify() might not wake the waiter for milliseconds
// No throughput guarantee: high-priority thread might not get more CPU on Linux
// No real-time guarantee: sleep(1) might sleep for 50ms under load
// No fairness for synchronized: blocked threads reacquire in unspecified order

Thread Scheduler

How the OS Scheduler Works — Time Slicing, Preemption, and the Run Queue

Thread.sleep(), Thread.yield(), and Object.wait() — Interacting with the Scheduler

Non-Determinism, Virtual Threads, and Scheduling Guarantees

Related Topics in Multithreading