☕ Java

Thread Scheduler

The thread scheduler is the component of the operating system responsible for deciding which RUNNABLE thread executes on which CPU core at any given moment. Java threads map to native OS threads (in the HotSpot JVM's default threading model), so Java thread scheduling is OS thread scheduling. The JVM provides no scheduling guarantees beyond what the OS provides, and the OS provides no real-time guarantees in a standard (non-RTOS) environment. The scheduler's decisions are influenced by thread priority, the current scheduling policy (CFS on Linux, preemptive priority scheduling on Windows), the number of available CPU cores, thread time slices (quanta), and OS-specific heuristics for interactive responsiveness. Java programs interact with the scheduler through a small set of methods — Thread.sleep(), Thread.yield(), Object.wait(), and LockSupport.park() — each of which surrenders CPU time in different ways with different wake-up semantics. Understanding how the scheduler works, what guarantees it makes and does not make, and how Java's threading methods interact with it is essential for writing correct concurrent code and diagnosing performance problems in multithreaded systems.

How the OS Scheduler Works — Time Slicing, Preemption, and the Run Queue

The OS scheduler maintains a run queue — a data structure of RUNNABLE threads waiting for CPU time — and a set of per-core current threads actually executing. On a machine with N cores, the scheduler allows exactly N threads to execute simultaneously. All other RUNNABLE threads wait in the run queue. The scheduler periodically preempts running threads (removes them from a core at the end of their time slice, or quantum) and selects new threads from the run queue to run next. This cycle of run-preempt-queue-run is how the illusion of concurrent execution is created when there are more threads than cores. The time slice (quantum) is the maximum duration a thread is permitted to run before the scheduler forcibly preempts it. On Linux with CFS, the quantum is not a fixed duration — CFS allocates proportional CPU time based on a virtual runtime accounting scheme, and threads are preempted when their virtual runtime exceeds that of the lowest-virtual-runtime thread in the run queue. In practice this produces scheduling quanta that typically range from 0.75ms to 5ms depending on system load and the number of runnable threads. On Windows, threads receive fixed quanta of approximately 15ms in server mode or 2ms in desktop (UI-responsive) mode, with higher-priority threads receiving longer quanta. Preemption can happen between any two bytecode instructions. The JVM has no control over when the OS scheduler decides to preempt a running thread. From the Java program's perspective, any thread can be preempted at any point during execution — including between the read and write of a compound operation like i++, between the null check and the field access of a reference, or in the middle of updating a multi-field object. This is the fundamental source of race conditions: two threads interleaved by the scheduler in arbitrary ways, reading and writing shared state without synchronization. The scheduler also handles I/O blocking. When a thread issues an OS-level I/O call (read() on a socket, write() to a file, accept() on a server socket), the OS typically transitions the thread out of the run queue into a wait queue for the I/O completion event. While waiting, the thread does not consume CPU and does not count against the OS's runnable thread count. When the I/O completes, the OS moves the thread back to the run queue. This is why blocking I/O doesn't waste CPU — the thread is simply not scheduled while waiting — but does waste a thread and its stack memory while the I/O is pending. This trade-off is the core motivation for non-blocking I/O and, in modern Java, virtual threads.
Java
// ── Preemption is invisible and can happen anywhere ──────────────────
class SharedCounter {
    int value = 0;

    // i++ is NOT atomic — it compiles to: read value, add 1, write value
    // Preemption can occur between any of these three steps:
    void increment() {
        value++;   // read → preempt → other thread reads same value → both write same result
    }
}

SharedCounter counter = new SharedCounter();
Thread[] threads = new Thread[10];
for (int i = 0; i < threads.length; i++) {
    threads[i] = new Thread(() -> {
        for (int j = 0; j < 1_000; j++) counter.increment();
    });
    threads[i].start();
}
for (Thread t : threads) t.join();
// Expected: 10,000. Actual: anything from ~5,000 to 10,000
// Preemption between read and write causes lost updates
System.out.println("Lost updates example: " + counter.value);

// Fix: remove the race condition, not the preemption (you can't stop preemption)
AtomicInteger atomicCounter = new AtomicInteger(0);
Thread[] fixedThreads = new Thread[10];
for (int i = 0; i < fixedThreads.length; i++) {
    fixedThreads[i] = new Thread(() -> {
        for (int j = 0; j < 1_000; j++) atomicCounter.incrementAndGet();
    });
    fixedThreads[i].start();
}
for (Thread t : fixedThreads) t.join();
System.out.println("Correct count: " + atomicCounter.get());  // always 10,000

// ── Thread.sleep() — yielding CPU with a time-based wake-up ──────────
// sleep() requests that the OS remove the thread from the run queue for
// AT LEAST the specified duration. "At least" — not exactly.
long before = System.nanoTime();
Thread.sleep(100);   // request: don't schedule me for 100ms
long actualMs = (System.nanoTime() - before) / 1_000_000;
System.out.printf("Requested 100ms sleep, actual: %dms%n", actualMs);
// Typical: 100–115ms on an unloaded system
// Under load or high timer resolution jitter: 100–200ms

// sleep(0) is special: releases the time slice but immediately re-enters run queue
Thread.sleep(0);   // hint to scheduler: reschedule now if anything else wants to run

// ── LockSupport.park() — the low-level primitive behind all Java blocking
import java.util.concurrent.locks.LockSupport;

Thread parker = new Thread(() -> {
    System.out.println("Parker going to sleep via park()");
    LockSupport.park();   // blocks until unpark() is called
    System.out.println("Parker woke up");
}, "parker");

parker.start();
Thread.sleep(200);                          // let parker reach park()
System.out.println(parker.getState());      // WAITING
LockSupport.unpark(parker);                 // wake up parker
parker.join();

// park() with timeout — produces TIMED_WAITING:
Thread timedParker = new Thread(() -> {
    LockSupport.parkNanos(500_000_000L);    // 500ms in nanoseconds
    System.out.println("Timed park complete");
}, "timed-parker");
timedParker.start();
Thread.sleep(50);
System.out.println(timedParker.getState()); // TIMED_WAITING
timedParker.join();

// ── OS scheduler differences — Linux CFS vs Windows ──────────────────
// On Linux (CFS): each thread gets "fair" share of CPU proportional to its weight.
// Time slices are dynamic: fewer runnable threads = longer effective quantum.
// Thread priority maps to CFS weight (with nice values), affects share not absolute time.

// On Windows: fixed quanta (2ms desktop, 15ms server).
// Higher-priority thread preempts lower-priority thread immediately when it becomes RUNNABLE.
// This makes Windows scheduling more deterministic but also more prone to starvation.

// Neither provides real-time guarantees without RTOS configuration.

Thread.sleep(), Thread.yield(), and Object.wait() — Interacting with the Scheduler

Java provides three main mechanisms for a thread to voluntarily relinquish CPU time, each with different semantics and different use cases. Understanding the difference between them is essential for writing correct concurrent code and for understanding why certain common idioms work or fail. Thread.sleep(long millis) asks the OS to keep the thread out of the run queue for at least the specified duration. The thread transitions to TIMED_WAITING, releases its CPU core for other threads to use, and will not be scheduled again until at least millis milliseconds have elapsed. "At least" is the operative phrase: the OS provides no upper bound on how long sleep() might actually take. Under heavy load, a timer interrupt might be delayed; on systems with coarse timer resolution (Windows with a 15ms default timer interrupt interval), sleeps shorter than 15ms may round up to the next timer tick. Thread.sleep() does not release any monitors the thread holds. If the thread holds a synchronized lock and calls sleep(), that lock remains held for the duration of the sleep, blocking any other thread waiting for it — a common source of concurrency bugs. Thread.yield() is a hint to the scheduler that the current thread is willing to give up its remaining time slice. The scheduler may move the thread to the back of the run queue for its priority level, allowing other threads of equal or higher priority to run. The scheduler may also completely ignore the hint. yield() never changes the thread's state from RUNNABLE, does not release any locks, and does not provide any happens-before guarantee. Its correct uses are limited to specific performance scenarios: spin loops where busy-waiting is intentional but should be courteous to other threads, and test code that needs to encourage thread interleaving to expose races. In production application logic, yield() is almost always either unnecessary (the OS scheduler handles CPU sharing automatically) or masking a design problem. Object.wait() is fundamentally different from sleep() and yield(): it must be called while holding the object's monitor (within a synchronized block), and it atomically releases the monitor and transitions the thread to WAITING. This atomic release is critical — no thread can observe the state between "released monitor" and "entered WAITING" because the JVM handles both as a single indivisible operation. A thread in wait() can be awakened by Object.notify() or Object.notifyAll() on the same monitor, or by Thread.interrupt(). When awakened by notify(), the thread transitions to BLOCKED (re-competing for the monitor it released) before becoming RUNNABLE. The canonical correct usage of wait() is inside a while loop that re-checks the condition: while (!condition) { obj.wait(); }. An if rather than while is the spurious wakeup bug — a thread can wake from wait() without being notified (permitted by the Java Memory Model for OS-level reasons), and without the while loop it proceeds incorrectly on a stale condition.
Java
// ── Thread.sleep() — releases CPU, holds locks ───────────────────────
Object lock = new Object();

Thread sleepWithLock = new Thread(() -> {
    synchronized (lock) {
        System.out.println("Acquired lock, going to sleep");
        try {
            Thread.sleep(2000);   // HOLDS LOCK during sleep — other threads blocked!
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
        System.out.println("Woke from sleep, releasing lock");
    }
}, "sleep-with-lock");

Thread blockedOnLock = new Thread(() -> {
    System.out.println("Waiting for lock...");
    synchronized (lock) {   // BLOCKED for 2 seconds while sleepWithLock sleeps
        System.out.println("Finally got lock");
    }
}, "blocked-on-lock");

sleepWithLock.start();
Thread.sleep(50);
blockedOnLock.start();
blockedOnLock.join();
sleepWithLock.join();

// ── Thread.yield() — courteous spin-wait ─────────────────────────────
AtomicBoolean flag = new AtomicBoolean(false);

Thread spinner = new Thread(() -> {
    // Busy-wait with yield — courteous but still burns CPU
    while (!flag.get()) {
        Thread.yield();   // give other threads a chance to set the flag
    }
    System.out.println("Flag set, spinner done");
}, "spinner");

Thread setter = new Thread(() -> {
    try { Thread.sleep(100); } catch (InterruptedException e) {}
    flag.set(true);
    System.out.println("Flag set by setter");
}, "setter");

spinner.start();
setter.start();
spinner.join();
setter.join();

// Better alternative for most cases: LockSupport.park() / unpark()
// or blocking queues — no CPU burned at all

// ── Object.wait() — releases lock AND suspends ────────────────────────
class ProducerConsumer {
    private final Queue<String> queue = new LinkedList<>();
    private final int maxSize = 5;

    synchronized void produce(String item) throws InterruptedException {
        // WHILE loop — not IF — to handle spurious wakeups:
        while (queue.size() == maxSize) {
            System.out.println("Queue full, producer waiting");
            wait();   // atomically: releases monitor + enters WAITING
        }
        queue.add(item);
        System.out.println("Produced: " + item);
        notifyAll();  // wake consumers
    }

    synchronized String consume() throws InterruptedException {
        // WHILE loop — guards against spurious wakeup on empty queue:
        while (queue.isEmpty()) {
            System.out.println("Queue empty, consumer waiting");
            wait();   // atomically: releases monitor + enters WAITING
        }
        String item = queue.poll();
        System.out.println("Consumed: " + item);
        notifyAll();  // wake producers
        return item;
    }
}

ProducerConsumer pc = new ProducerConsumer();

Thread producer = new Thread(() -> {
    for (int i = 0; i < 10; i++) {
        try { pc.produce("item-" + i); } catch (InterruptedException e) {
            Thread.currentThread().interrupt(); break;
        }
    }
}, "producer");

Thread consumer = new Thread(() -> {
    for (int i = 0; i < 10; i++) {
        try { pc.consume(); } catch (InterruptedException e) {
            Thread.currentThread().interrupt(); break;
        }
    }
}, "consumer");

producer.start(); consumer.start();
producer.join();  consumer.join();

// ── Spurious wakeup — why while, not if ──────────────────────────────
// The Java spec explicitly permits wait() to return without notify() being called.
// The OS may wake a thread for implementation-defined reasons.
// Code that uses if instead of while is incorrect:

synchronized (lock) {
    if (queue.isEmpty()) {
        lock.wait();               // WRONG: spurious wakeup → proceeds on empty queue
    }
    String item = queue.poll();    // NullPointerException if spurious wakeup
}

synchronized (lock) {
    while (queue.isEmpty()) {
        lock.wait();               // CORRECT: re-check condition after every wakeup
    }
    String item = queue.poll();    // guaranteed: queue is non-empty here
}

Non-Determinism, Virtual Threads, and Scheduling Guarantees

The most important practical consequence of OS thread scheduling for Java programmers is non-determinism: the order in which threads execute is not fixed, not reproducible across runs, and not under the Java program's control. Two runs of the same program on the same machine with the same input can produce different thread interleaving patterns, different execution orders, and — in the presence of race conditions — different results. This non-determinism is not a bug in the JVM or the OS; it is a fundamental property of concurrent systems running on shared hardware. Tests that pass most of the time but occasionally fail ("flaky tests") are almost always caused by race conditions exposed by different scheduling decisions across runs. The Java Memory Model (JMM) provides the formal framework for reasoning about what results are possible given non-deterministic scheduling. The JMM defines which memory operations can be reordered by the compiler, the JVM, and the CPU, and it defines the happens-before relation as the mechanism that prevents specific reorderings when necessary. Correctly synchronized Java programs produce consistent results regardless of scheduling non-determinism because their synchronization establishes the happens-before relationships needed to constrain the set of possible outcomes to the intended ones. Incorrectly synchronized programs are data races, and the JMM explicitly specifies that data races produce undefined behavior — any outcome is permissible, including values appearing out of thin air. Virtual threads (Project Loom, GA in Java 21) change the scheduling model significantly. Virtual threads are lightweight threads scheduled by the JVM rather than the OS. The JVM maintains a pool of carrier threads (typically one per CPU core) and multiplexes many virtual threads onto these carriers. When a virtual thread blocks — on I/O, on a lock, on Thread.sleep() — the JVM unmounts it from the carrier thread, stores its stack in heap memory, and mounts another virtual thread that is ready to run. This allows millions of virtual threads to coexist without consuming millions of OS thread stacks. The JVM's virtual thread scheduler is a work-stealing scheduler (based on ForkJoinPool) that aims to keep carrier threads busy. Virtual thread scheduling is still non-deterministic, but the scheduling unit is now JVM-managed, enabling future JVM-level scheduling optimizations impossible with pure OS threads. Scheduling guarantees in standard Java amount to: a RUNNABLE thread will eventually be scheduled (no indefinite starvation, courtesy of OS starvation prevention), sleep() will wait at least the requested duration, notify() will eventually wake a waiting thread (though not immediately), and synchronization operations establish happens-before as specified by the JMM. There are no ordering guarantees, no latency guarantees, no throughput guarantees, and no real-time guarantees without specialized OS and JVM configuration.
Java
// ── Non-determinism — different runs, different ordering ─────────────
for (int run = 0; run < 3; run++) {
    Thread a = new Thread(() -> System.out.print("A"), "thread-A");
    Thread b = new Thread(() -> System.out.print("B"), "thread-B");
    Thread c = new Thread(() -> System.out.print("C"), "thread-C");
    a.start(); b.start(); c.start();
    a.join(); b.join(); c.join();
    System.out.println();
}
// Possible output on different runs:
// ABC
// BAC
// ACB
// BAC  ← same as run 2 but different from run 1 and 3
// Neither order is guaranteed; any permutation is legal

// ── Non-determinism is not fixable with sleep() ───────────────────────
// WRONG: using sleep() to "ensure" ordering (broken — just widens the window)
Thread producer = new Thread(() -> { /* set shared state */ });
Thread consumer = new Thread(() -> {
    Thread.sleep(100);   // "wait for producer" — BROKEN on slow machines or GC pause
    /* read shared state */
});
// A 200ms GC pause in producer makes sleep(100) insufficient.
// The only correct fix is synchronization, not timing.

// ── Virtual threads — JVM-scheduled, not OS-scheduled ─────────────────
// Java 21+: virtual threads use Thread.ofVirtual()
Thread vt = Thread.ofVirtual().name("virtual-1").start(() -> {
    System.out.println("Running on carrier: " + Thread.currentThread());
    try { Thread.sleep(100); } catch (InterruptedException e) {}
    // During sleep: virtual thread is unmounted from carrier.
    // Carrier is free to run other virtual threads.
    System.out.println("Resumed on: " + Thread.currentThread());
});
vt.join();

// Millions of virtual threads — impossible with OS threads:
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    List<Future<?>> futures = new ArrayList<>();
    for (int i = 0; i < 100_000; i++) {
        futures.add(executor.submit(() -> {
            Thread.sleep(1000);   // unmounts during sleep — carrier not consumed
            return null;
        }));
    }
    for (Future<?> f : futures) f.get();
}
System.out.println("100,000 virtual threads completed");
// With OS threads: 100,000 × ~1MB stack = ~100GB — impossible
// With virtual threads: stacks in heap, carriers = num CPUs — entirely feasible

// ── What the JVM DOES guarantee for scheduling ────────────────────────
// 1. No indefinite starvation: RUNNABLE threads eventually get CPU
//    (OS starvation prevention — priority boosting on Windows, fair CFS on Linux)

// 2. sleep() minimum duration: Thread.sleep(N) sleeps for AT LEAST N ms
long start = System.nanoTime();
Thread.sleep(50);
long elapsed = (System.nanoTime() - start) / 1_000_000;
System.out.println("Slept at least 50ms: " + (elapsed >= 50));  // always true

// 3. notify() effect: a thread in wait() will eventually be woken by notify()
//    (not guaranteed to be immediate — could be delayed by scheduler)

// 4. Happens-before per JMM: synchronized, volatile, join(), start() establish
//    happens-before and prevent race conditions on correctly synchronized code

// ── What the JVM does NOT guarantee ──────────────────────────────────
// No ordering guarantee: thread A starting before thread B doesn't mean A runs first
// No latency guarantee: notify() might not wake the waiter for milliseconds
// No throughput guarantee: high-priority thread might not get more CPU on Linux
// No real-time guarantee: sleep(1) might sleep for 50ms under load
// No fairness for synchronized: blocked threads reacquire in unspecified order

Related Topics in Multithreading

Process vs Thread
A process is an independent program in execution with its own isolated memory space, file handles, and system resources, managed by the operating system and separated from all other processes by strict boundaries. A thread is a unit of execution that lives inside a process, sharing that process's memory, heap, and resources with every other thread in the same process. Java programs run inside a JVM process; the JVM itself creates and manages threads, and every Java application starts with at least one thread — the main thread — with additional threads created by the JVM for garbage collection, JIT compilation, signal handling, and other runtime tasks. Understanding the distinction between processes and threads is the foundation for all concurrent programming in Java: it determines what is shared and what is isolated, what is fast and what is expensive, what fails independently and what fails together. This entry covers the OS-level and JVM-level model of processes and threads, the memory model that follows from the shared-versus-isolated distinction, the cost model for creation and context switching, failure isolation and its consequences, inter-process and inter-thread communication mechanisms, and the practical decision of when to use multiple processes versus multiple threads.
Thread Basics
A Java thread is an instance of java.lang.Thread that represents an independent path of execution within the JVM process. Every thread has a lifecycle — from creation through runnable, running, blocked, waiting, timed-waiting, and terminated states — and a set of properties including its name, priority, daemon status, thread group, and uncaught exception handler. The Java memory model specifies what visibility guarantees exist between threads and when writes by one thread are guaranteed to be visible to another. Thread scheduling is controlled by the OS scheduler subject to hints from the JVM via thread priority; the JVM does not provide real-time scheduling guarantees. This entry covers the complete thread lifecycle and its state machine, thread properties and how they affect scheduling and JVM shutdown, the happens-before relationship and why it matters for visibility, daemon threads and their relationship to JVM shutdown, thread interruption as a cooperative cancellation mechanism, and the methods on Thread that every Java developer must understand.
Creating Threads
Java provides three primary abstractions for defining the work a thread will execute: the Thread class itself (subclassed to override run()), the Runnable interface (a task with no return value and no checked exception), and the Callable interface (a task with a return value and a declared checked exception). Each represents a different contract between the task and the infrastructure that runs it. Thread subclassing couples the task definition to the execution mechanism and is the oldest and least flexible approach. Runnable decouples the task from the thread, allowing the same Runnable to be submitted to thread pools, scheduled executors, or wrapped in Thread objects. Callable extends that decoupling to include a return value and exception propagation, returning a Future that allows the caller to retrieve the result or handle exceptions asynchronously. Understanding all three — their contracts, their limitations, and when to use each — is the foundation of concurrent programming in Java before reaching for higher-level constructs.
Thread Lifecycle
The Java thread lifecycle is the complete sequence of states a thread passes through from the moment a Thread object is constructed to the moment its execution ends. Java defines six states in the Thread.State enum — NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, and TERMINATED — and the JVM transitions threads between these states in response to specific method calls, lock acquisitions, monitor notifications, timeouts, and exceptions. Each state has a precise meaning, a defined set of entry conditions, and a defined set of exit conditions. Understanding the lifecycle in full is prerequisite knowledge for diagnosing deadlocks, thread leaks, performance bottlenecks in thread dumps, and incorrect synchronization — all of which manifest as threads stuck in specific states. This entry covers every state in the lifecycle with its entry and exit conditions, all legal and illegal state transitions, how thread dumps represent each state, the interaction between lifecycle states and interruption, the effect of uncaught exceptions on lifecycle, and how to observe lifecycle transitions programmatically.