☕ JavaMultithreading

Process vs Thread

A process is an independent program in execution with its own isolated memory space, file handles, and system resources, managed by the operating system and separated from all other processes by strict boundaries. A thread is a unit of execution that lives inside a process, sharing that process's memory, heap, and resources with every other thread in the same process. Java programs run inside a JVM process; the JVM itself creates and manages threads, and every Java application starts with at least one thread — the main thread — with additional threads created by the JVM for garbage collection, JIT compilation, signal handling, and other runtime tasks. Understanding the distinction between processes and threads is the foundation for all concurrent programming in Java: it determines what is shared and what is isolated, what is fast and what is expensive, what fails independently and what fails together. This entry covers the OS-level and JVM-level model of processes and threads, the memory model that follows from the shared-versus-isolated distinction, the cost model for creation and context switching, failure isolation and its consequences, inter-process and inter-thread communication mechanisms, and the practical decision of when to use multiple processes versus multiple threads.

Memory Model — Isolation vs Sharing

A process has its own virtual address space, allocated by the operating system. This address space contains the program's code segment, data segment, heap, and stack. No other process can read or write this memory without explicit operating system cooperation (shared memory segments, memory-mapped files, or similar IPC mechanisms). The isolation is enforced by hardware and the OS kernel: an attempt by one process to access another process's memory results in a segmentation fault or access violation, never silent data corruption. A thread has no independent memory of its own beyond its stack. Every thread inside a process shares the process's heap, its static variables, its open file descriptors, its socket connections, and every object ever allocated on the shared heap. Two threads can hold references to the same object and both mutate it simultaneously — which is both the primary power and the primary danger of multithreading. The stack is the only memory that is private to a thread: each thread has its own call stack, its own local variables, and its own program counter indicating which instruction it is currently executing. Objects that never escape a single thread's stack (local variables that are never published to shared state) are inherently thread-safe because no other thread can see them. The JVM memory model maps onto this OS model directly. The JVM runs as a single OS process. The JVM heap — where all Java objects live — is shared among all threads in the JVM. Static fields are on the heap and therefore shared. Instance fields are on the heap and therefore shared whenever the object is accessible from multiple threads. Local variables and method parameters live on the thread stack and are private. This is the fundamental reason that local variables need no synchronization while shared object fields do. Context switching is the OS mechanism for time-slicing CPU time among multiple threads or processes. When the OS switches from one execution unit to another, it must save the current unit's register state, program counter, and stack pointer, then restore the saved state of the next unit. Switching between threads within the same process is cheaper than switching between processes because threads share the same virtual address space — the CPU's memory management unit does not need to reload page tables or flush the TLB (translation lookaside buffer) for same-process thread switches. The cost difference is typically an order of magnitude: process context switches on modern hardware range from 1–10 microseconds; thread context switches within the same process range from 0.1–1 microsecond.

Java

// ── What each thread owns vs what it shares ──────────────────────────

// PRIVATE TO EACH THREAD (on the thread's stack):
public void threadLocalExample() {
    int localVar = 42;                    // private — stack variable
    String localRef = "hello";            // reference is private; "hello" is interned but immutable
    Object localObj = new Object();       // reference private; object on heap but not shared
    // No synchronization needed for any of these — no other thread can see them
}

// SHARED ACROSS ALL THREADS (on the heap):
class SharedState {
    static int staticCounter = 0;         // shared — static field on heap
    int instanceCounter = 0;              // shared — instance field on heap
    List<String> sharedList = new ArrayList<>();  // shared — object on heap

    void unsafeIncrement() {
        staticCounter++;    // NOT thread-safe — read-modify-write is not atomic
        instanceCounter++;  // NOT thread-safe — same issue
    }
}

// ── Demonstrating shared heap — two threads, one object ──────────────
class Counter {
    int value = 0;
    void increment() { value++; }   // unsafe — illustrating sharing, not correctness
}

Counter shared = new Counter();   // ONE object on the heap

Thread t1 = new Thread(() -> {
    for (int i = 0; i < 1000; i++) shared.increment();
});
Thread t2 = new Thread(() -> {
    for (int i = 0; i < 1000; i++) shared.increment();
});

t1.start(); t2.start();
t1.join();  t2.join();

// Result is less than 2000 — lost updates due to unsynchronized shared access:
System.out.println(shared.value);  // e.g., 1843 — not 2000

// ── JVM startup threads — threads you didn't create yourself ─────────
// These threads exist in every JVM process:
Thread.getAllStackTraces().keySet().forEach(t ->
    System.out.println(t.getName() + " [daemon=" + t.isDaemon() + "]")
);
// main                          [daemon=false]   — your code runs here
// Reference Handler             [daemon=true]    — processes reference queue (GC)
// Finalizer                     [daemon=true]    — runs finalizers
// Signal Dispatcher             [daemon=true]    — handles OS signals (SIGTERM etc.)
// Notification Thread           [daemon=true]    — JVM internal notifications
// Common-Cleaner                [daemon=true]    — java.lang.ref.Cleaner
// (plus GC threads, JIT compiler threads, etc.)

// ── ProcessBuilder — launching a separate OS process ─────────────────
ProcessBuilder pb = new ProcessBuilder("java", "-version");
pb.redirectErrorStream(true);
Process proc = pb.start();

// proc is a SEPARATE OS process with its own heap, stack, and address space.
// No memory is shared between this JVM and the child process.
String output = new String(proc.getInputStream().readAllBytes());
int exitCode  = proc.waitFor();
System.out.println("Child process output: " + output.trim());
System.out.println("Exit code: " + exitCode);

Creation Cost, Failure Isolation, and Process vs Thread Decision

Creating a process is expensive in both time and memory. The OS must allocate a new virtual address space, copy (or copy-on-write) the parent's address space, set up file descriptor tables, initialize kernel data structures for the new process, and load the new program image if exec is called. On Linux, a typical fork() call takes 1–10 milliseconds depending on the parent process's memory footprint and whether copy-on-write pages need to be set up. Launching a new JVM process is particularly expensive because the JVM itself must be loaded, the standard library classes must be loaded and JIT-compiled, and the garbage collector must be initialized — this commonly takes 50–500 milliseconds for a minimal program. Creating a thread is far cheaper. The OS needs to allocate a stack (typically 512KB to 8MB, configurable), set up a thread control block, and add the thread to the scheduler's run queue. On Linux, pthread_create takes roughly 10–50 microseconds. In Java, new Thread(...).start() involves a JVM call to the OS thread API and takes a similar 10–100 microseconds. The JVM thread stack is allocated from the process's existing address space — no new address space setup is required. This cost difference is why thread pools (pre-allocating threads and reusing them) are standard practice for handling many short-lived concurrent tasks, and why Java 21's virtual threads take the approach even further by multiplexing many virtual threads onto a small number of OS threads. Failure isolation is the decisive advantage of separate processes. If a thread in a process throws an uncaught exception and the thread dies, other threads continue running. But if the failure is a JVM crash (native crash, OutOfMemoryError from which the JVM cannot recover, stack overflow in a GC thread, corrupted JVM internal state), the entire JVM process and all its threads die together. A separate process can fail without affecting any other process — the OS cleans up the dead process's resources and other processes continue running. This isolation is why microservices architectures use separate processes (usually separate containers) per service, and why a crash in one service does not bring down others. The practical decision between multiple processes and multiple threads follows from these properties. Use multiple threads when tasks need to share large amounts of in-memory data efficiently, when the communication overhead between tasks is high (millions of operations per second), when low latency is critical and even milliseconds of IPC overhead are unacceptable, and when the tasks are expected to succeed or fail together. Use multiple processes when strong fault isolation is required (a crash in one must not affect others), when the tasks are logically independent services with well-defined interfaces, when different components need to run different languages or runtime versions, when resource usage needs to be independently limited via OS-level controls, or when security boundaries are needed between components.

Java

// ── Thread creation cost vs process creation cost ────────────────────
// Timing thread creation:
long start = System.nanoTime();
Thread t = new Thread(() -> {});
t.start();
t.join();
long threadNanos = System.nanoTime() - start;
System.out.printf("Thread create+start+join: %.2f ms%n", threadNanos / 1e6);
// Typical: 0.05 – 0.5 ms

// Timing process creation:
long pStart = System.nanoTime();
Process p = new ProcessBuilder("true").start();  // Unix "true" — exits immediately
p.waitFor();
long processNanos = System.nanoTime() - pStart;
System.out.printf("Process create+wait: %.2f ms%n", processNanos / 1e6);
// Typical: 2 – 20 ms (much higher for "java -version": 80 – 500 ms)

// ── Failure isolation — thread death vs process death ─────────────────
// Thread death: only that thread stops; other threads continue
Thread fragile = new Thread(() -> {
    System.out.println("Thread starting");
    throw new RuntimeException("Thread failed");
});
fragile.setUncaughtExceptionHandler((thread, ex) ->
    System.err.println("Thread " + thread.getName() + " died: " + ex.getMessage())
);
fragile.start();
fragile.join();

// Main thread and all other threads continue after fragile dies:
System.out.println("Main thread still running");  // this always prints

// JVM-level failure kills everything — cannot recover:
// Runtime.getRuntime().halt(1);  // kills the JVM process and ALL threads immediately
// OutOfMemoryError from the GC thread — no recovery possible

// ── Inter-thread communication — shared memory, fast ─────────────────
// Threads share the heap — communication is a field write + field read:
class Message { volatile String text; }
Message msg = new Message();

Thread writer = new Thread(() -> {
    try { Thread.sleep(100); } catch (InterruptedException e) { Thread.currentThread().interrupt(); }
    msg.text = "hello from writer";
});
Thread reader = new Thread(() -> {
    while (msg.text == null) Thread.onSpinWait();  // busy-wait for demo
    System.out.println("Received: " + msg.text);
});

writer.start(); reader.start();
writer.join();  reader.join();
// No serialization, no network, no system call — just a heap write and read

// ── Inter-process communication — boundaries and cost ────────────────
// Processes must serialize data to cross the boundary:
// Option 1: Files (high latency, high durability)
// Option 2: Sockets / TCP (flexible, works across machines)
// Option 3: Pipes (fast, same machine only)
// Option 4: Shared memory segments (fast, complex setup, Linux/Unix)

// Simple pipe IPC between parent and child process:
ProcessBuilder childPB = new ProcessBuilder("cat");  // echoes stdin to stdout
childPB.redirectErrorStream(true);
Process child = childPB.start();

// Parent writes to child via pipe:
child.getOutputStream().write("hello process
".getBytes());
child.getOutputStream().close();

// Parent reads child's output:
String reply = new String(child.getInputStream().readAllBytes());
System.out.println("Child replied: " + reply.trim());  // hello process

// ── Decision matrix ───────────────────────────────────────────────────
// Factor              │ Threads              │ Processes
// ────────────────────┼──────────────────────┼───────────────────────
// Memory sharing      │ Shared heap          │ Isolated (explicit IPC)
// Creation cost       │ ~0.1ms               │ ~5–500ms
// Communication       │ Field read/write      │ Serialization + IPC
// Failure isolation   │ None (JVM crash = all die) │ Strong (one crash ≠ others)
// Security boundary   │ None (same JVM)      │ OS process separation
// Typical use         │ Parallel computation │ Microservices, plugins, fault tolerance

Related Topics in Multithreading

Thread Basics

A Java thread is an instance of java.lang.Thread that represents an independent path of execution within the JVM process. Every thread has a lifecycle — from creation through runnable, running, blocked, waiting, timed-waiting, and terminated states — and a set of properties including its name, priority, daemon status, thread group, and uncaught exception handler. The Java memory model specifies what visibility guarantees exist between threads and when writes by one thread are guaranteed to be visible to another. Thread scheduling is controlled by the OS scheduler subject to hints from the JVM via thread priority; the JVM does not provide real-time scheduling guarantees. This entry covers the complete thread lifecycle and its state machine, thread properties and how they affect scheduling and JVM shutdown, the happens-before relationship and why it matters for visibility, daemon threads and their relationship to JVM shutdown, thread interruption as a cooperative cancellation mechanism, and the methods on Thread that every Java developer must understand.

Creating Threads

Java provides three primary abstractions for defining the work a thread will execute: the Thread class itself (subclassed to override run()), the Runnable interface (a task with no return value and no checked exception), and the Callable interface (a task with a return value and a declared checked exception). Each represents a different contract between the task and the infrastructure that runs it. Thread subclassing couples the task definition to the execution mechanism and is the oldest and least flexible approach. Runnable decouples the task from the thread, allowing the same Runnable to be submitted to thread pools, scheduled executors, or wrapped in Thread objects. Callable extends that decoupling to include a return value and exception propagation, returning a Future that allows the caller to retrieve the result or handle exceptions asynchronously. Understanding all three — their contracts, their limitations, and when to use each — is the foundation of concurrent programming in Java before reaching for higher-level constructs.

Thread Lifecycle

The Java thread lifecycle is the complete sequence of states a thread passes through from the moment a Thread object is constructed to the moment its execution ends. Java defines six states in the Thread.State enum — NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, and TERMINATED — and the JVM transitions threads between these states in response to specific method calls, lock acquisitions, monitor notifications, timeouts, and exceptions. Each state has a precise meaning, a defined set of entry conditions, and a defined set of exit conditions. Understanding the lifecycle in full is prerequisite knowledge for diagnosing deadlocks, thread leaks, performance bottlenecks in thread dumps, and incorrect synchronization — all of which manifest as threads stuck in specific states. This entry covers every state in the lifecycle with its entry and exit conditions, all legal and illegal state transitions, how thread dumps represent each state, the interaction between lifecycle states and interruption, the effect of uncaught exceptions on lifecycle, and how to observe lifecycle transitions programmatically.

Thread Priority

Thread priority in Java is an integer hint to the OS scheduler indicating the relative importance of a thread compared to others. Java defines a scale from Thread.MIN_PRIORITY (1) to Thread.MAX_PRIORITY (10) with Thread.NORM_PRIORITY (5) as the default, and every thread inherits the priority of the thread that created it. The critical word in this definition is hint: thread priority is advisory, not mandatory. The JVM maps Java priorities to native OS thread priorities, and the OS scheduler uses those priorities according to its own scheduling policy, which varies by operating system, scheduler configuration, and system load. On some platforms, priority has a measurable effect on scheduling frequency; on others, it is almost entirely ignored. Priority must never be used as a correctness mechanism — any program that requires a thread to run before another for correctness, rather than merely preferring it, is broken and will fail on any platform where priorities are not honored. This entry covers the priority scale, inheritance, platform mapping, the correctness prohibition, starvation, priority inversion, and the narrow set of cases where priority hints are legitimately useful.

Raw Types

Thread Basics