☕ Java

Thread Pool

A thread pool is a managed collection of pre-created, reusable worker threads that execute submitted tasks from a shared work queue, eliminating the overhead of creating and destroying a thread for every unit of work. Thread creation costs 50–500 microseconds in JVM startup time plus stack memory (default 512KB–8MB per thread); a thread pool amortizes this cost by reusing threads across thousands of tasks. ThreadPoolExecutor is the central class in java.util.concurrent that implements thread pool behavior, exposing fine-grained control over core thread count, maximum thread count, idle timeout, work queue type and capacity, thread factory, and rejection policy. Understanding how ThreadPoolExecutor works internally — how threads are created, how the queue interacts with the max thread count, how idle threads are terminated — is essential for configuring pools correctly for CPU-bound versus I/O-bound workloads, diagnosing thread pool exhaustion and starvation, and tuning pool parameters for observed workloads. This entry covers the complete ThreadPoolExecutor configuration parameters and their interactions, the exact algorithm used to decide when to create threads versus queue tasks versus reject, work queue types and their performance characteristics, pool sizing formulas for CPU-bound and I/O-bound tasks, common pathologies (thread starvation, queue unboundedness, idle thread overhead), and the relationship between thread pools and virtual threads in Java 21.

ThreadPoolExecutor Internals — Configuration and the Thread Creation Algorithm

ThreadPoolExecutor is constructed with seven parameters that fully determine its behavior: corePoolSize, maximumPoolSize, keepAliveTime, TimeUnit, BlockingQueue<Runnable>, ThreadFactory, and RejectedExecutionHandler. Understanding each parameter and how they interact is the foundation of all thread pool configuration. corePoolSize is the number of threads the pool keeps alive even when idle, unless allowCoreThreadTimeOut(true) is set. Threads are not pre-created at construction time (unless prestartAllCoreThreads() is called); they are created on demand as tasks are submitted. Once corePoolSize threads have been created and are active, new tasks are placed in the work queue rather than creating additional threads — even if all core threads are busy. maximumPoolSize is the upper bound on the number of threads in the pool. Threads beyond corePoolSize are created only when the work queue is full and there is still work to be done. These extra threads are called non-core threads. Once a non-core thread has been idle for keepAliveTime units, it terminates. This creates a burst capacity: under sustained load up to corePoolSize threads handle work; under sudden spikes, the queue fills, additional threads up to maximumPoolSize are created, and those extra threads terminate after the spike subsides. The thread creation algorithm is the most commonly misunderstood aspect of ThreadPoolExecutor. When a task is submitted: if the number of running threads is less than corePoolSize, a new thread is created to handle the task immediately — even if other core threads are idle. If the number of running threads is at or above corePoolSize, the task is offered to the work queue. If the queue accepts the task (offer() returns true), no new thread is created. If the queue is full (offer() returns false), a new thread is created if the current thread count is below maximumPoolSize. If the queue is full and the thread count is at maximumPoolSize, the rejection handler is invoked. This algorithm has a critical implication: maximumPoolSize is only relevant when the queue is full. For an unbounded queue (LinkedBlockingQueue with no capacity limit), the queue never fills, so maximumPoolSize is never reached — the pool always has exactly corePoolSize threads regardless of the maximumPoolSize parameter. This is why Executors.newFixedThreadPool() sets corePoolSize = maximumPoolSize and uses an unbounded queue: maximumPoolSize is irrelevant but harmless. keepAliveTime and the TimeUnit determine how long a non-core thread may be idle before being terminated. Core threads are exempt from this timeout by default. Calling allowCoreThreadTimeOut(true) applies the timeout to core threads as well, allowing the pool to shrink to zero under sustained idleness — appropriate for pools that handle very bursty traffic with long quiet periods.
Java
// ── Full ThreadPoolExecutor construction ─────────────────────────────
ThreadPoolExecutor pool = new ThreadPoolExecutor(
    4,                          // corePoolSize: always keep 4 threads
    16,                         // maximumPoolSize: allow bursts up to 16
    60L,                        // keepAliveTime: idle non-core threads live 60...
    TimeUnit.SECONDS,           // ...seconds before termination
    new ArrayBlockingQueue<>(200),  // work queue: bounded, holds up to 200 tasks
    new NamedThreadFactory("api-worker", false),  // custom thread factory
    new ThreadPoolExecutor.CallerRunsPolicy()     // backpressure on rejection
);

// ── Thread creation algorithm — step by step ─────────────────────────
// Pool starts: 0 threads, corePoolSize=4, maxPoolSize=16, queue capacity=200

// Submit task 1: threads(0) < corePoolSize(4) → CREATE thread-1, run task-1
// Submit task 2: threads(1) < corePoolSize(4) → CREATE thread-2, run task-2
// Submit task 3: threads(2) < corePoolSize(4) → CREATE thread-3, run task-3
// Submit task 4: threads(3) < corePoolSize(4) → CREATE thread-4, run task-4
// Submit task 5: threads(4) >= corePoolSize(4) → QUEUE task-5 (queue size: 1)
// ...
// Submit task 204: threads(4) >= core, queue(200) FULL → threads(4) < max(16) → CREATE thread-5
// Submit task 205: threads(5) >= core, queue FULL, threads < max → CREATE thread-6
// ...
// Submit task 216: threads(16) = max, queue FULL → REJECT (CallerRunsPolicy: run in caller)

// Verify this behavior empirically:
ThreadPoolExecutor observable = new ThreadPoolExecutor(
    2, 6, 30L, TimeUnit.SECONDS, new ArrayBlockingQueue<>(4)
);

CountDownLatch taskLatch = new CountDownLatch(1);

// Submit 10 long-running tasks: 2 core + 4 queue + 4 extra threads:
for (int i = 1; i <= 10; i++) {
    int id = i;
    try {
        observable.execute(() -> {
            System.out.printf("Task %d on %s (pool=%d active=%d queue=%d)%n",
                id, Thread.currentThread().getName(),
                observable.getPoolSize(), observable.getActiveCount(),
                observable.getQueue().size());
            try { taskLatch.await(); } catch (InterruptedException e) {}
        });
    } catch (RejectedExecutionException e) {
        System.out.println("Task " + id + " REJECTED");
    }
}

Thread.sleep(200);
System.out.printf("Final: pool=%d active=%d queued=%d%n",
    observable.getPoolSize(), observable.getActiveCount(), observable.getQueue().size());
// pool=6, active=6, queued=4  (tasks 7-10 rejected by CallerRunsPolicy, so caller ran them)

taskLatch.countDown();
observable.shutdown();

// ── allowCoreThreadTimeOut — let pool shrink to zero ─────────────────
ThreadPoolExecutor shrinkable = new ThreadPoolExecutor(
    4, 4, 10L, TimeUnit.SECONDS, new LinkedBlockingQueue<>()
);
shrinkable.allowCoreThreadTimeOut(true);   // core threads also time out when idle
shrinkable.submit(() -> System.out.println("Quick task"));
shrinkable.shutdown();
// After 10 seconds of idleness, pool shrinks from 4 threads to 0

// ── prestartAllCoreThreads — eager thread creation ────────────────────
ThreadPoolExecutor eager = new ThreadPoolExecutor(
    4, 4, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<>()
);
System.out.println("Before prestart: " + eager.getPoolSize());  // 0
eager.prestartAllCoreThreads();
System.out.println("After prestart:  " + eager.getPoolSize());  // 4
// Useful when first-task latency matters and thread creation delay is unacceptable
eager.shutdown();

Work Queue Types and Their Performance Characteristics

The work queue is the most architecturally significant parameter in ThreadPoolExecutor. The queue type and capacity determine how tasks wait when all core threads are busy, how backpressure is applied, and whether the pool's maximumPoolSize is ever relevant. SynchronousQueue is a handoff queue with no capacity: every offer() must be matched by a concurrent take(). If no thread is waiting to take when a task is offered, the offer() fails immediately and the pool must either create a new thread or reject the task. Combined with Integer.MAX_VALUE maximumPoolSize, this is the configuration of newCachedThreadPool(): tasks are never buffered, always either handed off to a waiting thread or assigned a freshly created thread. SynchronousQueue has excellent throughput for short-lived tasks because there is no memory allocation for queuing and no latency waiting in a queue. LinkedBlockingQueue is an optionally-bounded queue backed by linked nodes. When constructed without a capacity argument (new LinkedBlockingQueue()), its capacity is Integer.MAX_VALUE — effectively unbounded. Unbounded queues never reject tasks and never trigger extra thread creation beyond corePoolSize, making maximumPoolSize irrelevant. They are dangerous in systems where the submission rate can exceed the processing rate for extended periods, because the queue grows without bound, consuming heap memory until an OutOfMemoryError. Always specify a capacity for production pools: new LinkedBlockingQueue<>(10_000). LinkedBlockingQueue has separate locks for head and tail operations, giving it better throughput than ArrayBlockingQueue under concurrent producer-consumer scenarios. ArrayBlockingQueue is a bounded queue backed by a fixed-size array. It has a single lock for both enqueue and dequeue operations, which makes it simpler but slightly lower throughput than LinkedBlockingQueue under high concurrency. Its strict capacity bound is an advantage in systems where backpressure and task rejection must be enforced. The fixed array means no heap allocation for individual tasks, which can reduce GC pressure. PriorityBlockingQueue is an unbounded queue that orders tasks by priority (natural ordering via Comparable, or a Comparator provided at construction). Tasks are dequeued in priority order, not FIFO. This is useful for priority-aware work queues where high-priority tasks (user-facing requests) should preempt low-priority tasks (background jobs) when threads are scarce. Because it is unbounded, it shares the memory growth risk of LinkedBlockingQueue. DelayQueue holds elements that implement Delayed and only makes them available after their delay expires. It is used internally by ScheduledThreadPoolExecutor for scheduling. Direct use of DelayQueue in a ThreadPoolExecutor is uncommon but valid for custom delayed-task execution.
Java
// ── SynchronousQueue — no buffering, immediate handoff ───────────────
// newCachedThreadPool() uses SynchronousQueue with Integer.MAX_VALUE max threads:
ThreadPoolExecutor syncPool = new ThreadPoolExecutor(
    0, Integer.MAX_VALUE,
    60L, TimeUnit.SECONDS,
    new SynchronousQueue<>()    // no buffering — task must find a thread immediately
);

// Under load: each submitted task either finds an idle thread or creates a new one:
for (int i = 0; i < 5; i++) {
    syncPool.execute(() -> {
        try { Thread.sleep(100); } catch (InterruptedException e) {}
    });
}
Thread.sleep(50);
System.out.println("Cached pool threads: " + syncPool.getPoolSize());  // 5 (one per task)
syncPool.shutdown();

// ── LinkedBlockingQueue bounded — production-safe ─────────────────────
ThreadPoolExecutor linkedBounded = new ThreadPoolExecutor(
    4, 4,                           // fixed size — max is irrelevant with unbounded queue
    0L, TimeUnit.MILLISECONDS,
    new LinkedBlockingQueue<>(500)  // ALWAYS specify capacity in production
);
// If 500+ tasks queue up, 501st is rejected — explicit backpressure instead of OOM

// ── LinkedBlockingQueue unbounded — DANGEROUS in production ───────────
// DO NOT use in production without queue depth monitoring and circuit breakers:
ThreadPoolExecutor risky = new ThreadPoolExecutor(
    4, 4, 0L, TimeUnit.MILLISECONDS,
    new LinkedBlockingQueue<>()   // Integer.MAX_VALUE capacity — grows forever
);
// Under sustained overload: risky.getQueue().size() → millions → OutOfMemoryError

// ── ArrayBlockingQueue — strict bound, good for backpressure ──────────
ThreadPoolExecutor arrayPool = new ThreadPoolExecutor(
    2, 8,                            // 2 core, burst to 8
    30L, TimeUnit.SECONDS,
    new ArrayBlockingQueue<>(50),    // max 50 queued tasks (after max threads created)
    Executors.defaultThreadFactory(),
    new ThreadPoolExecutor.CallerRunsPolicy()
);
// Queue fills at 50 tasks → new threads created up to 8 → then CallerRunsPolicy kicks in

// ── PriorityBlockingQueue — priority-ordered execution ────────────────
class PrioritizedTask implements Runnable, Comparable<PrioritizedTask> {
    final int priority;
    final String name;
    PrioritizedTask(int priority, String name) {
        this.priority = priority;
        this.name = name;
    }
    @Override public void run() { System.out.println("Running: " + name + " (p=" + priority + ")"); }
    @Override public int compareTo(PrioritizedTask o) {
        return Integer.compare(o.priority, this.priority);  // higher priority = dequeued first
    }
}

ThreadPoolExecutor priorityPool = new ThreadPoolExecutor(
    1, 1, 0L, TimeUnit.MILLISECONDS,
    new PriorityBlockingQueue<>()
);

// Submit low-priority first, then high-priority:
priorityPool.execute(new PrioritizedTask(1, "background-job"));
priorityPool.execute(new PrioritizedTask(5, "user-request"));
priorityPool.execute(new PrioritizedTask(3, "analytics"));
priorityPool.execute(new PrioritizedTask(5, "another-user-request"));

// Execution order: user-request (5), another-user-request (5), analytics (3), background-job (1)
priorityPool.shutdown();
priorityPool.awaitTermination(5, TimeUnit.SECONDS);

// ── Queue depth monitoring — essential for production ─────────────────
ThreadPoolExecutor prod = new ThreadPoolExecutor(
    8, 8, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<>(1000)
);

// Export queue depth as a metric every 10 seconds:
ScheduledExecutorService metrics = Executors.newSingleThreadScheduledExecutor();
metrics.scheduleAtFixedRate(() -> {
    int queueDepth = prod.getQueue().size();
    double utilization = (double) prod.getActiveCount() / prod.getPoolSize();
    System.out.printf("[METRICS] queue=%d utilization=%.1f%%%n",
        queueDepth, utilization * 100);
    if (queueDepth > 800) {
        System.err.println("[ALERT] Queue depth critical: " + queueDepth);
    }
}, 0, 10, TimeUnit.SECONDS);

Pool Sizing, Pathologies, and Virtual Threads

Thread pool sizing is the most consequential and most commonly misconfigured aspect of thread pool deployment. The correct size depends critically on whether the workload is CPU-bound or I/O-bound, and gets the answer wrong in either direction causes significant performance degradation. For CPU-bound tasks (computation, in-memory processing, mathematical operations with no I/O), the optimal pool size is the number of available CPU cores: N = Runtime.getRuntime().availableProcessors(). More threads than cores means threads compete for CPU time with context-switch overhead and cache thrashing that reduces throughput. Fewer threads leaves CPU cores idle. The classic formula is N + 1 (one extra thread to handle the occasional cache miss or minor pause) but N is usually close enough. This is the configuration of Executors.newFixedThreadPool(availableProcessors()). For I/O-bound tasks (database queries, HTTP calls, file reads, any operation that blocks waiting for external systems), threads spend most of their time blocking — not using CPU. The optimal pool size is much larger than the number of CPU cores. The standard formula is N_threads = N_cores × (1 + wait_time / compute_time), where wait_time is the average time a task spends blocking and compute_time is the average time a task spends on CPU. If tasks spend 90% of their time blocked on I/O (wait_time/compute_time = 9), and there are 8 cores, the formula gives 8 × 10 = 80 threads. In practice, the ideal size is measured empirically under realistic load rather than calculated, because wait_time/compute_time varies by network conditions, database load, and request mix. Thread pool starvation (also called thread pool deadlock) occurs when all threads in a pool are blocked waiting for the result of other tasks that are also submitted to the same pool. If tasks A and B are both submitted to a pool of 2 threads, and A submits C to the same pool and then blocks waiting for C to complete, while B is also running and the pool has no more threads, C will never execute because no thread is available — A and B are both waiting, and C is in the queue with no thread to run it. The solution is to use separate pools for tasks that have dependencies, or to use CompletableFuture's composition operators which do not block pool threads while waiting. Virtual threads (Java 21) change the calculus of thread pool sizing for I/O-bound workloads. A virtual thread that blocks on I/O is unmounted from its carrier thread, freeing the carrier to run other virtual threads. This means the effective number of concurrent I/O operations is no longer limited by the pool size — it is limited by the number of virtual threads in flight and the capacity of the underlying I/O subsystem. For I/O-bound work, Executors.newVirtualThreadPerTaskExecutor() creates one virtual thread per task with no pool, and the JVM schedules them efficiently across a small number of carrier threads (typically equal to CPU count). For CPU-bound work, virtual threads provide no advantage because unmounting only helps during blocking — a CPU-bound thread never blocks and never unmounts.
Java
// ── CPU-bound sizing — cores + 1 ─────────────────────────────────────
int cores = Runtime.getRuntime().availableProcessors();
ExecutorService cpuPool = new ThreadPoolExecutor(
    cores + 1, cores + 1,        // fixed at cores+1 — no burst beyond cores
    0L, TimeUnit.MILLISECONDS,
    new LinkedBlockingQueue<>(10_000),
    new NamedThreadFactory("cpu-worker", false),
    new ThreadPoolExecutor.CallerRunsPolicy()
);

// Parallel computation — each task gets a full core:
List<Future<Long>> cpuFutures = new ArrayList<>();
for (int i = 0; i < cores * 2; i++) {
    final int from = i * 500_000;
    cpuFutures.add(cpuPool.submit(() -> {
        long sum = 0;
        for (int j = from; j < from + 500_000; j++) sum += j;
        return sum;
    }));
}
long total = 0;
for (Future<Long> f : cpuFutures) total += f.get();
System.out.println("Sum: " + total);
cpuPool.shutdown();

// ── I/O-bound sizing — cores × (1 + wait/compute) ────────────────────
// Measured: tasks spend ~95% blocking on DB (wait=950ms, compute=50ms)
// wait/compute = 19; cores = 8; optimal = 8 × 20 = 160 threads
// Start with formula, tune empirically:
ExecutorService ioPool = new ThreadPoolExecutor(
    160, 160,
    60L, TimeUnit.SECONDS,
    new LinkedBlockingQueue<>(5000),
    new NamedThreadFactory("db-worker", false),
    new ThreadPoolExecutor.AbortPolicy()
);

// ── Thread pool starvation — the deadlock pattern ─────────────────────
ExecutorService tinyPool = Executors.newFixedThreadPool(2);

// Outer task submits inner task to SAME pool and blocks waiting:
Future<?> outer = tinyPool.submit(() -> {
    System.out.println("Outer task started — submitting inner");
    Future<?> inner = tinyPool.submit(() -> {
        System.out.println("Inner task running");
        return "inner done";
    });
    try {
        String result = (String) inner.get();   // DEADLOCK: both pool threads blocked here
        System.out.println("Outer got: " + result);
    } catch (Exception e) { Thread.currentThread().interrupt(); }
});

// Second outer task fills the pool:
tinyPool.submit(() -> {
    System.out.println("Second outer — also blocked");
    try { outer.get(); } catch (Exception e) {}
});

// inner task is queued but can NEVER run — all 2 threads are blocked waiting for it
// SOLUTION: use a separate pool for inner tasks, or use CompletableFuture.thenApply()

// ── CompletableFuture avoids starvation (no blocking pool threads) ────
CompletableFuture.supplyAsync(() -> "outer result", tinyPool)
    .thenComposeAsync(r ->
        CompletableFuture.supplyAsync(() -> r + " + inner result", tinyPool),
        tinyPool   // thenComposeAsync doesn't block — thread released while inner runs
    )
    .thenAccept(System.out::println);

tinyPool.shutdown();

// ── Virtual threads — Java 21, for I/O-bound workloads ───────────────
// No thread pool needed for I/O-bound work with virtual threads:
try (ExecutorService vThreadExecutor = Executors.newVirtualThreadPerTaskExecutor()) {
    List<Future<String>> vFutures = new ArrayList<>();
    for (int i = 0; i < 10_000; i++) {
        int id = i;
        vFutures.add(vThreadExecutor.submit(() -> {
            Thread.sleep(100);   // blocks I/O — virtual thread unmounts, carrier free
            return "result-" + id;
        }));
    }
    // All 10,000 tasks run concurrently on a handful of carrier threads:
    long done = vFutures.stream().filter(f -> {
        try { f.get(); return true; } catch (Exception e) { return false; }
    }).count();
    System.out.println("Completed: " + done);  // 10,000
}
// Vs OS threads: 10,000 threads × 1MB stack = 10GB — infeasible
// Virtual threads: 10,000 virtual threads on ~8 carriers — trivial

Related Topics in Multithreading

Process vs Thread
A process is an independent program in execution with its own isolated memory space, file handles, and system resources, managed by the operating system and separated from all other processes by strict boundaries. A thread is a unit of execution that lives inside a process, sharing that process's memory, heap, and resources with every other thread in the same process. Java programs run inside a JVM process; the JVM itself creates and manages threads, and every Java application starts with at least one thread — the main thread — with additional threads created by the JVM for garbage collection, JIT compilation, signal handling, and other runtime tasks. Understanding the distinction between processes and threads is the foundation for all concurrent programming in Java: it determines what is shared and what is isolated, what is fast and what is expensive, what fails independently and what fails together. This entry covers the OS-level and JVM-level model of processes and threads, the memory model that follows from the shared-versus-isolated distinction, the cost model for creation and context switching, failure isolation and its consequences, inter-process and inter-thread communication mechanisms, and the practical decision of when to use multiple processes versus multiple threads.
Thread Basics
A Java thread is an instance of java.lang.Thread that represents an independent path of execution within the JVM process. Every thread has a lifecycle — from creation through runnable, running, blocked, waiting, timed-waiting, and terminated states — and a set of properties including its name, priority, daemon status, thread group, and uncaught exception handler. The Java memory model specifies what visibility guarantees exist between threads and when writes by one thread are guaranteed to be visible to another. Thread scheduling is controlled by the OS scheduler subject to hints from the JVM via thread priority; the JVM does not provide real-time scheduling guarantees. This entry covers the complete thread lifecycle and its state machine, thread properties and how they affect scheduling and JVM shutdown, the happens-before relationship and why it matters for visibility, daemon threads and their relationship to JVM shutdown, thread interruption as a cooperative cancellation mechanism, and the methods on Thread that every Java developer must understand.
Creating Threads
Java provides three primary abstractions for defining the work a thread will execute: the Thread class itself (subclassed to override run()), the Runnable interface (a task with no return value and no checked exception), and the Callable interface (a task with a return value and a declared checked exception). Each represents a different contract between the task and the infrastructure that runs it. Thread subclassing couples the task definition to the execution mechanism and is the oldest and least flexible approach. Runnable decouples the task from the thread, allowing the same Runnable to be submitted to thread pools, scheduled executors, or wrapped in Thread objects. Callable extends that decoupling to include a return value and exception propagation, returning a Future that allows the caller to retrieve the result or handle exceptions asynchronously. Understanding all three — their contracts, their limitations, and when to use each — is the foundation of concurrent programming in Java before reaching for higher-level constructs.
Thread Lifecycle
The Java thread lifecycle is the complete sequence of states a thread passes through from the moment a Thread object is constructed to the moment its execution ends. Java defines six states in the Thread.State enum — NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, and TERMINATED — and the JVM transitions threads between these states in response to specific method calls, lock acquisitions, monitor notifications, timeouts, and exceptions. Each state has a precise meaning, a defined set of entry conditions, and a defined set of exit conditions. Understanding the lifecycle in full is prerequisite knowledge for diagnosing deadlocks, thread leaks, performance bottlenecks in thread dumps, and incorrect synchronization — all of which manifest as threads stuck in specific states. This entry covers every state in the lifecycle with its entry and exit conditions, all legal and illegal state transitions, how thread dumps represent each state, the interaction between lifecycle states and interruption, the effect of uncaught exceptions on lifecycle, and how to observe lifecycle transitions programmatically.