☕ Java

Spliterator

Spliterator (Splittable Iterator) is Java 8's mechanism for parallel and sequential element traversal of sources such as collections, arrays, I/O channels, and generators. Where Iterator is purely sequential and has no concept of size or characteristics, Spliterator carries metadata about its source — estimated size, encounter order, element uniqueness, nullability, and more — enabling the Stream API and fork/join framework to make intelligent decisions about parallelism. A Spliterator can split itself into two halves via trySplit(), allowing one half to be processed on a separate thread while the other continues locally. This entry covers all Spliterator methods, the eight characteristic flags, trySplit semantics and balancing, the four functional traversal methods, how to implement a custom Spliterator, and how the Stream pipeline uses Spliterators under the hood.

Spliterator Methods and Traversal

Spliterator<T> defines four traversal methods and two size-query methods. The primary traversal method tryAdvance(Consumer<? super T> action) processes the next element if one exists, returning true, or returns false if no elements remain. It is the Spliterator analogue of Iterator's hasNext/next pair collapsed into a single call. forEachRemaining(Consumer<? super T> action) processes all remaining elements sequentially — the default implementation repeatedly calls tryAdvance, but implementations override it for efficiency (ArrayList's Spliterator, for example, uses a plain array loop). The splitting method trySplit() attempts to partition the remaining elements into two roughly equal halves, returning a new Spliterator covering approximately the first half while the receiver retains the second half. If the source cannot be split further (too few elements, or the source is inherently non-splittable), trySplit() returns null. The returned Spliterator need not cover exactly half — the contract only requires that both halves together cover exactly the same elements as the original, without overlap or omission. estimateSize() returns an estimate of the number of elements remaining. If the SIZED characteristic is set, this estimate is exact. getExactSizeIfKnown() is a convenience method that returns estimateSize() if SIZED is set, or -1 otherwise. The characteristics() method returns an int bitmask of characteristic flags. hasCharacteristics(int flags) tests whether specific flags are set. The Stream infrastructure calls these methods before deciding whether to parallelize, whether to sort, whether to deduplicate, and how to size result containers.
Java
// ── Basic Spliterator usage ───────────────────────────────────────────
List<String> list = List.of("alpha", "beta", "gamma", "delta", "epsilon");
Spliterator<String> spliterator = list.spliterator();

System.out.println(spliterator.estimateSize());         // 5 (exact, SIZED is set)
System.out.println(spliterator.getExactSizeIfKnown());  // 5

// tryAdvance — process one element at a time
spliterator.tryAdvance(s -> System.out.println("First: " + s));  // First: alpha

// forEachRemaining — consume all remaining elements
spliterator.forEachRemaining(s -> System.out.println("Rest: " + s));
// Rest: beta
// Rest: gamma
// Rest: delta
// Rest: epsilon

// ── trySplit — manual parallel split ─────────────────────────────────
List<Integer> numbers = new ArrayList<>(
    List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10));

Spliterator<Integer> original = numbers.spliterator();
Spliterator<Integer> prefix   = original.trySplit();  // ~first half

System.out.println("prefix size:   " + prefix.estimateSize());    // 5
System.out.println("remaining size:" + original.estimateSize());  // 5

// Process each half independently (could be on separate threads):
prefix.forEachRemaining(n -> System.out.print(n + " "));   // 1 2 3 4 5
System.out.println();
original.forEachRemaining(n -> System.out.print(n + " ")); // 6 7 8 9 10

// ── Characteristics bitmask ───────────────────────────────────────────
Spliterator<String> arraySplit = Arrays.asList("x", "y", "z").spliterator();
int chars = arraySplit.characteristics();

System.out.println(Integer.toBinaryString(chars));   // shows set flags

System.out.println(arraySplit.hasCharacteristics(Spliterator.SIZED));    // true
System.out.println(arraySplit.hasCharacteristics(Spliterator.ORDERED));  // true
System.out.println(arraySplit.hasCharacteristics(Spliterator.DISTINCT)); // false

// HashSet: DISTINCT but not ORDERED
Spliterator<String> setSplit = new HashSet<>(List.of("a","b","c")).spliterator();
System.out.println(setSplit.hasCharacteristics(Spliterator.DISTINCT)); // true
System.out.println(setSplit.hasCharacteristics(Spliterator.ORDERED));  // false

// ── Spliterators utility factory methods ──────────────────────────────
// Wrap an Iterator into a Spliterator (characteristics must be specified manually):
Iterator<String> iter = List.of("p", "q", "r").iterator();
Spliterator<String> fromIter = Spliterators.spliteratorUnknownSize(
    iter,
    Spliterator.ORDERED     // we know it's ordered, nothing else guaranteed
);

// Create a Stream from a Spliterator:
Stream<String> stream = StreamSupport.stream(fromIter, false); // false = sequential
stream.forEach(System.out::println);  // p q r

Characteristic Flags — The Eight Constants

Spliterator defines eight characteristic constants as int bit flags. Understanding them is essential for writing correct custom Spliterators and for understanding how the Stream pipeline optimises itself. SIZED (0x00000040) means estimateSize() returns an exact count of the remaining elements. The Stream pipeline uses this to pre-allocate result containers (e.g., toArray(), collect(toList())) and to decide whether parallel splits are worth performing. After trySplit(), both the prefix and the original Spliterator should remain SIZED, and their sizes should sum to the original size. SUBSIZED (0x00004000) means that all Spliterators returned by trySplit() are also SIZED. SUBSIZED implies SIZED. This flag allows the parallel framework to compute exact offsets for each split without coordination, enabling array-filling operations to fill different array segments in parallel without merging. ORDERED (0x00000010) means the Spliterator has a defined encounter order (like a List or array, as opposed to a Set or Map). Operations that depend on order — findFirst(), limit(), sequential collect with order-preserving collectors — must respect this flag. Removing ORDERED via spliterator.withoutCharacteristics(Spliterator.ORDERED) (via StreamSupport) allows parallel unordered operations which can be significantly faster. SORTED (0x00000004) means elements are delivered in a sorted order according to the Comparator returned by getComparator(). If the natural order applies, getComparator() must return null (not Comparator.naturalOrder()). SORTED implies ORDERED. The Stream.sorted() operation can be skipped entirely when this flag is set. DISTINCT (0x00000001) means no two elements returned by the traversal are equal under equals(). HashSet and TreeSet Spliterators carry this flag. The Stream.distinct() operation is a no-op when DISTINCT is already set. NONNULL (0x00000100) guarantees no element will be null. This allows null checks to be elided in operations that would otherwise need to handle null. IMMUTABLE (0x00000400) means the source cannot be structurally modified (elements added, removed, or replaced) during traversal. No ConcurrentModificationException is possible. Arrays and unmodifiable collections carry this flag. CONCURRENT (0x00001000) means the source can be safely modified concurrently without a ConcurrentModificationException. ConcurrentHashMap and ConcurrentLinkedQueue carry this flag. CONCURRENT and IMMUTABLE are mutually exclusive.
Java
// ── Examining characteristics of standard collections ─────────────────
import java.util.Spliterator;

// ArrayList: ORDERED, SIZED, SUBSIZED
List<Integer> arrayList = new ArrayList<>(List.of(1, 2, 3));
Spliterator<Integer> alSplit = arrayList.spliterator();
printCharacteristics("ArrayList", alSplit);

// HashSet: DISTINCT, SIZED
Set<Integer> hashSet = new HashSet<>(List.of(1, 2, 3));
Spliterator<Integer> hsSplit = hashSet.spliterator();
printCharacteristics("HashSet", hsSplit);

// TreeSet: DISTINCT, ORDERED, SIZED, SORTED
Set<Integer> treeSet = new TreeSet<>(List.of(3, 1, 2));
Spliterator<Integer> tsSplit = treeSet.spliterator();
printCharacteristics("TreeSet", tsSplit);
System.out.println("Comparator: " + tsSplit.getComparator()); // null = natural order

// Arrays.asList: ORDERED, SIZED, SUBSIZED
Spliterator<String> arrSplit = Arrays.asList("a","b","c").spliterator();
printCharacteristics("Arrays.asList", arrSplit);

// ConcurrentHashMap.keySet(): DISTINCT, CONCURRENT
Map<String,Integer> chm = new ConcurrentHashMap<>();
chm.put("x", 1); chm.put("y", 2);
Spliterator<String> chmSplit = chm.keySet().spliterator();
printCharacteristics("ConcurrentHashMap.keySet", chmSplit);

// ── Utility method ────────────────────────────────────────────────────
static void printCharacteristics(String name, Spliterator<?> sp) {
    System.out.printf("%s:%n", name);
    System.out.printf("  SIZED:      %b%n", sp.hasCharacteristics(Spliterator.SIZED));
    System.out.printf("  SUBSIZED:   %b%n", sp.hasCharacteristics(Spliterator.SUBSIZED));
    System.out.printf("  ORDERED:    %b%n", sp.hasCharacteristics(Spliterator.ORDERED));
    System.out.printf("  SORTED:     %b%n", sp.hasCharacteristics(Spliterator.SORTED));
    System.out.printf("  DISTINCT:   %b%n", sp.hasCharacteristics(Spliterator.DISTINCT));
    System.out.printf("  NONNULL:    %b%n", sp.hasCharacteristics(Spliterator.NONNULL));
    System.out.printf("  IMMUTABLE:  %b%n", sp.hasCharacteristics(Spliterator.IMMUTABLE));
    System.out.printf("  CONCURRENT: %b%n", sp.hasCharacteristics(Spliterator.CONCURRENT));
}

// ── How the Stream pipeline uses characteristics ──────────────────────
// SIZED → pre-allocate result array without resizing
int[] result = IntStream.range(0, 10).toArray();  // knows size = 10 upfront

// SORTED → skip sort() when source is already sorted
TreeSet<Integer> sorted = new TreeSet<>(List.of(5,3,1,4,2));
List<Integer> sortedResult = sorted.stream()
    .sorted()        // no-op: SORTED already set
    .collect(Collectors.toList());

// DISTINCT → skip distinct() when source already has unique elements
Set<String> distinct = new HashSet<>(List.of("a","b","c"));
long count = distinct.stream()
    .distinct()      // no-op: DISTINCT already set
    .count();

Custom Spliterator Implementation

Implementing a custom Spliterator is needed when wrapping a non-standard data source — a database cursor, a lazy generator, a binary tree, a network stream — for use with the Stream API. The contract has several precise requirements: trySplit() must return null or a Spliterator covering a strict subset of the remaining elements with no overlap; tryAdvance() must return false consistently once exhausted; the characteristic flags must accurately reflect the source. The minimal implementation requires only tryAdvance(), trySplit(), estimateSize(), and characteristics(). For performance, override forEachRemaining() to avoid the per-call overhead of tryAdvance(). For a balanced, splittable source (like an array or indexed list), implement trySplit() by computing a midpoint and returning a Spliterator for the lower half while retaining the upper half. For an unbalanced source (like a linked structure), trySplit() may peel off a fixed number of elements (a batch) and return a fixed Spliterator for that batch. For an unsplittable source, return null from trySplit() — the Stream will fall back to sequential processing. The AbstractSpliterator base class (java.util.Spliterators.AbstractSpliterator) provides a default trySplit() that accumulates elements into an array batch and returns a fixed Spliterator for that batch. This is suitable for sources that can provide elements one at a time but cannot compute a structural midpoint. The batch size starts small (1024) and grows toward MAX_BATCH (1 << 25) to balance overhead against granularity.
Java
// ── Custom Spliterator for a range of integers ───────────────────────
// Demonstrates: SIZED + SUBSIZED + ORDERED + IMMUTABLE + NONNULL
public class RangeSpliterator implements Spliterator.OfInt {
    private int current;
    private final int end;

    public RangeSpliterator(int start, int end) {
        this.current = start;
        this.end     = end;
    }

    @Override
    public boolean tryAdvance(IntConsumer action) {
        if (current < end) {
            action.accept(current++);
            return true;
        }
        return false;
    }

    @Override
    public OfInt trySplit() {
        int mid = (current + end) >>> 1;   // midpoint, avoids overflow
        if (mid <= current) return null;    // too small to split
        int lo = current;
        current = mid;                      // this Spliterator now covers [mid, end)
        return new RangeSpliterator(lo, mid); // prefix covers [lo, mid)
    }

    @Override
    public long estimateSize() {
        return (long)(end - current);
    }

    @Override
    public int characteristics() {
        return SIZED | SUBSIZED | ORDERED | IMMUTABLE | NONNULL;
    }
}

// Use in a sequential stream:
Spliterator.OfInt sp = new RangeSpliterator(0, 10);
StreamSupport.intStream(sp, false)
    .forEach(n -> System.out.print(n + " "));  // 0 1 2 3 4 5 6 7 8 9

// Use in a parallel stream:
Spliterator.OfInt psp = new RangeSpliterator(0, 1_000_000);
long sum = StreamSupport.intStream(psp, true)   // true = parallel
    .asLongStream()
    .sum();
System.out.println(sum);  // 499999500000

// ── Custom Spliterator wrapping a binary tree (pre-order) ─────────────
// Uses AbstractSpliterator for built-in batch splitting:
class TreeNode<T> {
    T value;
    TreeNode<T> left, right;
    TreeNode(T v, TreeNode<T> l, TreeNode<T> r) { value=v; left=l; right=r; }
}

public class TreeSpliterator<T> extends Spliterators.AbstractSpliterator<T> {
    private final Deque<TreeNode<T>> stack = new ArrayDeque<>();

    public TreeSpliterator(TreeNode<T> root) {
        super(Long.MAX_VALUE, ORDERED);  // size unknown → MAX_VALUE; ordered = pre-order
        if (root != null) stack.push(root);
    }

    @Override
    public boolean tryAdvance(Consumer<? super T> action) {
        if (stack.isEmpty()) return false;
        TreeNode<T> node = stack.pop();
        action.accept(node.value);
        if (node.right != null) stack.push(node.right);
        if (node.left  != null) stack.push(node.left);
        return true;
    }
    // trySplit() inherited from AbstractSpliterator — batches elements automatically
}

// Build a small tree and stream it:
//        1
//       / \
//      2   3
//     / \
//    4   5
TreeNode<Integer> tree = new TreeNode<>(1,
    new TreeNode<>(2,
        new TreeNode<>(4, null, null),
        new TreeNode<>(5, null, null)),
    new TreeNode<>(3, null, null));

StreamSupport.stream(new TreeSpliterator<>(tree), false)
    .forEach(n -> System.out.print(n + " "));  // 1 2 4 5 3  (pre-order)

Related Topics in Collections Framework

Collections Overview
The Java Collections Framework (JCF) is a unified architecture for representing and manipulating groups of objects. It provides a set of interfaces that define the operations a collection must support, a set of abstract classes that provide partial implementations, and a set of concrete implementations optimised for different use cases. Every Java developer uses collections daily — lists for sequences, sets for uniqueness, maps for key-value pairs, and queues for ordering — and choosing the right implementation for the right use case is one of the most fundamental practical skills in Java.
Iterable
Iterable<E> is the root interface of the Java Collections hierarchy. Any class that implements Iterable can be used in a for-each loop. It declares a single abstract method: iterator(), which returns an Iterator<E> that the for-each loop uses to traverse the elements. Implementing Iterable is all that is required to make a custom data structure work with Java's enhanced for loop, the Stream API, and any method that accepts an Iterable.
Collection Interface
Collection<E> is the root interface of the main collection hierarchy, extending Iterable<E>. It defines the common operations that all collection types must support: adding elements, removing elements, checking containment, querying size, clearing, converting to an array, and bulk operations. List, Set, and Queue all extend Collection. Map does not extend Collection because a map operates on key-value pairs rather than individual elements.
ArrayList
ArrayList is a resizable-array implementation of the List interface. It is the most commonly used collection in Java, providing dynamic sizing on top of a standard array. Elements are stored in contiguous memory, enabling O(1) random access by index. When the internal array is full, ArrayList automatically allocates a larger array and copies all elements. ArrayList is not thread-safe and preserves insertion order, allowing duplicates.