☕ Java

Stream API

The Stream API, introduced in Java 8, provides a functional, declarative model for processing sequences of elements. A Stream<T> is not a data structure — it carries no storage. It is a pipeline specification: a source that provides elements, zero or more intermediate operations that transform or filter the stream, and exactly one terminal operation that consumes the stream and produces a result or side effect. Streams are lazy: intermediate operations do not execute until the terminal operation is invoked, and execution is fused — elements flow through the entire pipeline one at a time (or in batches for parallel streams), avoiding intermediate collection. Streams are single-use: once a terminal operation has been invoked, the stream is consumed and cannot be reused. Java provides both reference streams (Stream<T>) and primitive streams (IntStream, LongStream, DoubleStream) that avoid boxing overhead. The Stream API covers sources (collections, arrays, files, generators), intermediate operations (filter, map, flatMap, sorted, distinct, limit, skip, peek, mapToInt, mapToObj), terminal operations (forEach, collect, reduce, count, findFirst, findAny, anyMatch, allMatch, noneMatch, toList, min, max), and collectors (toList, toSet, toMap, groupingBy, partitioningBy, joining, counting, summarizing). This entry covers the full lifecycle, every operation class, the short-circuit evaluation, the Spliterator model, collector design, and performance guidance.

Stream Sources, Pipeline Structure, and Lazy Evaluation

A Stream pipeline has three parts: source, intermediate operations, and terminal operation. The source provides elements: Collection.stream() or Collection.parallelStream() for collections; Arrays.stream(array) for arrays; Stream.of(elements) for explicit element lists; IntStream.range(start, end) and IntStream.rangeClosed() for integer ranges; Files.lines(path) for file lines; Stream.generate(Supplier) for infinite generated streams; Stream.iterate(seed, UnaryOperator) for infinite iterated streams; StreamSupport.stream(spliterator, parallel) for custom sources. Intermediate operations are lazy — they register a stage in the pipeline but do not execute any computation until a terminal operation is invoked. They return a new Stream (or IntStream, LongStream, DoubleStream). Intermediate operations are stateless or stateful. Stateless operations (filter, map, flatMap, peek, mapToInt, mapToObj) process each element independently of others. Stateful operations (sorted, distinct, limit, skip) must accumulate state before producing output — sorted sees all elements before emitting the first result; distinct tracks all seen elements; limit and skip track a count. When the terminal operation is invoked, the pipeline is realized. For sequential streams, elements flow through the pipeline one at a time: each element passes through every non-short-circuit stateless operation in sequence before the next element is fetched. Short-circuit operations (limit, findFirst, findAny, anyMatch, allMatch, noneMatch) may stop element consumption before all elements are processed — this is the key performance benefit of streaming over collection-then-iterate patterns. The pipeline fusion means that filter().map().forEach() is not three separate passes over the data. Each element is filtered, then mapped, then consumed in a single traversal. This is equivalent to writing the loop body explicitly, but expressed in composable, declarative form. For large data sets, this avoids creating intermediate collections that would consume memory proportional to the data size.
Java
// ── Stream sources ────────────────────────────────────────────────────
// From Collection:
List<String> names = List.of("Alice", "Bob", "Carol", "Dave");
Stream<String> fromList = names.stream();

// From array:
String[] arr = {"x", "y", "z"};
Stream<String> fromArray = Arrays.stream(arr);

// From explicit elements:
Stream<Integer> ofElements = Stream.of(1, 2, 3, 4, 5);

// Primitive range (no boxing):
IntStream range = IntStream.range(0, 10);       // 0,1,...,9
IntStream rangeClosed = IntStream.rangeClosed(1, 5); // 1,2,3,4,5

// File lines:
try (Stream<String> lines = Files.lines(Path.of("data.txt"), StandardCharsets.UTF_8)) {
    lines.forEach(System.out::println);
}

// Infinite generator:
Stream<Double> randoms = Stream.generate(Math::random);  // infinite — must limit()

// Infinite iterate:
Stream<Integer> powers = Stream.iterate(1, n -> n * 2);  // 1, 2, 4, 8, ...

// Iterate with predicate (Java 9+):
Stream<Integer> bounded = Stream.iterate(1, n -> n <= 100, n -> n * 2); // stops at 64

// ── Pipeline: lazy evaluation demonstration ───────────────────────────
Stream<String> pipeline = names.stream()
    .filter(s -> {
        System.out.println("Filtering: " + s);  // will print
        return s.length() > 3;
    })
    .map(s -> {
        System.out.println("Mapping: " + s);   // will print
        return s.toUpperCase();
    });
// Nothing printed yet — pipeline is lazy, no computation has occurred

System.out.println("Terminal operation starting:");
List<String> result = pipeline.collect(Collectors.toList());
// NOW the filtering and mapping runs:
// Filtering: Alice   Mapping: Alice
// Filtering: Bob     (Bob filtered out — length = 3, not > 3)
// Filtering: Carol   Mapping: Carol
// Filtering: Dave    Mapping: Dave
System.out.println(result);  // [ALICE, CAROL, DAVE]

// ── Short-circuit: stops early ────────────────────────────────────────
Optional<String> firstLong = names.stream()
    .filter(s -> s.length() > 3)  // filter runs for: Alice (pass), Bob (stop here)
    .findFirst();                  // stop after finding first match
// Only "Alice" is processed beyond the filter — "Carol" and "Dave" never seen
System.out.println(firstLong);  // Optional[Alice]

Intermediate Operations — Transforming and Filtering

filter(Predicate<? super T> predicate) keeps elements for which the predicate returns true. It is stateless and lazy. It is the first operation to reach for when reducing the number of elements. map(Function<? super T, ? extends R> mapper) transforms each element to a new value. It is stateless, lazy, and produces a Stream<R>. For transformations to primitive types, use mapToInt(ToIntFunction), mapToLong(ToLongFunction), mapToDouble(ToDoubleFunction) to get primitive streams that avoid boxing. mapToObj(IntFunction) on primitive streams converts back to reference streams. flatMap(Function<? super T, ? extends Stream<? extends R>> mapper) maps each element to a stream and flattens all those streams into a single stream. It is the key operation for one-to-many transformations and for unnesting nested collections. flatMapToInt, flatMapToLong, flatMapToDouble are the primitive variants. sorted() and sorted(Comparator) are stateful — they must buffer all elements before emitting the first. sorted() requires T to implement Comparable. sorted(Comparator) uses the provided comparator. For performance: sort only after filtering and limiting to reduce the number of elements sorted. distinct() is stateful — it tracks all seen elements using equals/hashCode. It removes duplicates. For streams of many elements, distinct() may consume significant memory for the seen-element set. limit(long maxSize) is stateful and short-circuit — it passes at most maxSize elements downstream and then signals upstream to stop producing. skip(long n) is stateful — it discards the first n elements. Together they implement pagination: skip(page * pageSize).limit(pageSize). peek(Consumer<? super T> action) passes each element through unchanged while executing a side effect. It is used for debugging (logging elements mid-pipeline) and should not be used for primary program logic — its execution is not guaranteed for all elements in all situations (short-circuit operations may prevent peek from seeing all elements).
Java
// ── filter: keep elements matching predicate ──────────────────────────
List<String> filtered = names.stream()
    .filter(s -> s.startsWith("A") || s.startsWith("C"))
    .collect(Collectors.toList());
System.out.println(filtered);   // [Alice, Carol]

// ── map: transform each element ───────────────────────────────────────
List<Integer> lengths = names.stream()
    .map(String::length)     // StringInteger (boxed)
    .collect(Collectors.toList());

// mapToInt: avoid boxing ───────────────────────────────────────────────
int totalLength = names.stream()
    .mapToInt(String::length)  // Stringint (no boxing)
    .sum();                    // IntStream.sum() — no boxing
System.out.println(totalLength);

// ── flatMap: one-to-many, flatten nested structure ────────────────────
List<List<String>> nested = List.of(
    List.of("a", "b", "c"),
    List.of("d", "e"),
    List.of("f")
);

List<String> flat = nested.stream()
    .flatMap(Collection::stream)  // each inner List → its stream, then flatten
    .collect(Collectors.toList());
System.out.println(flat);  // [a, b, c, d, e, f]

// flatMap for splitting strings into words:
List<String> sentences = List.of("hello world", "foo bar baz");
List<String> words = sentences.stream()
    .flatMap(s -> Arrays.stream(s.split(" ")))
    .collect(Collectors.toList());
System.out.println(words);  // [hello, world, foo, bar, baz]

// ── sorted: stateful, buffers all elements ────────────────────────────
List<String> sortedNames = names.stream()
    .sorted()                          // natural order (Comparable)
    .collect(Collectors.toList());
System.out.println(sortedNames);   // [Alice, Bob, Carol, Dave]

List<String> byLength = names.stream()
    .sorted(Comparator.comparingInt(String::length).thenComparing(Comparator.naturalOrder()))
    .collect(Collectors.toList());

// ── distinct: remove duplicates ────────────────────────────────────────
List<Integer> withDups = List.of(1, 2, 2, 3, 3, 3, 4);
List<Integer> deduped = withDups.stream()
    .distinct()
    .collect(Collectors.toList());
System.out.println(deduped);   // [1, 2, 3, 4]

// ── limit and skip: pagination ────────────────────────────────────────
int page = 1, pageSize = 2;
List<String> page1 = names.stream()
    .skip((long) page * pageSize)    // skip page 1 = skip 2 elements
    .limit(pageSize)                 // take next 2
    .collect(Collectors.toList());
System.out.println(page1);   // [Carol, Dave]

// ── peek: debugging (side-effect, don't use for logic) ────────────────
long count = names.stream()
    .peek(s -> System.out.println("Before filter: " + s))
    .filter(s -> s.length() > 3)
    .peek(s -> System.out.println("After filter:  " + s))
    .count();

Terminal Operations and Collectors

Terminal operations trigger pipeline execution and produce a result or a side effect. forEach(Consumer) executes the Consumer for each element — the order is not guaranteed for parallel streams. forEachOrdered(Consumer) guarantees encounter order at the cost of parallelism. count() returns the number of elements as a long. min(Comparator) and max(Comparator) return Optional<T> for the minimum and maximum elements. findFirst() returns Optional<T> for the first element (short-circuit); findAny() returns any element, potentially faster for parallel streams. anyMatch, allMatch, noneMatch take a Predicate and return boolean, short-circuiting when the result is determined. reduce() accumulates elements into a single value. reduce(identity, BinaryOperator) uses an identity value as the starting accumulator and combines each element with the current accumulator. reduce(BinaryOperator) returns Optional<T> for streams that may be empty. reduce(identity, BiFunction<U, T, U>, BinaryOperator<U>) is the three-argument form for parallel streams where the accumulator type differs from the element type. collect(Collector) is the most powerful terminal operation. It accumulates elements into a mutable result container using the Collector's three operations: supplier (creates the container), accumulator (adds an element), and combiner (merges containers from parallel sub-streams). Built-in collectors cover nearly all needs: Collectors.toList(), toSet(), toUnmodifiableList(), toUnmodifiableSet(), toMap(), toConcurrentMap(), groupingBy(), partitioningBy(), joining(), counting(), summingInt/Long/Double(), averagingInt/Long/Double(), summarizingInt/Long/Double(), and toCollection(Supplier). groupingBy(Function<T, K>) groups elements by a classifier function, producing Map<K, List<T>>. groupingBy(Function, Collector) applies a downstream Collector to each group — groupingBy(classifier, Collectors.counting()) produces Map<K, Long>. partitioningBy(Predicate) groups into Map<Boolean, List<T>> — true for matching elements, false for non-matching. joining(delimiter, prefix, suffix) concatenates String elements. toMap(keyMapper, valueMapper) produces Map<K, V>, throwing on duplicate keys; toMap(keyMapper, valueMapper, mergeFunction) handles duplicate keys.
Java
// ── Terminal operations ───────────────────────────────────────────────
long count = names.stream().filter(s -> s.length() > 3).count();
System.out.println(count);   // 3

Optional<String> shortest = names.stream().min(Comparator.comparingInt(String::length));
Optional<String> longest  = names.stream().max(Comparator.comparingInt(String::length));
System.out.println(shortest.orElse(""));  // Bob
System.out.println(longest.orElse(""));   // Carol (or Alice — tied at 5)

boolean anyA = names.stream().anyMatch(s -> s.startsWith("A"));  // true
boolean allA = names.stream().allMatch(s -> s.startsWith("A"));  // false
boolean noneZ = names.stream().noneMatch(s -> s.startsWith("Z")); // true

// reduce: sum lengths
int totalLen = names.stream()
    .mapToInt(String::length)
    .reduce(0, Integer::sum);  // IntStream.reduce(0, IntBinaryOperator)
System.out.println(totalLen);

// reduce with Object accumulator:
String longest2 = names.stream()
    .reduce("", (a, b) -> a.length() >= b.length() ? a : b);
System.out.println(longest2);  // Carol

// ── Collectors: the comprehensive toolkit ─────────────────────────────
// toList (Java 16+): unmodifiable list preserving encounter order:
List<String> immList = names.stream().collect(Collectors.toUnmodifiableList());
// or since Java 16:
List<String> toListResult = names.stream().toList();  // unmodifiable

// toSet:
Set<String> nameSet = names.stream().collect(Collectors.toSet());

// toMap: String → length
Map<String, Integer> nameLengths = names.stream()
    .collect(Collectors.toMap(
        Function.identity(),   // key: the string itself
        String::length         // value: its length
    ));
System.out.println(nameLengths);  // {Alice=5, Bob=3, Carol=5, Dave=4}

// toMap with merge function (handles duplicate keys):
Map<Integer, String> byLength = names.stream()
    .collect(Collectors.toMap(
        String::length,      // key: length
        Function.identity(), // value: name
        (a, b) -> a + "," + b  // merge duplicate keys: "Alice,Carol" for length 5
    ));

// groupingBy: group into Map<K, List<T>>
Map<Integer, List<String>> byLengthGroup = names.stream()
    .collect(Collectors.groupingBy(String::length));
System.out.println(byLengthGroup);
// {3=[Bob], 4=[Dave], 5=[Alice, Carol]}

// groupingBy with downstream collector:
Map<Integer, Long> countByLength = names.stream()
    .collect(Collectors.groupingBy(String::length, Collectors.counting()));
System.out.println(countByLength);  // {3=1, 4=1, 5=2}

Map<Integer, String> joinedByLength = names.stream()
    .collect(Collectors.groupingBy(
        String::length,
        Collectors.joining(", ")  // join names of same length
    ));
System.out.println(joinedByLength);  // {3=Bob, 4=Dave, 5=Alice, Carol}

// partitioningBy: split into true/false
Map<Boolean, List<String>> partition = names.stream()
    .collect(Collectors.partitioningBy(s -> s.length() > 3));
System.out.println(partition.get(true));   // [Alice, Carol, Dave]
System.out.println(partition.get(false));  // [Bob]

// joining: concatenate strings
String joined = names.stream().collect(Collectors.joining(", ", "[", "]"));
System.out.println(joined);  // [Alice, Bob, Carol, Dave]

// summarizingInt: all stats at once
IntSummaryStatistics stats = names.stream()
    .collect(Collectors.summarizingInt(String::length));
System.out.printf("count=%d sum=%d min=%d max=%d avg=%.1f%n",
    stats.getCount(), stats.getSum(), stats.getMin(),
    stats.getMax(), stats.getAverage());
// count=4 sum=17 min=3 max=5 avg=4.3

Related Topics in Java 8 Features

Lambda Expressions
Lambda expressions, introduced in Java 8, are anonymous functions — blocks of code that can be stored in variables, passed as arguments, and returned from methods, treating behavior as data. A lambda has three parts: a parameter list, an arrow token (->), and a body. The body is either a single expression (whose value is the implicit return value) or a block of statements wrapped in braces. Lambdas implement functional interfaces — interfaces with exactly one abstract method — allowing any lambda whose signature matches the abstract method's signature to be used wherever that interface is expected. The lambda syntax is syntactic sugar: every lambda is compiled to an invocation of the functional interface's abstract method, with the compiler generating a class (via invokedynamic) that implements the interface and delegates to the lambda body. This entry covers the complete lambda syntax including all shorthand forms, variable capture and the effectively-final constraint, method references as a specialized lambda syntax, the relationship between lambdas and the type system, how lambdas interact with exception handling, the invokedynamic compilation strategy and its performance characteristics, and the complete set of rules governing lambda type inference.
Functional Interfaces
A functional interface is any Java interface that has exactly one abstract method. This single-abstract-method (SAM) contract makes the interface a valid target type for a lambda expression or method reference — the lambda provides the implementation of that one abstract method. The @FunctionalInterface annotation is optional but strongly recommended: it causes the compiler to verify that the interface satisfies the SAM constraint, rejecting it at compile time if there is more than one abstract method. The java.util.function package, introduced in Java 8, provides 43 standard functional interfaces organized around four root types — Function, Consumer, Supplier, Predicate — and their variations for primitives (IntFunction, LongSupplier, DoubleConsumer, etc.), binary operations (BiFunction, BiConsumer, BiPredicate), and unary operators (UnaryOperator, IntUnaryOperator, etc.). This entry covers the design principles behind functional interfaces, the complete @FunctionalInterface contract including default and static methods, the full java.util.function hierarchy and the pattern that governs naming, creating custom functional interfaces with checked exceptions, composing functional interfaces via default methods, and the relationship between functional interfaces and the type system including the rules for lambda assignment and widening.
Predicate
Predicate<T> is a functional interface in java.util.function representing a boolean-valued function of one argument, with the single abstract method boolean test(T t). It is one of the four foundational functional interfaces in the Java standard library and is used throughout the Collections framework, Streams API, and Optional for filtering, condition testing, and validation. Predicate is designed for composition: its default methods and(Predicate), or(Predicate), and negate() allow building complex boolean expressions from simple predicates without boilerplate. The static methods isEqual(Object) and not(Predicate) provide factory methods for common cases. The primitive specializations IntPredicate, LongPredicate, and DoublePredicate avoid boxing overhead for numeric values. BiPredicate<T,U> extends the concept to two-argument boolean functions. This entry covers the complete Predicate API, all composition methods and their short-circuit semantics, the static factory methods, primitive specializations, BiPredicate, using Predicate in stream pipelines and Collections methods, building validation frameworks with Predicate composition, and the performance and readability trade-offs of different composition styles.
Function
Function<T,R> is a functional interface in java.util.function representing a function that accepts one argument of type T and produces a result of type R, with the single abstract method R apply(T t). It is the most general transformation interface in the standard library, used throughout the Streams API for mapping (Stream.map()), in Optional for value transformation (Optional.map(), Optional.flatMap()), and as a building block for more specialized functional interfaces. Function provides two default composition methods — andThen() and compose() — that create new functions by chaining two functions together, enabling functional pipeline construction without intermediate variables. The specializations cover all combinations of generic and primitive inputs and outputs: ToIntFunction, IntFunction, IntToLongFunction, and so on. UnaryOperator<T> extends Function<T,T> for operations that transform a value within the same type. BiFunction<T,U,R> generalizes to two input arguments. This entry covers the complete Function API, the semantics of andThen versus compose, all specializations and when each is appropriate, the functional relationship between Function and other java.util.function types, partial application patterns, and Function as the basis for building data pipelines.