☕ Java

I/O Basics

Java I/O is built on a small set of abstract concepts that underlie every I/O operation in the language: streams, readers, writers, channels, and buffers. A stream is a sequential flow of data — bytes moving from a source to a destination one at a time or in chunks. Java organizes I/O around two fundamental distinctions: byte I/O (reading and writing raw bytes, the universal representation that everything ultimately reduces to) and character I/O (reading and writing text encoded in a specific character set, with automatic encoding and decoding). The original java.io package, introduced in Java 1.0, provides stream-based I/O through four abstract base classes: InputStream, OutputStream, Reader, and Writer. The java.nio package, introduced in Java 1.4, adds a channel-and-buffer model for non-blocking and memory-mapped I/O. The java.nio.file package, introduced in Java 7 as part of NIO.2, provides a modern, comprehensive file system API that supersedes much of java.io.File. This entry covers the conceptual model of streams and their abstract base classes, the decorator pattern that underlies Java I/O class hierarchy, the source-processor-sink taxonomy of stream classes, blocking versus non-blocking I/O, buffering and why it is almost always necessary, the standard I/O streams (System.in, System.out, System.err), and the resource management contract that every I/O class must satisfy.

The Stream Model — Sources, Processors, and Sinks

Java I/O is organized around the stream metaphor: data flows continuously from a source to a sink, one element at a time, without random access. A source stream (also called a node stream or raw stream) connects directly to a data source: a file on disk, a socket, a byte array in memory, a pipe between threads. A sink stream connects to a destination. Processor streams (also called filter streams or wrapper streams) sit between a source and the application, transforming the data as it passes through: buffering it for efficiency, compressing or decompressing it, converting bytes to characters, computing checksums, or encrypting and decrypting. The class hierarchy reflects this taxonomy precisely. Every concrete I/O class ultimately extends one of four abstract base classes: InputStream for byte input, OutputStream for byte output, Reader for character input, Writer for character output. Source classes read from real sources — FileInputStream reads from a file, ByteArrayInputStream reads from a byte array, SocketInputStream (used internally by Socket) reads from a network socket. Processor classes wrap another stream and add behavior — BufferedInputStream wraps any InputStream and adds buffering, DataInputStream wraps any InputStream and adds methods for reading primitive types, GZIPInputStream wraps any InputStream and decompresses on the fly. This is the decorator pattern at the level of the Java API. A processing chain is constructed by wrapping streams: FileInputStream is wrapped in BufferedInputStream for buffering, which is wrapped in DataInputStream for structured reads. Each wrapper holds a reference to its wrapped stream (the "inner" stream) and delegates read/write operations to it while adding its own behavior. The wrappers are composable and interchangeable — any BufferedInputStream can wrap any InputStream, regardless of whether the underlying source is a file, a socket, or an in-memory array. The consequence of this design is that I/O classes proliferate: there are dedicated stream classes for files, byte arrays, pipes, strings, object serialization, compression, encryption, data types, line-oriented text, and more — each at both the byte and character level. Learning I/O is largely about learning which class to use for which purpose, and how to combine them. The underlying model — a chain of streams, each adding one transformation — never changes. The distinction between byte streams and character streams is not just a convenience — it reflects a fundamental fact about text processing. A char in Java is a 16-bit UTF-16 code unit, but bytes in a file or on a network are just octets. Converting between them requires a charset encoding (UTF-8, ISO-8859-1, UTF-16). Character streams (Reader and Writer) encapsulate this conversion; byte streams do not. Using a byte stream to read text data works only for single-byte encodings that happen to coincide with Java's internal char representation, and silently produces garbage for multi-byte encodings like UTF-8 with non-ASCII characters. Always use Reader/Writer for text data; always use InputStream/OutputStream for binary data.
Java
// ── The four abstract base classes ───────────────────────────────────
// Byte input:
InputStream  byteIn;   // abstract: read(), read(byte[]), read(byte[], off, len), close()

// Byte output:
OutputStream byteOut;  // abstract: write(int), write(byte[]), write(byte[], off, len), flush(), close()

// Character input:
Reader       charIn;   // abstract: read(), read(char[]), read(char[], off, len), close()

// Character output:
Writer       charOut;  // abstract: write(int), write(char[]), write(String), flush(), close()

// ── Source → Processor → Consumer chain ──────────────────────────────

// Simple: read a file byte by byte (unbuffered — very slow)
try (InputStream raw = new FileInputStream("data.bin")) {
    int b;
    while ((b = raw.read()) != -1) {
        // b is a byte value 0255; -1 signals end of stream
        System.out.print(b + " ");
    }
}

// Better: file → buffer → consumer (buffered — efficient)
try (InputStream buffered = new BufferedInputStream(new FileInputStream("data.bin"))) {
    byte[] chunk = new byte[4096];
    int bytesRead;
    while ((bytesRead = buffered.read(chunk)) != -1) {
        process(chunk, 0, bytesRead);
    }
}

// Layered: file → buffer → data types → application
try (DataInputStream dis = new DataInputStream(
        new BufferedInputStream(new FileInputStream("data.bin")))) {
    int version = dis.readInt();       // reads 4 bytes as big-endian int
    double value = dis.readDouble();   // reads 8 bytes as IEEE 754 double
    String name  = dis.readUTF();      // reads length-prefixed UTF-8 string
    System.out.printf("v=%d val=%.2f name=%s%n", version, value, name);
}

// ── Byte stream vs character stream — always use the right one ────────
// WRONG: reading UTF-8 text with a byte stream
try (InputStream is = new FileInputStream("text.txt")) {
    int b;
    StringBuilder sb = new StringBuilder();
    while ((b = is.read()) != -1) {
        sb.append((char) b);   // WRONG: casts raw byte to char — breaks on UTF-8 multi-byte
    }
    System.out.println(sb);   // garbled for any non-ASCII characters
}

// CORRECT: reading UTF-8 text with a Reader
try (Reader reader = new InputStreamReader(new FileInputStream("text.txt"),
        StandardCharsets.UTF_8)) {
    int c;
    StringBuilder sb = new StringBuilder();
    while ((c = reader.read()) != -1) {
        sb.append((char) c);   // c is a Java char — correctly decoded from UTF-8
    }
    System.out.println(sb);
}

// BEST: buffered reader with charset
try (BufferedReader br = new BufferedReader(
        new InputStreamReader(new FileInputStream("text.txt"), StandardCharsets.UTF_8))) {
    String line;
    while ((line = br.readLine()) != null) {
        System.out.println(line);
    }
}

// ── The read() contract — the most important detail in all of Java I/O
// read() returns an int (not a byte) to allow -1 as the end-of-stream sentinel.
// If read() returned a byte, -1 could be confused with the valid byte value 0xFF.
// WRONG — misses last byte when its value is 0xFF:
// byte b = (byte) stream.read();   // WRONG: 0xFF becomes -1 (byte), indistinguishable from EOF

// CORRECT:
int value = stream.read();
if (value == -1) {
    // end of stream
} else {
    byte b = (byte) value;   // safe cast — value is in range [0, 255]
}

Buffering, the Decorator Pattern, and Why Unbuffered I/O is Unacceptable

Every I/O operation on an underlying device — a disk, a network socket, a serial port — carries overhead: a system call to the OS kernel, context switching from user mode to kernel mode, and often waiting for the hardware to respond. On modern hardware, a single disk read system call costs 1–10 microseconds plus the actual disk latency (typically 0.1ms for SSD, 5–10ms for HDD). Calling read() or write() for each individual byte makes one system call per byte, which for a 1MB file means 1,048,576 system calls: on the order of 1–10 seconds of system call overhead alone. Buffering amortizes this overhead. A BufferedInputStream maintains an internal byte array (default 8KB). When the buffer is empty and read() is called, BufferedInputStream fills the entire buffer with a single read() call to the underlying stream — one system call for 8192 bytes, then serves subsequent read() calls from memory until the buffer empties. The result is approximately 8000x fewer system calls, reducing I/O overhead from seconds to milliseconds. The same logic applies to BufferedOutputStream and BufferedReader/BufferedWriter. The rule is absolute: always buffer disk I/O and network I/O. The only exceptions are cases where you explicitly need unbuffered behavior (streaming large data through to avoid memory copies) or where the underlying stream is already buffered (some socket implementations buffer internally). The BufferedOutputStream has a critical additional property: buffered writes are not persisted until the buffer is flushed. If the program terminates without flushing a BufferedOutputStream, the last unflushed bytes are lost. This is the source of the most common data loss bug in Java I/O code: writing to a file through a BufferedOutputStream or BufferedWriter, not calling flush() or close(), and wondering why the file is missing the last few kilobytes. The close() method always flushes before closing, so using try-with-resources (which calls close() automatically) prevents this. But if you need to verify that data has been written before the stream is closed (for example, to check for disk-full errors mid-write), call flush() explicitly. The decorator chain must be closed at the outermost level. When you construct new BufferedInputStream(new FileInputStream("file.txt")), you hold a reference only to the BufferedInputStream. Closing the BufferedInputStream calls close() on the FileInputStream it wraps, which closes the file descriptor. If you accidentally close only the inner stream (the FileInputStream), the BufferedInputStream is unaware and may attempt future reads on a closed stream. Always close the outermost wrapper in the chain, never an inner stream directly.
Java
// ── Buffering performance comparison ─────────────────────────────────
// Unbuffered write — one system call per byte (catastrophically slow):
long start = System.nanoTime();
try (FileOutputStream fos = new FileOutputStream("unbuffered.bin")) {
    for (int i = 0; i < 1_000_000; i++) {
        fos.write(i & 0xFF);   // 1,000,000 individual write() system calls
    }
}
System.out.printf("Unbuffered: %.2f ms%n", (System.nanoTime() - start) / 1e6);
// Typical: 2000–10000ms — completely unacceptable

// Buffered write — one system call per 8KB (correct):
start = System.nanoTime();
try (BufferedOutputStream bos = new BufferedOutputStream(
        new FileOutputStream("buffered.bin"))) {
    for (int i = 0; i < 1_000_000; i++) {
        bos.write(i & 0xFF);   // writes to 8KB in-memory buffer
    }
    // bos.flush() called by close() via try-with-resources
}
System.out.printf("Buffered: %.2f ms%n", (System.nanoTime() - start) / 1e6);
// Typical: 3–10ms — 100–1000x faster

// Chunked write — explicitly writing arrays (often fastest):
start = System.nanoTime();
try (FileOutputStream fos = new FileOutputStream("chunked.bin")) {
    byte[] chunk = new byte[65536];   // 64KB chunks
    for (int offset = 0; offset < 1_000_000; offset += chunk.length) {
        int len = Math.min(chunk.length, 1_000_000 - offset);
        fos.write(chunk, 0, len);
    }
}
System.out.printf("Chunked: %.2f ms%n", (System.nanoTime() - start) / 1e6);
// Typical: 2–8ms — comparable to buffered

// ── Flush semantics — data loss without flush/close ──────────────────
// DANGEROUS: data loss if JVM crashes before close()
BufferedWriter writer = new BufferedWriter(new FileWriter("output.txt"));
writer.write("Important data");
// If program crashes here — "Important data" is in the buffer, NOT on disk
// JVM termination without close() loses buffered data permanently

// SAFE: try-with-resources always calls close() → close() calls flush()
try (BufferedWriter safeWriter = new BufferedWriter(new FileWriter("output.txt"))) {
    safeWriter.write("Important data");
    // Even if exception thrown, try-with-resources calls close() → flush() → data saved
}

// EXPLICIT FLUSH: use when you need confirmation of write before close()
try (BufferedOutputStream bos = new BufferedOutputStream(
        new FileOutputStream("critical.bin"))) {
    bos.write(criticalData);
    bos.flush();   // ensures data is written to OS (not necessarily to disk — use fsync for that)
    verifyDataWasWritten();   // can now check for partial-write errors
}

// ── Custom buffer size — when 8KB default isn't optimal ──────────────
// Larger buffer for bulk I/O operations:
try (BufferedInputStream bis = new BufferedInputStream(
        new FileInputStream("large-file.bin"), 65536)) {   // 64KB buffer
    // Each fill of the buffer reads 64KB in one system call
    byte[] buf = new byte[65536];
    int n;
    while ((n = bis.read(buf)) != -1) { process(buf, n); }
}

// ── Decorator chain — always close the OUTERMOST wrapper ─────────────
// CORRECT: close the outermost — propagates through the chain:
FileInputStream fis = new FileInputStream("file.bin");
BufferedInputStream bis = new BufferedInputStream(fis);
DataInputStream dis = new DataInputStream(bis);
dis.close();   // closes DataInputStream → BufferedInputStream → FileInputStream

// try-with-resources handles this correctly:
try (DataInputStream dis2 = new DataInputStream(
        new BufferedInputStream(new FileInputStream("file.bin")))) {
    int version = dis2.readInt();
    // dis2.close() called: closes DataInputStream → BufferedInputStream → FileInputStream
}

// WRONG: closing inner stream leaves outer wrappers in inconsistent state:
// fis.close();   // closes file descriptor, but bis and dis still exist
// bis.read();    // IOException: stream closed (or undefined behavior)

Standard Streams, AutoCloseable, and the try-with-resources Contract

Java provides three pre-opened standard I/O streams through the System class: System.in (an InputStream connected to the process's standard input), System.out (a PrintStream connected to standard output), and System.err (a PrintStream connected to standard error). These streams are opened when the JVM starts, owned by the JVM, and should not be closed by application code. Closing System.out or System.err prevents further output, which is almost never the intent. PrintStream — the type of System.out and System.err — is a FilterOutputStream that adds print() and println() overloads for all primitive types and for Object (via toString()). PrintStream suppresses IOExceptions internally, converting them to a "trouble" flag checkable via checkError(). This makes PrintStream convenient for console output but unsuitable for file or network output where exception handling is critical. For file writing, always use FileWriter, BufferedWriter, or PrintWriter (which wraps Writer, not OutputStream, and has the same print/println API but proper exception propagation when constructed with autoFlush=false). System.setIn(), System.setOut(), and System.setErr() allow replacing the standard streams with custom implementations. This is used in tests to capture console output (replace System.out with a ByteArrayOutputStream) and to pipe input (replace System.in with a ByteArrayInputStream containing prepared test data). Always restore the original streams after a test to avoid polluting subsequent tests. AutoCloseable and Closeable are the interfaces that make try-with-resources work. Closeable extends AutoCloseable and adds the constraint that close() should be idempotent (calling it multiple times has the same effect as calling it once) and that close() throws IOException. AutoCloseable.close() may throw any Exception. All java.io streams implement Closeable, which means they can be used in try-with-resources. The try-with-resources construct guarantees that close() is called on every declared resource in the reverse order of declaration, even if exceptions occur in the body or in earlier close() calls. This is the only safe way to manage I/O resources in Java — manually calling close() in finally blocks is error-prone (a second exception in close() would suppress the first) and verbose. The suppressed exception mechanism handles the case where both the try body and close() throw exceptions: the exception from close() is suppressed (attached to the primary exception via addSuppressed()) rather than replacing it. This ensures the original exception — the one that likely caused the failure — is not lost. Suppressed exceptions are visible in stack traces and accessible via Throwable.getSuppressed().
Java
// ── System standard streams ───────────────────────────────────────────
System.out.println("Standard output");      // PrintStream — no IOException
System.err.println("Standard error");       // PrintStream — writes to stderr
System.out.flush();                         // explicit flush for time-sensitive output

// Reading from System.in — wrap for usability:
try (BufferedReader stdin = new BufferedReader(
        new InputStreamReader(System.in, StandardCharsets.UTF_8))) {
    System.out.print("Enter name: ");
    String line = stdin.readLine();
    System.out.println("Hello, " + line);
}

// Scanner is a higher-level wrapper around System.in:
Scanner scanner = new Scanner(System.in, StandardCharsets.UTF_8);
String input = scanner.nextLine();
scanner.close();   // but closing Scanner also closes System.in — use with care in tests

// ── Redirecting standard streams for testing ──────────────────────────
PrintStream originalOut = System.out;     // save original
ByteArrayOutputStream captured = new ByteArrayOutputStream();
System.setOut(new PrintStream(captured, true, StandardCharsets.UTF_8));

try {
    System.out.println("This is captured");
    System.out.printf("Value: %d%n", 42);
} finally {
    System.setOut(originalOut);           // ALWAYS restore
}

String output = captured.toString(StandardCharsets.UTF_8);
System.out.println("Captured: " + output.strip());
// Captured: This is captured
//           Value: 42

// ── PrintStream vs PrintWriter — suppress vs propagate ────────────────
// PrintStream: suppresses IOException, sets error flag
PrintStream ps = new PrintStream(new FileOutputStream("out.txt"), true, StandardCharsets.UTF_8);
ps.println("hello");
if (ps.checkError()) {
    System.err.println("Write failed — PrintStream swallowed the exception!");
}

// PrintWriter: proper exception propagation (when constructed with autoFlush=false)
try (PrintWriter pw = new PrintWriter(new BufferedWriter(
        new FileWriter("out.txt", StandardCharsets.UTF_8)))) {
    pw.println("hello");
    pw.println("world");
    if (pw.checkError()) throw new IOException("PrintWriter write failed");
}

// ── try-with-resources — the ONLY correct way to manage I/O resources ─
// WRONG: manual try/finally — broken when close() throws
InputStream is = null;
try {
    is = new FileInputStream("file.txt");
    process(is);
} finally {
    if (is != null) {
        try { is.close(); }
        catch (IOException e) { /* second exception swallows first — data lost */ }
    }
}

// CORRECT: try-with-resources — handles suppressed exceptions properly
try (InputStream is2 = new FileInputStream("file.txt")) {
    process(is2);
}   // is2.close() always called, exceptions properly suppressed

// Multiple resources — closed in reverse declaration order:
try (FileInputStream fis  = new FileInputStream("input.txt");
     FileOutputStream fos = new FileOutputStream("output.txt");
     DataInputStream  dis = new DataInputStream(fis)) {
    // Closed order on exit: dis → fos → fis
    int n = dis.readInt();
    fos.write(n & 0xFF);
}

// ── Suppressed exceptions — what try-with-resources does internally ───
// If process(is) throws exception A, and is.close() throws exception B:
// A is the primary exception (from the body)
// B is suppressed and attached to A
try (InputStream failStream = new InputStream() {
    @Override public int read() throws IOException { throw new IOException("Read failed"); }
    @Override public void close() throws IOException { throw new IOException("Close failed"); }
}) {
    failStream.read();
} catch (IOException e) {
    System.out.println("Primary:   " + e.getMessage());           // Read failed
    System.out.println("Suppressed: " + e.getSuppressed()[0].getMessage()); // Close failed
}

Related Topics in Java I/O

Byte Streams
Byte streams are the fundamental I/O abstraction in Java for reading and writing raw binary data. InputStream and OutputStream are the abstract base classes for all byte-oriented I/O, and their concrete subclasses cover every byte-level data source and destination: files, byte arrays in memory, network sockets, pipes between threads, and process standard streams. The critical read() contract — returning an int from 0 to 255 for valid bytes and -1 for end-of-stream — is the foundation of all stream-based binary processing. Byte streams do not perform character encoding or decoding; every byte is passed through as-is, making them correct for binary formats (images, audio, archives, serialized data, protocol buffers), and incorrect for text unless the encoding is explicitly managed. This entry covers the complete InputStream and OutputStream APIs, every major concrete byte stream class and its use case, DataInputStream and DataOutputStream for structured binary I/O, the mark/reset mechanism, available() and its correct interpretation, skipping and transferTo, and ObjectInputStream and ObjectOutputStream for Java serialization.
Character Streams
Character streams, represented by the Reader and Writer abstract base classes, handle text data by abstracting away the encoding and decoding between Java's internal char/String representation (UTF-16) and the byte encoding used in files and network connections. Where byte streams treat data as raw octets, character streams treat data as Unicode characters, handling multi-byte sequences transparently according to a specified Charset. InputStreamReader and OutputStreamWriter are the bridge classes that connect byte streams to character streams, applying charset encoding on write and decoding on read. BufferedReader adds line-at-a-time reading via readLine() and multi-character buffering. PrintWriter adds print/println/printf formatting output. StringReader and StringWriter enable in-memory character stream operations on String data. This entry covers the complete Reader and Writer APIs, charset handling and the consequences of using the wrong charset, the complete class hierarchy of character streams with the use case for each, BufferedReader.readLine() semantics and the lines() stream, the bridge classes in depth, character encoding best practices, and the interaction between character streams and Java's String.lines() and Files.readString()/writeString() alternatives.
File Handling
File handling in Java spans two generations of API: the legacy java.io.File class introduced in Java 1.0, and the modern java.nio.file package (NIO.2) introduced in Java 7 with its Path interface, Files utility class, and FileSystem abstraction. The File class represents a file or directory path as an abstract pathname and provides methods for querying metadata, listing directory contents, creating and deleting files, and basic path manipulation. Its limitations — no symbolic link support, inconsistent error reporting (methods return boolean instead of throwing exceptions), no atomic operations, limited metadata access, and performance issues for large directory traversals — motivated the complete redesign in NIO.2. The Path interface and Files class cover all functionality of File with better exception handling, symbolic link support, atomic operations, rich metadata via BasicFileAttributes, efficient directory walking with Files.walk() and Files.walkFileTree(), file watching with WatchService, and a provider model for custom file system implementations. This entry covers the complete File API and its limitations, the NIO.2 Path and Files APIs, directory traversal strategies, file watching, temporary files, and best practices for cross-platform path handling.
File Class
The java.io.File class is Java's original file system abstraction, present since Java 1.0. A File object represents an abstract pathname — a string denoting a file or directory that may or may not exist on the file system. File objects are immutable: once constructed, the path string they represent never changes. The class provides a comprehensive set of methods for path manipulation, file system queries, directory operations, and file creation and deletion. File served as the primary file system API for 17 years until NIO.2's Path and Files classes superseded it in Java 7. Understanding File is essential for reading existing Java codebases, working with older APIs that accept File parameters, and understanding why NIO.2 was designed the way it was. This entry covers the complete File API in depth: all constructor forms and path semantics, every query and mutation method with its exact return and failure semantics, the listFiles() filtering API, path resolution and relative path handling, platform-specific behavior differences, the interoperability bridge between File and Path, and a precise catalog of File's deficiencies that motivated NIO.2.