☕ JavaJava I/O

Buffered Streams

Buffered streams wrap an underlying unbuffered stream with an in-memory buffer, dramatically reducing the number of native I/O system calls by batching reads and writes. Without buffering, each read() or write() call typically results in one OS system call, which transitions the CPU between user mode and kernel mode — an operation that costs thousands of CPU cycles. With buffering, data is read from or written to the OS in large chunks (typically 8192 bytes by default), and individual application read() and write() calls are served from memory, requiring no system call unless the buffer fills or empties. Java provides four buffered stream classes: BufferedInputStream and BufferedOutputStream for byte-level I/O, and BufferedReader and BufferedWriter for character-level I/O. All four wrap an existing stream, are transparent to the application (the same read/write API), and dramatically improve performance for I/O-intensive code. This entry covers the internal buffer mechanics, buffer sizing guidance, the flush contract (when buffered data is actually written), the decorator pattern that makes buffering composable, mark/reset functionality in BufferedInputStream, and the readLine() convenience in BufferedReader.

Buffer Mechanics — How Buffering Reduces System Calls

Every Java I/O stream ultimately communicates with the OS through system calls. A read() on an unbuffered FileInputStream triggers a read() system call that transitions to kernel mode, retrieves data from disk or network, and returns to user mode — a round trip that costs thousands of nanoseconds even for a single byte. Writing one byte at a time to an unbuffered FileOutputStream produces one system call per byte, turning a 1MB write into one million system calls — completely impractical for any performance-sensitive application. Buffered streams interpose an in-memory byte array between the application and the OS. On a BufferedInputStream.read(), if the buffer is non-empty, the byte is taken from the buffer at memory speed with no system call. Only when the buffer is empty does the stream make one native read() call to refill the entire buffer from the underlying stream. The default buffer size of 8192 bytes means that for sequentially reading a file, the number of system calls is reduced by a factor of 8192 compared to unbuffered reading — one system call per 8KB instead of one per byte. On the write side, BufferedOutputStream.write(byte) adds the byte to the buffer. Only when the buffer fills, or when flush() is called, or when the stream is closed, does the stream make a native write() system call with the entire buffer contents. For writing many small values (like writing JSON character by character), this batching can reduce system calls by orders of magnitude. The buffer size can be customized: BufferedInputStream(InputStream in, int size) and BufferedOutputStream(OutputStream out, int size). The default 8192 bytes is appropriate for most use cases. Larger buffers (65536, 1MB) benefit sequential I/O of large files where the OS can perform efficient large sequential reads. Smaller buffers waste less memory when many streams are open simultaneously. The optimal size depends on the OS filesystem block size and the access pattern. For most networked I/O, the default is appropriate; for disk-bound large file processing, 64KB to 256KB buffers often improve throughput.

Java

// ── Performance comparison: unbuffered vs buffered ────────────────────
import java.io.*;
import java.nio.file.*;

// Writing 1MB one byte at a time WITHOUT buffering:
long start = System.nanoTime();
try (FileOutputStream fos = new FileOutputStream("test.bin")) {
    for (int i = 0; i < 1_000_000; i++) {
        fos.write(i & 0xFF);   // 1,000,000 system calls — catastrophic
    }
}
System.out.printf("Unbuffered: %.0fms%n", (System.nanoTime() - start) / 1e6);
// Typical: ~5000ms on SSD, much longer on HDD

// Writing 1MB one byte at a time WITH buffering:
start = System.nanoTime();
try (BufferedOutputStream bos = new BufferedOutputStream(
        new FileOutputStream("test_buffered.bin"))) {
    for (int i = 0; i < 1_000_000; i++) {
        bos.write(i & 0xFF);   // bytes go to 8192-byte buffer → ~122 system calls total
    }
}   // close() flushes remaining buffer — final system call
System.out.printf("Buffered:   %.0fms%n", (System.nanoTime() - start) / 1e6);
// Typical: ~5ms — ~1000x faster

// ── Default buffer size ───────────────────────────────────────────────
BufferedInputStream defaultBIS = new BufferedInputStream(new FileInputStream("test.bin"));
// Buffer is 8192 bytes — not exposed by public API, but documented in source

// ── Custom buffer size ────────────────────────────────────────────────
// 64KB buffer for large sequential file reads:
int BUFFER_SIZE = 65_536;
try (BufferedInputStream bigBuffer = new BufferedInputStream(
        new FileInputStream("large.bin"), BUFFER_SIZE)) {
    byte[] chunk = new byte[BUFFER_SIZE];
    int bytesRead;
    while ((bytesRead = bigBuffer.read(chunk)) != -1) {
        process(chunk, bytesRead);
    }
}

// Small buffer (1KB) for many small streams where memory is constrained:
try (BufferedInputStream small = new BufferedInputStream(
        new FileInputStream("small.txt"), 1024)) {
    // works correctly — just refills buffer more often
}

// ── System call visualization ─────────────────────────────────────────
// File: 100,000 bytes
// Unbuffered read():  100,000 system calls (read 1 byte each)
// Buffered read():    ~13 system calls    (read 8192 bytes each = ceil(100000/8192))
// Manual bulk read(): 1 system call       (read(byte[100000]) — one OS call)

// ── The decorator pattern: stacking buffering onto any stream ──────────
// BufferedInputStream wraps ANY InputStream:
InputStream network = socket.getInputStream();
BufferedInputStream bufferedNetwork = new BufferedInputStream(network);

// BufferedOutputStream wraps ANY OutputStream:
OutputStream compressor = new GZIPOutputStream(new FileOutputStream("out.gz"));
BufferedOutputStream bufferedCompressor = new BufferedOutputStream(compressor);
// Writing to bufferedCompressor: buffered → GZIP compressed → file

The Flush Contract and Stream Closing

The flush contract defines when buffered data actually reaches the underlying stream. A BufferedOutputStream (or BufferedWriter) holds data in memory until: the buffer fills (at which point it is automatically flushed to make room for new data), flush() is called explicitly, or the stream is closed via close() (which always flushes first). Between these events, the buffered data is invisible to the underlying stream, the file on disk, or the remote host on a network connection. This creates a potential data loss scenario: if a process terminates, crashes, or the stream is not properly closed, buffered data that was not flushed is lost. The mandatory pattern for avoiding this is try-with-resources: wrapping the outermost stream in try (BufferedOutputStream bos = ...) { ... } ensures that close() is called even if an exception is thrown, which flushes the buffer before closing. If try-with-resources is not used, the finally { bos.close(); } pattern provides the same guarantee. Never rely on garbage collection to close and flush streams — GC does not run predictably enough and stream finalization is deprecated. The flush()-then-continue pattern applies when the application needs to ensure that data is visible to a reader on the other end of a network connection (or another process reading a pipe) without closing the stream. After writing a complete protocol message, calling flush() ensures the OS sends it. For file I/O, flush() ensures the data has been handed to the OS (in the OS buffer); calling the underlying FileOutputStream.getFD().sync() additionally ensures the OS has written to the physical disk. When multiple buffered streams are stacked — for example, BufferedOutputStream wrapping GZIPOutputStream wrapping FileOutputStream — calling flush() or close() on the outermost stream propagates the operation through the chain: BufferedOutputStream flushes its buffer to GZIPOutputStream, which flushes its compressed data to FileOutputStream, which writes to the file. The innermost flush/close always happens last, which is the correct order.

Java

// ── Flush contract: data is NOT written until flush or close ─────────
File file = new File("output.txt");
BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(file));
bos.write("Hello".getBytes());
// At this point, "Hello" is in the 8192-byte buffer — NOT on disk
// If the process crashes here, the file is empty

bos.flush();   // now "Hello" is handed to the OS — visible to other processes
bos.write(" World".getBytes());
bos.close();   // flush() + close on underlying stream — " World" now on disk

// ── try-with-resources: guaranteed flush and close ────────────────────
// CORRECT — always use try-with-resources for streams:
try (BufferedOutputStream safe = new BufferedOutputStream(
        new FileOutputStream("safe.bin"))) {
    for (int i = 0; i < 100_000; i++) {
        safe.write(i & 0xFF);
    }
}   // close() called automatically — flushes buffer — all 100,000 bytes written

// WRONG — exception before close() loses data:
BufferedOutputStream unsafe = new BufferedOutputStream(
        new FileOutputStream("unsafe.bin"));
for (int i = 0; i < 100_000; i++) {
    unsafe.write(i & 0xFF);
    if (i == 50_000) throw new RuntimeException("Crash!");   // buffer not flushed
}
unsafe.close();   // never reached — last ~50,000 bytes in buffer are lost

// ── Stacked streams: close/flush propagates through the chain ─────────
try (BufferedOutputStream bos2 = new BufferedOutputStream(
        new GZIPOutputStream(
            new FileOutputStream("data.gz")))) {

    bos2.write(largeData);
}
// close() chain: BufferedOutputStream.close()
//                → flush buffer to GZIPOutputStream
//                → GZIPOutputStream.close() (writes GZIP trailer)
//                → FileOutputStream.close() (closes file descriptor)

// ── flush() for network protocols ─────────────────────────────────────
try (Socket socket = new Socket("api.example.com", 80);
     BufferedOutputStream out = new BufferedOutputStream(socket.getOutputStream())) {

    String request = "GET / HTTP/1.0
Host: api.example.com

";
    out.write(request.getBytes());
    out.flush();   // critical: without this, the server never receives the request
                   // — it's still in the 8192-byte buffer on our end

    // Now read the response...
}

// ── sync() for durability guarantees ─────────────────────────────────
try (FileOutputStream fos = new FileOutputStream("critical.dat");
     BufferedOutputStream bos3 = new BufferedOutputStream(fos)) {

    bos3.write(criticalData);
    bos3.flush();          // hand data to OS kernel buffer
    fos.getFD().sync();    // force OS to write kernel buffer to physical disk
    // After sync(): data survives a power failure
}

mark/reset in BufferedInputStream and Composition Patterns

BufferedInputStream supports the mark/reset API that its parent InputStream declares. mark(int readlimit) marks the current position in the stream; reset() returns the stream position to the marked position. The readlimit argument specifies how many bytes can be read after mark() before the mark becomes invalid. BufferedInputStream maintains the mark by keeping the buffered data from the marked position in memory even after it would normally be discarded. If more than readlimit bytes are read, the buffer may be released and reset() will throw IOException ("Resetting to invalid mark"). The mark/reset capability is used for look-ahead parsing: read bytes to determine the format or content, then reset and re-read with the appropriate parser. It is also used by format detectors (determine if a stream is a PNG, JPEG, or GIF by reading the first few bytes, then reset and handle accordingly). BufferedInputStream is one of the few InputStream implementations where markSupported() returns true — raw FileInputStream and network streams do not support mark/reset. Composing streams effectively: the standard layering for high-performance file processing is BufferedInputStream(FileInputStream) for reading and BufferedOutputStream(FileOutputStream) for writing. For character I/O, BufferedReader(new InputStreamReader(new FileInputStream(path), charset)) correctly handles encoding while buffering. The outermost stream is always the one the application code interacts with; the innermost is the actual data source or sink; intermediate streams add transformations (decompression, encoding conversion, buffering, encryption). A critical composition mistake: wrapping a BufferedOutputStream inside another BufferedOutputStream creates double buffering — data is buffered twice, the inner buffer effectively acting as a large collection buffer that is flushed to the outer buffer which then flushes to the OS. This is wasteful but not incorrect. More harmful is wrapping a DataOutputStream or PrintWriter inside a BufferedOutputStream — these classes have their own internal write() calls that go to their underlying stream, and if that stream is already buffered, the layering is correct; if not, every write is unbuffered.

Java

// ── mark/reset in BufferedInputStream ────────────────────────────────
byte[] data = {0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A};  // PNG header
try (BufferedInputStream bis = new BufferedInputStream(
        new ByteArrayInputStream(data))) {

    // Check if markSupported() before using:
    System.out.println("Mark supported: " + bis.markSupported());  // true

    bis.mark(8);   // mark position 0, can read 8 bytes before mark invalid
    byte[] header = new byte[4];
    bis.read(header);   // read first 4 bytes: [0x89, 0x50, 0x4E, 0x47]

    String format = detectFormat(header);   // e.g., "PNG"
    System.out.println("Detected format: " + format);

    bis.reset();   // return to mark position — position is 0 again

    // Now re-read with proper parser for the detected format:
    if ("PNG".equals(format)) {
        parsePng(bis);   // reads from position 0 again
    }
}

// ── Format detection with mark/reset ─────────────────────────────────
public static String detectFormat(BufferedInputStream bis) throws IOException {
    if (!bis.markSupported()) throw new IllegalArgumentException("Stream must support mark");

    bis.mark(16);   // allow reading 16 bytes for detection
    byte[] magic = new byte[4];
    int read = bis.read(magic);
    bis.reset();    // always reset after detection

    if (read < 4) return "UNKNOWN";
    // PNG: 89 50 4E 47
    if (magic[0] == (byte)0x89 && magic[1] == 'P' && magic[2] == 'N' && magic[3] == 'G')
        return "PNG";
    // JPEG: FF D8 FF
    if (magic[0] == (byte)0xFF && magic[1] == (byte)0xD8 && magic[2] == (byte)0xFF)
        return "JPEG";
    // PDF: %PDF
    if (magic[0] == '%' && magic[1] == 'P' && magic[2] == 'D' && magic[3] == 'F')
        return "PDF";
    return "UNKNOWN";
}

// ── Standard composition patterns ────────────────────────────────────
// Byte file I/O — the standard pattern:
try (BufferedInputStream in   = new BufferedInputStream(new FileInputStream("in.bin"));
     BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream("out.bin"))) {
    byte[] buffer = new byte[8192];
    int n;
    while ((n = in.read(buffer)) != -1) {
        out.write(buffer, 0, n);
    }
}

// Character file I/O with explicit encoding — the standard pattern:
try (BufferedReader reader = new BufferedReader(
        new InputStreamReader(new FileInputStream("text.txt"), StandardCharsets.UTF_8));
     BufferedWriter writer = new BufferedWriter(
        new OutputStreamWriter(new FileOutputStream("out.txt"), StandardCharsets.UTF_8))) {
    String line;
    while ((line = reader.readLine()) != null) {
        writer.write(line);
        writer.newLine();
    }
}

// With compression:
try (BufferedInputStream in = new BufferedInputStream(
        new GZIPInputStream(new FileInputStream("data.gz")));
     BufferedOutputStream out = new BufferedOutputStream(
        new FileOutputStream("decompressed.bin"))) {
    byte[] buf = new byte[65536];
    int n;
    while ((n = in.read(buf)) != -1) out.write(buf, 0, n);
}

Related Topics in Java I/O

I/O Basics

Java I/O is built on a small set of abstract concepts that underlie every I/O operation in the language: streams, readers, writers, channels, and buffers. A stream is a sequential flow of data — bytes moving from a source to a destination one at a time or in chunks. Java organizes I/O around two fundamental distinctions: byte I/O (reading and writing raw bytes, the universal representation that everything ultimately reduces to) and character I/O (reading and writing text encoded in a specific character set, with automatic encoding and decoding). The original java.io package, introduced in Java 1.0, provides stream-based I/O through four abstract base classes: InputStream, OutputStream, Reader, and Writer. The java.nio package, introduced in Java 1.4, adds a channel-and-buffer model for non-blocking and memory-mapped I/O. The java.nio.file package, introduced in Java 7 as part of NIO.2, provides a modern, comprehensive file system API that supersedes much of java.io.File. This entry covers the conceptual model of streams and their abstract base classes, the decorator pattern that underlies Java I/O class hierarchy, the source-processor-sink taxonomy of stream classes, blocking versus non-blocking I/O, buffering and why it is almost always necessary, the standard I/O streams (System.in, System.out, System.err), and the resource management contract that every I/O class must satisfy.

Byte Streams

Byte streams are the fundamental I/O abstraction in Java for reading and writing raw binary data. InputStream and OutputStream are the abstract base classes for all byte-oriented I/O, and their concrete subclasses cover every byte-level data source and destination: files, byte arrays in memory, network sockets, pipes between threads, and process standard streams. The critical read() contract — returning an int from 0 to 255 for valid bytes and -1 for end-of-stream — is the foundation of all stream-based binary processing. Byte streams do not perform character encoding or decoding; every byte is passed through as-is, making them correct for binary formats (images, audio, archives, serialized data, protocol buffers), and incorrect for text unless the encoding is explicitly managed. This entry covers the complete InputStream and OutputStream APIs, every major concrete byte stream class and its use case, DataInputStream and DataOutputStream for structured binary I/O, the mark/reset mechanism, available() and its correct interpretation, skipping and transferTo, and ObjectInputStream and ObjectOutputStream for Java serialization.

Character Streams

Character streams, represented by the Reader and Writer abstract base classes, handle text data by abstracting away the encoding and decoding between Java's internal char/String representation (UTF-16) and the byte encoding used in files and network connections. Where byte streams treat data as raw octets, character streams treat data as Unicode characters, handling multi-byte sequences transparently according to a specified Charset. InputStreamReader and OutputStreamWriter are the bridge classes that connect byte streams to character streams, applying charset encoding on write and decoding on read. BufferedReader adds line-at-a-time reading via readLine() and multi-character buffering. PrintWriter adds print/println/printf formatting output. StringReader and StringWriter enable in-memory character stream operations on String data. This entry covers the complete Reader and Writer APIs, charset handling and the consequences of using the wrong charset, the complete class hierarchy of character streams with the use case for each, BufferedReader.readLine() semantics and the lines() stream, the bridge classes in depth, character encoding best practices, and the interaction between character streams and Java's String.lines() and Files.readString()/writeString() alternatives.

File Handling

File handling in Java spans two generations of API: the legacy java.io.File class introduced in Java 1.0, and the modern java.nio.file package (NIO.2) introduced in Java 7 with its Path interface, Files utility class, and FileSystem abstraction. The File class represents a file or directory path as an abstract pathname and provides methods for querying metadata, listing directory contents, creating and deleting files, and basic path manipulation. Its limitations — no symbolic link support, inconsistent error reporting (methods return boolean instead of throwing exceptions), no atomic operations, limited metadata access, and performance issues for large directory traversals — motivated the complete redesign in NIO.2. The Path interface and Files class cover all functionality of File with better exception handling, symbolic link support, atomic operations, rich metadata via BasicFileAttributes, efficient directory walking with Files.walk() and Files.walkFileTree(), file watching with WatchService, and a provider model for custom file system implementations. This entry covers the complete File API and its limitations, the NIO.2 Path and Files APIs, directory traversal strategies, file watching, temporary files, and best practices for cross-platform path handling.

FileOutputStream

FileReader