☕ JavaJava I/O

Character Streams

Character streams, represented by the Reader and Writer abstract base classes, handle text data by abstracting away the encoding and decoding between Java's internal char/String representation (UTF-16) and the byte encoding used in files and network connections. Where byte streams treat data as raw octets, character streams treat data as Unicode characters, handling multi-byte sequences transparently according to a specified Charset. InputStreamReader and OutputStreamWriter are the bridge classes that connect byte streams to character streams, applying charset encoding on write and decoding on read. BufferedReader adds line-at-a-time reading via readLine() and multi-character buffering. PrintWriter adds print/println/printf formatting output. StringReader and StringWriter enable in-memory character stream operations on String data. This entry covers the complete Reader and Writer APIs, charset handling and the consequences of using the wrong charset, the complete class hierarchy of character streams with the use case for each, BufferedReader.readLine() semantics and the lines() stream, the bridge classes in depth, character encoding best practices, and the interaction between character streams and Java's String.lines() and Files.readString()/writeString() alternatives.

Reader and Writer Hierarchy, and the Bridge Classes

Reader is the abstract base class for character input streams. Its core method is read(), which returns a single char as an int in the range [0, 65535] (the Unicode code point range for chars), or -1 for end of stream. read(char[] cbuf, int off, int len) reads up to len chars into cbuf starting at off, returning the count read or -1. read(char[] cbuf) reads up to cbuf.length chars. Like InputStream.read(byte[]), these methods may return partial fills — the count returned must be used, not the array length. Reader also defines skip(long n), ready() (analogous to InputStream.available() — returns true if a read is guaranteed not to block), markSupported(), mark(), and reset(). Writer is the abstract base class for character output streams. Its core method is write(int c) which writes a single char (the low 16 bits of the int). write(char[] cbuf, int off, int len) writes len chars from cbuf. write(String str, int off, int len) writes a substring, which is unique to Writer — there is no corresponding method in OutputStream. append(CharSequence csq) and append(char c) provide a fluent API for building character output. flush() and close() have the same semantics as in OutputStream. InputStreamReader extends Reader and wraps an InputStream, applying charset decoding as it reads: bytes from the InputStream are decoded according to the specified Charset into chars returned by read(). If no Charset is specified, it uses Charset.defaultCharset(), which on most JVMs defaults to UTF-8 but was historically platform-dependent — always specify the charset explicitly. OutputStreamWriter extends Writer and wraps an OutputStream, applying charset encoding: chars written to the OutputStreamWriter are encoded according to the specified Charset into bytes sent to the underlying OutputStream. These two bridge classes are the standard connection points between the byte stream world and the character stream world. Any network socket communication that carries text must go through InputStreamReader/OutputStreamWriter to handle encoding correctly. Any file containing text should be read and written through these classes (or through FileReader/FileWriter with explicit charset, which are convenience subclasses introduced in Java 11 — prior to Java 11, FileReader and FileWriter always used the platform default charset and had no charset parameter, making them treacherous). The combination of FileInputStream + InputStreamReader + BufferedReader is the correct, explicit way to read text files that works on all Java versions and all platforms.

Java

// ── InputStreamReader — byte stream to char stream bridge ───────────
// Always specify charset explicitly — never rely on platform default:
try (Reader reader = new InputStreamReader(
        new FileInputStream("document.txt"), StandardCharsets.UTF_8)) {
    char[] buffer = new char[4096];
    int charsRead;
    while ((charsRead = reader.read(buffer)) != -1) {
        process(buffer, 0, charsRead);   // process only charsRead chars, not buffer.length
    }
}

// WRONG: relies on platform default charset — breaks across environments
try (Reader broken = new InputStreamReader(new FileInputStream("document.txt"))) {
    // Platform default charset may be UTF-8 on Linux, Cp1252 on Windows — inconsistent
}

// ── OutputStreamWriter — char stream to byte stream bridge ────────────
try (Writer writer = new OutputStreamWriter(
        new FileOutputStream("output.txt"), StandardCharsets.UTF_8)) {
    writer.write("Hello, 世界!");   // writes UTF-8 bytes to the underlying stream
    writer.write('
');
    writer.write(new char[]{'A', 'B', 'C'}, 0, 3);
    writer.write("substring", 3, 6);   // writes chars 3..8 (inclusive of 3, exclusive of 9)
}

// ── Full stack: FileInputStream → InputStreamReader → BufferedReader ──
// This is the most explicit and portable way to read text files:
try (BufferedReader br = new BufferedReader(
        new InputStreamReader(new FileInputStream("data.csv"), StandardCharsets.UTF_8))) {
    String line;
    while ((line = br.readLine()) != null) {
        String[] fields = line.split(",");
        processCsvRow(fields);
    }
}

// Java 11+: FileReader with charset (simpler, same result):
try (BufferedReader br = new BufferedReader(
        new FileReader("data.csv", StandardCharsets.UTF_8))) {
    // Equivalent but cleaner
}

// ── Reader hierarchy — all concrete Reader classes ────────────────────
// StringReader: reads chars from a String
try (Reader sr = new StringReader("Hello, Reader!")) {
    char[] buf = new char[5];
    int n = sr.read(buf);    // reads "Hello"
    System.out.println(new String(buf, 0, n));  // Hello
}

// CharArrayReader: reads chars from a char[]
char[] chars = "CharArray".toCharArray();
try (Reader cr = new CharArrayReader(chars)) {
    System.out.println((char) cr.read());  // C
}

// PipedReader/PipedWriter: inter-thread character piping (rarely used directly)
PipedWriter pw = new PipedWriter();
PipedReader pr = new PipedReader(pw, 4096);
new Thread(() -> {
    try { pw.write("Inter-thread text"); pw.close(); }
    catch (IOException e) { e.printStackTrace(); }
}).start();

try (BufferedReader br = new BufferedReader(pr)) {
    System.out.println(br.readLine());  // Inter-thread text
}

BufferedReader, BufferedWriter, PrintWriter — High-Level Text I/O

BufferedReader is the workhorse of text input in Java. It wraps any Reader and adds two critical capabilities: an internal char buffer (default 8192 chars) that amortizes system call overhead, and readLine() which reads a complete line of text and strips the line terminator. readLine() handles all three line terminator conventions: carriage return ( ), line feed ( ), and carriage-return-line-feed ( ). It returns null at end of stream rather than -1 (because null is a valid sentinel for String return types). The lines() method (Java 8+) returns a Stream<String> of all lines, enabling functional processing with filter, map, and collect. The readLine() null sentinel is the most common source of BufferedReader bugs. Code that compares the return value of readLine() to "" (empty string) to detect end-of-stream is wrong — an empty line returns "" (an empty string, not null), and end-of-stream returns null. The correct idiom is while ((line = br.readLine()) != null). BufferedWriter wraps any Writer and adds buffering plus one specific extra method: newLine(). newLine() writes the platform-specific line separator (System.lineSeparator()), which is on Windows and on Unix/macOS. This is important for writing files that will be read on the same platform and should conform to its line ending convention. For cross-platform files (configuration files shared between systems, files committed to version control), always write explicitly rather than calling newLine(), to avoid accidentally writing Windows-style line endings from a Windows build server. PrintWriter adds the full print/println/printf/format API to any Writer, making it the most convenient class for formatted text output to files. PrintWriter is constructed from a Writer (explicit control over charset and buffering) or from a File or filename (with optional auto-flush and charset in Java 10+). When constructed from a Writer with autoFlush=false (the default), output is buffered and must be flushed manually or by close(). When constructed with autoFlush=true, println(), printf(), and format() automatically flush. PrintWriter suppresses IOExceptions (unlike PrintStream, it does not inherit from FilterOutputStream); use checkError() to detect write failures. StringWriter is a Writer backed by an internal StringBuffer. Everything written to it accumulates in the buffer, which is retrieved as a String via toString(). It is used when a method accepts a Writer parameter and you want to capture its output as a String — for testing, for building formatted strings, or for serializing object state to a string.

Java

// ── BufferedReader.readLine() — correct and incorrect usage ──────────
try (BufferedReader br = new BufferedReader(
        new InputStreamReader(new FileInputStream("lines.txt"), StandardCharsets.UTF_8))) {
    String line;

    // CORRECT: check for null to detect end-of-stream
    while ((line = br.readLine()) != null) {
        if (!line.isBlank()) processLine(line);
    }

    // WRONG: comparing to "" — empty lines return "", end-of-stream returns null
    // while (!(line = br.readLine()).equals("")) { }  // NullPointerException at end!
}

// ── BufferedReader.lines() — functional stream API (Java 8+) ──────────
try (BufferedReader br = Files.newBufferedReader(
        Path.of("data.txt"), StandardCharsets.UTF_8)) {
    long wordCount = br.lines()
        .filter(line -> !line.isBlank())
        .flatMap(line -> Arrays.stream(line.split("\s+")))
        .filter(word -> !word.isEmpty())
        .count();
    System.out.println("Word count: " + wordCount);
}

// Lines stream is lazy — reads lines on demand, not all at once
try (Stream<String> lines = Files.lines(Path.of("large.txt"), StandardCharsets.UTF_8)) {
    lines.filter(l -> l.contains("ERROR"))
         .limit(100)
         .forEach(System.out::println);
    // Only reads until 100 ERROR lines found — doesn't read entire file
}

// ── BufferedWriter.newLine() vs 
 ────────────────────────────────────
// For platform-specific line endings (e.g., Windows batch files):
try (BufferedWriter bw = new BufferedWriter(
        new OutputStreamWriter(new FileOutputStream("windows.bat"), StandardCharsets.UTF_8))) {
    bw.write("@echo off");
    bw.newLine();        // 

 on Windows, 
 on Unix — matches platform
    bw.write("echo Hello");
    bw.newLine();
}

// For cross-platform files (config files, source code, version-controlled files):
try (BufferedWriter bw = new BufferedWriter(
        new OutputStreamWriter(new FileOutputStream("config.txt"), StandardCharsets.UTF_8))) {
    bw.write("key=value");
    bw.write("
");     // ALWAYS 
 — never 
 in cross-platform files
    bw.write("other=data");
    bw.write("
");
}

// ── PrintWriter — convenient formatted text output to files ───────────
// From Writer (explicit charset and buffering control — recommended):
try (PrintWriter pw = new PrintWriter(
        new BufferedWriter(new OutputStreamWriter(
            new FileOutputStream("report.txt"), StandardCharsets.UTF_8)))) {
    pw.println("Report: " + LocalDate.now());
    pw.printf("Total items: %,d%n", 1_234_567);
    pw.printf("Average: %.2f%n", 98.76);
    pw.println("Done");
    if (pw.checkError()) throw new IOException("PrintWriter write failed");
}

// From File (Java 10+: charset as second parameter):
try (PrintWriter pw = new PrintWriter(new File("output.txt"), StandardCharsets.UTF_8)) {
    pw.println("Simple output");
}

// autoFlush=true: println/printf/format flush automatically:
try (PrintWriter autoFlushed = new PrintWriter(new FileWriter("live.txt"), true)) {
    autoFlushed.println("Line 1");  // flushed immediately
    autoFlushed.println("Line 2");  // flushed immediately
    // Useful for log files that must be visible while program is running
}

// ── StringWriter — capture Writer output as String ────────────────────
StringWriter sw = new StringWriter();
try (PrintWriter pw = new PrintWriter(sw)) {
    pw.printf("Name: %s%n", "Alice");
    pw.printf("Score: %d%n", 95);
}   // pw closed; sw still valid
String report = sw.toString();
System.out.println(report);
// Name: Alice
// Score: 95

// Testing utility: capture method output that writes to a Writer:
StringWriter capturedOutput = new StringWriter();
generateReport(new PrintWriter(capturedOutput));   // method under test
assertThat(capturedOutput.toString()).contains("Expected Section");

Charset Handling, Encoding Best Practices, and Modern Alternatives

Character encoding is the most common source of silent data corruption in Java I/O. Java strings are sequences of UTF-16 code units; files and network streams are sequences of bytes; the mapping between them is defined by a Charset. Using the wrong charset, or using the platform default charset, causes characters outside the charset's repertoire to be replaced with question marks or other substitution characters, and causes some byte sequences to be misinterpreted as different characters — all without any exception being thrown. The StandardCharsets class provides constants for the six charsets guaranteed to be available on every Java platform: US_ASCII, ISO_8859_1, UTF_8, UTF_16, UTF_16BE, and UTF_16LE. For virtually all text I/O, UTF_8 is the correct choice: it can represent every Unicode character, it is the dominant encoding on the internet, it is backward compatible with ASCII, and it is the default for Java source files and most modern operating systems. ISO_8859_1 (Latin-1) is sometimes needed for HTTP headers (which are historically Latin-1 encoded), legacy files, or binary data masquerading as text. US_ASCII is appropriate only when you know the data is pure ASCII and want to fail fast (with a replacement character or exception) on non-ASCII input. The Charset.defaultCharset() is Charset.forName("UTF-8") on most modern JVMs (including all JVMs running on Java 17+, where it was standardized via JEP 400), but was platform-dependent on earlier JVMs — UTF-8 on Linux/macOS, Cp1252 on Windows. Code that implicitly relies on the default charset (by using FileReader, FileWriter, new String(bytes), or String.getBytes() without a charset) produces different results on different platforms. This is the definition of a latent cross-platform bug. Always pass an explicit charset. Java NIO.2 provides higher-level alternatives to character stream boilerplate for common file operations: Files.readString(Path, Charset) reads an entire text file as a String; Files.writeString(Path, CharSequence, Charset, OpenOption...) writes a String to a file; Files.readAllLines(Path, Charset) reads all lines as a List<String>; Files.write(Path, Iterable<? extends CharSequence>, Charset, OpenOption...) writes a collection of lines. These methods are cleaner than constructing stream chains for simple cases, but they read or write the entire file at once — unsuitable for very large files. For large files or streaming processing, the stream chain approach (BufferedReader wrapping InputStreamReader wrapping FileInputStream) remains necessary.

Java

// ── Always specify charset — never rely on default ───────────────────
// WRONG: platform-dependent charset
String fromBytes = new String(bytes);               // uses Charset.defaultCharset()
byte[] toBytes   = string.getBytes();               // uses Charset.defaultCharset()
FileReader fr    = new FileReader("file.txt");       // uses Charset.defaultCharset()
FileWriter fw    = new FileWriter("file.txt");       // uses Charset.defaultCharset()

// CORRECT: always explicit
String fromBytesUTF8 = new String(bytes, StandardCharsets.UTF_8);
byte[] toBytesUTF8   = string.getBytes(StandardCharsets.UTF_8);
FileReader frExplicit = new FileReader("file.txt", StandardCharsets.UTF_8);   // Java 11+
FileWriter fwExplicit = new FileWriter("file.txt", StandardCharsets.UTF_8);   // Java 11+

// ── Charset detection — when encoding is unknown ──────────────────────
// If charset is truly unknown: read raw bytes and detect via BOM or external library
try (InputStream is = new FileInputStream("unknown.txt")) {
    byte[] bom = is.readNBytes(3);
    Charset charset;
    int skip = 0;
    if (bom[0] == (byte)0xEF && bom[1] == (byte)0xBB && bom[2] == (byte)0xBF) {
        charset = StandardCharsets.UTF_8; skip = 3;   // UTF-8 BOM
    } else if (bom[0] == (byte)0xFF && bom[1] == (byte)0xFE) {
        charset = StandardCharsets.UTF_16LE; skip = 2; // UTF-16 LE BOM
    } else if (bom[0] == (byte)0xFE && bom[1] == (byte)0xFF) {
        charset = StandardCharsets.UTF_16BE; skip = 2; // UTF-16 BE BOM
    } else {
        charset = StandardCharsets.UTF_8; skip = 0;   // assume UTF-8 (most common)
    }
    // Re-read without BOM bytes consumed:
    InputStream adjusted = new SequenceInputStream(
        new ByteArrayInputStream(bom, skip, bom.length - skip), is);
    try (BufferedReader br = new BufferedReader(
            new InputStreamReader(adjusted, charset))) {
        br.lines().forEach(System.out::println);
    }
}

// ── CodingErrorAction — control behavior on invalid byte sequences ────
Charset utf8Strict = StandardCharsets.UTF_8;

// Default: REPLACE (replaces unmappable chars with '?')
CharsetDecoder defaultDecoder = utf8Strict.newDecoder();
// On error: replaces with U+FFFD (replacement char)

// Strict: REPORT (throws exception on invalid sequence)
CharsetDecoder strictDecoder = utf8Strict.newDecoder()
    .onMalformedInput(CodingErrorAction.REPORT)
    .onUnmappableCharacter(CodingErrorAction.REPORT);

// Using strict decoder with InputStreamReader:
try (Reader reader = new InputStreamReader(
        new FileInputStream("strict.txt"), strictDecoder)) {
    // CharacterCodingException thrown on any invalid UTF-8 sequence
    String content = new BufferedReader(reader).lines()
        .collect(Collectors.joining("
"));
}

// ── NIO.2 alternatives for simple file operations ────────────────────
Path path = Path.of("data.txt");

// Read entire file as String (suitable for small-to-medium files):
String content = Files.readString(path, StandardCharsets.UTF_8);

// Write String to file:
Files.writeString(path, "Hello, World!
Second line
",
    StandardCharsets.UTF_8,
    StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING);

// Read all lines as List<String> (loads entire file into memory):
List<String> lines = Files.readAllLines(path, StandardCharsets.UTF_8);

// Write lines (adds system line separator after each):
Files.write(path, List.of("Line 1", "Line 2", "Line 3"),
    StandardCharsets.UTF_8);

// Stream lines lazily (for large files):
try (Stream<String> stream = Files.lines(path, StandardCharsets.UTF_8)) {
    stream.filter(l -> l.startsWith("ERROR"))
          .forEach(System.err::println);
}

// ── Newline normalization on read ─────────────────────────────────────
// BufferedReader.readLine() strips ALL line terminators (

, 
, 
)
// If you need to preserve original line endings, read with char[] not readLine():
try (Reader reader = new BufferedReader(new FileReader("mixed.txt", StandardCharsets.UTF_8))) {
    StringBuilder sb = new StringBuilder();
    char[] buf = new char[4096];
    int n;
    while ((n = reader.read(buf)) != -1) {
        sb.append(buf, 0, n);  // preserves all original 
 and 
 characters
    }
    String withOriginalLineEndings = sb.toString();
}

Character Streams

Reader and Writer Hierarchy, and the Bridge Classes

BufferedReader, BufferedWriter, PrintWriter — High-Level Text I/O

Charset Handling, Encoding Best Practices, and Modern Alternatives

Related Topics in Java I/O