LinkedHashSet
LinkedHashSet is a Set implementation that combines the O(1) performance of a hash table with the predictable iteration order of a doubly-linked list. It maintains a linked list running through its entries in the order elements were inserted, so iteration always produces elements in insertion order. LinkedHashSet occupies the middle ground between HashSet (no order guarantee) and TreeSet (sorted order, O(log n) operations) — it provides the ordering guarantee that HashSet lacks at nearly the same performance cost. This entry covers the dual data structure implementation, insertion-order semantics, the memory overhead of the linked list, when LinkedHashSet is the right choice over HashSet and TreeSet, and practical patterns including deduplication with preserved order.
Dual Structure — Hash Table Plus Linked List
// ── Insertion order preserved ─────────────────────────────────────────
Set<String> linked = new LinkedHashSet<>();
linked.add("banana");
linked.add("apple");
linked.add("cherry");
linked.add("date");
// Iteration produces elements in INSERTION order:
for (String s : linked) System.out.print(s + " ");
// banana apple cherry date — always, regardless of hash codes
// Compare with HashSet:
Set<String> hashed = new HashSet<>(linked);
for (String s : hashed) System.out.print(s + " ");
// might print: apple banana date cherry — order unpredictable
// ── Re-insertion does not change order ────────────────────────────────
Set<String> set = new LinkedHashSet<>();
set.add("A");
set.add("B");
set.add("C");
set.add("A"); // already present — ignored, position in list unchanged
for (String s : set) System.out.print(s + " ");
// A B C — A stays at its original insertion position
// ── Dual structure explained ──────────────────────────────────────────
//
// LinkedHashSet with {"banana", "apple", "cherry"}:
//
// Hash table (for O(1) lookup):
// Bucket 2: [banana]
// Bucket 5: [apple]
// Bucket 7: [cherry]
//
// Doubly-linked list (for ordered iteration):
// head ↔ banana ↔ apple ↔ cherry ↔ tail
//
// contains("apple"): hash("apple") → bucket 5 → found → O(1)
// iteration: follow linked list: banana → apple → cherry
// ── Memory overhead per entry ─────────────────────────────────────────
// HashSet entry: { hash, key, value(dummy), next-in-bucket }
// LinkedHashSet entry: { hash, key, value(dummy), next-in-bucket,
// before(linked list prev),
// after(linked list next) }
// Extra: 2 references × 8 bytes = 16 bytes per entry
// For large sets of small objects, this is significant
// For large objects (e.g., complex DTOs), the overhead is negligibleWhen to Choose LinkedHashSet
// ── Deduplication preserving insertion order ─────────────────────────
List<String> events = List.of(
"login", "view", "login", "purchase", "view", "logout", "login");
// HashSet dedup — order lost:
Set<String> unordered = new HashSet<>(events);
// might print: purchase logout login view
// LinkedHashSet dedup — insertion order preserved:
Set<String> ordered = new LinkedHashSet<>(events);
for (String e : ordered) System.out.print(e + " ");
// login view purchase logout — first occurrence order
// ── Preserving CSV column order while deduplicating ───────────────────
String[] columns = {"name", "email", "phone", "email", "name", "city"};
Set<String> uniqueColumns = new LinkedHashSet<>(Arrays.asList(columns));
// uniqueColumns iteration: name, email, phone, city — preserves first-seen order
// ── OrderedHistory: recent items without duplicates ───────────────────
public class RecentItems<T> {
private final Set<T> items = new LinkedHashSet<>();
private final int maxSize;
public RecentItems(int maxSize) { this.maxSize = maxSize; }
public void add(T item) {
items.remove(item); // remove if present (to re-insert at end)
items.add(item); // add at tail of linked list
if (items.size() > maxSize) {
// Remove the oldest item (head of linked list)
items.remove(items.iterator().next());
}
}
public List<T> getRecentItems() {
List<T> result = new ArrayList<>(items);
Collections.reverse(result); // most recent first
return result;
}
}
// Usage:
RecentItems<String> history = new RecentItems<>(5);
history.add("page1");
history.add("page2");
history.add("page3");
history.add("page1"); // moves page1 to end (most recent)
// getRecentItems(): [page1, page3, page2] — most recent first
// ── Performance comparison ────────────────────────────────────────────
// Operation HashSet LinkedHashSet TreeSet
// ─────────────────────────────────────────────────────
// add O(1)* O(1)* O(log n)
// remove O(1)* O(1)* O(log n)
// contains O(1)* O(1)* O(log n)
// iteration O(capacity) O(size) O(size)
// memory/entry baseline +16 bytes +node overhead
//
// * amortised average case
// LinkedHashSet iteration is FASTER than HashSet because:
// - HashSet scans the entire bucket array (capacity slots, mostly empty)
// - LinkedHashSet follows a pointer chain through only live entriesPractical Patterns with LinkedHashSet
// ── Deduplication pipeline step ──────────────────────────────────────
public List<String> deduplicatePreservingOrder(List<String> input) {
return new ArrayList<>(new LinkedHashSet<>(input));
// One line: deduplicate + preserve first-occurrence order + return as List
}
List<String> tags = List.of("java", "spring", "java", "rest", "spring", "api");
List<String> uniqueTags = deduplicatePreservingOrder(tags);
// [java, spring, rest, api] — first occurrence order
// ── Stream-based deduplication with order ─────────────────────────────
// Stream.distinct() also preserves encounter order:
List<String> distinct = tags.stream()
.distinct()
.collect(Collectors.toList());
// [java, spring, rest, api] — same result, but LinkedHashSet is faster for large input
// ── Predictable serialisation ──────────────────────────────────────────
// Configuration keys in defined order:
Set<String> configKeys = new LinkedHashSet<>();
configKeys.add("database.url");
configKeys.add("database.username");
configKeys.add("database.password");
configKeys.add("server.port");
// Serialising always produces the same order:
configKeys.forEach(key ->
System.out.println(key + " = " + config.get(key)));
// database.url = jdbc:...
// database.username = admin
// database.password = ***
// server.port = 8080
// Always in this exact order — predictable for diffs and logs
// ── Testing: predictable assertion order ─────────────────────────────
// HashSet assertion can be flaky:
// assertThat(processResult()).containsExactly("A", "B", "C");
// fails intermittently because HashSet iteration order varies
// LinkedHashSet assertion is stable:
Set<String> result = new LinkedHashSet<>();
result.add("A");
result.add("B");
result.add("C");
// This assertion is always correct:
assertThat(result).containsExactly("A", "B", "C");
// ── Intersection preserving left-operand order ─────────────────────────
public <T> Set<T> orderedIntersection(Set<T> ordered, Set<T> filter) {
Set<T> result = new LinkedHashSet<>(ordered);
result.retainAll(filter); // keeps elements in 'ordered' that are in 'filter'
return result; // preserves 'ordered's insertion order
}
Set<String> preferred = new LinkedHashSet<>(
List.of("cherry", "banana", "apple", "date"));
Set<String> available = Set.of("apple", "cherry", "elderberry");
Set<String> inStock = orderedIntersection(preferred, available);
// [cherry, apple] — preferred order preserved, banana/date excluded