Contents
- HashSet — O(1) Operations
- LinkedHashSet — Insertion Order
- TreeSet — Sorted Order
- Set Algebra — Union, Intersection, Difference
- Choosing the Right Set
HashSet is backed by a HashMap and delivers O(1) average-case performance for add, remove, and contains. Duplicate elements are silently ignored — add() returns false if the element was already present. Iteration order is undefined.
import java.util.*;
// HashSet — backed by HashMap, O(1) average for add/remove/contains
// No guaranteed iteration order
Set<String> set = new HashSet<>();
set.add("apple");
set.add("banana");
set.add("apple"); // duplicate — silently ignored
System.out.println(set.size()); // 2
// contains — O(1)
System.out.println(set.contains("apple")); // true
System.out.println(set.contains("grape")); // false
// remove — O(1)
set.remove("banana");
System.out.println(set); // [apple]
// add returns boolean — true if element was new
boolean added = set.add("cherry"); // true
boolean dup = set.add("cherry"); // false — already present
// Initializing from existing collection (deduplication trick)
List<String> withDups = List.of("a", "b", "a", "c", "b");
Set<String> unique = new HashSet<>(withDups);
System.out.println(unique); // [a, b, c] (order may vary)
System.out.println(unique.size()); // 3
// Iteration — order is unpredictable
for (String s : set) {
System.out.println(s);
}
// HashSet requires proper equals() and hashCode() on elements
// Two objects equal by equals() must have the same hashCode()
For HashSet to work correctly with custom objects, those objects must implement both equals() and hashCode() consistently. Records and many library classes do this automatically.
LinkedHashSet extends HashSet with a doubly-linked list maintaining insertion order. All operations remain O(1), with a small overhead over plain HashSet. It is the idiomatic way to deduplicate a collection while preserving the first-occurrence order:
// LinkedHashSet — preserves insertion order; O(1) operations (slightly slower than HashSet)
Set<String> lhs = new LinkedHashSet<>();
lhs.add("banana");
lhs.add("apple");
lhs.add("cherry");
lhs.add("apple"); // duplicate ignored
System.out.println(lhs); // [banana, apple, cherry] — insertion order preserved
// All Set operations work the same as HashSet
lhs.contains("apple"); // true — O(1)
lhs.remove("apple");
System.out.println(lhs); // [banana, cherry]
// Use case: deduplication while preserving order
List<String> input = List.of("c", "a", "b", "a", "c", "d");
Set<String> ordered = new LinkedHashSet<>(input);
System.out.println(ordered); // [c, a, b, d] — first occurrence order
// Convert back to list — deduplicated, order preserved
List<String> deduped = new ArrayList<>(ordered);
// Useful for LRU-style caches when combined with removeEldestEntry
// LinkedHashSet also offers a constructor for load factor tuning:
Set<String> tuned = new LinkedHashSet<>(16, 0.75f); // initial capacity, load factor
TreeSet implements NavigableSet on top of a red-black tree, keeping elements in sorted order at O(log n) cost. It provides rich range-query operations — floor, ceiling, headSet, tailSet, subSet — and iterates in ascending order by default:
// TreeSet — backed by TreeMap (Red-Black tree), O(log n) for all operations
// Elements must be Comparable OR a Comparator must be provided
Set<String> ts = new TreeSet<>();
ts.add("banana");
ts.add("apple");
ts.add("cherry");
ts.add("avocado");
System.out.println(ts); // [apple, avocado, banana, cherry] — alphabetical
// NavigableSet methods (TreeSet implements NavigableSet)
TreeSet<Integer> nums = new TreeSet<>(List.of(10, 20, 30, 40, 50));
System.out.println(nums.first()); // 10 — smallest
System.out.println(nums.last()); // 50 — largest
System.out.println(nums.floor(25)); // 20 — greatest element <= 25
System.out.println(nums.ceiling(25)); // 30 — smallest element >= 25
System.out.println(nums.lower(30)); // 20 — strictly less than 30
System.out.println(nums.higher(30)); // 40 — strictly greater than 30
// Subsets — all returned views are live (backed by original)
System.out.println(nums.headSet(30)); // [10, 20] — less than 30
System.out.println(nums.tailSet(30)); // [30, 40, 50] — >= 30
System.out.println(nums.subSet(20, 40)); // [20, 30] — [20, 40)
System.out.println(nums.subSet(20, true, 40, true)); // [20, 30, 40] — inclusive
// Reverse order
NavigableSet<Integer> desc = nums.descendingSet();
System.out.println(desc); // [50, 40, 30, 20, 10]
// Custom comparator — sort strings by length, then alphabetically
TreeSet<String> byLength = new TreeSet<>(
Comparator.comparingInt(String::length).thenComparing(Comparator.naturalOrder()));
byLength.addAll(List.of("cat", "elephant", "ox", "dog", "ant"));
System.out.println(byLength); // [ox, ant, cat, dog, elephant]
The Set interface's bulk operations map directly to standard set theory: addAll (union), retainAll (intersection), removeAll (difference). Always operate on a copy to avoid mutating the original sets:
Set<Integer> a = new HashSet<>(Set.of(1, 2, 3, 4, 5));
Set<Integer> b = new HashSet<>(Set.of(3, 4, 5, 6, 7));
// Union — all elements from both sets
Set<Integer> union = new HashSet<>(a);
union.addAll(b);
System.out.println(union); // [1, 2, 3, 4, 5, 6, 7]
// Intersection — elements in both sets
Set<Integer> intersection = new HashSet<>(a);
intersection.retainAll(b);
System.out.println(intersection); // [3, 4, 5]
// Difference (a minus b) — elements in a but not in b
Set<Integer> difference = new HashSet<>(a);
difference.removeAll(b);
System.out.println(difference); // [1, 2]
// Symmetric difference — elements in exactly one set
Set<Integer> symDiff = new HashSet<>(a);
symDiff.addAll(b);
Set<Integer> temp = new HashSet<>(a);
temp.retainAll(b);
symDiff.removeAll(temp);
System.out.println(symDiff); // [1, 2, 6, 7]
// Non-mutating versions (don't touch originals)
Set<Integer> unionClean = Stream.concat(a.stream(), b.stream())
.collect(Collectors.toSet());
// Subset check — is a subset of b?
boolean isSubset = b.containsAll(Set.of(3, 4)); // true — {3,4} ⊆ b
System.out.println(isSubset); // true
// Disjoint check — no elements in common
boolean disjoint = Collections.disjoint(a, Set.of(8, 9));
System.out.println(disjoint); // true
The bulk operations addAll, retainAll, and removeAll modify the set in place. Always copy the set first if you need to preserve the original: new HashSet<>(original).
Default to HashSet for membership testing, LinkedHashSet when iteration order matters, TreeSet for sorted access or range queries, and EnumSet whenever all elements belong to the same enum type. Set.of() is ideal for immutable constants:
// HashSet — O(1) ops, no order — most common default choice
// LinkedHashSet — O(1) ops, insertion order — when you need predictable iteration
// TreeSet — O(log n) ops, sorted order — when you need sorted access/ranges
// Immutable Set — Java 9+
Set<String> immutable = Set.of("a", "b", "c");
// immutable.add("d"); // UnsupportedOperationException
// Set.of does NOT allow null elements; order is unspecified
// Copyable immutable from mutable:
Set<String> copy = Set.copyOf(mutableSet);
// EnumSet — for enum types, uses bit vector internally — very fast
enum Day { MON, TUE, WED, THU, FRI, SAT, SUN }
EnumSet<Day> weekdays = EnumSet.range(Day.MON, Day.FRI);
EnumSet<Day> weekend = EnumSet.of(Day.SAT, Day.SUN);
EnumSet<Day> all = EnumSet.allOf(Day.class);
System.out.println(weekdays); // [MON, TUE, WED, THU, FRI]
// Summary table:
// | Implementation | Null | Order | add/contains | When to use |
// |-----------------|------|-------------|--------------|----------------------|
// | HashSet | Yes | None | O(1) | General purpose |
// | LinkedHashSet | Yes | Insertion | O(1) | Preserve order |
// | TreeSet | No | Sorted | O(log n) | Range queries, sort |
// | EnumSet | No | Enum order | O(1) | Enum flags |
// | Set.of(...) | No | None | O(1) | Immutable constants |
When you only need to check membership (no iteration needed), HashSet is the right choice. If the elements are an enum type, always prefer EnumSet — it is faster than HashSet and uses less memory.