Contents
- Collectors.groupingBy() Basics
- groupingBy with Downstream Collector
- Multi-level groupingBy
- partitioningBy — Split into True / False
- partitioningBy with Downstream Collector
- toUnmodifiableMap Alternative
- Real-world Example — Group Employees by Department
Collectors.groupingBy(classifier) partitions a stream into a Map<K, List<T>> where each key is the result of applying the classifier function. By default the result is a HashMap with ArrayList values; both can be overridden with the three-argument overload:
import java.util.*;
import java.util.stream.*;
record Employee(String name, String department, String city, int salary, boolean active) {}
List<Employee> employees = List.of(
new Employee("Alice", "Engineering", "London", 90_000, true),
new Employee("Bob", "Marketing", "New York", 65_000, true),
new Employee("Carol", "Engineering", "London", 80_000, false),
new Employee("Diana", "HR", "New York", 55_000, true),
new Employee("Ethan", "Marketing", "London", 70_000, true),
new Employee("Fiona", "Engineering", "Berlin", 95_000, true),
new Employee("George", "HR", "Berlin", 50_000, false)
);
// groupingBy(classifier) → Map<K, List<T>>
Map<String, List<Employee>> byDept = employees.stream()
.collect(Collectors.groupingBy(Employee::department));
// Keys: "Engineering", "Marketing", "HR"
byDept.forEach((dept, emps) -> {
System.out.println(dept + ": " +
emps.stream().map(Employee::name).collect(Collectors.joining(", ")));
});
// Engineering: Alice, Carol, Fiona
// Marketing: Bob, Ethan
// HR: Diana, George
// Grouping by a derived value — first letter of city
Map<Character, List<Employee>> byCity = employees.stream()
.collect(Collectors.groupingBy(e -> e.city().charAt(0)));
// 'L' -> [Alice, Carol, Ethan], 'N' -> [Bob, Diana], 'B' -> [Fiona, George]
// TreeMap to guarantee sorted key order
Map<String, List<Employee>> sortedByDept = employees.stream()
.collect(Collectors.groupingBy(Employee::department, TreeMap::new, Collectors.toList()));
sortedByDept.keySet().forEach(System.out::println); // Engineering, HR, Marketing
The three-argument overload groupingBy(classifier, mapFactory, downstream) lets you control both the Map implementation (e.g. TreeMap::new for sorted keys) and the accumulation strategy in one call.
The two-argument overload groupingBy(classifier, downstream) replaces the default toList() accumulator with any collector — counting(), summingInt(), mapping(), joining(), maxBy(), or even another groupingBy():
import java.util.stream.*;
// counting() — how many employees per department
Map<String, Long> countByDept = employees.stream()
.collect(Collectors.groupingBy(Employee::department, Collectors.counting()));
// {Engineering=3, Marketing=2, HR=2}
// summingInt() — total salary per department
Map<String, Integer> totalSalaryByDept = employees.stream()
.collect(Collectors.groupingBy(
Employee::department,
Collectors.summingInt(Employee::salary)));
// {Engineering=265000, Marketing=135000, HR=105000}
// averagingInt() — average salary per department
Map<String, Double> avgSalaryByDept = employees.stream()
.collect(Collectors.groupingBy(
Employee::department,
Collectors.averagingInt(Employee::salary)));
// {Engineering=88333.33, Marketing=67500.0, HR=52500.0}
// mapping() — extract a field after grouping
Map<String, List<String>> namesByDept = employees.stream()
.collect(Collectors.groupingBy(
Employee::department,
Collectors.mapping(Employee::name, Collectors.toList())));
// {Engineering=[Alice, Carol, Fiona], ...}
// joining() — names as a single comma-separated string per department
Map<String, String> joinedNamesByDept = employees.stream()
.collect(Collectors.groupingBy(
Employee::department,
Collectors.mapping(Employee::name, Collectors.joining(", "))));
// {Engineering="Alice, Carol, Fiona", Marketing="Bob, Ethan", HR="Diana, George"}
// toSet() — deduplicate cities per department
Map<String, Set<String>> citiesByDept = employees.stream()
.collect(Collectors.groupingBy(
Employee::department,
Collectors.mapping(Employee::city, Collectors.toSet())));
// {Engineering=[Berlin, London], Marketing=[London, New York], HR=[Berlin, New York]}
// maxBy() — highest earner per department
Map<String, Optional<Employee>> topEarnerByDept = employees.stream()
.collect(Collectors.groupingBy(
Employee::department,
Collectors.maxBy(Comparator.comparingInt(Employee::salary))));
topEarnerByDept.forEach((dept, emp) ->
emp.ifPresent(e -> System.out.println(dept + " top earner: " + e.name())));
// Engineering top earner: Fiona (95000)
// Marketing top earner: Ethan (70000)
// HR top earner: Diana (55000)
Collectors.collectingAndThen(downstream, finisher) wraps any downstream collector and applies a final transformation. Use it to convert a mutable list to an unmodifiable one, or to extract a value from an Optional returned by maxBy / minBy.
Passing a second groupingBy() as the downstream collector builds a nested Map<K1, Map<K2, List<T>>>. Each subsequent grouping level can itself carry a downstream collector for further aggregation:
// Nested groupingBy — outer key: department, inner key: city
Map<String, Map<String, List<Employee>>> byDeptAndCity = employees.stream()
.collect(Collectors.groupingBy(
Employee::department,
Collectors.groupingBy(Employee::city)));
byDeptAndCity.forEach((dept, cityMap) -> {
System.out.println("--- " + dept + " ---");
cityMap.forEach((city, emps) ->
System.out.println(" " + city + ": " +
emps.stream().map(Employee::name).collect(Collectors.joining(", "))));
});
// --- Engineering ---
// London: Alice, Carol
// Berlin: Fiona
// --- Marketing ---
// New York: Bob
// London: Ethan
// --- HR ---
// New York: Diana
// Berlin: George
// Two-level grouping with a downstream count
Map<String, Map<String, Long>> countByDeptAndCity = employees.stream()
.collect(Collectors.groupingBy(
Employee::department,
Collectors.groupingBy(Employee::city, Collectors.counting())));
// {Engineering={London=2, Berlin=1}, Marketing={New York=1, London=1}, HR={New York=1, Berlin=1}}
// Two-level grouping, inner level joins names
Map<String, Map<String, String>> namesByDeptCity = employees.stream()
.collect(Collectors.groupingBy(
Employee::department,
Collectors.groupingBy(
Employee::city,
Collectors.mapping(Employee::name, Collectors.joining(", ")))));
// Flatten to a summary line per dept+city
employees.stream()
.collect(Collectors.groupingBy(
e -> e.department() + " / " + e.city(),
Collectors.counting()))
.forEach((key, count) -> System.out.println(key + " → " + count));
Multi-level groupings produce deeply nested Map types. Consider extracting each level into a named variable, or flattening into a single string key (e.g. "dept/city"), when readability matters more than the nested structure.
Collectors.partitioningBy(predicate) is a specialised groupingBy that always produces a Map<Boolean, List<T>>. Both true and false keys are always present (even if one list is empty), eliminating the need for null checks:
// partitioningBy(predicate) → always Map<Boolean, List<T>>
// Guaranteed two keys: true and false
Map<Boolean, List<Employee>> activePartition = employees.stream()
.collect(Collectors.partitioningBy(Employee::active));
List<Employee> active = activePartition.get(true);
List<Employee> inactive = activePartition.get(false);
System.out.println("Active: " + active.stream().map(Employee::name).toList());
System.out.println("Inactive: " + inactive.stream().map(Employee::name).toList());
// Active: [Alice, Bob, Diana, Ethan, Fiona]
// Inactive: [Carol, George]
// Partition by salary threshold
Map<Boolean, List<Employee>> highEarners = employees.stream()
.collect(Collectors.partitioningBy(e -> e.salary() >= 75_000));
System.out.println("High earners: " +
highEarners.get(true).stream().map(Employee::name).toList());
// [Alice, Carol, Fiona]
// Partition a list of integers into even / odd
List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
Map<Boolean, List<Integer>> evenOdd = numbers.stream()
.collect(Collectors.partitioningBy(n -> n % 2 == 0));
System.out.println("Even: " + evenOdd.get(true)); // [2, 4, 6, 8, 10]
System.out.println("Odd: " + evenOdd.get(false)); // [1, 3, 5, 7, 9]
// Partition strings by length
List<String> words = List.of("cat", "elephant", "ox", "hippopotamus", "dog", "ant");
Map<Boolean, List<String>> longShort = words.stream()
.collect(Collectors.partitioningBy(w -> w.length() > 4));
System.out.println("Long: " + longShort.get(true)); // [elephant, hippopotamus]
System.out.println("Short: " + longShort.get(false)); // [cat, ox, dog, ant]
partitioningBy always returns a map with both keys present (even if one list is empty), making get(true) and get(false) safe without null checks. This distinguishes it from groupingBy with a boolean classifier, which only populates keys that actually appear.
Like groupingBy, partitioningBy accepts a downstream collector as its second argument, applying a further reduction to each of the two partitions — useful for counting, averaging, or joining within each partition:
// partitioningBy(predicate, downstream) — apply a second collector to each partition
// Count active vs inactive
Map<Boolean, Long> countActiveInactive = employees.stream()
.collect(Collectors.partitioningBy(Employee::active, Collectors.counting()));
System.out.println(countActiveInactive);
// {false=2, true=5}
// Average salary per partition
Map<Boolean, Double> avgSalaryPartition = employees.stream()
.collect(Collectors.partitioningBy(
e -> e.salary() >= 75_000,
Collectors.averagingInt(Employee::salary)));
System.out.println("High earner avg: " + avgSalaryPartition.get(true));
System.out.println("Others avg: " + avgSalaryPartition.get(false));
// Names per partition as a joined string
Map<Boolean, String> namesByActive = employees.stream()
.collect(Collectors.partitioningBy(
Employee::active,
Collectors.mapping(Employee::name, Collectors.joining(", "))));
System.out.println("Active: " + namesByActive.get(true));
// Active: Alice, Bob, Diana, Ethan, Fiona
System.out.println("Inactive: " + namesByActive.get(false));
// Inactive: Carol, George
// Partition primes vs composites (1..20)
Map<Boolean, List<Integer>> primesComposites = IntStream.rangeClosed(2, 20)
.boxed()
.collect(Collectors.partitioningBy(n -> {
if (n < 2) return false;
for (int i = 2; i * i <= n; i++) {
if (n % i == 0) return false;
}
return true;
}));
System.out.println("Primes: " + primesComposites.get(true));
// Primes: [2, 3, 5, 7, 11, 13, 17, 19]
System.out.println("Composites: " + primesComposites.get(false));
// Composites: [4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20]
When keys are unique and you need a simple Map<K, V> rather than a Map<K, List<V>>, use Collectors.toMap() or Collectors.toUnmodifiableMap(). Supply a merge function to handle duplicate keys gracefully instead of throwing an exception:
import java.util.stream.*;
// When you need a Map<K, V> (not Map<K, List<V>>) and keys are unique
// Use Collectors.toMap() or toUnmodifiableMap()
List<Employee> employees = List.of(
new Employee("Alice", "Engineering", "London", 90_000, true),
new Employee("Bob", "Marketing", "New York", 65_000, true),
new Employee("Fiona", "Engineering", "Berlin", 95_000, true)
);
// toMap — name → salary (keys must be unique or provide merge function)
Map<String, Integer> salaryMap = employees.stream()
.collect(Collectors.toMap(Employee::name, Employee::salary));
// {Alice=90000, Bob=65000, Fiona=95000}
// toUnmodifiableMap — same but result is unmodifiable (Java 10+)
Map<String, Employee> byName = employees.stream()
.collect(Collectors.toUnmodifiableMap(Employee::name, e -> e));
// With merge function — handle duplicate keys (take higher salary)
List<Employee> withDups = List.of(
new Employee("Alice", "Engineering", "London", 90_000, true),
new Employee("Alice", "Engineering", "Berlin", 95_000, true), // duplicate name
new Employee("Bob", "Marketing", "New York", 65_000, true)
);
Map<String, Integer> maxSalaryByName = withDups.stream()
.collect(Collectors.toMap(
Employee::name,
Employee::salary,
Integer::max)); // merge: keep higher salary on duplicate key
// {Alice=95000, Bob=65000}
// groupingBy vs toMap — pick the right one:
// groupingBy → Map<K, List<T>> (multiple values per key, safe on duplicates)
// toMap → Map<K, V> (one value per key, requires merge on duplicates)
// collectingAndThen to wrap result as unmodifiable after groupingBy
Map<String, List<Employee>> immutableGroups = employees.stream()
.collect(Collectors.collectingAndThen(
Collectors.groupingBy(Employee::department),
Collections::unmodifiableMap));
Prefer Collectors.toUnmodifiableMap() over wrapping with Collections.unmodifiableMap() when building lookup maps from streams — it is more concise and signals intent clearly at the collection site.
This end-to-end example applies multiple grouping strategies to a realistic employee dataset — headcount, average salary, top earner per department, two-level grouping, and summary statistics — all in a single stream pass per query:
import java.util.*;
import java.util.stream.*;
// Realistic domain model
record Department(String name, String location) {}
record Employee(String name, Department dept, int salary, int yearsExp, boolean active) {}
List<Employee> staff = List.of(
new Employee("Alice", new Department("Engineering", "London"), 90_000, 8, true),
new Employee("Bob", new Department("Marketing", "New York"), 65_000, 4, true),
new Employee("Carol", new Department("Engineering", "London"), 80_000, 5, false),
new Employee("Diana", new Department("HR", "New York"), 55_000, 3, true),
new Employee("Ethan", new Department("Marketing", "London"), 70_000, 6, true),
new Employee("Fiona", new Department("Engineering", "Berlin"), 95_000, 10, true),
new Employee("George", new Department("HR", "Berlin"), 50_000, 2, false),
new Employee("Hannah", new Department("Engineering", "Berlin"), 88_000, 7, true),
new Employee("Ivan", new Department("Marketing", "New York"), 72_000, 5, true)
);
// 1. Active headcount by department
Map<String, Long> headcount = staff.stream()
.filter(Employee::active)
.collect(Collectors.groupingBy(e -> e.dept().name(), Collectors.counting()));
System.out.println("Headcount: " + headcount);
// {Engineering=3, Marketing=3, HR=1}
// 2. Average salary by department, sorted descending
staff.stream()
.collect(Collectors.groupingBy(
e -> e.dept().name(),
Collectors.averagingInt(Employee::salary)))
.entrySet().stream()
.sorted(Map.Entry.<String, Double>comparingByValue().reversed())
.forEach(e -> System.out.printf("%-15s %.0f%n", e.getKey(), e.getValue()));
// Engineering 88333
// Marketing 69000
// HR 52500
// 3. Top earner per department
Map<String, Optional<Employee>> topEarner = staff.stream()
.collect(Collectors.groupingBy(
e -> e.dept().name(),
Collectors.maxBy(Comparator.comparingInt(Employee::salary))));
topEarner.forEach((dept, emp) ->
emp.ifPresent(e -> System.out.println(dept + " → " + e.name() + " (" + e.salary() + ")")));
// Engineering → Fiona (95000)
// Marketing → Ivan (72000)
// HR → Diana (55000)
// 4. Employees grouped by department, then by location
Map<String, Map<String, List<String>>> deptLocationNames = staff.stream()
.collect(Collectors.groupingBy(
e -> e.dept().name(),
Collectors.groupingBy(
e -> e.dept().location(),
Collectors.mapping(Employee::name, Collectors.toList()))));
deptLocationNames.forEach((dept, locMap) -> {
System.out.println(dept + ":");
locMap.forEach((loc, names) ->
System.out.println(" " + loc + ": " + names));
});
// 5. Salary statistics per department using summarizingInt
Map<String, IntSummaryStatistics> salaryStats = staff.stream()
.collect(Collectors.groupingBy(
e -> e.dept().name(),
Collectors.summarizingInt(Employee::salary)));
salaryStats.forEach((dept, stats) ->
System.out.printf("%s — min: %d max: %d avg: %.0f%n",
dept, stats.getMin(), stats.getMax(), stats.getAverage()));
// Engineering — min: 80000 max: 95000 avg: 88333
// Marketing — min: 65000 max: 72000 avg: 69000
// HR — min: 50000 max: 55000 avg: 52500
Collectors.summarizingInt() computes count, sum, min, max, and average in a single pass, which is more efficient than chaining multiple collectors when you need multiple aggregate values for the same grouping key.