Photo by Paweł Czerwiński on Unsplash

Introduction

There are several ways to reduce Stream as a sequence of input elements into a single summary result. One of them is to use implementations of Collector interface with Stream.collect(collector) method. It’s possible to implement this interface explicitly, but it should start with studying its predefined implementations from Collectors class.

Classification of predefined collectors

There are 44 public static factory methods in Collectors class (up to Java 12) that return predefined implementations of Collector interface. To understand them better, it’s rational to divide them into categories, for example:

  • 1) regular collectors to collections
  • 2) collectors to unmodifiable collections
  • downstream-designed collectors:
  • 1) analogs of stream intermediate operations
  • 2) analogs of stream terminal operations
  • 3) analogs of stream reduce operations
  • collectors to maps:
  • 1) “to-map” collectors to maps:
  • 1a) regular collectors to maps
  • 1b) collectors to unmodifiable maps
  • 1c) concurrent collectors to maps
  • 2) “grouping-by” collectors to maps:
  • 2a) grouping collectors to maps
  • 2b) partitioning collectors to maps
  • 2c) concurrent grouping collectors to maps
  • other collectors

Collectors to collections

Collectors to reduce input elements into collections are the simplest. They allow collecting streams into List, Set, and a specific Collection.

Regular collectors to collections

To collect Stream to List it’s possible to use a collector from toList method. There are no guarantees about type, mutability, serializability, or thread-safety of the returned List.

List<Integer> list = Stream.of(1, 2, 3)
.collect(toList());
assertThat(list)
.hasSize(3)
.containsOnly(1, 2, 3);
Set<Integer> set = Stream.of(1, 1, 2, 2, 3, 3)
.collect(toSet());
assertThat(set)
.hasSize(3)
.containsOnly(1, 2, 3);
List<Integer> list = Stream.of(1, 2, 3)
.collect(toCollection(ArrayList::new));
assertThat(list)
.hasSize(3)
.containsOnly(1, 2, 3)
.isExactlyInstanceOf(ArrayList.class);

Collectors to unmodifiable collections

Collections that do not support modification operations are referred to as unmodifiable. Such collections cannot be modified by calling any mutator methods, they are guaranteed to throw UnsupportedOperationException. But only if elements of such collections are immutable, collections can be considered as immutable itself.

List<Integer> unmodifiableList = Stream.of(1, 2, 3)
.collect(toUnmodifiableList());
assertThat(unmodifiableList)
.hasSize(3)
.containsOnly(1, 2, 3);
assertThatThrownBy(unmodifiableList::clear)
.isInstanceOf(UnsupportedOperationException.class);
Set<Integer> unmodifiableSet = Stream.of(1, 1, 2, 2, 3, 3)
.collect(toUnmodifiableSet());
assertThat(unmodifiableSet)
.hasSize(3)
.containsOnly(1, 2, 3);
assertThatThrownBy(unmodifiableSet::clear)
.isInstanceOf(UnsupportedOperationException.class);

Downstream-designed collectors

There are collectors that have functionality similar to some Stream operations. Indeed these collectors were designed not to duplicate Stream functionality, but to be passed as arguments (downstream collectors) to other collectors to perform the multilevel reduction.

Analogs of stream intermediate operations

There are collectors from filtering, mapping, flatMapping methods that have functionality similar to filter, map, flatMap Stream intermediate operations. They all are designed to perform filter-map steps in filter-map-reduce functional pipeline.

List<Integer> listOfOddNumbers = Stream.of(1, 2, 3)
.collect(filtering(i -> i % 2 != 0, toList()));
assertThat(listOfOddNumbers)
.hasSize(2)
.containsOnly(1, 3);
List<Integer> listOfSquares = Stream.of(1, 2, 3)
.collect(mapping(i -> i * i, toList()));
assertThat(listOfSquares)
.hasSize(3)
.containsOnly(1, 4, 9);
List<Integer> list = Stream.of(
List.of(1, 2),
List.of(3, 4))
.collect(flatMapping(List::stream, toList()));
assertThat(list)
.hasSize(4)
.containsOnly(1, 2, 3, 4);

Analogs of stream terminal operations

There are collectors from averaging(Int|Long|Double), counting, maxBy, minBy, summing(Int|Long|Double), summarizing(Int|Long|Double) methods that have functionality similar to average, count, max, min, sum, summaryStatistics Stream terminal operations. They all are designed to perform specialized reduce steps in filter-map-reduce functional pipeline.

double average = Stream.of(1, 2, 3)
.collect(averagingInt(i -> i));
assertThat(average).isEqualTo(2);
long count = Stream.of(1, 2, 3)
.collect(counting());
assertEquals(3L, count);
Optional<Integer> max = Stream.of(1, 2, 3)
.collect(maxBy(Comparator.naturalOrder()));
assertThat(max)
.isNotEmpty()
.hasValue(3);
Optional<Integer> min = Stream.of(1, 2, 3)
.collect(minBy(Comparator.naturalOrder()));
assertThat(min)
.isNotEmpty()
.hasValue(1);
int sum = Stream.of(1, 2, 3)
.collect(summingInt(i -> i));
assertThat(sum).isEqualTo(6);
IntSummaryStatistics iss = Stream.of(1, 2, 3)
.collect(summarizingInt(i -> i));
assertThat(iss.getAverage()).isEqualTo(2);
assertThat(iss.getCount()).isEqualTo(3);
assertThat(iss.getMax()).isEqualTo(3);
assertThat(iss.getMin()).isEqualTo(1);
assertThat(iss.getSum()).isEqualTo(6);

Analogs of stream reduce operations

There are collectors from overloaded reducing methods that have functionality similar to reduce Stream operations. They all are designed to perform general reduce steps in filter-map-reduce functional pipeline.

  • identity — the initial value for the reduction; it’s returned as result value when there are no input elements
  • mapper — a Function to apply to each input element
Optional<Integer> sumOptional = Stream.of(1, 2, 3)
.collect(reducing(Integer::sum));
assertTrue(sumOptional.isPresent());
assertThat(sumOptional.get()).isEqualTo(6);
Integer sum = Stream.of(1, 2, 3)
.collect(reducing(0, Integer::sum));
assertThat(sum).isEqualTo(6);
Integer sumOfSquares = Stream.of(1, 2, 3)
.collect(reducing(0, element -> element * element, Integer::sum));
assertThat(sumOfSquares).isEqualTo(14);

Collectors to maps

Collectors to reduce input elements to maps are much more complicated than collectors to collections. There are two big categories of such collectors:

  • collectors from “grouping-by” methods (groupingBy, partitioningBy, groupingByConcurrent)

“To-map” collectors to maps

Collectors from “to-map” methods reduce input elements to maps whose keys and values are the results of applying key-mapping and value-mapping functions. If many input elements are associated with the same key, it’s possible to use merge function to return a single value by binary reduction.

  • valueMapper — a Function to convert input elements into map values
  • mergeFunction — a BinaryOperator to resolve collisions between values when many input elements are associated with the same key
  • mapFactory — a Supplier for new empty Map to collect results

Regular collectors to maps

Example of a collector from toMap(keyMapper, valueMapper) method, where no keys collisions are guaranteed.

Map<Character, String> map = Stream.of("Alpha", "Bravo", "Charlie")
.collect(toMap(s -> s.charAt(0), Function.identity()));
assertThat(map)
.hasSize(3)
.containsEntry('A', "Alpha")
.containsEntry('B', "Bravo")
.containsEntry('C', "Charlie");
assertThrows(IllegalStateException.class, () -> {
Stream.of(
"Amsterdam", "Baltimore", "Casablanca",
"Alpha", "Bravo", "Charlie")
.collect(toMap(s -> s.charAt(0), Function.identity()));
});
Map<Character, String> map = Stream.of(
"Amsterdam", "Baltimore", "Casablanca",
"Alpha", "Bravo", "Charlie")
.collect(toMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2));
assertThat(map)
.hasSize(3)
.containsEntry('A', "Alpha")
.containsEntry('B', "Bravo")
.containsEntry('C', "Charlie");
SortedMap<Character, String> map = Stream.of(
"Amsterdam", "Baltimore", "Casablanca",
"Alpha", "Bravo", "Charlie")
.collect(toMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2, TreeMap::new));
assertThat(map)
.hasSize(3)
.containsEntry('A', "Alpha")
.containsEntry('B', "Bravo")
.containsEntry('C', "Charlie")
.isExactlyInstanceOf(TreeMap.class);

Collectors to unmodifiable maps

Maps that do not support modification operations are referred to as unmodifiable. Such maps cannot be modified by calling any mutator methods, they are guaranteed to throw UnsupportedOperationException. But only if keys and values of such maps are immutable, maps can be considered as immutable itself.

Map<Character, String> unmodifiableMap = Stream.of("Alpha", "Bravo", "Charlie")
.collect(toUnmodifiableMap(s -> s.charAt(0), Function.identity()));
assertThat(unmodifiableMap)
.hasSize(3)
.containsEntry('A', "Alpha")
.containsEntry('B', "Bravo")
.containsEntry('C', "Charlie");
assertThatThrownBy(unmodifiableMap::clear)
.isInstanceOf(UnsupportedOperationException.class);
assertThrows(IllegalStateException.class, () -> {
Stream.of(
"Amsterdam", "Baltimore", "Casablanca",
"Alpha", "Bravo", "Charlie")
.collect(toUnmodifiableMap(s -> s.charAt(0), Function.identity()));
});
Map<Character, String> unmodifiableMap = Stream.of(
"Amsterdam", "Baltimore", "Casablanca",
"Alpha", "Bravo", "Charlie")
.collect(toUnmodifiableMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2));
assertThat(unmodifiableMap)
.hasSize(3)
.containsEntry('A', "Alpha")
.containsEntry('B', "Bravo")
.containsEntry('C', "Charlie");
assertThatThrownBy(unmodifiableMap::clear)
.isInstanceOf(UnsupportedOperationException.class);

Concurrent collectors to maps

The difference between collectors from toMap and toConcurrentMap methods is in their behavior during parallel reduction.

  • accumulator — a function to add a new element into a result container (is called multiple times)
  • combiner — a function to combine two result containers into one (is called multiple times)
  • accumulator — a function to add a new element into a result container (is called multiple times)
  • combiner — a function to combine two result containers into one (is never called)
  • the Collector has the characteristic CONCURRENT
  • either the Stream is unordered or the Collector has the characteristic UNORDERED
ConcurrentMap<Character, String> map = Stream.of("Alpha", "Bravo", "Charlie")
.parallel()
.collect(toConcurrentMap(s -> s.charAt(0), Function.identity()));
assertThat(map)
.hasSize(3)
.containsEntry('A', "Alpha")
.containsEntry('B', "Bravo")
.containsEntry('C', "Charlie");
assertThrows(IllegalStateException.class, () -> {
Stream.of(
"Amsterdam", "Baltimore", "Casablanca",
"Alpha", "Bravo", "Charlie")
.parallel()
.collect(toConcurrentMap(s -> s.charAt(0), Function.identity()));
});
ConcurrentMap<Character, String> map = Stream.of(
"Amsterdam", "Baltimore", "Casablanca",
"Alpha", "Bravo", "Charlie")
.parallel()
.collect(toConcurrentMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2));
assertThat(map)
.hasSize(3)
.containsKey('A')
.containsKey('B')
.containsKey('C');
ConcurrentMap<Character, String> map = Stream.of(
"Amsterdam", "Baltimore", "Casablanca",
"Alpha", "Bravo", "Charlie")
.parallel()
.collect(toConcurrentMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2, ConcurrentHashMap::new));
assertThat(map)
.hasSize(3)
.containsKey('A')
.containsKey('B')
.containsKey('C')
.isExactlyInstanceOf(ConcurrentHashMap.class);

“Grouping-by” collectors to maps

Collectors from “grouping-by” methods reduce input elements to maps whose key are groups by applying a classification function. All values, associated with the same key group, are reduced by a downstream collector into one value.

  • mapFactory — a Supplier for new empty Map to collect results
  • downstream — a Collector to reduce values, associated with the same key group

Grouping collectors to maps

Example a collector from groupingBy(classifier) method. Here is implicitly used a downstream collector to List.

Map<Area, List<City>> citiesPerArea = USA.CITIES.stream()
.collect(groupingBy(City::getArea));
Map<Area, Set<City>> citiesPerArea = USA.CITIES.stream()
.collect(groupingBy(City::getArea, toSet()));
EnumMap<Area, List<City>> citiesPerArea = USA.CITIES.stream()
.collect(groupingBy(City::getArea, () -> new EnumMap<>(Area.class), toList()));

Partitioning collectors to maps

Collectors from partitioningBy methods are a special case of collectors from groupingBy methods. The first use more specific Predicate as a classifier function, the second use more general Function.

  • downstream — a Collector to reduce values, associated with the same key group
Map<Boolean, List<Integer>> reminderFromDivisionBy2IsZero = Stream.of(1, 2, 3)
.collect(partitioningBy(i -> i % 2 == 0));
assertThat(reminderFromDivisionBy2IsZero)
.hasSize(2)
.containsEntry(false, List.of(1, 3))
.containsEntry(true, List.of(2));
Map<Boolean, Set<Integer>> reminderFromDivisionBy4IsZero = Stream.of(1, 2, 3)
.collect(partitioningBy(i -> i % 4 == 0, toSet()));
assertThat(reminderFromDivisionBy4IsZero)
.hasSize(2)
.containsEntry(false, Set.of(1, 2, 3))
.containsEntry(true, Set.of());

Concurrent grouping collectors to maps

There are collectors from groupingByConcurrent methods that are designed for parallel reduction similar to collectors from toConcurrentMap methods.

ConcurrentMap<Area, List<City>> citiesPerArea = USA.CITIES
.parallelStream()
.collect(groupingByConcurrent(City::getArea));
ConcurrentMap<Area, Set<City>> citiesPerArea = USA.CITIES
.parallelStream()
.collect(groupingByConcurrent(City::getArea, toSet()));
ConcurrentMap<Area, List<City>> citiesPerArea = USA.CITIES
.parallelStream()
.collect(groupingByConcurrent(City::getArea, ConcurrentHashMap::new, toList()));

Other collectors

Some collectors can’t be assigned to any category described above.

List<Integer> unmodifiableList = Stream.of(1, 2, 3)
.collect(collectingAndThen(toList(), Collections::unmodifiableList));
assertThat(unmodifiableList)
.hasSize(3)
.containsOnly(1, 2, 3);
assertThatThrownBy(unmodifiableList::clear)
.isInstanceOf(UnsupportedOperationException.class);
String result = Stream.of(1, 2, 3)
.map(String::valueOf)
.collect(joining());
assertThat(result).isEqualTo("123");
String result = Stream.of(1, 2, 3)
.map(String::valueOf)
.collect(joining(","));
assertThat(result).isEqualTo("1,2,3");
String result = Stream.of(1, 2, 3)
.map(String::valueOf)
.collect(joining(",", "[", "]"));
assertThat(result).isEqualTo("[1,2,3]");
Map.Entry<Optional<Integer>, Optional<Integer>> limits = Stream.of(1, 2, 3)
.collect(
teeing(
minBy(Integer::compareTo),
maxBy(Integer::compareTo),
AbstractMap.SimpleImmutableEntry::new
)
);
assertNotNull(limits);Optional<Integer> minOptional = limits.getKey();
assertThat(minOptional)
.isNotEmpty()
.hasValue(1);
Optional<Integer> maxOptional = limits.getValue();
assertThat(maxOptional)
.isNotEmpty()
.hasValue(3);

Conclusion

Extended code examples are available in the GitHub repository.

Senior Software Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store