Word count
Here’s another neat streams snippet, which in three lines handles what would take at least two loops in pre-Java 8 code. The task at hand is to count the number of occurrences of each unique word in a file. Given the content shown below, this Bash command line would solve it.
cat words | tr ' ' '\n' | sort | uniq -c
File content:
one two three four
two three four
three four four
Shell output:
4 four
1 one
3 three
2 two
The Java code follows a similar flow as seen in the pipes above: Read the file, split each line by space, and flatten the result from all lines into a single stream. Finally, the collect() function is used with the groupingBy() helper, to map each token (or word) in the stream (the identity) to its count.
Map<String, Long> map = Files.lines(path)
.flatMap(line -> Stream.of(line.split(" ")))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
The Java map will contain the following key-value pairs. Here, the words are also accidentally sorted alphabetically. However, the order is not guaranteed by collect() function, since it returns a HashMap.
{four=4, one=1, three=3, two=2}
The full listing:
/* Copyright rememberjava.com. Licensed under GPL 3. See http://rememberjava.com/license */
package com.rememberjava.lambda;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import org.junit.Test;
public class WordCount {
// TODO: Fix path
//@Test
public void disabled_countWords() throws IOException {
Path path = Paths.get("com/rememberjava/lambda/words");
Map<String, Long> map = Files.lines(path)
.flatMap(line -> Stream.of(line.split(" ")))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
System.out.println(map);
}
@Test
public void test_dummy() {
}
}