The Joys of Guava – Splitter and Joiner

The Guava library, for me, is hands down the best Java utility library out there. It is a shining example of clean code and API design, both in its implementation and documentation. If you’re not already using it in your Java projects – you absolutely should be! Check out the wiki here to get started.

In this series I want to highlight different pieces of Guava that I have found particularly useful. Often there’s a Guava utility that fits your need perfectly, so becoming familiar with the different features of the library can be extremely helpful!

Introduction

Whether you’re parsing input data or formatting output data, String manipulation is one of the most common operations you’ll encounter as a developer. The JDK provides some limited utilities for basic operations, but Guava’s Splitter and Joiner help fill in the gaps tremendously.

Splitter

If you’re lucky enough to have not encountered the quirks of Java’s String.split() method, I would encourage you to avoid it in all circumstances. If you have Error Prone set up in your project, it will even throw a warning if you try to use it!

Quirks aside, String.split() returns an array which is rarely the most convenient format for the client. It also only accepts regular expressions on which to split, which in many cases involves compiling your input into a Pattern under the hood. This is often way more expensive than you need if all you want to do is split based on a plain String literal. Fortunately, Guava has our back with an easy to use alternative that allows complete control over the behavior and return type of the split operation while avoiding any unnecessary computations.

Splitter instances do exactly what you think – they split Strings. You can get the output as an Iterable, List, or Stream depending your needs.

String csvData = "One,Two,Three";
String[] badExample = csvData.split(",");  // don't use this

// Iterable
for (String s : Splitter.on(',').split(csvData))
{
    ...
}

// List
List<String> strings = Splitter.on(',').splitToList(csvData);

// Stream
Splitter.on(',').splitToStream(csvData)
        ...

Splitter instances are immutable and thread-safe, so you can safely store them as static final constants to avoid repeatedly creating instances.

Customization

The real power of the Splitter class comes in the many ways in which instances can be customized. The Splitter wiki details these methods, which I’ve included below.

Method Description Example
omitEmptyStrings() Automatically omits empty strings from the result. Splitter.on(',').omitEmptyStrings().split("a,,c,d") returns "a", "c", "d"
trimResults() Trims whitespace from the results; equivalent to trimResults(CharMatcher.WHITESPACE). Splitter.on(',').trimResults().split("a, b, c, d") returns "a", "b", "c", "d"
trimResults(CharMatcher) Trims characters matching the specified CharMatcher from results. Splitter.on(',').trimResults(CharMatcher.is('_')).split("_a ,_b_ ,c__") returns "a ", "b_ ", "c"
limit(int) Stops splitting after the specified number of strings have been returned. Splitter.on(',').limit(3).split("a,b,c,d") returns "a", "b", "c,d"

These modifications can be invoked in any order.

Remember that because Splitters are immutable, you must store a reference to the Splitter returned from one of these methods. In other words, do this:

private static final Splitter MY_SPLITTER = Splitter.on(',').trimResults().omitEmptyStrings();

And not this:

Splitter splitter = Splitter.on('/');
splitter.trimResults(); // does nothing! IntelliJ will flag the ignored result
return splitter.split("wrong / wrong / wrong");

Similar to what you see with the Builder pattern or Java Streams, this is a great example of a fluent API.

Creation

In addition to Splitter.on(char) and Splitter.on(String), there are several other useful factories to be aware of. Some of them are tailored to very specific applications, but you never know when one might come in handy. (table copied from the wiki)

Method Description Example
Splitter.on(CharMatcher) Split on occurrences of any character in some category. Splitter.on(CharMatcher.BREAKING_WHITESPACE) Splitter.on(CharMatcher.anyOf(";,."))
Splitter.on(Pattern) Splitter.onPattern(String) Split on a regular expression. Splitter.onPattern("\r?\n")
Splitter.fixedLength(int) Splits strings into substrings of the specified fixed length. The last piece can be smaller than length but will never be empty. Splitter.fixedLength(3)

Joiner

Going the opposite direction, there’s Joiner. Again, Joiner does exactly what you’d expect it to – it joins Strings. Once you’re comfortable with the Splitter API, using Joiner will feel instantly familiar.

In its basic form, Joiner will accept inputs in array, Iterable, or varargs format.

// array
String[] strings = {"one", "two", "three"};
String joined = Joiner.on(',').join(strings);

// Iterable
List<String> strings = List.of("one", "two", "three");
String joined = Joiner.on(',').join(strings);

// varargs
String joined = Joiner.on(',').join("one", "two", "three");

Objects used as inputs to Joiners will be converted to Strings using their toString methods.

Like Splitter, Joiner instances are always immutable, thread-safe, and usable as a static final constant.

Customization

The modifiers available for Joiners dictate the behavior for null inputs.

Method Description Example
skipNulls() Automatically skips over any provided null elements. Joiner.on("; ").skipNulls().join("Harry", null, "Ron", "Hermione") returns "Harry; Ron; Hermione"
useForNull(String) Automatically substitutes the given string for any provided null elements. Joiner.on("; ").useForNull("Ginny").join("Harry", null, "Ron", "Hermione") returns "Harry; Ginny; Ron; Hermione"

Note that if neither skipNulls() nor useForNull(String) is specified, the joining methods will throw a NullPointerException if any given element is null.

JDK Alternatives

Java 8 added a similar utility to Joiner with the String.join() method. This method will also accept array, Iterable, or varargs inputs.

// array
String[] strings = {"one", "two", "three"};
String joined = String.join(",", strings);

// Iterable
List<String> strings = List.of("one", "two", "three");
String joined = String.join(",", strings);

// varargs
String joined = String.join(",", "one", "two", "three");

The only thing you lose by not using Joiner here is control over behavior with null inputs – the String "null" is always added. And before you scream that there’s no Object creation involved with the static String.join() method, there actually is under the hood in the form of StringJoiner instances.

Collectors.joining() also uses the StringJoiner class in the JDK to do its job, but it is designed to work with Streams in a way that Guava’s Joiner cannot.

String joined = List.of("one", "two", "three").stream()
        .map(String::toUpperCase)
        .collect(joining(","));
System.out.println(joined); // prints "ONE,TWO,THREE"

Note that it is customary and wise to statically import all members of Collectors because it makes stream pipelines more readable (EJ 46).

Use with Maps

Splitter and Joiner have close cousins for use with serializing and deserializing maps, MapSplitter and MapJoiner. Let’s say you have a cache or some other Map that contains data you want to write out to a file or some other location. MapJoiner can help you easily serialize your Map in whatever format you desire.

Map<String, String> data = Map.of("name", "Michael", "role", "admin", "city", "Nashville", "state", "TN");
String serializedForm = Joiner.on('\n').withKeyValueSeparator(" == ").join(data);

This is what our serialized form looks like:

state == TN
name == Michael
city == Nashville
role == admin

Clean and human-readable. Not bad for one line of code!

Going the other direction is just as easy using MapSplitter:

Map<String, String> myData = Splitter.on('\n').withKeyValueSeparator(" == ").split(serializedForm);

This isn’t the stuff you’ll use every day, but it’s so clean and satisfying when you can put it to use. You’ll likely impress your teammates as well!

Leave a Reply