Contents
- String.join() with delimiter and Iterable
- String.repeat() — Java 11+
- strip() vs trim() — Unicode-aware whitespace
- chars() and codePoints() streams
- contains(), startsWith(), endsWith()
- toUpperCase() / toLowerCase() with Locale
- isBlank() — Java 11+
String.join(delimiter, elements) concatenates all elements with the given separator, accepting either varargs or any Iterable<? extends CharSequence>. Null elements are included as the literal string "null" rather than throwing. For a list join, String.join(",", list) is the simplest option. Inside a stream pipeline where you need prefix and suffix as well, use Collectors.joining(delimiter, prefix, suffix). For incremental joining in a loop with prefix/suffix control, StringJoiner provides setEmptyValue() to control what is returned when no elements are added.
// String.join(delimiter, elements...) — joins strings with a separator
String csv = String.join(", ", "Alice", "Bob", "Charlie");
System.out.println(csv); // Alice, Bob, Charlie
// Join from a List (or any Iterable<CharSequence>)
List<String> names = List.of("red", "green", "blue");
String joined = String.join(" | ", names);
System.out.println(joined); // red | green | blue
// Empty delimiter — concatenates without separator
String concat = String.join("", "foo", "bar", "baz");
System.out.println(concat); // foobarbaz
// Join with multi-character delimiter
String path = String.join("/", "home", "user", "docs", "report.txt");
System.out.println(path); // home/user/docs/report.txt
// Join a Set (iteration order is Set-defined)
Set<String> tags = new LinkedHashSet<>(List.of("java", "strings", "api"));
System.out.println(String.join(", ", tags)); // java, strings, api
// StringJoiner — more control: prefix, suffix, empty value
import java.util.StringJoiner;
StringJoiner sj = new StringJoiner(", ", "[", "]");
sj.add("one");
sj.add("two");
sj.add("three");
System.out.println(sj.toString()); // [one, two, three]
// setEmptyValue — returned when no elements are added
StringJoiner empty = new StringJoiner(", ", "[", "]");
empty.setEmptyValue("(none)");
System.out.println(empty.toString()); // (none)
// Collectors.joining() — joining in a stream pipeline
String result = Stream.of("a", "b", "c")
.map(String::toUpperCase)
.collect(Collectors.joining(", ", "{", "}"));
System.out.println(result); // {A, B, C}
Use String.join() when you already have the elements. Use Collectors.joining() inside a stream pipeline. Use StringJoiner directly when you need to add elements incrementally (e.g., in a loop) and want prefix/suffix support.
repeat(n), added in Java 11, returns a new string consisting of the receiver repeated n times. Calling it with 0 returns an empty string; a negative count throws IllegalArgumentException. It is clean and readable for building separator lines, padding strings, or generating test data — eliminating the old idiom of new String(new char[n]).replace('\0', c). It also works on multi-character strings, making it useful for pattern repetition.
// String.repeat(int count) — returns the string repeated count times
String dashes = "-".repeat(20);
System.out.println(dashes); // --------------------
// Useful for padding or building separators
String title = "Results";
String line = "=".repeat(title.length());
System.out.println(title);
System.out.println(line); // =======
// Repeat multi-character strings
String block = "ab".repeat(4);
System.out.println(block); // abababab
// Zero count returns empty string
System.out.println("x".repeat(0)); // ""
// Negative count throws IllegalArgumentException
// "x".repeat(-1); // throws IllegalArgumentException
// Practical use: indent levels
String indent = " "; // 2 spaces
for (int level = 0; level <= 3; level++) {
System.out.println(indent.repeat(level) + "level " + level);
}
// level 0
// level 1
// level 2
// level 3
// Building a simple ASCII table border
int cols = 5;
String border = ("+" + "-".repeat(10)).repeat(cols) + "+";
System.out.println(border);
// +----------+----------+----------+----------+----------+
// Pre-Java 11 equivalent
// String old = new String(new char[n]).replace('\0', 'x');
// Or: IntStream.range(0, n).mapToObj(i -> "x").collect(Collectors.joining())
Before Java 11, there was no built-in repeat(). Common workarounds were new String(new char[n]).replace('\0', c) or a stream pipeline. Since Java 11 is the current LTS baseline for most projects, prefer repeat() directly.
trim() removes characters with code point <= '\u0020' (ASCII space and control characters). strip(), added in Java 11, uses Character.isWhitespace() which recognizes a broader set of Unicode whitespace including non-breaking spaces (\u00A0), em-spaces, and ideographic spaces. stripLeading() and stripTrailing() strip only one side. For any code dealing with internationalized or copy-pasted text, prefer strip() — trim() may silently leave Unicode whitespace that causes subsequent comparisons to fail.
// trim() — removes ASCII control characters and spaces (\u0000–\u0020)
// strip() — removes all Unicode whitespace (Java 11+, uses Character.isWhitespace())
String plain = " hello world ";
System.out.println(plain.trim()); // "hello world"
System.out.println(plain.strip()); // "hello world"
// The difference: Unicode whitespace characters
// \u2003 is an em-space — Unicode whitespace but NOT ASCII \u0020
String unicode = "\u2003hello\u2003";
System.out.println("[" + unicode.trim() + "]"); // [\u2003hello\u2003] — NOT removed
System.out.println("[" + unicode.strip() + "]"); // [hello] — removed
// stripLeading() — only leading whitespace
String leading = " hello ";
System.out.println("[" + leading.stripLeading() + "]"); // [hello ]
// stripTrailing() — only trailing whitespace
System.out.println("[" + leading.stripTrailing() + "]"); // [ hello]
// Practical use: normalising user input
String userInput = " \t Alice \n ";
String cleaned = userInput.strip();
System.out.println("[" + cleaned + "]"); // [Alice]
// When to use trim() vs strip()
// - New code: always prefer strip() — Unicode-aware and clearly named
// - Existing codebases: trim() is fine for pure ASCII input
// - CSV/TSV parsing with tab-separated values — use strip() to handle tabs
// Checking if stripped result is empty: combine with isBlank()
String ws = " \t \n ";
System.out.println(ws.strip().isEmpty()); // true
System.out.println(ws.isBlank()); // true (Java 11+ shorthand)
Do not use trim() on strings that may contain non-ASCII whitespace such as data from international user inputs, HTML entities decoded to Unicode spaces, or text copied from word processors. Use strip() instead.
chars() returns an IntStream of UTF-16 code unit values — cast each int to char for character operations. codePoints() returns an IntStream of Unicode code points, which correctly handles supplementary characters (emoji, rare CJK characters) that are stored as two char values (a surrogate pair) in UTF-16. For most ASCII and Latin text, both methods give the same results. Use codePoints() whenever the input may contain characters outside the Basic Multilingual Plane (U+FFFF and above). Both streams enable stream-pipeline operations like character frequency counts, filtering, and transformation.
// chars() — returns an IntStream of char values (UTF-16 code units)
String s = "Hello";
s.chars().forEach(c -> System.out.print((char) c + " ")); // H e l l o
// Count vowels using a stream
long vowels = "Programming".chars()
.filter(c -> "aeiouAEIOU".indexOf(c) >= 0)
.count();
System.out.println("Vowels: " + vowels); // Vowels: 3
// Convert chars stream back to String
String upper = "hello".chars()
.map(Character::toUpperCase)
.collect(StringBuilder::new, (sb, c) -> sb.append((char) c), StringBuilder::append)
.toString();
System.out.println(upper); // HELLO
// Check if all characters are digits
boolean allDigits = "12345".chars().allMatch(Character::isDigit); // true
boolean hasLetter = "abc12".chars().anyMatch(Character::isLetter); // true
// codePoints() — returns an IntStream of Unicode code points
// Handles supplementary characters (outside BMP, code points > 0xFFFF)
// A single supplementary char is 2 chars (a surrogate pair) in UTF-16
String emoji = "Hello \uD83D\uDE00"; // \uD83D\uDE00 = 😀 (U+1F600)
System.out.println("chars length: " + emoji.chars().count()); // 8 (6 + 2 surrogates)
System.out.println("codePoints length: " + emoji.codePoints().count()); // 7 (6 + 1 emoji)
// Count actual characters (code points), not UTF-16 code units
long cpCount = emoji.codePoints().count();
// Collect code points back to string safely
String rebuilt = emoji.codePoints()
.collect(StringBuilder::new,
StringBuilder::appendCodePoint,
StringBuilder::append)
.toString();
System.out.println(rebuilt.equals(emoji)); // true
// Get distinct characters sorted
"banana".chars()
.distinct()
.sorted()
.mapToObj(c -> String.valueOf((char) c))
.forEach(System.out::print); // aban (sorted unique chars)
System.out.println();
chars() works correctly for the vast majority of text. Use codePoints() when processing text that may contain emoji or other supplementary Unicode characters (code points above U+FFFF), because those are stored as two char values (a surrogate pair) and chars() would split them.
These three methods return a plain boolean and are all case-sensitive. contains() scans the entire string for the given CharSequence. startsWith() checks the beginning; its two-argument form startsWith(prefix, offset) checks starting from a specific character position. endsWith() checks the end of the string. None of them support case-insensitive matching directly — convert both sides to the same case first, or use regionMatches(ignoreCase, ...) for substring comparison at a given offset.
String s = "The quick brown fox jumps over the lazy dog";
// contains(CharSequence) — case-sensitive substring check
System.out.println(s.contains("fox")); // true
System.out.println(s.contains("Fox")); // false
System.out.println(s.contains("")); // true — empty string is always contained
// Case-insensitive contains
String lower = s.toLowerCase();
System.out.println(lower.contains("FOX".toLowerCase())); // true
// startsWith(prefix) / endsWith(suffix)
String filename = "report_2025.csv";
System.out.println(filename.startsWith("report")); // true
System.out.println(filename.endsWith(".csv")); // true
System.out.println(filename.endsWith(".txt")); // false
// startsWith(prefix, toffset) — check from an offset
String url = "https://example.com/path";
System.out.println(url.startsWith("example.com", 8)); // true (skip "https://")
// Practical: file type checks
List<String> files = List.of("data.csv", "readme.md", "config.json", "App.java");
List<String> javaFiles = files.stream()
.filter(f -> f.endsWith(".java"))
.collect(Collectors.toList());
System.out.println(javaFiles); // [App.java]
// Check multiple prefixes — no built-in, use stream
String[] prefixes = {"http://", "https://"};
boolean isUrl = Arrays.stream(prefixes).anyMatch(url::startsWith);
System.out.println(isUrl); // true
// indexOf / lastIndexOf — when you need position too
int idx = s.indexOf("fox"); // 16
int last = s.lastIndexOf("the"); // 31 (second "the")
int notFound = s.indexOf("cat"); // -1
// regionMatches — substring comparison at given offsets
// regionMatches(ignoreCase, toffset, other, ooffset, len)
boolean match = "HelloWorld".regionMatches(true, 0, "HELLO", 0, 5);
System.out.println(match); // true
The no-argument toUpperCase() and toLowerCase() use the JVM's default locale, which causes portability bugs. The most famous example is the Turkish locale: "i".toUpperCase() produces "İ" (dotted capital I) rather than "I", breaking any code that compares the result against an ASCII string. Always pass Locale.ENGLISH or Locale.ROOT when the result will be stored, compared, used as a map key, or transmitted. Use the user's locale only when the result is strictly for display.
// Basic case conversion
String s = "Hello World";
System.out.println(s.toUpperCase()); // HELLO WORLD
System.out.println(s.toLowerCase()); // hello world
// Always pass a Locale for locale-sensitive characters
// Classic example: Turkish 'i' — toUpperCase() in Turkish locale yields 'İ' (dotted I)
// but in English locale it yields 'I' (undotted I)
String dotless = "title";
System.out.println(dotless.toUpperCase(Locale.ENGLISH)); // TITLE
System.out.println(dotless.toUpperCase(new Locale("tr"))); // TİTLE ← note İ
// Always use Locale.ROOT for programmatic/technical strings (URLs, keys, SQL)
String header = "Content-Type";
System.out.println(header.toLowerCase(Locale.ROOT)); // content-type
// Locale.ROOT is culture-neutral — safest for non-display strings
String key = "UserName";
String normalised = key.toLowerCase(Locale.ROOT); // username
// Avoid the no-arg overloads for anything that will be stored, indexed, or compared
// Bad: str.toUpperCase() — uses JVM default locale, non-portable
// Good: str.toUpperCase(Locale.US) — explicit, predictable
// Capitalizing the first letter (no built-in)
String word = "hello";
String capitalised = word.isEmpty()
? word
: Character.toUpperCase(word.charAt(0)) + word.substring(1);
System.out.println(capitalised); // Hello
// Title-case each word
String sentence = "the quick brown fox";
String titleCase = Arrays.stream(sentence.split(" "))
.map(w -> Character.toUpperCase(w.charAt(0)) + w.substring(1))
.collect(Collectors.joining(" "));
System.out.println(titleCase); // The Quick Brown Fox
Never call toUpperCase() or toLowerCase() without a Locale argument on strings that are used programmatically (map keys, SQL identifiers, HTTP headers, file extensions). Always use Locale.ROOT for such values and the user's Locale only for display text.
isBlank(), added in Java 11, returns true for empty strings and for strings containing only Unicode whitespace (using the same definition as strip()). isEmpty() only returns true for zero-length strings — it returns false for a string of spaces. isBlank() is the right check when validating that a user entered something meaningful. It does not allocate a new string (unlike strip().isEmpty()), making it both more correct and more efficient.
// isBlank() — returns true if the string is empty or contains only whitespace
// Uses the same Unicode definition as strip() (Character.isWhitespace())
System.out.println("".isBlank()); // true
System.out.println(" ".isBlank()); // true
System.out.println(" \t\n ".isBlank()); // true
System.out.println(" hi ".isBlank()); // false
System.out.println("0".isBlank()); // false
// isBlank() vs isEmpty()
String empty = "";
String spaces = " ";
System.out.println(empty.isEmpty()); // true
System.out.println(spaces.isEmpty()); // false
System.out.println(empty.isBlank()); // true
System.out.println(spaces.isBlank()); // true — isEmpty() would miss this
// Filtering blank lines from a text block
String text = """
line one
line two
line three
""";
List<String> nonBlank = text.lines()
.filter(line -> !line.isBlank())
.collect(Collectors.toList());
System.out.println(nonBlank); // [line one, line two, line three]
// Validate user input — null-safe pattern
static boolean hasValue(String s) {
return s != null && !s.isBlank();
}
System.out.println(hasValue(null)); // false
System.out.println(hasValue("")); // false
System.out.println(hasValue(" ")); // false
System.out.println(hasValue("text")); // true
// lines() — Java 11+ — splits on line terminators, returns Stream<String>
// Handles \n, \r\n, \r without regex
long lineCount = "one\ntwo\r\nthree\rfour".lines().count();
System.out.println(lineCount); // 4
// Pre-Java 11 equivalent of isBlank()
// str.trim().isEmpty() — but trim() misses Unicode whitespace
// str.strip().isEmpty() — correct, but requires Java 11 anyway
isBlank() is the idiomatic Java 11+ test for "has no meaningful content". Prefer it over trim().isEmpty() (which misses Unicode spaces) and over calling strip().isEmpty() (two allocations). isBlank() performs the check without allocating a new String.