Parsing strings with custom delimiters
You're building a tool to process log files. Each line looks like ERROR;2023-10-27;User login failed. You need to extract the timestamp and the message. In Python or JavaScript, you'd call .split(';') and get a list of strings. Rust gives you something sharper: an iterator that hands you slices of the original string without copying a single byte.
The split method is the standard way to break a string at a delimiter. It works on &str and String. It returns an iterator, not a collection. This design choice is deliberate. It keeps memory usage low and lets you chain operations efficiently.
How split works
split takes a delimiter and returns an iterator over &str slices. The delimiter can be a string or a character. When you iterate, split yields views into the original data. It does not allocate new String values for each piece.
Think of the string as a long roll of film. split doesn't cut the film and hand you separate rolls. It gives you a map with coordinates. "Frame 1 to 10", "Frame 12 to 20". The film stays in one piece. You just look at specific windows. This is why Rust parsing is fast. You avoid the allocation overhead that plagues naive string splitting in other languages.
fn main() {
let data = "apple;banana;cherry";
// split returns an iterator, not a Vec.
// Each item is a &str slice pointing to the original data.
// No new String allocations happen here.
for item in data.split(";") {
println!("{}", item);
}
}
The output is three lines: apple, banana, cherry. The loop consumes the iterator. Each item is a &str that borrows from data. If data goes out of scope, the slices become invalid. The borrow checker enforces this rule.
Walking through the mechanics
When you call data.split(";"), Rust creates a Split iterator. The iterator holds a reference to data and remembers the delimiter. It hasn't scanned the string yet. Iterators in Rust are lazy. They do work only when you ask for the next item.
The first call to next() scans from the start until it finds the delimiter. It yields the slice before the delimiter. The second call continues from where the first left off. This pattern repeats until the end of the string.
This lazy behavior matters for performance. If you have a massive string but only need the first three parts, split stops scanning after the third delimiter. It doesn't waste time parsing the rest. In a language that returns a full list, the entire string is processed regardless of how much you use.
Realistic example: parsing numbers
Real data is messy. Config files have spaces. Logs have inconsistent formatting. You often need to split, clean, and convert. Rust's iterator adapters make this chainable and readable.
fn parse_scores(data: &str) -> Vec<u32> {
data.split(",")
// trim() removes surrounding whitespace from each slice.
// This handles " 100 " -> "100".
.map(|s| s.trim())
// parse() returns a Result.
// filter_map keeps only the Ok values and discards errors.
// This silently skips malformed entries like "abc".
.filter_map(|s| s.parse().ok())
// collect() gathers the iterator into a Vec.
.collect()
}
fn main() {
let input = "100, 200, bad, 300";
let scores = parse_scores(input);
println!("{:?}", scores); // [100, 200, 300]
}
This code splits by comma, trims whitespace, attempts to parse each part as a u32, and collects the successful results. The filter_map step is idiomatic. It combines filtering and mapping in one pass. If parse fails, filter_map drops the item. If it succeeds, filter_map unwraps the value and keeps it.
Convention aside: prefer split(',') over split(",") for single characters. It reads cleaner and avoids the overhead of a string literal comparison. The compiler optimizes both well, but the character form signals intent more clearly.
Pitfalls and compiler errors
String splitting trips up beginners in predictable ways. Watch for these traps.
Empty strings are data
split produces empty strings when delimiters are adjacent or at the edges. This is not a bug. It's data.
fn main() {
let data = "a;;b";
for item in data.split(";") {
println!("'{}'", item);
}
}
Output: 'a', '', 'b'. The empty string between the semicolons is a valid segment. If you don't want empty segments, filter them out.
// Filter out empty strings explicitly.
for item in data.split(";").filter(|s| !s.is_empty()) {
println!("'{}'", item);
}
Empty strings are data, not bugs. Handle them or filter them.
Panicking on empty delimiters
Calling split("") panics. The delimiter must be non-empty. Rust prevents this at runtime with a panic, not a compile error. If you pass an empty string, the program crashes with a message like panicked at 'delimiter must not be empty'.
Check your delimiter before splitting if it comes from user input.
Borrowing conflicts
You cannot mutate a string while iterating over its slices. The iterator borrows the string immutably. Mutation requires a mutable borrow. The compiler rejects this with E0502 (cannot borrow as mutable because it is also borrowed as immutable).
fn main() {
let mut data = String::from("apple;banana");
// This fails. split borrows data immutably.
// push_str requires a mutable borrow.
// E0502: cannot borrow `data` as mutable more than once at a time.
for item in data.split(";") {
data.push_str(item);
}
}
Fix this by collecting the results first, or by working with a separate buffer.
fn main() {
let data = String::from("apple;banana");
let mut result = String::new();
// Collect slices into a Vec to end the borrow.
let parts: Vec<&str> = data.split(";").collect();
for item in parts {
result.push_str(item);
}
}
Trust the borrow checker. If you can't split and mutate, you're asking for a memory safety violation.
Decision matrix
Rust provides several splitting methods. Pick the right one for your scenario.
Use split when you need to break a string at every occurrence of a delimiter and keep all resulting segments, including empty ones.
Use split_whitespace when you want to tokenize text by runs of whitespace and ignore the whitespace itself. This method handles spaces, tabs, and newlines automatically.
Use splitn when you only care about the first N splits, such as separating a filename from its extension or extracting a prefix. This avoids scanning the rest of the string.
Use split_terminator when a trailing delimiter should not create an empty final segment, like parsing a list where the last item might be followed by a separator.
Use split_inclusive when you need to keep the delimiter attached to each segment, useful for processing chunks that include their boundaries.