Why Is String Indexing Not Allowed in Rust?

The compiler rejects `s[0]`

You write let s = "café"; let c = s[1]; expecting the character 'a'. The compiler rejects you with E0277: the type str cannot be indexed by usize. You're used to strings being arrays of characters. In Python, s[1] works. In JavaScript, s[1] works. Rust says no.

This isn't a missing feature. It's a design choice rooted in how computers actually store text. Rust refuses to guess what you mean when you ask for a numeric index, because a number can point to a byte, a character, or the middle of a character. Returning half a character is nonsense. Rust forces you to be explicit.

Variable-width encoding breaks fixed indexing

UTF-8 is a variable-width encoding. ASCII characters like 'a' take one byte. The character 'é' takes two bytes. The emoji '🦀' takes four bytes. If you ask for index 3 in "café", you're asking for the third byte. That byte lands in the middle of 'é'. Returning half a character produces invalid UTF-8. Rust strings must always be valid UTF-8. The compiler refuses to let you slice a string at an invalid boundary.

Think of a string like a roll of film where each frame has a different length. You can't just measure 10 centimeters and expect to be at the start of the 10th frame. You have to count frames. Counting frames requires looking at each frame to see where it ends. Random access is impossible without scanning from the start.

Rust exposes this reality. Other languages hide the cost by allocating extra memory or converting strings internally. Rust gives you the raw bytes and makes you handle the decoding. This keeps strings fast and compact, but it means you can't use a simple integer to jump to a character.

Minimal example: characters vs bytes

/// Demonstrates safe character access and byte slicing.
fn main() {
    let s = "café";
    
    // Indexing by number is forbidden.
    // Rust cannot guarantee index 1 is a character boundary.
    // let c = s[1]; // Error: the type `str` cannot be indexed by `usize`
    
    // Use chars() to iterate over Unicode scalar values.
    // nth(1) finds the second character, skipping bytes as needed.
    let second_char = s.chars().nth(1).unwrap();
    println!("Second char: {}", second_char);
    
    // Use get() for byte slices when you need a substring.
    // This returns an Option to handle out-of-bounds or boundary errors.
    let slice = s.get(0..2);
    println!("First two bytes: {:?}", slice);
    
    // len() returns byte count, not character count.
    // "café" has 5 bytes: c, a, é (2 bytes), e.
    println!("Byte length: {}", s.len());
}

The chars() method returns an iterator that decodes UTF-8 sequences. Each item is a char, which is a Unicode scalar value. The nth(n) method advances the iterator n times. This is O(n). You pay the cost of scanning from the start to find the nth character.

The get() method takes a byte range. It checks if the start and end indices are character boundaries. If they are, it returns Some(&str). If not, it returns None. This prevents panics. The community convention is to prefer get() over slicing with [] because get() returns an Option and forces you to handle errors. Slicing with [] panics if the indices are invalid.

Don't assume len() is character count. It returns bytes. Use chars().count() if you need the number of characters, but be aware that also costs O(n).

Walking the string: what happens at runtime

When you call s.chars(), Rust creates an iterator that holds a pointer to the string data and a current position. The iterator reads bytes one by one. If it sees a byte starting with 0xxxxxxx, it knows that's a single-byte ASCII character. If it sees 110xxxxx, it knows the next byte is part of the same character. It assembles the bytes into a char and yields it.

Calling nth(1) tells the iterator to skip the first character and yield the second. The iterator advances its internal pointer past the first character's bytes. This requires decoding. You can't jump to byte 2 and assume it's the second character. You have to decode byte 0 to see how many bytes that character occupies.

This is why chars().nth(n) is O(n). You must walk the string. For short strings, this is fast. For long strings, repeated random access is expensive. If you need to access many characters by index, consider collecting into a Vec<char> first. That trades memory for speed.

/// Precomputes characters for fast random access.
fn fast_access(s: &str) -> Vec<char> {
    // Collecting allocates a vector and decodes all characters upfront.
    // This is O(n) once, but allows O(1) access later.
    s.chars().collect()
}

Convention aside: if you only need the first character, use chars().next() instead of chars().nth(0). The next() method is slightly faster because it doesn't need to skip anything. The compiler can optimize it better.

Realistic example: mapping character positions to bytes

Text editors and parsers often need to map character positions to byte offsets. You might have a cursor at character position 5 and need to extract the substring starting there. You can't just use s[5..]. You need to find the byte index for character 5.

/// Finds the byte index for a given character position.
/// Returns None if the position is out of bounds.
fn char_to_byte_index(s: &str, char_pos: usize) -> Option<usize> {
    // char_indices() yields (byte_index, char) pairs.
    // This lets us map character counts to byte offsets.
    s.char_indices()
        .nth(char_pos)
        .map(|(byte_idx, _)| byte_idx)
}

/// Extracts a substring by character count safely.
fn substring_by_chars(s: &str, start: usize, len: usize) -> Option<&str> {
    // Find byte indices for start and end.
    let start_byte = char_to_byte_index(s, start)?;
    let end_byte = char_to_byte_index(s, start + len)?;
    
    // Now we have valid byte boundaries.
    // get() checks bounds and returns None if out of range.
    s.get(start_byte..end_byte)
}

The char_indices() method is the tool for this job. It yields tuples of (byte_index, char). You can use it to build a map from character positions to byte offsets. This is essential for UI components that display text and need to handle cursor movements.

Pitfall: char_indices() is also O(n) per call. If you call it repeatedly in a loop, you get O(n²) performance. Cache the indices if you need them multiple times.

Trust get() to keep you safe. It checks boundaries and returns None instead of panicking. Use it whenever you're working with dynamic indices.

Pitfalls and compiler errors

If you try s[0], you get E0277: the type str cannot be indexed by usize. The compiler tells you exactly what's wrong. It also suggests using chars() or bytes().

If you use s[0..2] and the range is invalid, the program panics at runtime. For example, s[1..3] on "café" panics because index 1 is in the middle of 'é'. The panic message tells you the index was not a char boundary.

fn main() {
    let s = "café";
    
    // This panics at runtime.
    // Index 1 is inside the multi-byte character 'é'.
    // let bad = s[1..3];
    
    // Safe alternative: get() returns None.
    let safe = s.get(1..3);
    println!("{:?}", safe); // None
}

Another trap is assuming len() is character count. "🦀".len() is 4. If you loop 0..s.len(), you'll crash or get garbage. Always use chars() for character logic.

Convention aside: use let _ = s.get(0..1); to discard a result when you've checked a boundary but don't need the value. This signals to readers that you considered the value and chose to drop it.

Counter-intuitive but true: the more you try to force indexing, the more you fight the language. Embrace iteration and byte ranges.

Decision: when to use what

Use chars().nth(n) when you need the nth character by count and the string is short enough that O(n) traversal is acceptable.

Use get(start..end) when you need a substring and want to handle out-of-bounds or boundary errors gracefully without panicking.

Use char_indices() when you must convert character counts to byte offsets for slicing or indexing other data structures.

Use bytes() when you are parsing binary protocols or processing data where character semantics don't matter.

Reach for [] with a range only when you have verified the byte indices align with character boundaries and a panic is the correct failure mode.

Treat byte indices as raw data, not character positions. The compiler will enforce this discipline.

Where to go next

Rust strings store text in UTF-8, where characters can take up different amounts of space. Indexing by number assumes every character is the same size, which would break for many languages. Instead, you must iterate through the text to find the character you want, ensuring you never cut a character in half.