The directory walk that doesn't crash
You are writing a backup script. You need to copy every file in a folder, including all subfolders. You write a recursive function. It works on your laptop. You run it on a server with a deeply nested directory structure. Stack overflow. Or worse, you hit a symbolic link that points back to the parent directory, and your script copies the entire drive until the disk fills up.
Writing a directory walker from scratch sounds easy until you realize the edge cases are endless. Permission denials, broken symlinks, circular links, and files that vanish between the time you list a directory and the time you try to read it. The walkdir crate handles all of this. It wraps platform-specific system calls and turns them into a safe, predictable Rust iterator. You get a stream of entries with robust error handling, and you avoid reinventing the recursion logic.
How walkdir works
walkdir provides a builder pattern. You create a WalkDir object, configure options like depth limits or symlink behavior, and then call into_iter() to start the traversal. The iterator yields Result<DirEntry, Error> items. This design is intentional. Directory traversal is inherently fallible. The OS might deny access, a symlink might be broken, or a file might disappear. The iterator gives you the error so you can decide whether to skip, log, or abort.
Each DirEntry contains the path, file type, and cached metadata. The metadata is cached because walkdir fetches it during the directory read. Calling entry.metadata() reuses that cached data. If you call std::fs::metadata(entry.path()) instead, you force a new system call to the disk. The crate saves you that overhead.
The walker traverses depth-first by default. It descends into a directory, yields all its entries, then moves to the next sibling. You can control the traversal with min_depth and max_depth, or change the order with sort_by_file_name. The crate also detects symlink loops. If you enable following symlinks, walkdir tracks visited directories and returns an error if it detects a cycle, preventing infinite recursion.
Minimal example
The basic usage requires adding walkdir to your dependencies and iterating over the entries.
use walkdir::WalkDir;
fn main() {
// WalkDir returns a Result iterator. filter_map extracts the Ok values.
// This silently skips entries where the OS denies access or the path is invalid.
for entry in WalkDir::new("/tmp").into_iter().filter_map(|e| e.ok()) {
println!("{:?}", entry.path());
}
}
This code prints every path in /tmp. The filter_map(|e| e.ok()) pattern is common in quick scripts because it removes the Result boilerplate. It discards errors and keeps only successful entries. In production code, you usually want to handle errors explicitly so you don't silently miss files due to permission issues.
Don't swallow errors in production. Handle them or your tool will silently miss half the files.
Walkthrough of the iterator
When you call WalkDir::new, you get a builder. The builder holds configuration but does not touch the filesystem yet. Calling into_iter() consumes the builder and starts the walk. The iterator yields items one by one. Each item is a Result<DirEntry, Error>.
The DirEntry struct provides several methods. entry.path() returns a &Path. This is a borrowed reference. If you need an owned PathBuf, you can call entry.into_path(), which consumes the entry and returns the path. This is useful when you want to collect paths into a vector.
use walkdir::WalkDir;
fn collect_paths(root: &str) -> Vec<std::path::PathBuf> {
// Collect paths into a vector.
// into_path() consumes the entry and returns an owned PathBuf.
WalkDir::new(root)
.into_iter()
.filter_map(|e| e.ok())
.filter(|e| e.file_type().is_file())
.map(|e| e.into_path())
.collect()
}
The file_type() method returns a FileType that lets you check if an entry is a file, directory, or symlink without reading metadata. This is cheap. The metadata() method returns Result<Metadata>. It uses the cached data from the directory read. If the metadata is not available for some reason, it returns an error.
Convention aside: The community prefers entry.path().display() for printing paths. display() formats the path for human output without allocating a new string. to_string_lossy() allocates a String and replaces invalid UTF-8 with the replacement character. Use display() for logs and to_string_lossy() only when you need an owned string for processing.
Trust the cache. Calling entry.metadata() is free. Calling fs::metadata on the path is a disk hit.
Realistic example: scanning with filters
Real tools need to filter entries and handle errors gracefully. This example finds all TOML files, skips hidden directories, and logs errors without stopping the walk.
use walkdir::WalkDir;
/// Finds all TOML files and prints their size, skipping hidden directories.
fn find_toml_files(root: &str) {
// Configure the walker before starting.
// min_depth(1) skips the root directory entry itself.
// max_depth(3) prevents going deeper than 3 levels.
let walker = WalkDir::new(root)
.min_depth(1)
.max_depth(3)
.into_iter();
for entry in walker {
// Match on the Result to handle errors without panicking.
// A permission error on one folder shouldn't kill the whole scan.
let entry = match entry {
Ok(e) => e,
Err(e) => {
eprintln!("Warning: {}", e);
continue;
}
};
// Skip hidden files and directories.
// file_name() returns OsStr. to_string_lossy() converts to &str.
if entry.file_name().to_string_lossy().starts_with('.') {
// If it's a directory, we skip it.
// Note: walkdir does not support pruning subtrees dynamically.
// The walker will still read the directory entries, but we ignore them.
if entry.file_type().is_dir() {
continue;
}
}
// Check if it's a file and ends with .toml.
if entry.file_type().is_file() {
// extension() returns Option<&OsStr>.
// Comparing OsStr to &str works via PartialEq.
if entry.path().extension().map_or(false, |ext| ext == "toml") {
// Metadata is cached, so this is cheap.
let size = entry.metadata().map(|m| m.len()).unwrap_or(0);
println!("{}: {} bytes", entry.path().display(), size);
}
}
}
}
This code handles errors by logging and continuing. It filters by depth, hidden status, and extension. The extension() check uses map_or to handle the Option safely. Comparing OsStr to &str works because OsStr implements PartialEq<&str>.
Filter early, but remember the cost. walkdir reads every directory entry it visits. If you skip a branch in your logic, the OS still had to list it.
Pitfalls and compiler errors
Directory traversal has traps. The most common is symlinks. By default, walkdir does not follow symlinks. This is safe. If you call follow_links(true), the walker follows symbolic links to directories. This can lead to infinite loops if a symlink points to an ancestor directory. The crate detects loops and returns a TooManyLevels error, but it is better to avoid following links unless you have a specific reason.
Another pitfall is permissions. On Unix systems, you might not have read permission for some directories. The iterator yields PermissionDenied errors. If you unwrap these, your program crashes. Always handle errors or use filter_map if you accept skipping inaccessible paths.
If you try to iterate over WalkDir directly, the compiler rejects you with E0277 (trait bound not satisfied). WalkDir is a builder, not an iterator. You must call into_iter() to consume the builder and produce the iterator.
use walkdir::WalkDir;
fn main() {
// This fails with E0277: WalkDir does not implement Iterator.
// for entry in WalkDir::new("/tmp") { ... }
// Correct usage requires into_iter().
for entry in WalkDir::new("/tmp").into_iter() {
// ...
}
}
The walkdir crate also has a limit on the number of levels it will traverse to prevent stack overflow or excessive resource usage. The default limit is high enough for most use cases, but you can adjust it with max_open. This controls how many directory handles are kept open simultaneously. On some systems, opening too many files at once can fail.
Symlinks are the landmines of directory traversal. Keep follow_links off unless you have a specific reason and a loop detector.
When to use walkdir
Use walkdir when you need recursive traversal with robust error handling and symlink safety. It is the standard choice for almost every directory walk in Rust.
Use std::fs::read_dir when you only need to list the contents of a single directory without recursion. It avoids the dependency and overhead of a full walker.
Use the glob crate when you need pattern matching like **/*.rs or src/**/*.toml. glob handles the pattern expansion and recursion together, which is cleaner than filtering paths manually.
Use the ignore crate when you need to respect .gitignore rules or skip large files automatically. It builds on walkdir but adds filtering logic that matches how tools like grep or ripgrep work.
Pick the tool that matches the shape of your problem. Don't reach for a walker when you just need a list.