When you need to scan a whole tree
You are building a tool that needs to look at every file in a project. Maybe you are writing a linter that checks syntax, a backup script that copies everything, or a utility that calculates total disk usage. In Python, you reach for os.walk. In JavaScript, you chain fs.readdir calls with recursion. In Rust, the standard library gives you std::fs::read_dir, but that function only lists the immediate children of a directory. It stops at the top level. To go deeper, you need recursion.
Writing your own recursive directory walker is tedious. You have to handle io::Error at every step, manage symlinks to avoid infinite loops, and decide whether to follow links or skip them. The Rust community has solved this problem with the walkdir crate. It is the de facto standard for recursive directory traversal. It provides a clean iterator interface, robust error handling, and sensible defaults. Add walkdir to your dependencies and use it to iterate over files and directories safely.
The inspector analogy
Think of a directory tree like a house with rooms, closets, and boxes. std::fs::read_dir is like standing in the hallway and listing what is on the floor. It tells you there is a box labeled "Documents" and a box labeled "Photos," but it does not open them. It does not tell you what is inside.
walkdir is the inspector who walks through the house. The inspector opens every door, checks every closet, and notes down the contents. The inspector also handles edge cases. If a door is locked, the inspector notes the error and moves on. If a mirror reflects another room, the inspector checks whether to follow the reflection or treat it as a separate entity. The inspector gives you a structured report of everything found, along with metadata like size and type.
Minimal example
Start with the basics. Import WalkDir and iterate over a path. The iterator yields Result values because the operating system can deny access, files can disappear, or permissions can change while you are walking.
use walkdir::WalkDir;
fn main() {
// WalkDir::new accepts any type that implements AsRef<Path>.
// This includes &str, String, PathBuf, and &Path.
for entry in WalkDir::new(".") {
// Each entry is a Result<DirEntry, Error>.
// Unwrapping here is acceptable for a quick script,
// but production code should handle errors gracefully.
let entry = entry.unwrap();
// path() returns a borrowed &Path tied to the entry's lifetime.
println!("{:?}", entry.path());
}
}
The convention in Rust is to keep walkdir blocks explicit about error handling. Even in examples, showing the Result makes it clear that I/O operations can fail. The WalkDir builder returns an iterator that implements IntoIterator, so the for loop calls into_iter() automatically. You do not need to write .into_iter() explicitly, though doing so can improve readability for newcomers.
How the iterator works
The iterator yields Result<DirEntry, Error>. The DirEntry struct holds the path and cached metadata. You can extract information from the entry without making extra system calls.
The path() method returns a &Path. This path is borrowed from the DirEntry. The lifetime of the path is tied to the entry. If you try to store the path after the entry is dropped, the compiler rejects the code. Clone the path to a PathBuf if you need to keep it.
The file_type() method returns the type of the entry: file, directory, symlink, or other. This method is cheap. It often uses cached data from the directory listing, avoiding a separate system call. Use file_type() when you only need to distinguish between files and directories.
The metadata() method returns full metadata: size, permissions, modification time, and more. This method performs a system call. It can fail even if the entry exists, for example, if permissions change between listing and stat. Use metadata() when you need details beyond the type.
use walkdir::WalkDir;
fn main() {
for entry in WalkDir::new("src") {
let entry = match entry {
Ok(entry) => entry,
Err(e) => {
// Log the error and skip the entry.
// This prevents the tool from crashing on permission issues.
eprintln!("Warning: {}", e);
continue;
}
};
// file_type() is fast and safe to call unconditionally.
if entry.file_type().is_file() {
// metadata() is slower and can fail.
// Only call it when you actually need the data.
if let Ok(meta) = entry.metadata() {
println!("{:?} is {} bytes", entry.path(), meta.len());
}
}
}
}
The community convention is to prefer file_type() for filtering and metadata() only when necessary. This reduces system calls and improves performance, especially on network filesystems where metadata lookups are expensive.
Realistic example: summing file sizes
A realistic tool needs to filter files, handle errors, and aggregate data. This example sums the size of all Rust source files in a directory. It demonstrates filtering by extension, error handling, and metadata usage.
use std::path::Path;
use walkdir::WalkDir;
/// Recursively sums the size of all .rs files in a directory.
/// Skips errors gracefully to mimic user-friendly tools.
fn sum_rust_files(dir: &Path) -> u64 {
let mut total = 0;
// into_iter() is explicit here for clarity.
// WalkDir implements IntoIterator, so the for loop works either way.
for entry in WalkDir::new(dir).into_iter() {
// Handle errors: skip inaccessible files rather than crashing.
let entry = match entry {
Ok(entry) => entry,
Err(e) => {
eprintln!("Warning: {}", e);
continue;
}
};
// Filter for .rs files only.
// extension() returns Option<&OsStr>.
// to_str() converts to Option<&str> for easy comparison.
if entry.path().extension().and_then(|s| s.to_str()) == Some("rs") {
// metadata() can fail even if the entry exists.
// Check the result before accessing len().
if let Ok(meta) = entry.metadata() {
if meta.is_file() {
total += meta.len();
}
}
}
}
total
}
fn main() {
let size = sum_rust_files(Path::new("."));
println!("Total Rust code size: {} bytes", size);
}
The extension() method returns Option<&OsStr>. Converting to &str requires to_str(), which returns Option<&str> because not all filenames are valid UTF-8. The and_then chain handles both options cleanly. This pattern is common when working with paths in Rust.
The metadata().is_file() check is redundant if you already filtered by extension, but it is good practice. Some systems allow directories to have extensions. The check ensures you only sum file sizes.
Pitfalls and compiler traps
Walking directories involves I/O, lifetimes, and symlinks. Each introduces potential issues.
Symlinks and cycles. By default, walkdir does not follow symlinks. This prevents infinite loops caused by circular links. If you need to follow symlinks, call .follow_links(true) on the builder. The crate detects cycles and returns an error if a cycle is found. Do not enable symlink following unless you trust the directory structure. Untrusted input can lead to symlink attacks where a link points outside the intended directory.
Path lifetimes. The entry.path() method returns a borrowed &Path. The lifetime is tied to the DirEntry. If you try to collect paths into a Vec<&Path>, the compiler rejects you with E0597 (borrowed value does not live long enough). The DirEntry is dropped at the end of the loop iteration, invalidating the reference. Clone the path to a PathBuf if you need to store it.
// This code fails to compile.
// let paths: Vec<&Path> = WalkDir::new(".").into_iter()
// .filter_map(|e| e.ok())
// .map(|e| e.path())
// .collect();
// Correct approach: clone the path.
let paths: Vec<std::path::PathBuf> = WalkDir::new(".")
.into_iter()
.filter_map(|e| e.ok())
.map(|e| e.path().to_path_buf())
.collect();
Metadata cost. Calling metadata() on every entry triggers a system call. On large trees, this adds up. Use file_type() for filtering. Call metadata() only when you need size, permissions, or timestamps. The performance difference is noticeable on network drives.
Depth limits. walkdir traverses the entire tree by default. Use .max_depth(n) to limit recursion. This is useful for tools that only need to scan a few levels. The depth is zero-based. max_depth(1) scans the root and its immediate children.
Error handling. Ignoring errors with unwrap() makes your tool fragile. Real directories contain files with restricted permissions. A robust tool logs errors and continues. Use match or filter_map to handle results. The ignore crate builds on walkdir and adds .gitignore support, which is essential for user-facing tools.
Decision matrix
Choose the right tool based on your needs. Each option has a specific use case.
Use walkdir when you need a reliable, recursive traversal of a directory tree with metadata access. Use walkdir when you want a simple API that handles recursion, symlinks, and errors without reinventing the wheel. Use walkdir when you are building a system tool that needs to scan directories efficiently.
Reach for std::fs::read_dir with manual recursion when you need fine-grained control over the traversal order. Reach for std::fs::read_dir when you want to avoid a dependency for a simple script that only needs top-level listing. Reach for std::fs::read_dir when you are implementing a custom traversal algorithm that walkdir does not support.
Pick the ignore crate when you are building a tool that respects .gitignore files. Pick the ignore crate when you need to skip hidden files and directories by default. Pick the ignore crate when you are writing a user-facing utility like a linter or formatter that should behave like git.
Use the glob crate when you need pattern matching on paths rather than full tree traversal. Use the glob crate when you want to find files matching a wildcard pattern like **/*.rs. Use the glob crate when you are processing configuration files or assets with specific naming conventions.
Trust the iterator. It handles the recursion, the errors, and the symlinks. Focus on your logic, not the traversal mechanics.