How to Use the Dhat or Valgrind Memory Profiler with Rust

When your memory usage refuses to make sense

You write a Rust parser. It handles megabytes of JSON without breaking a sweat. You deploy it to a staging server. After forty-eight hours, the process consumes three gigabytes of RAM and refuses to let go. The borrow checker never complained. Every reference is valid. Every scope closes correctly. Yet the memory usage climbs like a slow leak in a submarine.

Ownership and borrowing guarantee that you never read freed memory or create dangling pointers. They do not guarantee that you allocate efficiently. They do not catch circular references. They do not stop you from holding onto a Vec long after you finished iterating it. When your program runs out of memory, the borrow checker is not the tool you need. You need a profiler.

What memory profiling actually tracks

Memory profiling answers a single question: what is sitting on the heap right now, and who put it there? The heap is where Rust stores data that outlives the current stack frame. Every Box, Vec, String, HashMap, and Rc lives there. The profiler intercepts allocation calls, records the size and the call stack, and watches what happens when the program finishes.

Think of the heap as a warehouse floor. The borrow checker is the manager who ensures every crate gets a shipping label and a return receipt. The profiler is the overhead camera that records exactly how many crates are stacked in Aisle 4 at noon versus midnight. It shows you the shape of your memory usage over time. It tells you whether you are accumulating crates faster than you are shipping them out.

The quick path: dhat

The dhat crate is the standard starting point for Rust developers. It attaches to your binary, hooks into the system allocator, and writes a summary report when the process exits. It adds minimal friction and produces a browser-friendly HTML report.

Add the crate to your Cargo.toml as an optional dependency. This keeps profiling out of your production builds.

[dependencies]
dhat = { version = "0.3.3", optional = true }

Guard the profiler behind a feature flag in your entry point. The underscore prefix signals to readers that you intentionally ignore the return value. The Profiler struct uses RAII. It starts tracking when created and flushes the report when dropped.

/// Entry point for the application.
/// Conditionally enables heap profiling when the `dhat` feature is active.
fn main() {
    #[cfg(feature = "dhat")]
    let _profiler = dhat::Profiler::new_heap();

    // Application logic runs here.
    // All heap allocations are intercepted by dhat.
    // The profiler drops at the end of this scope.
    // It writes dhat-heap.json to the current directory.
}

Run the binary with the feature enabled. The --release flag is mandatory for meaningful results. Debug builds insert extra padding and skip optimizations that change allocation patterns.

cargo run --release --features dhat

The process finishes. A file named dhat-heap.json appears in your working directory. Open it in any modern browser. The report renders instantly.

Reading the heap report

The HTML output splits into two main sections. The first shows aggregate numbers: total allocations, total bytes, and the number of live allocations at program exit. The second breaks down allocations by call site. Each row lists the function name, the number of allocations, the total bytes requested, and the bytes still in use.

Focus on the "live bytes" column. High allocation counts with zero live bytes mean you are creating and dropping objects rapidly. That is usually fine. High live bytes mean you are holding onto memory. That is where you look for leaks.

Click any function name to expand the call stack. The stack trace shows exactly where the allocation originated. You will often see std::vec::Vec::push or alloc::string::String::from. Those are the allocation sites. The interesting part is what called them. Trace upward until you find your own code. That is the function responsible for the retention.

Convention aside: the community expects profiling tools to be opt-in via features. Never ship dhat in production. The overhead is small but measurable, and the JSON report will fill your disk if you run a long-lived daemon. Keep it behind #[cfg(feature = "dhat")] and document the flag in your README.

The heavy artillery: Valgrind and Massif

dhat gives you a snapshot summary. It does not show you how memory usage changes over time. If your process runs for hours and you need to see when the spike happens, you need a timeline profiler. Valgrind's massif tool fills that gap.

Valgrind runs your binary inside a simulator. It intercepts every memory operation and records it with a timestamp. The slowdown is severe. Expect your program to run ten to fifty times slower. That is the tradeoff for precision.

Compile your binary first. Then run it through Valgrind with the massif tool. The --pages-as-heap=no flag tells Valgrind to track heap allocations instead of page mappings. The --stacks=yes flag captures stack traces for each allocation.

cargo build --release
valgrind --tool=massif --pages-as-heap=no --stacks=yes ./target/release/myapp

Valgrind writes a file named massif.out.<pid>. It is not human-readable. Convert it to a graph using the ms_print utility that ships with Valgrind.

ms_print massif.out.<pid> > massif.svg

Open the SVG in your browser. The X axis shows time. The Y axis shows heap size in bytes. Peaks indicate moments where your program allocated heavily. Click a peak to see the call stacks responsible for the growth at that exact moment.

Massif excels at long-running processes. It catches gradual leaks that dhat might miss because dhat only reports the final state. Massif shows you the trajectory. It tells you whether memory usage stabilizes or climbs linearly.

Pitfalls and hidden allocations

Memory profilers lie if you feed them the wrong data. Debug builds allocate differently than release builds. The optimizer inlines functions, eliminates temporary vectors, and reuses stack space. Profiling in debug mode gives you a distorted map. Always profile release builds.

Stack allocations are invisible to heap profilers. If you allocate a String on the stack using stacker or small vector optimizations, dhat and massif will not see it. They only track heap memory. If your profiler shows low usage but your process still consumes RAM, check for thread-local storage, file descriptors, or OS-level memory mapping.

Circular references leak silently. Rc and Arc use reference counting. If two objects hold Rc pointers to each other, the count never reaches zero. The borrow checker approves the code. The profiler shows live bytes that never drop. Break the cycle with Weak references or redesign the graph structure.

Multithreading skews results. Allocation locks serialize access to the heap. Profilers add extra locking overhead. Your thread contention changes. The numbers you see under profiling will not match production throughput. Treat profiler output as a diagnostic map, not a performance benchmark.

Do not chase every allocation. Rust allocates aggressively during startup. Standard library initialization, TLS setup, and crate initialization all touch the heap. Focus on allocations that grow over time or persist after you expect them to drop. Ignore the noise. Hunt the signal.

Which profiler fits your situation

Use dhat when you need a quick snapshot of which functions allocate the most memory and want a browser-friendly report with minimal setup. Use valgrind --tool=massif when you need to see how heap usage changes over the lifetime of a long-running process and can tolerate severe runtime slowdown. Use a custom allocator hook when you are building a game engine or embedded system and need cycle-accurate tracking that integrates with your own memory pools. Reach for cargo bench with --nopreserve when you want to measure allocation impact on throughput without generating a detailed heap map. Trust the tool that matches your timeline. Snapshot for fast iteration. Timeline for stubborn leaks.

Where to go next

Dhat is a tool that tracks how your Rust program uses memory while it runs. It helps you find places where your code might be using too much memory or leaking it. Think of it like a fuel gauge for your program's memory usage, showing you exactly where the consumption is happening.