How to Write Custom Allocators in Rust

When the default heap isn't enough

You're building a real-time audio plugin. The host environment calls your processing function every 2.5 milliseconds. If you allocate memory inside that callback, the system allocator might trigger a page fault, fragment the heap, or pause for garbage collection. The result is audio glitches that ruin the performance. You need to control exactly where memory comes from and when it goes back.

Or you're targeting a microcontroller with 2KB of RAM and no operating system. The standard library allocator expects a heap manager provided by the OS. That manager doesn't exist. You have to bring your own memory management strategy, or the program won't even link.

In these cases, Rust's default allocator gets in the way. You need to step inside the black box and take control.

The allocator contract

Rust's ownership system manages data lifetimes, but it relies on an allocator to provide the raw bytes. The default allocator is a wrapper around the system's malloc and free (or HeapAlloc on Windows). It works for 99% of programs.

When you write a custom allocator, you implement the GlobalAlloc trait. This trait defines two methods: alloc and dealloc. You then mark a static instance of your allocator with the #[global_allocator] attribute. This tells the compiler to route every memory request in your binary through your code.

You become the warehouse manager. Every time a Vec grows, a String expands, or a Box is created, your alloc method runs. You decide if the request is valid, where to store the data, and what pointer to return. When the value is dropped, your dealloc method runs. You must reclaim the memory using the exact same rules you used to allocate it. Mismatching allocation and deallocation strategies causes memory corruption, double frees, or silent data loss.

The compiler requires unsafe for this trait. You are taking responsibility for low-level operations that bypass Rust's safety guarantees. If you return a garbage pointer, the program crashes. If you forget to free memory, you leak. If you free memory twice, you corrupt the heap. The borrow checker protects the data. Your allocator protects the heap. Both must agree.

Take the keys. The compiler trusts you with the memory.

Understanding Layout

Before writing the allocator, you need to understand Layout. Every allocation request includes a Layout struct that describes the memory requirements.

Layout contains two pieces of information:

Size: The number of bytes needed.
Alignment: The memory address must be a multiple of this value.

Alignment ensures data is placed at addresses that the CPU can read efficiently. Some types, like SIMD vectors or atomic integers, require specific alignment. If you allocate memory with incorrect alignment, the CPU might crash or the program might produce wrong results.

The alignment must always be a power of two and non-zero. Rust enforces this in Layout::from_size_align. If you pass an invalid alignment, the function panics. This is a safety net that catches bugs early during development.

Convention aside: When constructing layouts manually, prefer Layout::from_size_align over Layout::from_size_align_unchecked. The unchecked version skips validation and is faster, but the community considers it a footgun. Use the checked version unless you have measured a bottleneck and proven the invariants hold.

Minimal custom allocator

The safest way to start is to delegate to the system allocator. This proves your hook works without risking memory corruption.

use std::alloc::{GlobalAlloc, Layout, System};

// Define the struct that will hold your allocator logic.
// This struct can be empty if you don't need state.
struct MyAlloc;

// SAFETY:
// 1. System.alloc returns a valid pointer for the given layout, or null on failure.
// 2. System.dealloc correctly frees memory previously allocated by System.alloc.
// 3. The layout passed to dealloc matches the layout used for alloc.
// 4. We never return null from alloc for non-zero size requests.
unsafe impl GlobalAlloc for MyAlloc {
    // alloc is called whenever `Box::new`, `Vec::push`, or `String::push`
    // needs to grow. layout describes the size and alignment requirements.
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        // Delegate to the system allocator.
        // In a real allocator, you would implement your own logic here.
        System.alloc(layout)
    }

    // dealloc is called when a value is dropped and memory needs to be freed.
    // ptr is the pointer returned by alloc. layout must match the original request.
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        System.dealloc(ptr, layout);
    }
}

// This attribute tells the compiler to use MyAlloc as the global allocator
// for the entire binary. There can be only one #[global_allocator].
#[global_allocator]
static GLOBAL: MyAlloc = MyAlloc;

fn main() {
    // Any allocation in this binary now routes through MyAlloc.
    // The Vec calls GlobalAlloc::alloc, which calls MyAlloc::alloc.
    let _v = vec![1, 2, 3];
    println!("Custom allocator active");
    // When _v drops, GlobalAlloc::dealloc runs.
}

When you run this, vec![1, 2, 3] triggers an allocation. The Vec type asks the global allocator for a chunk of memory. The compiler routes that request to MyAlloc::alloc. Your code runs and delegates to System. The system allocator finds space and returns a pointer. The Vec stores the data.

When _v goes out of scope, Vec calls dealloc. Your dealloc method runs. You pass the pointer and layout to System::dealloc. The memory is freed.

Delegation is the safest way to start. Prove your logic works before replacing the system.

A realistic example: counting allocations

A common use case for custom allocators is debugging. You might want to track how many allocations your program makes to detect leaks or excessive churn.

use std::alloc::{GlobalAlloc, Layout, System};
use std::sync::atomic::{AtomicUsize, Ordering};

struct CountingAlloc;

// Track total active allocations for debugging.
static ALLOC_COUNT: AtomicUsize = AtomicUsize::new(0);

// SAFETY:
// 1. System.alloc and System.dealloc are called with valid layouts.
// 2. Pointers passed to dealloc were returned by alloc with the same layout.
// 3. Zero-sized allocations return a non-null pointer.
unsafe impl GlobalAlloc for CountingAlloc {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        // Increment counter before attempting allocation.
        // Relaxed ordering is sufficient for a simple counter.
        ALLOC_COUNT.fetch_add(1, Ordering::Relaxed);

        // Zero-sized allocations must return a non-null pointer.
        // This is a strict requirement of the allocator contract.
        // Returning null for size 0 is undefined behavior.
        if layout.size() == 0 {
            return System.alloc(layout);
        }

        System.alloc(layout)
    }

    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        // Decrement counter.
        // If this underflows, you have a double-free or mismatched dealloc.
        ALLOC_COUNT.fetch_sub(1, Ordering::Relaxed);
        System.dealloc(ptr, layout);
    }
}

#[global_allocator]
static GLOBAL: CountingAlloc = CountingAlloc;

fn main() {
    let _a = String::from("hello");
    let _b = vec![1, 2, 3];

    // Two allocations are active.
    println!("Active allocations: {}", ALLOC_COUNT.load(Ordering::Relaxed));

    // When _a and _b drop, the count returns to zero.
}

This allocator wraps the system allocator and tracks the count. The AtomicUsize ensures thread safety if your program uses multiple threads. The Relaxed ordering is enough because we don't need synchronization, just a count.

Notice the check for layout.size() == 0. This is a trap for beginners. The allocator contract requires alloc to return a non-null pointer even when the size is zero. Some types, like Box<()>, trigger zero-sized allocations. If you return null, the caller panics. The convention is to handle this explicitly or delegate to a system allocator that handles it correctly.

Track every byte. If the count doesn't match, you have a leak or a double-free waiting to happen.

Pitfalls and errors

Custom allocators introduce new failure modes. The compiler can't check your memory logic at compile time. You have to test rigorously.

You can only define one global allocator per binary. If you depend on a crate that also sets #[global_allocator], the linker fails with a "duplicate global allocator" error. You must configure dependencies to disable their allocator, often via a feature flag like default-allocator = false.

If your alloc method returns null, the caller panics. You cannot return an error code. The allocator interface is binary: succeed with a valid pointer, or crash. If you run out of memory, you must panic or abort. There is no graceful recovery path through the allocator trait.

Alignment mismatches are silent killers. If you allocate memory with alignment 1 but the caller expects alignment 8, the program might work on x86 but crash on ARM. Or it might produce wrong results without crashing. Always respect the alignment in Layout. If you implement a pool allocator, ensure your pool slots are aligned to the maximum alignment your program needs.

Convention aside: Test your allocator with Miri. Miri is an interpreter for Rust that detects undefined behavior. It will catch invalid pointers, alignment errors, and memory leaks that runtime tests might miss. The community considers Miri testing essential for any unsafe code that touches memory.

A broken allocator crashes the whole process. Test with Miri. Test with AddressSanitizer. Test until you're bored.

When to use a custom allocator

Writing a custom allocator is hard. It requires deep understanding of memory management, alignment, and concurrency. Most programs don't need one.

Use the default system allocator when your program is a standard application, web server, or CLI tool. The system allocator is highly optimized and handles fragmentation better than most custom implementations.

Use a custom allocator when you are targeting embedded systems without an OS, where you must manage a fixed memory region manually.

Use a custom allocator when you need to hook into allocation for profiling, leak detection, or security auditing.

Use a crate like bumpalo or mimalloc when you need performance tuning or arena allocation without writing unsafe code yourself.

Use #[global_allocator] sparingly. Reaching for a custom allocator is rarely the first step for performance. Profile first. If allocation is the bottleneck, try jemalloc or mimalloc via a crate before writing your own.

Write your own allocator only when the problem demands it. Otherwise, stand on the shoulders of giants.

Where to go next

A custom allocator is a piece of code that decides how your program requests and returns memory from the operating system. You use one when you need to track every memory operation, handle out-of-memory errors differently, or optimize for specific hardware. Think of it as replacing the default bank teller with your own manager who follows your specific rules for handing out cash.