How to avoid deadlocks

Prevent deadlocks in Rust by using Mutex to guard shared data and ensuring consistent lock ordering so threads do not wait indefinitely.

How to avoid deadlocks

You have two threads updating a shared database. Thread A holds a lock on the user table and waits for the order table. Thread B holds the order table and waits for the user table. Neither moves. The program freezes. This is a deadlock. It's the silent killer of concurrent code. Rust won't stop you from writing deadlocks at compile time, but it gives you tools to make them rare and easy to spot. Deadlocks are logic bugs, not syntax errors. The compiler trusts you to keep the threads honest.

The hallway standoff

A deadlock happens when two or more threads are each waiting for a resource the other holds. Imagine two people trying to enter two different rooms. Alice holds the key to Room 1 and needs the key to Room 2. Bob holds the key to Room 2 and needs the key to Room 1. They both stand in the hallway, clutching their keys, waiting for the other to move. Neither can proceed. The system is stuck forever.

In Rust, resources are usually locks like Mutex. If Thread A locks Mutex A then tries to lock Mutex B, while Thread B locks Mutex B then tries to lock Mutex A, you get the hallway standoff. The threads are alive, but they are blocked waiting for locks that will never be released. Deadlocks are race conditions in the order of operations. They only happen when the timing lines up just wrong.

Minimal deadlock example

This code creates a deadlock by reversing the lock order in two threads. The sleep calls force the overlap so the deadlock happens reliably, but the bug exists even without them.

use std::sync::Mutex;
use std::thread;

fn main() {
    let lock_a = Mutex::new(0);
    let lock_b = Mutex::new(0);

    // Thread 1: Grabs A, then B
    let handle1 = thread::spawn(move || {
        let mut a = lock_a.lock().unwrap(); // Acquire A
        thread::sleep(std::time::Duration::from_millis(10)); // Force context switch
        let mut b = lock_b.lock().unwrap(); // Wait for B (held by Thread 2)
        *b += 1;
    });

    // Thread 2: Grabs B, then A
    let handle2 = thread::spawn(move || {
        let mut b = lock_b.lock().unwrap(); // Acquire B
        thread::sleep(std::time::Duration::from_millis(10)); // Force context switch
        let mut a = lock_a.lock().unwrap(); // Wait for A (held by Thread 1)
        *a += 1;
    });

    handle1.join().unwrap();
    handle2.join().unwrap();
}

Run this code and it hangs. The OS scheduler gives Thread 1 the CPU first. It grabs lock_a. Then Thread 2 runs and grabs lock_b. Now Thread 1 tries lock_b and blocks. Thread 2 tries lock_a and blocks. Both are sleeping, waiting for the other to wake up and release a lock. They never will. The join calls in main also block. The program sits there until you kill it. This code hangs. The sleep forces the overlap, but without it, the race condition still exists. It just happens less often.

Walkthrough: why the compiler stays silent

The compiler rejects this with no error. It sees valid lock acquisitions. lock_a.lock() returns a MutexGuard. lock_b.lock() returns a MutexGuard. The types are correct. The borrows are valid. Rust's type system tracks ownership and borrowing within a single thread's control flow. It does not track the runtime order of lock acquisitions across multiple threads. The compiler cannot prove that Thread 1 and Thread 2 will never acquire locks in opposite orders. You have to manage the order yourself. The compiler sees valid lock acquisitions. It doesn't track runtime order across threads. You own the ordering invariant.

Realistic fix: global lock ordering

The standard way to prevent deadlocks is to enforce a global order on lock acquisition. If every thread acquires locks in the same order, the hallway standoff becomes impossible. One thread might wait for another, but it will never wait for a lock that the other thread is waiting for.

Consider a bank transfer between two accounts. Each account has a mutex protecting its balance. A naive implementation locks the source account then the destination account. If two transfers happen simultaneously in opposite directions, you get a deadlock.

use std::sync::Mutex;
use std::thread;

struct Account {
    balance: Mutex<i64>,
}

fn transfer(from: &Account, to: &Account, amount: i64) {
    // BAD: Order depends on argument order
    let mut from_bal = from.balance.lock().unwrap();
    let mut to_bal = to.balance.lock().unwrap();
    
    *from_bal -= amount;
    *to_bal += amount;
}

fn main() {
    let acc1 = Account { balance: Mutex::new(100) };
    let acc2 = Account { balance: Mutex::new(100) };

    let t1 = thread::spawn(move || {
        transfer(&acc1, &acc2, 10);
    });

    let t2 = thread::spawn(move || {
        transfer(&acc2, &acc1, 10); // Opposite order: deadlock risk
    });

    t1.join().unwrap();
    t2.join().unwrap();
}

The fix is to order locks by a consistent criterion, such as memory address. This ensures that regardless of which account is the source or destination, the locks are always acquired in the same sequence.

use std::sync::Mutex;
use std::thread;

struct Account {
    balance: Mutex<i64>,
}

fn transfer_safe(from: &Account, to: &Account, amount: i64) {
    // Avoid deadlock by ordering locks by memory address
    let (first, second) = if std::ptr::addr_eq(&from.balance, &to.balance) {
        return; // Same account, nothing to do
    } else if std::ptr::addr_lt(&from.balance, &to.balance) {
        (&from.balance, &to.balance)
    } else {
        (&to.balance, &from.balance)
    };

    let mut first_bal = first.lock().unwrap();
    let mut second_bal = second.lock().unwrap();

    // Apply transfer logic based on which lock is which
    if std::ptr::addr_eq(first, &from.balance) {
        *first_bal -= amount;
        *second_bal += amount;
    } else {
        *second_bal -= amount;
        *first_bal += amount;
    }
}

fn main() {
    let acc1 = Account { balance: Mutex::new(100) };
    let acc2 = Account { balance: Mutex::new(100) };

    let t1 = thread::spawn(move || {
        transfer_safe(&acc1, &acc2, 10);
    });

    let t2 = thread::spawn(move || {
        transfer_safe(&acc2, &acc1, 10); // Safe: ordering is deterministic
    });

    t1.join().unwrap();
    t2.join().unwrap();
}

std::ptr::addr_lt compares raw pointers to determine a total order. This works because every mutex has a unique address. The logic inside the critical section must check which lock corresponds to which account, since the order might swap. This adds a small amount of complexity but eliminates the deadlock risk entirely. Ordering by memory address is a reliable tie-breaker. It turns arbitrary inputs into a deterministic lock sequence.

Scope and drop discipline

Rust's MutexGuard releases the lock when it goes out of scope. This is your primary defense against holding locks longer than necessary. If you hold a lock while doing expensive work or calling other functions, you increase contention and the window for deadlocks. Keep critical sections short. Drop guards explicitly if you need to release a lock before the end of the function.

use std::sync::Mutex;

fn process_data(data: &Mutex<Vec<i32>>) {
    let guard = data.lock().unwrap();
    
    // Do minimal work here
    let snapshot = guard.clone();
    
    // Drop guard explicitly to release lock early
    drop(guard);
    
    // Expensive processing happens without holding the lock
    let result = expensive_computation(&snapshot);
    println!("Result: {}", result);
}

fn expensive_computation(data: &[i32]) -> i32 {
    data.iter().sum()
}

The drop(guard) call releases the lock immediately. Other threads can acquire the mutex while expensive_computation runs. This pattern reduces contention and makes deadlocks less likely because locks are held for shorter periods. Convention aside: use lock().expect("mutex poisoned") in production code instead of unwrap(). The error message helps diagnose panics. Also, MutexGuard drops in reverse order of acquisition. If you lock A then B, B drops first, then A. This is usually fine, but be aware if your drop logic has side effects. Trust the borrow checker for scope, but manage your scope for performance. Short critical sections are the best performance optimization for locks.

Pitfalls: re-entrant locks and poison

Rust's Mutex is not re-entrant. If a thread tries to lock a mutex it already holds, it deadlocks. The thread waits for itself to release the lock, which it never does because it's blocked.

use std::sync::Mutex;

fn main() {
    let m = Mutex::new(0);
    let mut g = m.lock().unwrap();
    *g += 1;
    
    // DEADLOCK: Thread holds lock, tries to acquire it again
    let mut g2 = m.lock().unwrap(); 
    *g2 += 1;
}

This code hangs instantly. There is no compile error. The compiler sees two separate lock calls. It doesn't know they are on the same mutex in the same thread. If you need re-entrant locking, use a crate like parking_lot which provides ReentrantMutex, or refactor your code to avoid recursive locking. Refactoring is usually better. Re-entrant locks hide design flaws and make reasoning about state harder.

Another pitfall is a poisoned mutex. If a thread panics while holding a lock, the mutex becomes poisoned. Subsequent lock() calls return Err(PoisonError). This prevents other threads from accessing potentially inconsistent data.

use std::sync::Mutex;
use std::thread;

fn main() {
    let m = Mutex::new(0);
    
    let handle = thread::spawn(move || {
        let mut g = m.lock().unwrap();
        *g += 1;
        panic!("Oops"); // Panics while holding lock
    });
    
    handle.join().unwrap();
    
    // This returns Err(PoisonError)
    let result = m.lock();
    match result {
        Ok(g) => println!("Got lock"),
        Err(e) => {
            // Recover the guard if you trust the data
            let g = e.into_inner();
            println!("Recovered value: {}", *g);
        }
    }
}

You can recover the guard using into_inner() if you decide the data is still valid. This is a design decision. A poisoned mutex is a red flag. Your data might be inconsistent. Decide early whether to crash or recover.

Decision: when to use what

Use global lock ordering when you have multiple mutexes and can define a strict hierarchy. Assign an ID to each lock and always acquire them in ascending ID order.

Use a single coarse-grained mutex when the critical section is small and contention is low. One lock eliminates ordering issues entirely.

Use try_lock with backoff when you need non-blocking behavior or want to detect potential deadlocks at runtime. This adds complexity and doesn't guarantee progress, so reserve it for specific recovery scenarios.

Use channels instead of shared state when threads need to exchange data rather than mutate a shared structure. Channels remove the need for locks and make deadlocks impossible by design.

If you can replace a mutex with a channel, do it. Shared mutable state is the root of concurrency bugs; messages are the cure.

Where to go next