How to Implement the Repository Pattern in Rust

Implement the Repository Pattern in Rust by defining a trait for data operations and a struct that implements it using generics.

The seam between logic and storage

You are building a service to manage user accounts. You start with a Vec<User> in memory because it is fast and requires no setup. Your functions take &mut Vec<User> and do their job. Then the product manager asks for persistence. You add a database. Suddenly, every function signature changes. You replace &mut Vec<User> with &mut SqlxPool. You refactor ten files. The compiler screams about mismatched types. You realize you have glued your business logic to your storage details. Changing storage now means touching every line of logic.

You need a seam. You need a way to say, "I don't care where this data lives. I just need to get and save it." In Rust, you create that seam with a trait. The trait defines the operations. The struct implements them. Your business logic depends on the trait. You swap the struct without touching the logic.

The contract, not the container

Rust does not have abstract base classes or inheritance hierarchies. It has traits. A trait is a contract. It lists the methods a type must provide. It says nothing about how those methods work.

Think of a trait like a power outlet. Your laptop charger plugs into the outlet. The laptop does not care if the outlet is connected to a wall socket, a battery backup, or a solar inverter. The laptop cares that the outlet provides 120V AC. The outlet is the trait. The wall socket is one implementation. The battery backup is another. The laptop code stays the same regardless of the source.

The repository pattern in Rust follows this shape. You define a trait with methods like get and save. You implement that trait for an in-memory vector. You implement it again for a database client. Your service struct takes the trait. It calls get and save. It never sees the vector or the database.

Define the trait. Implement the trait. Swap the impl. The rest of the code stays asleep.

Minimal implementation

Start with the trait. Make it generic over the item type so the repository can store anything. Return owned values to keep lifetimes simple.

/// A generic contract for storing and retrieving items by ID.
/// 
/// The trait is generic over T, allowing any type to be stored.
/// Methods return owned values to avoid lifetime complexity.
trait Repository<T> {
    fn get(&self, id: i32) -> Option<T>;
    fn save(&mut self, item: T);
}

/// A simple in-memory storage backed by a vector.
/// 
/// This implementation is useful for tests or prototypes.
struct InMemoryRepo<T> {
    items: Vec<(i32, T)>,
}

impl<T: Clone> Repository<T> for InMemoryRepo<T> {
    // Clone bound is required because get returns an owned T.
    // We must clone the item out of the vector.
    fn get(&self, id: i32) -> Option<T> {
        // Search the vector for the matching ID.
        // Map extracts the item and clones it for the caller.
        self.items.iter()
            .find(|(item_id, _)| *item_id == id)
            .map(|(_, item)| item.clone())
    }

    fn save(&mut self, item: T) {
        // Generate a simple ID based on current size.
        // In production, use the database's auto-increment or a UUID.
        let id = self.items.len() as i32;
        // Push the new item into storage.
        self.items.push((id, item));
    }
}

The T: Clone bound on the impl is a convention detail. The trait does not require Clone. The implementation does. This keeps the trait flexible. A database implementation might return owned values without cloning, perhaps by deserializing fresh data. The trait allows that. The in-memory impl requires cloning because it holds the data and shares it via &self.

How the compiler connects the dots

When you write a function that uses the repository, you can make it generic.

/// A service that depends on a repository trait.
/// 
/// The service is generic over R, which must implement Repository<T>.
/// This allows the service to work with any repository type.
fn find_item<R: Repository<T>, T>(repo: &R, id: i32) -> Option<T> {
    repo.get(id)
}

The compiler generates a separate copy of find_item for every concrete type you pass. If you call it with InMemoryRepo<User>, you get one version. If you call it with SqlxRepo<User>, you get another. This is monomorphization. The code is inlined. There is no virtual dispatch overhead. The trait call becomes a direct function call.

You can also use impl Trait syntax for cleaner signatures.

/// Same as above, but with impl Trait syntax.
/// 
/// This is the preferred style for function arguments.
/// It reads as "takes anything that implements Repository<T>".
fn find_item<T>(repo: &impl Repository<T>, id: i32) -> Option<T> {
    repo.get(id)
}

The behavior is identical. The compiler still monomorphizes. The signature is shorter. Use impl Trait in function arguments. Use generic parameters when you need to name the type multiple times, such as in a struct definition.

Static dispatch is free. Dynamic dispatch costs a pointer indirection. Pick static unless you have a reason not to.

Real-world shape: Services and Async

In a real application, the repository lives inside a service. The service is injected with the repository. This is dependency injection. Rust handles it with generics.

/// A service that manages users using a repository.
/// 
/// The service holds the repository and exposes business logic.
/// It is generic over the repository type.
struct UserService<R: Repository<User>> {
    repo: R,
}

impl<R: Repository<User>> UserService<R> {
    /// Creates a new service with the given repository.
    fn new(repo: R) -> Self {
        Self { repo }
    }

    /// Finds a user by ID.
    /// 
    /// Delegates to the repository.
    /// The service does not know how the repository works.
    fn find_user(&self, id: i32) -> Option<User> {
        self.repo.get(id)
    }

    /// Creates a new user and persists it.
    fn create_user(&mut self, user: User) {
        self.repo.save(user);
    }
}

This compiles. It works. But modern Rust applications are often async. Databases are async. If your repository talks to a database, its methods must be async.

Rust 1.75 stabilized async fn in traits. You no longer need the #[async_trait] macro for simple cases. You can write async methods directly.

/// An async repository trait.
/// 
/// Methods are async to allow non-blocking I/O.
/// The trait requires Send + Sync bounds for thread safety.
trait AsyncRepository<T: Send + Sync> {
    async fn get(&self, id: i32) -> Option<T>;
    async fn save(&mut self, item: T);
}

The Send + Sync bounds are critical here. Async code moves futures across threads. If T is not Send, the future cannot cross thread boundaries. The compiler rejects the code with E0277 (trait bound not satisfied). Adding Send + Sync to the trait or the impl tells the compiler that the data is safe to share.

Convention aside: Most repository traits in async crates include Send + Sync on the item type. It saves pain later. If you forget it, the error appears deep inside the async runtime, not at the trait definition. Add the bounds upfront.

Pitfalls and compiler friction

Lifetimes are the first trap. If you change get to return a reference Option<&T>, you must annotate lifetimes.

// This signature requires lifetime annotations.
// fn get(&self, id: i32) -> Option<&T>;

The compiler needs to know how long the reference lives. It ties the lifetime to &self. This works for simple cases. It breaks when you try to store the reference or return it across async boundaries. Async blocks erase lifetimes. You cannot return a reference from an async function.

If you hit lifetime errors, switch to owned returns. Return Option<T>. Clone the data. The clone cost is usually negligible compared to the database round-trip. If you are doing in-memory lookups on large structs, consider returning references. Accept the lifetime complexity. Measure the impact.

Another pitfall is mutation. The save method takes &mut self. This means the repository cannot be shared while saving. In a concurrent application, you often want multiple tasks to save simultaneously.

The solution is interior mutability. Wrap the storage in a Mutex or RwLock. Change save to take &self.

struct ThreadSafeRepo<T> {
    // Mutex allows mutation behind a shared reference.
    items: std::sync::Mutex<Vec<(i32, T)>>,
}

Now save can take &self. You lock the mutex inside the method. The repository becomes Sync. It can be shared across threads. This is the standard pattern for shared state in Rust.

Testing is where the pattern shines. Because the repository is a trait, you can implement a mock for tests.

/// A mock repository for testing.
/// 
/// Implements the same trait as the real repository.
/// Returns controlled data for deterministic tests.
struct MockRepo {
    data: Vec<(i32, User)>,
}

impl Repository<User> for MockRepo {
    fn get(&self, id: i32) -> Option<User> {
        self.data.iter()
            .find(|(item_id, _)| *item_id == id)
            .map(|(_, user)| user.clone())
    }

    fn save(&mut self, user: User) {
        let id = self.data.len() as i32;
        self.data.push((id, user));
    }
}

Pass MockRepo to UserService in tests. Verify the service logic without touching a database. The trait makes mocking trivial. No reflection. No framework magic. Just another impl.

Trust the borrow checker. It usually has a point. If the compiler rejects your repository signature, check the lifetimes and the mutation strategy. Return owned values when you can. Use interior mutability when you need sharing.

When to reach for this pattern

Use a trait-based repository when you need to swap storage backends without touching business logic. Use impl Trait for repository parameters when you want zero-cost abstraction and monomorphization. Use Box<dyn Trait> for repository parameters when you need to store multiple repository types in the same collection or when the concrete type is unknown at compile time. Use owned returns (Option<T>) when the data is small or cloning is cheap. Use reference returns (Option<&T>) when the data is large and you want to avoid allocation, accepting the lifetime complexity. Use Send + Sync bounds when your repository will be used across threads or in async runtimes. Use interior mutability (Mutex, RwLock) when the repository must be shared and mutated concurrently.

Where to go next