How to Write an Async Runtime in Rust

Writing a full async runtime from scratch in Rust requires implementing a custom executor that manages a task queue, a waker mechanism, and a scheduler loop to drive future progress.

The invisible hand behind await

You write an async fn handle_request for a web server. You call .await on a database query. The function pauses. The CPU switches to another request. When the database responds, your function resumes exactly where it left off.

Who pauses it? Who switches the CPU? Who resumes it?

That invisible hand is the runtime. The async/await syntax is just sugar for a state machine. The runtime is the engine that drives those state machines forward. Building one from scratch strips away the abstractions and reveals how Rust's concurrency model actually works.

Cooperative multitasking and the potluck analogy

Rust's async model uses cooperative multitasking. A task runs until it explicitly yields control. It cannot be preempted by the scheduler. This makes reasoning about locks and shared state much easier, but it puts the burden on the task to yield at the right moments.

Think of a potluck dinner where everyone shares a single stove. You are cooking your dish. You chop vegetables and stir the pot. That's CPU work. Then you put something in the oven. The oven takes ten minutes. You cannot stand there staring at the oven door; that blocks the stove for everyone else. You step aside and let the next person cook. You only return when the oven timer dings.

The runtime is the person managing the stove. It keeps a list of cooks. When a cook steps aside, the runtime picks the next one. When the timer dings, the runtime signals the waiting cook to come back.

In Rust terms:

  • The Future is the cook. It's a state machine that represents work in progress.
  • Polling is asking the cook, "Are you done yet?"
  • Pending means the cook is waiting for the oven. The runtime moves on.
  • Ready means the dish is finished. The runtime removes the cook from the list.
  • The Waker is the oven timer. It's a callback that tells the runtime to put the cook back in line.
  • The Context holds the waker. You pass it to the future so the future can register its timer.
  • Pin ensures the cook doesn't move while waiting. Some dishes have ingredients that point to other ingredients inside the same pot. Moving the pot would break those pointers. Pinning locks the pot in place.

Minimal single-threaded runtime

A production runtime handles I/O, timers, multi-threading, and complex scheduling. A minimal runtime needs three things: a task queue, a waker implementation, and a loop that polls tasks.

The hardest part is the waker. The standard library requires a Waker to be cloneable and thread-safe by default. Creating one from scratch involves defining a vtable of function pointers. This is where the "unsafe" surface lives.

use std::cell::RefCell;
use std::collections::VecDeque;
use std::future::Future;
use std::pin::Pin;
use std::rc::Rc;
use std::task::{Context, Poll, RawWaker, RawWakerVTable, Waker};

/// A task wraps a future and holds a reference to the ready queue.
/// The Rc allows the waker to clone the task and push it back to the queue.
struct Task {
    /// The future being polled. Pinned because it may contain self-referential data.
    future: Pin<Box<dyn Future<Output = ()> + 'static>>,
    /// The waker the task registered via Context.
    /// We store this so we can drop it when the task finishes, preventing leaks.
    waker: Option<Waker>,
}

/// The runtime holds the queue of pending tasks.
/// We use Rc<RefCell> so the waker can mutate the queue from a shared reference.
struct Runtime {
    /// Tasks waiting to be polled.
    tasks: VecDeque<Rc<Task>>,
}

impl Runtime {
    fn new() -> Self {
        Runtime {
            tasks: VecDeque::new(),
        }
    }

    /// Spawn a future onto the runtime.
    fn spawn<F>(&mut self, future: F)
    where
        F: Future<Output = ()> + 'static,
    {
        // Wrap the future in a Task and an Rc for shared ownership.
        // The Rc allows the waker to hold a reference to the task.
        let task = Rc::new(Task {
            future: Box::pin(future),
            waker: None,
        });
        self.tasks.push_back(task);
    }

    /// Block the current thread and run tasks until all are complete.
    fn block_on(&mut self) {
        // Loop until no tasks remain.
        while let Some(task) = self.tasks.pop_front() {
            // Create a waker that pushes this task back to the queue.
            // The waker needs to own a clone of the task to push it.
            let waker = create_waker(task.clone());

            // Build a context containing the waker.
            // The future uses this to register wake notifications.
            let mut cx = Context::from_waker(&waker);

            // Poll the future.
            // SAFETY: The future is pinned in the Task struct.
            // We never move the Task or the future after pinning.
            let poll_result = task.future.as_mut().poll(&mut cx);

            // Store the waker the task might have registered.
            // If the task replaces the waker, we drop the old one here.
            // This prevents waker leaks if the task is dropped.
            task.waker = Some(waker);

            match poll_result {
                Poll::Ready(()) => {
                    // Task finished. The Rc drops, cleaning up the future.
                }
                Poll::Pending => {
                    // Task needs more work.
                    // If the task called wake(), it already pushed itself to the queue.
                    // If not, we re-queue it to avoid starvation.
                    // A real runtime would use a "ready" flag to avoid double-queuing.
                    self.tasks.push_back(task);
                }
            }
        }
    }
}

/// Create a custom waker that pushes the task back to the runtime queue.
/// This requires implementing a RawWakerVTable, which defines the behavior
/// of clone, drop, and wake operations.
fn create_waker(task: Rc<Task>) -> Waker {
    // The data pointer for the raw waker points to the Rc<Task>.
    // We transmute the Rc to a pointer to store in the RawWaker.
    let raw = Rc::into_raw(Rc::clone(&task));

    // SAFETY: The vtable functions are implemented below to correctly
    // manage the Rc<Task> lifecycle. The data pointer is valid as long
    // as the Waker exists, and the vtable ensures the Rc is dropped
    // exactly when the last Waker is dropped.
    let raw_waker = RawWaker::new(raw as *const (), &VTABLE);
    unsafe { Waker::from_raw(raw_waker) }
}

/// The vtable for our custom waker.
/// Each function receives a raw pointer to the Rc<Task>.
static VTABLE: RawWakerVTable = RawWakerVTable::new(
    clone_waker,
    wake_waker,
    wake_by_ref_waker,
    drop_waker,
);

/// Clone the waker by cloning the underlying Rc<Task>.
/// This increments the reference count so the task stays alive.
fn clone_waker(ptr: *const ()) -> RawWaker {
    // SAFETY: ptr comes from a valid Rc<Task> created in create_waker.
    // We clone the Rc to increment the ref count.
    let task = unsafe { Rc::from_raw(ptr as *const Task) };
    let cloned = Rc::clone(&task);
    // Drop the original reference to keep the count balanced.
    drop(task);
    RawWaker::new(Rc::into_raw(cloned) as *const (), &VTABLE)
}

/// Wake the task by pushing it back to the ready queue.
/// In this minimal runtime, we push to a global queue for simplicity.
/// A real runtime would store the queue reference inside the Task.
fn wake_waker(ptr: *const ()) {
    // SAFETY: ptr is a valid Rc<Task>. We take ownership to push it.
    let task = unsafe { Rc::from_raw(ptr as *const Task) };
    // For this demo, we rely on the scheduler loop to re-queue Pending tasks.
    // A wake-driven runtime would push `task` to a ready queue here.
    // Dropping the task here would be wrong; we must re-queue it.
    // Since we don't have a ready queue in this struct, we drop and let
    // the scheduler handle re-queuing via the Pending branch.
    // This is a limitation of the minimal example.
    drop(task);
}

/// Wake by reference does the same as wake for this simple waker.
fn wake_by_ref_waker(ptr: *const ()) {
    wake_waker(ptr);
}

/// Drop the waker by dropping the underlying Rc<Task>.
/// This decrements the reference count.
fn drop_waker(ptr: *const ()) {
    // SAFETY: ptr is a valid Rc<Task>. We reconstruct the Rc to drop it.
    let _task = unsafe { Rc::from_raw(ptr as *const Task) };
    // The Rc drops here, decrementing the count.
    // If this was the last waker, the Task is freed.
}

fn main() {
    let mut rt = Runtime::new();

    rt.spawn(async {
        println!("Task started");
        // Simulate work. In a real runtime, this would yield to I/O.
        println!("Task finished");
    });

    rt.block_on();
}

Convention aside: You will see Rc::clone(&task) and task.clone() used interchangeably in Rust code. Both compile and do the same thing. The community prefers the explicit Rc::clone form because task.clone() looks like a deep clone to developers coming from other languages. The explicit form signals that you are cloning the reference, not the data.

How the loop drives progress

The block_on method is the heart of the runtime. It pops a task, creates a waker, and calls poll.

The poll method asks the future to make progress. The future runs its internal state machine. If it can finish immediately, it returns Poll::Ready. If it needs to wait for something, it returns Poll::Pending.

When the future returns Pending, it must have registered a waker. It does this by calling context.waker().clone() and storing the waker. Later, when the external event occurs, the runtime calls waker.wake(). This triggers the wake_waker function in the vtable, which signals the runtime to re-queue the task.

In the minimal example above, the waker implementation is simplified. It drops the task and relies on the scheduler to re-queue Pending tasks. This works for a demo but creates a polling loop that checks every task even if it's not ready. A real runtime uses a "ready queue". The waker pushes the task to the ready queue. The scheduler only polls tasks in the ready queue. This avoids wasting CPU cycles on sleeping tasks.

Pin plays a critical role here. Some futures contain self-referential pointers. For example, a future might allocate a buffer and store a pointer to that buffer inside the future struct. If the future moves in memory, the pointer becomes invalid. Pin prevents the future from moving after it's pinned. The compiler enforces this at the type level. You cannot extract a value out of a Pin without unsafe.

If you try to move a pinned value, the compiler rejects you with E0507 (cannot move out of borrowed content). This error protects you from creating dangling pointers inside futures.

Realistic waker registration

The minimal example hides waker registration. Real futures must store the waker so they can wake themselves up later.

Here is how a custom future stores and uses a waker:

use std::task::{Context, Poll, Waker};
use std::future::Future;

struct MyFuture {
    /// The waker to call when work is done.
    waker: Option<Waker>,
    /// State flag to simulate async work.
    done: bool,
}

impl Future for MyFuture {
    type Output = ();

    fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> {
        if self.done {
            return Poll::Ready(());
        }

        // Clone the waker from the context.
        // This is the callback the runtime provides.
        // We store it so we can call wake() later.
        self.waker = Some(cx.waker().clone());

        // Simulate checking for an event.
        // In real code, this might register a callback with an I/O driver.
        // If the event is ready, we return Ready.
        // Otherwise, we return Pending and the runtime will wake us later.
        Poll::Pending
    }
}

The future clones the waker on every poll. This is necessary because the runtime might replace the waker between polls. Storing the waker allows the future to wake itself when an external event occurs. For example, an I/O driver might call waker.wake() when a socket becomes readable.

Convention aside: Waker::clone() can be expensive. It often involves atomic reference count increments. Production runtimes optimize this by using Arc with atomic flags or by batching wake notifications. If you see waker.clone() in a hot loop, profile it. You might need a more efficient waker implementation.

Pitfalls and compiler errors

Writing a runtime exposes you to the raw edges of Rust's async machinery.

Pinning violations. If you try to move a future out of a Pin, the compiler stops you. You might see E0507 when attempting to extract a field from a pinned struct. The fix is to use Pin::map_unchecked_mut or redesign the struct to avoid self-references. Most futures generated by async blocks are self-referential. Treat them as pinned.

Waker leaks. If a future stores a waker and then drops without calling wake, the waker is dropped. This is fine. But if the runtime holds a reference to the task and the task holds a waker that holds a reference to the task, you create a reference cycle. The memory never frees. Use Weak references or ensure the waker drops when the task drops.

Starvation. If a task never yields, the runtime blocks. The scheduler loop spins on that one task. This is a logic error in the future, not the runtime. The future must call await or return Pending periodically.

Deadlocks. In a single-threaded runtime, calling block_on from inside a task deadlocks. The task tries to run the scheduler, but the scheduler is waiting for the task to finish. You cannot block the executor thread. Use spawn to run sub-tasks.

Error codes. If you forget to pin a future, you get E0277 (trait bound not satisfied) because Future requires Unpin unless pinned. If you move a value into a closure that captures it by value, you get E0382 (use of moved value). These errors guide you toward correct ownership patterns.

Trust the borrow checker on Pin. It catches self-referential bugs that would cause segfaults in C++.

When to write your own runtime

Building a runtime is an educational exercise. It teaches you how Future, Waker, Context, and Pin interact. It reveals the cost of async abstractions. It shows why tokio exists.

Use tokio when you need a production-grade runtime with I/O drivers, timers, multi-threading, and battle-tested scheduling. It handles edge cases you haven't thought of.

Use async-std when you want an async runtime that mirrors the standard library's blocking API. It integrates well with std::net and std::fs async equivalents.

Write a custom runtime when you are building an embedded system with strict memory constraints, a unikernel where the runtime is the kernel, or a specialized executor for a domain like graphics or real-time control.

Implement a minimal executor for learning when you want to understand the mechanics of polling, waker registration, and pin safety. This knowledge pays off when debugging complex async bugs in larger codebases.

Treat the SAFETY comment as a proof. If you can't write the invariants, you don't have a safe abstraction.

Where to go next