How to write procedural macro

Write a procedural macro by creating a library crate with `proc-macro = true` and implementing the `TokenStream` conversion logic to generate code at compile time.

The factory that builds your code

You are building a data layer for an application. You define a User struct with id, name, and email. You need to serialize it to JSON, deserialize it from a database row, and validate the email format. You write the implementations. Then you add Post. Then Comment. Then Order. You find yourself copy-pasting the same serialization logic, tweaking field names, and praying you didn't miss a field in one of the impls. The repetition is tedious and error-prone. You want a tool that reads your struct definition and writes the impls automatically.

That tool is a procedural macro.

Procedural macros are functions that run during compilation. They take Rust code as input, analyze it, and return new Rust code. The compiler treats the output as if you typed it yourself. You write the macro once. The compiler runs it for every struct you annotate. The result is less boilerplate, fewer copy-paste bugs, and code that stays in sync with your data structures.

What a procedural macro actually is

A procedural macro is a library crate marked with proc-macro = true. It exports functions annotated with attributes like #[proc_macro_derive] or #[proc_macro_attribute]. When the compiler encounters the attribute on your code, it calls the function. The function receives the code as a stream of tokens, processes it, and returns a new stream of tokens.

Think of a macro as a specialized assembler line in a factory. You feed it a raw part, which is your struct definition. The machine inspects the part, stamps it with serial numbers, welds on attachments, and outputs a finished component. The rest of the factory does not care how the component was made. It just uses the finished part. The macro is the machine. The struct is the raw part. The generated impl is the finished component.

The ecosystem provides three crates to make this work. syn parses the input tokens into a structured abstract syntax tree. quote builds output tokens from a template. proc-macro2 provides types that work both in the macro crate and in unit tests. You almost always use all three together.

Setting up the macro crate

Procedural macros must live in a separate crate. The compiler requires a distinct crate type to load the macro at the right time. Create a new library crate and configure it in Cargo.toml.

[lib]
proc-macro = true

[dependencies]
syn = { version = "2.0", features = ["derive", "printing"] }
quote = "1.0"
proc-macro2 = "1.0"

The proc-macro = true flag tells Cargo to compile this crate as a procedural macro library. The dependencies are the standard toolkit. syn handles parsing. quote handles generation. proc-macro2 provides the underlying token types. The derive feature in syn enables parsing of derive input structures.

The minimal derive macro

A derive macro implements a trait for a type. You annotate a struct with #[derive(MyDerive)], and the macro generates an impl block. The function signature takes a TokenStream and returns a TokenStream.

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput};

/// Generates an impl block for MyTrait based on the struct name.
#[proc_macro_derive(MyDerive)]
pub fn my_derive(input: TokenStream) -> TokenStream {
    // Parse the input tokens into a structured DeriveInput AST.
    // parse_macro_input converts parse errors into compiler errors with spans.
    let input = parse_macro_input!(input as DeriveInput);

    // Extract the identifier (name) of the struct or enum.
    let name = &input.ident;

    // Generate the output tokens using a quote template.
    // #name interpolates the identifier into the output.
    let expanded = quote! {
        impl MyTrait for #name {
            fn my_method(&self) {
                println!("Hello from {}", stringify!(#name));
            }
        }
    };

    // Return the generated tokens to the compiler.
    TokenStream::from(expanded)
}

The parse_macro_input! macro is the community standard for parsing. It wraps syn::parse and converts parse errors into syn::Error objects that the compiler displays with proper source spans. If you use syn::parse(input).unwrap(), a parse error panics the compiler. The error message becomes "proc-macro derive panicked" with no useful location information. Always use parse_macro_input! to give users actionable errors.

The quote! macro builds the output. It takes Rust-like syntax and interpolates variables using #. The stringify!(#name) call converts the identifier to a string literal at compile time. The result is a TokenStream that the compiler inserts into the caller's code.

Treat the macro crate as a separate product. Test it. Give it good error messages. The compiler is your user.

How the compilation pipeline works

When you write #[derive(MyDerive)] on a struct, the compiler follows a specific sequence. First, the compiler parses your code and encounters the derive attribute. It looks up the macro definition in the dependency crate. It collects the tokens of the struct definition and passes them to the macro function.

The macro function runs. It parses the tokens into an AST using syn. It manipulates the AST to extract information. It builds new tokens using quote. It returns the tokens. The compiler receives the tokens and inserts them into the source code at the location of the struct. The compiler then continues parsing and type-checking the expanded code.

The generated code lives in the caller's context. Variables, types, and traits referenced in the macro output must be in scope in the caller's crate. If the macro generates impl MyTrait, the caller must have MyTrait in scope. If the macro generates println!, the caller must have std::prelude in scope. This is called hygiene. Procedural macros expand in the caller's namespace.

A realistic example: generating Display

A minimal example prints a static message. A realistic macro generates code based on the structure of the type. Consider a macro that implements std::fmt::Display by listing all field names and values. This requires iterating over the struct's fields.

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput, Data, Fields};

/// Implements Display by printing field names and values.
#[proc_macro_derive(DebugDisplay)]
pub fn debug_display(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as DeriveInput);
    let name = &input.ident;

    // Match on the data variant to extract fields.
    // This macro only supports structs with named fields.
    let fields = match &input.data {
        Data::Struct(data_struct) => match &data_struct.fields {
            Fields::Named(fields) => fields,
            _ => panic!("DebugDisplay only supports structs with named fields"),
        },
        _ => panic!("DebugDisplay only supports structs"),
    };

    // Collect field names and accessors for the format string.
    let field_names: Vec<_> = fields.named.iter().map(|f| f.ident.as_ref().unwrap()).collect();
    let field_accessors: Vec<_> = fields.named.iter().map(|f| f.ident.as_ref().unwrap()).collect();

    // Generate the impl block with the format string.
    let expanded = quote! {
        impl std::fmt::Display for #name {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
                write!(f, "{} {{ ", #name)?;
                #(
                    write!(f, #field_names: {} = {:?}, self.#field_accessors)?;
                    if !#field_names.is_empty() {
                        write!(f, ", ")?;
                    }
                )*
                write!(f, "}}")
            }
        }
    };

    TokenStream::from(expanded)
}

This macro extracts the named fields from the struct. It builds vectors of field identifiers. The quote! block uses repetition syntax #(...)* to iterate over the fields. The repetition generates a write! call for each field. The #field_names inside the repetition interpolates the identifier as a string literal in the format string. The self.#field_accessors accesses the field value.

The repetition syntax is powerful. #(expr)* repeats the expression for each element in the iterator. You can add separators with #(expr),*. The macro generates code that looks like write!(f, "x: {} = {:?}", self.x)?; write!(f, ", ")?; for each field.

Macros are code that writes code. Make the written code readable. If the generated code is hard to debug, the macro is hard to use.

Handling generics correctly

Structs often have generic parameters. A macro must propagate those parameters to the generated impl. If you ignore generics, the generated code fails to compile. The compiler rejects the impl with a mismatched types error or a missing lifetime error.

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput};

/// Implements a trait while preserving generic parameters.
#[proc_macro_derive(GenericTrait)]
pub fn generic_trait(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as DeriveInput);
    let name = &input.ident;

    // Extract generic parameters, type generics, and where clause.
    // impl_generics includes lifetimes and types for the impl header.
    // ty_generics includes types for the type name.
    // where_clause includes the where predicates.
    let (impl_generics, ty_generics, where_clause) = input.generics.split_for_impl();

    let expanded = quote! {
        impl #impl_generics GenericTrait for #name #ty_generics #where_clause {
            fn process(&self) {
                println!("Processing {}", stringify!(#name));
            }
        }
    };

    TokenStream::from(expanded)
}

The split_for_impl() method is essential. It splits the generics into three parts. impl_generics produces <'a, T: Trait>. ty_generics produces <T>. where_clause produces where T: Clone. The generated impl uses all three parts to reconstruct the full generic signature. Without split_for_impl(), you must manually reconstruct the generics, which is error-prone and misses edge cases like const generics or complex where clauses.

Use split_for_impl() for every macro that touches generics. It handles the syntax correctly. Do not try to parse generics manually.

Pitfalls and compiler errors

Procedural macros introduce unique failure modes. The most common issue is panicking. If your macro calls .unwrap() or .expect() and the condition fails, the compiler panics. The error message is "proc-macro derive panicked". This message gives no span information. The user cannot locate the problem. Always use parse_macro_input! or return TokenStream::from(syn::Error::new(...).to_compile_error()) to provide structured errors.

Another issue is hygiene. Macros expand in the caller's context. If your macro generates a variable name that clashes with a variable in the caller's scope, you get a name collision. The compiler reports E0428 (constant with this name already defined) or a similar conflict error. Use quote::format_ident! to generate unique identifiers if needed. The quote crate handles most hygiene automatically by marking identifiers as coming from the macro crate.

A third pitfall is missing dependencies. If your macro generates code that uses a type from an external crate, the caller must import that crate. If the caller forgets the import, the compiler reports E0432 (use of undeclared type). Document the dependencies your macro requires. Add a note in the macro's doc comment listing the traits or types the caller must bring into scope.

Convention aside: The community expects macro crates to re-export the traits they implement. If your macro implements MyTrait, the macro crate should also define MyTrait or re-export it from a shared crate. This reduces friction for users. They can use my_macro::{MyDerive, MyTrait}; in one line.

Convention: proc-macro2 vs proc_macro

You will see two crates mentioned: proc_macro and proc-macro2. The proc_macro crate is part of the standard library. It provides the types used in the macro function signature, like TokenStream. It only works in crates with proc-macro = true. You cannot use it in unit tests.

The proc-macro2 crate is a third-party shim. It provides the same types but works in any crate. syn and quote use proc-macro2 types internally. When you write a macro, you convert between proc_macro::TokenStream and proc-macro2::TokenStream at the boundary. The parse_macro_input! and quote! macros handle this conversion automatically.

This design allows you to write unit tests for your macro logic. You can test the parsing and generation functions in isolation without compiling a macro crate. This is a significant quality-of-life improvement. Write your core logic using proc-macro2 types. Convert to proc_macro only at the entry point.

Test your macro logic in unit tests. The macro crate is hard to iterate on. Tests make development faster.

Decision: when to use procedural macros

Procedural macros are powerful but add complexity. They require a separate crate, increase compile times, and can produce confusing error messages if not implemented carefully. Choose the right tool for the job.

Use procedural macros when you need to generate code based on the structure of a type, such as implementing traits for structs or enums.

Use declarative macros when you need simple pattern matching and text substitution without parsing a full abstract syntax tree.

Use a build script when you need to generate code based on external files, system configuration, or non-Rust data sources.

Use a regular function when the logic can be expressed at runtime and does not require compile-time code generation.

Trust the borrow checker. It usually has a point. Macros bypass some checks by generating code. Ensure the generated code is safe.

Where to go next