How to Write a Custom Serde Derive

When #[derive] falls short

You have a Config struct. You want to serialize it as a flat list of key-value pairs for a legacy system, but the struct is nested. Serde's default derive gives you nested JSON. You could write impl Serialize manually, but then you have fifty config structs and you're copy-pasting the same traversal logic. You want to write the logic once and slap #[derive(FlatSerialize)] on any struct.

That's a custom derive macro. It lets you define a new #[derive(...)] attribute that generates code at compile time. You write the transformation rules, and the compiler applies them to your types before the rest of your code runs.

The macro factory

A derive macro is a special kind of library that the compiler loads during the build process. When the compiler sees #[derive(MyMacro)] on a struct, it pauses, calls your macro, and passes the struct's definition as text. Your macro parses that text, generates new Rust code, and returns it. The compiler then continues as if you had typed the generated code by hand.

Think of a derive macro as a specialized factory worker. You hand them a raw metal sheet (your struct definition). They inspect the sheet, stamp out the necessary bolts and nuts (the impl block), and weld them onto the sheet before it leaves the factory. By the time the compiler sees the final product, the macro has vanished. Only the generated code remains.

Three crates make this work:

proc_macro: The standard library interface. It defines how your macro talks to the compiler. You get tokens in, you return tokens out.
syn: A parser. It turns the raw tokens into a structured syntax tree you can inspect. Without syn, you'd be writing regex on Rust code.
quote: A generator. It turns your data structures back into Rust tokens. It lets you write Rust-like syntax in your macro and interpolate variables.

The community convention is to use syn with features = ["full"] for derive macros. The full feature enables parsing of all Rust syntax, including enums and complex generics. If you skip it, syn might reject valid input with a confusing error.

Minimal example

Start with a macro crate. Create a new library crate and enable proc-macro = true in Cargo.toml. This tells Cargo to compile the crate as a dynamic library that the compiler can load, rather than a normal library.

# my_derive/Cargo.toml
[lib]
proc-macro = true

[dependencies]
syn = { version = "2.0", features = ["full"] }
quote = "1.0"

The macro function takes a TokenStream and returns a TokenStream. Use parse_macro_input! to convert the input into a DeriveInput, which represents a struct, enum, or union.

// my_derive/src/lib.rs
use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput};

#[proc_macro_derive(CustomSerialize)]
pub fn custom_serialize_derive(input: TokenStream) -> TokenStream {
    // Parse the input tokens into a syntax tree.
    // DeriveInput captures the struct name, generics, and fields.
    let input = parse_macro_input!(input as DeriveInput);
    let name = &input.ident;

    // Generate the implementation block.
    // quote! turns Rust-like syntax into a TokenStream.
    // #name interpolates the struct name into the generated code.
    let expanded = quote! {
        impl serde::Serialize for #name {
            fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
            where
                S: serde::Serializer,
            {
                // Serialize the struct as a single string containing its name.
                serializer.serialize_str(stringify!(#name))
            }
        }
    };

    // Return the generated code to the compiler.
    TokenStream::from(expanded)
}

The #[proc_macro_derive(CustomSerialize)] attribute registers the function. The string CustomSerialize is what users write in their #[derive(...)]. The function name can be anything; the attribute name matters.

In a consuming crate, you add my_derive as a dependency and use the macro.

// consumer/src/main.rs
use my_derive::CustomSerialize;
use serde::Serialize;

#[derive(CustomSerialize)]
struct User {
    name: String,
    id: u32,
}

fn main() {
    let user = User { name: "Alice".into(), id: 1 };
    // The macro generated the impl, so this works.
    let json = serde_json::to_string(&user).unwrap();
    println!("{}", json); // Prints: "User"
}

The compiler never sees the macro. It only sees the result. If you expand the macro mentally, the User struct looks like you wrote the impl block yourself.

How the compiler calls your macro

The process happens in three steps. First, the compiler encounters #[derive(CustomSerialize)] and extracts the tokens for the struct definition. It sends these tokens to your macro crate.

Second, your macro runs. parse_macro_input! uses syn to build an abstract syntax tree. You inspect the tree to learn about the struct. You can read the name, check if it's a struct or enum, iterate over fields, and examine attributes.

Third, you use quote to build the output. quote takes a quasi-quotation and interpolates values from your variables. The # symbol injects a variable. stringify!(#name) converts the identifier to a string literal at compile time. The result is a TokenStream, which quote converts to tokens.

The compiler receives the tokens and inserts them into the source code. It then continues compilation. If the generated code has errors, the error message points to the generated code. Good macros preserve spans so errors point back to the original struct.

Realistic example: iterating fields

Most derive macros need to inspect the structure of the type. A common pattern is iterating over fields to generate code for each one. quote supports repetition with #(...)*, which loops over a vector.

This example generates a Serialize impl that serializes a struct as a map of field names to values. It handles named fields and skips fields marked with #[skip].

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput, Data, Fields, Meta};

#[proc_macro_derive(MapSerialize, attributes(skip))]
pub fn map_serialize_derive(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as DeriveInput);
    let name = &input.ident;

    // Match only structs with named fields.
    // Enums and tuple structs require different handling.
    let fields = match &input.data {
        Data::Struct(data_struct) => match &data_struct.fields {
            Fields::Named(fields) => fields,
            _ => panic!("MapSerialize only supports structs with named fields"),
        },
        _ => panic!("MapSerialize only supports structs"),
    };

    // Collect field names and filter out skipped fields.
    let mut field_names = Vec::new();
    let mut field_accesses = Vec::new();

    for field in &fields.named {
        // Check for #[skip] attribute.
        let is_skipped = field.attrs.iter().any(|attr| {
            attr.path().is_ident("skip")
        });

        if !is_skipped {
            // Collect the identifier for the field name.
            if let Some(ident) = &field.ident {
                field_names.push(ident);
                // Collect the access expression for the field value.
                field_accesses.push(quote! { self.#ident });
            }
        }
    }

    // Generate the implementation.
    // The #(...) block repeats for each field.
    let expanded = quote! {
        impl serde::Serialize for #name {
            fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
            where
                S: serde::Serializer,
            {
                use serde::ser::SerializeMap;
                let mut map = serializer.serialize_map(Some(#field_names.len()))?;
                #(
                    map.serialize_entry(stringify!(#field_names), &#field_accesses)?;
                )*
                map.end()
            }
        }
    };

    TokenStream::from(expanded)
}

The attributes(skip) part of #[proc_macro_derive] tells the compiler to forward #[skip] attributes to the macro. Without this, the compiler would complain about unknown attributes.

The #(...)* repetition is the superpower of quote. It expands the block inside the parentheses for each element in the vectors. If you have three fields, it generates three map.serialize_entry calls. This scales automatically to any number of fields.

Repetition is the key to writing maintainable macros. If you hardcode field counts, your macro breaks on structs with different shapes. Use repetition to make your macro generic.

Pitfalls and errors

Derive macros introduce complexity. The compiler errors can be cryptic if the macro is poorly written.

If you forget proc-macro = true in Cargo.toml, the crate compiles as a normal library. The compiler won't recognize the macro, and you'll get an error like cannot find derive macro CustomSerialize in this scope. Always check the Cargo.toml first.

If syn fails to parse the input, parse_macro_input! panics. A panic in a macro produces a useless error message that points to the macro crate, not the user's code. Use syn::Error instead.

// BAD: panic gives a useless error.
let fields = match &input.data {
    Data::Struct(data) => &data.fields,
    _ => panic!("Only structs supported"),
};

// GOOD: syn::Error points to the user's code.
let fields = match &input.data {
    Data::Struct(data) => &data.fields,
    _ => {
        return syn::Error::new_spanned(&input, "Only structs supported")
            .into_compile_error()
            .into();
    }
};

syn::Error::new_spanned attaches the error message to a specific span in the source code. into_compile_error() converts the error into a token stream that the compiler renders as a diagnostic. The user sees a clear error pointing to their struct.

Treat syn::Error as your friend. A panic is a failure; a compile error is a lesson.

Another common issue is generics. If the struct has generic parameters, the macro must preserve them. If you ignore generics, the generated code won't compile.

// Extract generics from the input.
let generics = &input.generics;
let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();

// Use split_for_impl() to generate correct generic syntax.
let expanded = quote! {
    impl #impl_generics serde::Serialize for #name #ty_generics #where_clause {
        // ...
    }
};

generics.split_for_impl() handles the tricky syntax of lifetimes and trait bounds. It returns the parts needed for impl<T, 'a> ... where .... If you manually construct generic syntax, you'll likely get E0106 (missing lifetime specifier) or E0207 (type parameter not found).

If the generated code references types that don't exist, the compiler reports E0412 (cannot find type). If the generated code violates trait bounds, you'll see E0277 (trait bound not satisfied). These errors happen at the call site, which is correct. The macro should generate code that compiles only when the user's types satisfy the requirements.

When to write a custom derive

Macros are powerful, but they add build time and complexity. Use them judiciously.

Use a custom derive macro when you need to generate repetitive impl blocks for many types and the logic depends on the structure of the type, like iterating fields or handling variants.

Use a manual impl when the serialization logic is unique to one type and doesn't benefit from code generation, or when the logic is complex enough that a macro would obscure the implementation.

Use Serde's built-in attributes like #[serde(rename)] or #[serde(with)] when you only need to tweak the default behavior of a standard derive, rather than replacing the entire implementation.

Use a helper function or trait when you need runtime flexibility that a compile-time macro cannot provide.

Macros are tools, not magic. If the tool makes the code harder to read, put it down.

Where to go next

A custom Serde derive is a tool you build to automatically generate code that lets your data structures be saved or loaded as text or binary. Instead of writing repetitive conversion code for every struct, you write this tool once, and it does the work for you whenever you tag a struct with a special attribute. Think of it like a template that fills in the blanks for you based on the shape of your data.