Intro to Rust Macros 🦀 | iditya.tech

Introduction#

Rust macros are a powerful tool for metaprogramming, allowing developers to write code that generates other code. This can help reduce boilerplate, implement domain-specific languages (DSLs), and extend the language's capabilities.

Prerequisites#

Before diving into Rust macros, it's essential to have a basic understanding of Rust syntax and concepts, including:

Variables and data types
Functions and control flow
Traits and generics
Modules

You can get the code of this blog here :

github.com

Let's dive into the fascinating world of Rust macros!

Rust has two main types of macros:

Declarative Macros (Macros by Example): These are the most common type and are defined using macro_rules!. They work by matching against patterns of Rust syntax and replacing them with a predefined expansion.
Procedural Macros: These are more powerful and complex. They operate on a TokenStream and allow you to inspect, modify, and generate arbitrary Rust code. Procedural macros come in three flavors:
- Function-like macros: #[proc_macro]
- Derive macros: #[proc_macro_derive]
- Attribute macros: #[proc_macro_attribute]

Let's explore each of these in detail.

Declarative Macros#

macro_rules!

Declarative macros are defined using the macro_rules! construct. They are essentially a set of pattern-matching rules, similar to a match statement. When the macro is invoked, the compiler tries to match the provided input against these rules. If a match is found, the corresponding code is expanded.

Syntax:

macro_rules! my_macro {
    // Rule 1: Pattern => Expansion
    ($input:expr) => {
        // ... code here ...
    };
    // Rule 2: Another Pattern => Another Expansion
    ($($args:expr),*) => {
        // ... more code here ...
    };
    // ... more rules ...
}

Key Components:

macro_rules!: The keyword to define a declarative macro.
Macro Name: The name you'll use to invoke the macro (e.g., my_macro!).
Rules: A list of (Pattern => Expansion); pairs.
- Pattern: Describes the syntax the macro expects. It consists of metavariables and literal Rust syntax.
- Expansion: The code that the macro will generate if the pattern matches.

Metavariables (Fragment Specifiers):

These are special variables prefixed with a dollar sign ($) and followed by an identifier (e.g., $name) and a fragment specifier. The fragment specifier tells the macro how to interpret the input.

Here are the common fragment specifiers:

ident: An identifier (e.g., variable name, function name, struct name).

macro_rules! create_var {
    ($name:ident, $value:expr) => {
        let $name = $value;
    };
}
create_var!(my_variable, 10); // Expands to: let my_variable = 10;

expr: An expression (e.g., 1 + 2, my_func(), some_variable).

macro_rules! print_expr {
    ($e:expr) => {
        println!("The expression evaluates to: {:?}", $e);
    };
}
print_expr!(10 * 5 + 3); // Expands to: println!("The expression evaluates to: {:?}", 10 * 5 + 3);

ty: A type (e.g., i32, Vec<String>, MyStruct).

macro_rules! create_vec {
    ($t:ty) => {
        Vec::<$t>::new()
    };
}
let my_vec: Vec<i32> = create_vec!(i32); // Expands to: Vec::<i32>::new();

block: A block of statements enclosed in curly braces (e.g., { let x = 5; x * 2 }).

macro_rules! run_block {
    ($b:block) => {
        $b
    };
}
let result = run_block!({ let x = 5; x * 2 }); // Expands to: { let x = 5; x * 2 }

pat: A pattern (e.g., Some(x), [a, b, ..], _).

macro_rules! match_option {
    ($option:expr, $pat:pat => $result:expr) => {
        match $option {
            $pat => $result,
            _ => None,
        }
    };
}
let opt = Some(5);
let val = match_option!(opt, Some(x) => Some(x * 2)); // Expands to: match opt { Some(x) => Some(x * 2), _ => None, }

stmt: A statement (e.g., let x = 5;, println!("hello");).

macro_rules! execute_stmt {
    ($s:stmt) => {
        $s
    };
}
execute_stmt!(let x = 10;); // Expands to: let x = 10;

path: A path to an item (e.g., std::collections::HashMap, super::my_module::MyType).
item: A full item (e.g., a function definition, a struct definition, a module).
meta: A meta item (an attribute, like #[test]).
tt: A single token tree (any single token or a group of tokens balanced by delimiters like parentheses, brackets, or curly braces). This is the most general but least specific.

Repetition ($()*, $(,)*, $(;)*):

Macros can handle repetition using $(...) with a separator and a repetition operator.

*: Zero or more repetitions.
+: One or more repetitions.
,: The separator used between repeated items (e.g., $($arg:expr),* means zero or more expressions separated by commas).

Example: vec! macro (simplified)

The standard library's vec! macro is a great example of declarative macros in action.

macro_rules! vec {
    // Case 1: vec![elem; count]
    ($elem:expr; $count:expr) => ({
        let mut temp_vec = Vec::new();
        temp_vec.reserve($count);
        for _ in 0..$count {
            temp_vec.push($elem);
        }
        temp_vec
    });
    // Case 2: vec![a, b, c] or vec![]
    ($($elem:expr),* $(,)?) => ({
        let mut temp_vec = Vec::new();
        $(
            temp_vec.push($elem);
        )*
        temp_vec
    });
}

Explanation of the vec! example:

($elem:expr; $count:expr): This rule handles vec![value; count].
- $elem:expr: Captures the element to repeat.
- $count:expr: Captures the number of repetitions.
- The expansion creates a Vec, reserves capacity, and pushes the element $count times.
($($elem:expr),* $(,)?): This rule handles vec![a, b, c] or vec![].
- $($elem:expr),*: This is the crucial part for repetition.
  - $elem:expr: Captures an expression.
  - , : The separator between repeated expressions.
  - * : Indicates zero or more repetitions.
- $(,)?: This optionally allows a trailing comma (e.g., vec![1, 2, 3,]).
- The expansion iterates over the captured $elems and pushes each into the temp_vec.

Calling a declarative macro:

You invoke declarative macros with an exclamation mark (!) after their name:

let my_vec = vec![1, 2, 3];
let ten_zeros = vec![0; 10];

Advantages of Declarative Macros:

Simpler to write: For many common use cases, they are easier to grasp and implement than procedural macros.
Faster compilation: Generally, they compile faster than procedural macros.
No external crate required: You can define them directly in your project.

Limitations of Declarative Macros:

Syntax-driven: They can only match based on the concrete syntax provided, not the semantic meaning of the code. This means they can't, for example, inspect types or function signatures.
Limited expressiveness: Complex transformations are often difficult or impossible to achieve with macro_rules!.
Debugging can be tricky: Errors within macro expansions can sometimes be less clear.

Procedural Macros#

Procedural macros are much more powerful because they work at the TokenStream level. They are Rust functions that receive a TokenStream as input and produce a TokenStream as output. This allows for arbitrary code analysis and generation.

Procedural macros must reside in their own crate with the proc-macro crate type.

Cargo.toml for a procedural macro crate:

[lib]
proc-macro = true

[dependencies]
syn = { version = "1.0", features = ["full"] } # For parsing Rust syntax
quote = "1.0" # For generating Rust syntax
proc-macro2 = "1.0" # Lower-level token handling (often used by syn and quote)

Key Libraries for Procedural Macros:

proc_macro crate: This is the standard library crate provided by Rust for working with procedural macros. It defines the TokenStream type.
syn crate: A powerful parser for Rust code. It allows you to parse a TokenStream into a structured Abstract Syntax Tree (AST), making it much easier to work with the code's components (structs, enums, functions, expressions, etc.).
quote crate: A library for easily generating TokenStreams from a more Rust-like syntax. It significantly simplifies the process of creating the output code.
proc_macro2 crate: A "shim" crate that provides the TokenStream type (and related types) from proc_macro but makes it available outside of procedural macro contexts, which is useful for testing and development.

Function-like Procedural Macros#

#[proc_macro]

These are invoked like declarative macros (my_macro!) but are implemented as regular Rust functions.

Example:

Let's create a simple function-like macro that reverses the order of a list of expressions.

my_proc_macro_crate/src/lib.rs:

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, Expr};

#[proc_macro]
pub fn reverse_exprs(input: TokenStream) -> TokenStream {
    // 1. Parse the input TokenStream into a list of expressions.
    // parse_macro_input! takes the TokenStream and a type to parse it into.
    // It will automatically handle error reporting if parsing fails.
    let mut exprs = parse_macro_input!(input as syn::punctuated::Punctuated<Expr, syn::Token![,]>);

    // 2. Reverse the order of the expressions.
    exprs.reverse();

    // 3. Generate the output TokenStream using `quote!`.
    // quote! allows you to write Rust-like code, and it will automatically
    // convert it into a TokenStream. You can embed variables (like `exprs`)
    // using interpolation.
    let output = quote! {
        // We'll just create a new tuple with the reversed expressions for simplicity
        (#exprs)
    };

    // 4. Convert the generated `proc_macro2::TokenStream` to `proc_macro::TokenStream`
    output.into()
}

my_app_crate/src/main.rs:

use my_proc_macro_crate::reverse_exprs;

fn main() {
    let original = (1, 2, 3);
    println!("Original: {:?}", original);

    let reversed = reverse_exprs!(1, 2, 3);
    println!("Reversed: {:?}", reversed); // Should print (3, 2, 1)

    let more_reversed = reverse_exprs!("a", "b", "c");
    println!("More Reversed: {:?}", more_reversed); // Should print ("c", "b", "a")
}

How it works:

#[proc_macro]: Marks the function reverse_exprs as a function-like procedural macro.
input: TokenStream: The macro receives the raw tokens provided by the user as a TokenStream.
parse_macro_input!(input as syn::punctuated::Punctuated<Expr, syn::Token![,]>):
- syn::punctuated::Punctuated is a syn type for a list of items separated by a specific token (here, Expr separated by ,).
- This parses the input like 1, 2, 3 into a Punctuated list of Expr AST nodes.
exprs.reverse(): We manipulate the AST nodes (reversing their order).
quote! { (#exprs) }: This uses quote! to construct a new TokenStream. The #exprs within quote! interpolates the Punctuated list of expressions back into the generated code, automatically adding the commas. The result is (3, 2, 1).
.into(): Converts the proc_macro2::TokenStream produced by quote! into the proc_macro::TokenStream required by the macro signature.

Derive Macros#

#[proc_macro_derive]

Derive macros generate code for impl blocks based on the definition of a struct or enum. They are used with the #[derive(MyMacro)] attribute.

Example:

Let's create a derive macro that automatically implements a Debug trait-like behavior, printing struct fields. (Note: Debug is already in std, this is just for demonstration.)

my_derive_crate/src/lib.rs:

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, Data, Fields, Ident};

#[proc_macro_derive(MyDebug)]
pub fn my_debug_derive(input: TokenStream) -> TokenStream {
    let ast = parse_macro_input!(input as syn::DeriveInput);

    let name = &ast.ident; // The name of the struct/enum

    // Generate field printing based on struct type
    let gen = match ast.data {
        Data::Struct(data_struct) => {
            let fields = match data_struct.fields {
                Fields::Named(fields) => {
                    fields.named.iter().map(|f| {
                        let field_name = &f.ident;
                        quote! {
                            print!("{}: {:?}, ", stringify!(#field_name), self.#field_name);
                        }
                    }).collect::<Vec<_>>()
                },
                Fields::Unnamed(fields) => {
                    fields.unnamed.iter().enumerate().map(|(i, _f)| {
                        let index = syn::Index::from(i);
                        quote! {
                            print!("{}: {:?}, ", #i, self.#index);
                        }
                    }).collect::<Vec<_>>()
                },
                Fields::Unit => {
                    vec![quote! { print!("(unit struct)"); }]
                }
            };
            quote! {
                impl MyDebug for #name {
                    fn my_fmt(&self) {
                        print!("{}: {{ ", stringify!(#name));
                        #(#fields)*
                        println!("}}");
                    }
                }
            }
        },
        _ => {
            // For simplicity, we'll only handle structs.
            // A real derive macro would handle enums and report errors.
            panic!("MyDebug can only be derived for structs");
        }
    };
    gen.into()
}

// You might also need a trait definition in your library crate or another common crate
// for the generated `impl` block to be meaningful.
// For example, in a `common_traits` crate:
// pub trait MyDebug {
//     fn my_fmt(&self);
// }
// And then in `my_derive_crate` you'd depend on `common_traits`.

my_app_crate/src/main.rs:

use my_derive_crate::MyDebug;

// Assuming `MyDebug` trait is defined and imported
pub trait MyDebug {
    fn my_fmt(&self);
}

#[derive(MyDebug)]
struct Point {
    x: i32,
    y: i32,
}

#[derive(MyDebug)]
struct Coords(f64, f64);

#[derive(MyDebug)]
struct Empty;

fn main() {
    let p = Point { x: 10, y: 20 };
    p.my_fmt(); // Expected output: Point: { x: 10, y: 20, }

    let c = Coords(1.0, 2.5);
    c.my_fmt(); // Expected output: Coords: { 0: 1.0, 1: 2.5, }

    let e = Empty;
    e.my_fmt(); // Expected output: Empty: { (unit struct) }
}

How it works:

#[proc_macro_derive(MyDebug)]: Marks the function my_debug_derive as a derive macro for the MyDebug attribute.
input: TokenStream: Receives the definition of the item (struct, enum) it's deriving for.
parse_macro_input!(input as syn::DeriveInput): Parses the input into a syn::DeriveInput struct, which contains information about the item being derived (its name, generics, and data type).
match ast.data: We inspect whether it's a struct, enum, or union. In this example, we only handle structs.
Field Iteration (data_struct.fields):
- For named fields (struct MyStruct { a: i32 }), we iterate over fields.named.
- For unnamed (tuple) fields (struct MyTuple(i32)), we iterate over fields.unnamed and use syn::Index for access.
- stringify!(#field_name): The stringify! macro (a built-in declarative macro) converts the token field_name into a string literal.
quote! { impl MyDebug for #name { ... } }: Generates the impl block for the MyDebug trait, using the name of the struct and the generated field-printing logic.
#(#fields)*: This is a powerful quote! feature for repetition. It takes a Vec<TokenStream> (or any iterable of ToTokens) and splices them directly into the output.

Attribute Macros#

#[proc_macro_attribute]

Attribute macros can attach to any item (function, struct, module, etc.) and get both the attribute's arguments and the item it's attached to. They can modify, replace, or wrap the item.

Example:

Let's create an attribute macro that logs when a function is called.

my_attribute_crate/src/lib.rs:

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, ItemFn};

#[proc_macro_attribute]
pub fn log_calls(_attr: TokenStream, item: TokenStream) -> TokenStream {
    // 1. Parse the item the attribute is attached to.
    let input_fn = parse_macro_input!(item as ItemFn);

    // 2. Get the function's name and its signature.
    let fn_name = &input_fn.sig.ident;
    let fn_block = &input_fn.block;
    let fn_sig = &input_fn.sig;
    let fn_vis = &input_fn.vis;
    let fn_attrs = &input_fn.attrs;

    // 3. Generate new code that wraps the original function's body
    // with logging statements.
    let expanded = quote! {
        #(#fn_attrs)* // Keep original attributes
        #fn_vis #fn_sig { // Keep original visibility and signature
            println!("Calling function `{}`", stringify!(#fn_name));
            let result = #fn_block; // Execute the original function body
            println!("Function `{}` returned", stringify!(#fn_name));
            result // Return the result
        }
    };

    expanded.into()
}

my_app_crate/src/main.rs:

use my_attribute_crate::log_calls;

#[log_calls]
fn greet(name: &str) -> String {
    format!("Hello, {}!", name)
}

#[log_calls]
fn add(a: i32, b: i32) -> i32 {
    a + b
}

fn main() {
    let greeting = greet("World");
    println!("Greeting: {}", greeting);

    let sum = add(5, 7);
    println!("Sum: {}", sum);
}

Expected Output:

Calling function `greet`
Function `greet` returned
Greeting: Hello, World!
Calling function `add`
Function `add` returned
Sum: 12

Extracted Components:

fn_name: The function's identifier (name)
fn_block: The function's body (the code inside { ... })
fn_sig: The function's complete signature (name, parameters, return type, etc.)
fn_vis: The function's visibility modifier (pub, pub(crate), private, etc.)
fn_attrs: Any attributes applied to the function (like #[inline], #[allow], etc.)

How it works:

#[proc_macro_attribute]: Marks the log_calls function as an attribute macro.
_attr: TokenStream: This parameter receives the tokens inside the attribute (e.g., #[my_attr(arg)] would pass arg). We ignore it here.
item: TokenStream: This parameter receives the tokens of the item the attribute is attached to (e.g., the entire fn greet(...) { ... } definition).
parse_macro_input!(item as ItemFn): Parses the input item into a syn::ItemFn struct, giving us structured access to the function's name, signature, body, etc.
quote! { ... }: We reconstruct the function with enhanced functionality:
- #(#fn_attrs)*: Preserves all original function attributes
- #fn_vis #fn_sig: Maintains the original visibility and signature
- println! statements: Adds logging before and after execution
- #fn_block: Embeds the original function body
- result: Ensures the original return value is preserved and returned

Advantages of Procedural Macros:

Extreme Flexibility: Can perform arbitrary code transformations.
Semantic Awareness: With syn, you can parse the input into an AST, allowing you to understand the structure and components of the code.
Powerful Abstractions: Ideal for creating DSLs, complex code generation, and advanced metaprogramming.

Limitations of Procedural Macros:

Complexity: They are significantly more complex to write and debug than declarative macros.
Dedicated Crate: Require a separate proc-macro crate.
Compilation Time: Can increase compilation times due to the parsing and code generation involved.
Error Reporting: While syn helps, writing good, user-friendly error messages can still be challenging.

Best Practices and Considerations#

Start with Declarative: If you can achieve your goal with macro_rules!, it's generally preferred due to its simplicity and faster compilation.
TokenStream vs. AST(Abstract Syntax Tree): Procedural macros manipulate TokenStream. Use syn to parse TokenStream into an AST for easier manipulation, and quote to generate TokenStream from an AST.
Error Handling: Procedural macros should provide clear error messages to the user if their input is invalid. proc_macro::Diagnostic and syn::Error::to_compile_error() are useful for this.
Debugging: Macros can be hard to debug. You can use tools like cargo-expand to see the expanded code of your macros, which is invaluable.
Hygiene: Rust macros are hygienic. This means that variables and functions generated by a macro won't accidentally clash with variables or functions in the user's code, and vice-versa. This is a huge benefit over older macro systems in other languages.
- However, if you need to refer to something from the user's scope (like a trait they've imported), you might need to use techniques like #[allow(unused_imports)] or fully qualified paths.
Recursive Macros: Declarative macros can be recursive, allowing them to process nested structures. But, the compiler imposes an upper limit of 128 recursive calls. This limit can be raised using the #![recursion_limit="…"] attribute, though it must be done crate-wide.
Avoid Overuse: While powerful, excessive or poorly designed macros can make code harder to read and understand. Use them when they genuinely reduce boilerplate or enable better abstractions.

If you want go deep and learn internals there is good resource The Little Book of Rust Macros.

Rust macros are a deep and rewarding topic. Mastering them allows you to unlock a new level of power and expressiveness in your Rust code. Happy macro writing!