How to Use Rust for ARM Cortex-M Development

You just plugged in your STM32 Blue Pill

The LED is blinking in C, and you want to rewrite it in Rust. You fire up cargo new, write a fn main(), and try to compile. The compiler screams. There is no main function on a microcontroller. There is no operating system to call. You are writing the kernel, the drivers, and the application all at once. This is no_std territory.

Standard Rust assumes a host. It expects an OS that provides malloc, print, and a main entry point. Cortex-M chips are bare metal. No OS. No heap by default. No standard library. You need #![no_std] to tell the compiler, "Don't link the standard library. I'll handle the basics." You also need a target triple to tell the compiler, "Generate ARM instructions, not x86."

Think of standard Rust like a library book. The OS provides the shelves, the lighting, the checkout system, and the index. no_std is like writing a book in a cave. You have to bring your own light, your own paper, and define what "page 1" means. The compiler gives you the language, but you bring the environment.

The target triple is a contract

The compiler doesn't know ARM by default. You add the target to your toolchain.

rustup target add thumbv7m-none-eabi

This downloads the backend for ARM Cortex-M3 and M4 class chips. The triple breaks down into four parts that define exactly what code the compiler generates.

thumb means the 16-bit instruction set. ARM has a 32-bit mode called ARM, but Thumb is smaller. Code density matters on microcontrollers with kilobytes of flash. Thumb instructions pack more logic into fewer bytes.

v7m specifies the architecture version. Cortex-M3 and M4 share the v7-M architecture. If you have a Cortex-M0, you need v6m. If you have an M7, you need v8m. The compiler uses this to enable or disable specific instructions like hardware divide or DSP extensions.

none means no operating system. There is no libc. There is no dynamic linker. The binary is a flat image that loads directly into memory.

eabi stands for Embedded Application Binary Interface. It defines how functions pass arguments and how the stack is aligned. This ensures Rust code can call C code and vice versa, which matters when you link against vendor SDKs or bootloaders.

Convention aside: the embedded community uses thumbv7m-none-eabi as the default target for most STM32F1 and STM32F4 boards. If you are unsure, check the chip datasheet for the architecture version and pick the matching triple.

Minimal example: the bare metal entry point

A no_std binary needs two attributes and a panic handler. The standard library provides a default panic handler that prints a message and aborts. On bare metal, there is no console to print to. You must provide a panic handler that does something safe, like halting the CPU.

#![no_std] // Disable the standard library. No heap, no OS calls.
#![no_main] // Disable the default main wrapper. We define the entry point.

use panic_halt as _; // On panic, halt the CPU. No stack trace.

#[no_mangle] // Don't mangle the symbol name. The linker expects exactly "_start".
pub extern "C" fn _start() -> ! {
    // Infinite loop. The CPU never returns from here.
    loop {}
}

The #![no_std] attribute tells the compiler to use the core library instead of std. Core provides fundamental types, traits, and math functions without OS dependencies.

The #![no_main] attribute tells the compiler not to generate the default main wrapper. The standard library wrapper calls OS initialization functions that don't exist on bare metal.

The panic_halt crate provides a panic handler that spins forever. The as _ syntax tells the compiler to use the crate as a panic handler without importing a specific item. This is a convention in the embedded ecosystem. The compiler looks for a crate marked as a panic handler and links it automatically.

The _start function is the entry point. The #[no_mangle] attribute prevents the compiler from changing the function name. Linkers and bootloaders expect the symbol to be exactly _start. The extern "C" attribute ensures the function uses the C calling convention, which matches the vector table expectations. The return type ! is the never type. It tells the compiler this function never returns. The CPU loops forever.

Convention aside: panic_halt is the smallest panic handler. It produces a binary that is a few hundred bytes. For debugging, panic_abort is better. It triggers a software exception that pauses the debugger, allowing you to inspect the stack. Use panic_halt for production firmware where size matters. Use panic_abort during development.

The CPU spins forever. That is a valid program.

The runtime handles the boot sequence

Writing _start manually works, but it is painful. You need to set up the stack pointer, initialize the .data section, zero out the .bss section, and handle the vector table. The cortex-m-rt crate does all of this. It is the de facto standard runtime for Cortex-M chips.

#![no_std]
#![no_main]

use cortex_m_rt::entry;
use panic_halt as _;

#[entry] // The runtime sets up the stack and calls this function.
fn main() -> ! {
    // Your application logic starts here.
    loop {}
}

The cortex-m-rt crate provides the #[entry] macro. You apply it to a function named main. The runtime generates the actual _start symbol, sets up the stack pointer based on the vector table, copies initialized data from flash to RAM, zeros out uninitialized data, and then calls your main function.

The vector table is a list of function pointers at the start of flash. The first entry is the initial stack pointer. The second entry is the reset handler, which is your _start. Subsequent entries are interrupt handlers. The runtime generates this table automatically based on your code and the chip's peripheral interrupts.

Convention aside: the embedded community expects cortex-m-rt in almost every Cortex-M project. Don't roll your own runtime unless you are writing a bootloader or a custom OS. The runtime handles edge cases like secure boot and memory protection that are easy to get wrong.

Memory layout and sections

Microcontrollers have a fixed memory map. Flash holds the code and read-only data. RAM holds variables and the stack. The linker places your code into sections that map to these regions.

The .text section holds the executable code. It goes into flash. The CPU executes instructions directly from flash, or from an instruction cache if the chip has one.

The .data section holds initialized global variables. The initial values live in flash. At startup, the runtime copies these values from flash to RAM. After the copy, the RAM holds the current values.

The .bss section holds uninitialized global variables. The runtime zeros out this section in RAM at startup. This saves flash space because you don't need to store zeros.

The stack grows downward from the top of RAM. The heap, if you use one, grows upward from the bottom of RAM. You must ensure the stack and heap don't collide. The linker script defines the memory regions and section placement. The cortex-m-rt crate provides a default linker script that you can override if your chip has unusual memory layout.

Convention aside: use cargo size to check the binary size. It shows the size of each section. If .text is too large, you need to optimize or reduce dependencies. If .data is too large, you have too many initialized globals. If .bss is too large, you have too many uninitialized globals.

Debugging and logging

Printing to the console doesn't work on bare metal. There is no stdout. You need a way to log messages and debug your code.

The defmt crate is the standard for logging in embedded Rust. It encodes messages into a compact binary format that the debugger can decode. It is much smaller and faster than println!.

use defmt::*;
use panic_probe as _; // Panic handler that prints via defmt.

#[cortex_m_rt::entry]
fn main() -> ! {
    info!("System started");
    loop {}
}

The defmt macros like info! and error! encode the message and arguments into a binary stream. The panic_probe crate sends panic messages over the debug probe. You need a probe like J-Link or ST-Link connected to the chip.

Convention aside: the embedded community uses defmt for all logging. println! requires a heavy allocator and a UART driver. defmt works without a heap and integrates with the debugger. It is the tool of choice for production firmware.

For debugging, probe-rs is the modern tool. It replaces OpenOCD for many users. It supports flashing, debugging, and GDB server mode. It works with VS Code and other IDEs.

cargo install probe-rs-tools
probe-rs run --chip STM32F405RGTx

The probe-rs run command flashes the binary and runs it. It connects to the debug probe and controls the chip. You can set breakpoints, inspect variables, and step through code.

Pitfalls and compiler errors

The linker is your friend here. If it complains about missing symbols, you likely forgot a no_std attribute or a runtime crate.

If you drop #![no_std], the compiler drags in the standard library. The linker fails with undefined references to OS functions like __libc_start_main. The error message looks like undefined reference to '__libc_start_main'. This happens because the standard library expects a POSIX environment.

If you forget #![no_main], the compiler generates a main wrapper that expects an OS entry point. The linker complains about missing symbols. The error message mentions main or __cxa_atexit.

If you use println! without a heap, the compiler rejects you with E0463 (can't find crate for alloc) or a linker error about missing allocation functions. println! requires the alloc crate for string formatting. On no_std, you need to enable the alloc feature and provide an allocator, or use defmt.

Stack overflows are silent killers. There is no MMU to catch out-of-bounds access. If the stack grows too large, it corrupts other data. The chip might reset randomly or behave erratically. You must estimate the stack usage and reserve enough RAM. The linker script allows you to set the stack size.

Convention aside: use cortex-m-rtic or embassy for interrupt-driven applications. They provide a structured way to handle interrupts without global state. Global state requires Mutex or RefCell, which adds complexity. The runtime frameworks encapsulate the synchronization logic.

The linker is your friend here. If it complains about missing symbols, you likely forgot a no_std attribute or a runtime crate.

Decision matrix

Use cortex-m-rt when you are starting a new Cortex-M project and want the standard runtime. It provides the #[entry] macro, sets up the stack pointer, and initializes memory sections automatically.

Use embassy when your application benefits from async tasks. It replaces the blocking driver model with an executor that schedules tasks around interrupts and timers.

Pick thumbv7m-none-eabi for Cortex-M3 and M4 chips without a floating-point unit. Pick thumbv7em-none-eabi for M4 chips with FPU. Pick thumbv6m-none-eabi for Cortex-M0 and M0+ chips.

Reach for panic-halt in production builds to minimize binary size. Reach for panic-exception during development to pause the debugger on a panic and inspect the state.

Use defmt for logging in all no_std projects. It is smaller and faster than text-based logging. It integrates with the debugger and works without a heap.

Pick probe-rs for flashing and debugging. It supports modern probes and provides a better developer experience than OpenOCD.

Use cortex-m-semihosting when you need to print messages to the host computer without a UART. It uses the debug probe to send data to the host. It is slow but useful for initial bring-up.

Where to go next

Rust for ARM Cortex-M Development compiles your code into a format that runs directly on a microcontroller without an operating system. Think of it as writing a program that talks directly to the hardware, similar to how a car's engine computer controls the vehicle without needing Windows or macOS. You use this when building embedded devices like sensors, drones, or smart home gadgets.