The flash is full
You flash the microcontroller. The programmer hangs. The error message says the binary is too large. Your chip has 16 kilobytes of flash. Your binary is 24 kilobytes. You need to cut 8 kilobytes without deleting features. This happens. Embedded Rust fights for every byte. The compiler defaults to safety, debuggability, and portability. Those defaults produce binaries that fit on a laptop but choke on a microcontroller. You have to tell the compiler to specialize.
Why Rust binaries get fat
Rust's default build targets desktop development. It includes debug information, unoptimized code, and the full standard library. Debug information contains line numbers, variable names, and type metadata. This makes stack traces readable. It also adds megabytes of data. Unoptimized code is verbose. The compiler generates instructions that are easy to debug, not instructions that are compact. The standard library pulls in threading primitives, file I/O, network stacks, and panic formatting. None of this exists on a bare-metal chip. The compiler generates code that works everywhere. You have to trim it down to fit your target.
The release profile is your first weapon
Debug builds are bloated. A debug build can be ten times larger than a release build. Always build with the release profile for embedded targets. The release profile enables optimizations and strips debug info by default. It is the baseline for size reduction.
[profile.release]
opt-level = "z"
lto = true
strip = true
The opt-level key controls optimization. The value "z" tells LLVM to optimize for size. It chooses smaller instruction encodings. It avoids unrolling loops if the unrolled version is too large. It can make code slower. Use "z" when flash is tight. The value "s" optimizes for size with fewer aggressive trade-offs. Start with "s" if you have room. Switch to "z" when you run out.
The lto key enables Link-Time Optimization. Without LTO, each crate compiles in isolation. The compiler assumes every public function might be called. It keeps them all. With LTO, the linker sees the entire program. It knows exactly which functions are used. It deletes the rest. It also inlines small functions across crate boundaries. This reduces call overhead and exposes more opportunities for constant folding. LTO increases build time. The compiler re-optimizes everything at the end. On embedded, build time is a trade-off you accept. The size savings are usually worth the wait.
The strip key removes symbols. Symbols are the names of functions and variables. They help debuggers. They take space. strip = true removes them. You get a smaller binary. You lose readable stack traces. On embedded, you often do not have a debugger attached in production. Strip is safe.
Build the project with the release profile.
cargo build --release
Check the size.
ls -lh target/thumbv7em-none-eabi/release/your_binary
The size should drop significantly. If it is still too large, you need to dig deeper.
The hidden bloat: panic and allocators
The standard library is a heavy dependency. std includes panic formatting code. The default panic handler prints a message. Printing requires formatting logic. Formatting logic is large. On embedded, you usually want to halt or reset. You do not need a panic message. Replace the default panic handler.
Add panic-halt to your dependencies.
[dependencies]
panic-halt = "0.2"
Use panic-halt in your crate root.
#![no_std]
#![no_main]
use panic_halt as _;
#[cortex_m_rt::entry]
fn main() -> ! {
loop {}
}
The #![no_std] attribute disables the standard library. You still get core, which has basic types and traits. You lose Vec, String, and println. The #![no_main] attribute tells the compiler there is no OS-provided entry point. The panic_halt as _ import replaces the default panic handler with one that just loops. It saves kilobytes.
Dynamic allocation is another source of bloat. Vec and String require a global allocator. If you do not need dynamic allocation, avoid them. Use arrays or heapless. If you try to use Vec in a no_std crate without an allocator, the compiler rejects you with E0277 (trait bound not satisfied). Vec requires GlobalAlloc. The error tells you the trait is missing. Fix it by removing Vec or adding an allocator. Removing Vec is the better choice for size.
Logging without the weight
println is a trap. It pulls in formatting code. The embedded community uses defmt for logging. defmt is a framework designed for embedded targets. It encodes types at compile time. It sends compact binary frames. It can reduce logging overhead by 90%. Replace log or println with defmt.
[dependencies]
defmt = "0.3"
defmt-rtt = "0.4"
Use defmt in your code.
use defmt::*;
fn main() {
info!("Temperature: {}", 42);
}
The info! macro encodes the format string and types at compile time. The runtime only sends the data. It is tiny. It requires a transport like defmt-rtt to send the data out. The convention in embedded Rust is to use defmt for all logging. It saves space and provides structured data.
Analyzing what's taking space
You need to know where the bloat is. Guessing wastes time. Use tools to measure. cargo bloat shows which functions take the most space. Install it with cargo install cargo-bloat. Run it on your release build.
cargo bloat --release --crates
The output lists crates and their contribution to the binary size. It highlights the largest functions. You can see if a dependency is dragging in unused code. You can see if a specific function is too large. Use this data to make decisions. Drop a dependency. Replace a function. Enable LTO.
cargo-binutils provides cargo size. It gives a summary of the binary sections. It shows text, data, and bss sizes. Install it with cargo install cargo-binutils. Run it with cargo size --release. It tells you how much code, initialized data, and zero-initialized data you have. High text size means too much code. High data size means too much static data. High bss size means too many uninitialized variables. Use these metrics to guide your optimization.
Pitfalls and trade-offs
Size optimization has costs. Aggressive size optimization can make code slower. opt-level = "z" may choose smaller instructions that take more cycles. Measure performance if it matters. LTO increases build time. The compiler has to re-optimize everything at the end. Build times can double or triple. Accept the wait. Strip removes debug symbols. You lose readable stack traces. If you need to debug in the field, keep a debug build separate. Do not strip the debug build.
no_std removes familiar types. You lose Vec, String, HashMap, and Result with Debug formatting. You have to use alternatives. heapless provides fixed-size collections. defmt provides logging. nb provides non-blocking patterns. The ecosystem supports no_std. You just have to learn the crates.
Dependencies can hide bloat. A crate might enable features you do not need. Check the Cargo.toml of your dependencies. Disable features. Use default-features = false. For example, serde has an alloc feature. If you do not need alloc, disable it. serde = { version = "1.0", default-features = false }. This can save kilobytes. Audit your dependencies. Remove unused ones. Replace heavy ones with lighter alternatives.
Decision: when to use which flag
Use opt-level = "z" when flash is tight and every byte counts, accepting potential speed penalties. Use opt-level = "s" when you want size reduction without the aggressive trade-offs of "z". Use lto = true when your binary includes many dependencies and you suspect unused code is dragging in weight. Use strip = true when you have no need for debug symbols in the release binary and want to reclaim symbol table space. Use panic = "abort" or a custom panic handler when you cannot afford the formatting code required for panic messages. Use #![no_std] when you are targeting a bare-metal environment without an operating system. Use defmt for logging when you need to minimize overhead and want structured binary output. Use heapless for collections when you need fixed-size data structures without dynamic allocation.