Rust Engineering Practices - Beyond cargo build / Rust 工程实践:超越 cargo build
Speaker Intro / 讲师简介
- Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team / Microsoft SCHIE(Silicon and Cloud Hardware Infrastructure Engineering)团队首席固件架构师
- Industry veteran with expertise in security, systems programming (firmware, operating systems, hypervisors), CPU and platform architecture, and C++ systems / 在安全、系统编程(固件、操作系统、虚拟机监控器)、CPU 与平台架构以及 C++ 系统方面经验丰富
- Started programming in Rust in 2017 (@AWS EC2), and have been in love with the language ever since / 2017 年在 AWS EC2 开始使用 Rust,此后长期深度投入
A practical guide to the Rust toolchain features that most teams discover too late: build scripts, cross-compilation, benchmarking, code coverage, and safety verification with Miri and Valgrind. Each chapter uses concrete examples drawn from a real hardware-diagnostics codebase - a large multi-crate workspace - so every technique maps directly to production code.
这是一本聚焦 Rust 工具链实践的实用指南,覆盖许多团队往往接触得太晚的关键能力:构建脚本、交叉编译、基准测试、代码覆盖率,以及借助 Miri 和 Valgrind 做安全验证。每章都基于真实硬件诊断代码库中的具体示例展开,该代码库是一个大型多 crate 工作区,因此书中的每项技巧都能直接映射到生产代码。
How to Use This Book / 如何使用本书
This book is designed for self-paced study or team workshops. Each chapter is largely independent - read them in order or jump to the topic you need.
本书适合 自定节奏学习或团队工作坊。各章大体独立,你既可以按顺序阅读,也可以直接跳到当前最需要的主题。
Difficulty Legend / 难度说明
| Symbol / 标记 | Level / 等级 | Meaning / 含义 |
|---|---|---|
| 🟢 | Starter / 入门 | Straightforward tools with clear patterns - useful on day one / 规则清晰、上手直接,第一天就能用到 |
| 🟡 | Intermediate / 中级 | Requires understanding of toolchain internals or platform concepts / 需要理解工具链内部机制或平台概念 |
| 🔶 | Advanced / 高级 | Deep toolchain knowledge, nightly features, or multi-tool orchestration / 涉及更深的工具链知识、nightly 特性或多工具协同 |
Pacing Guide / 学习节奏建议
| Part / 部分 | Chapters / 章节 | Est. Time / 预计时间 | Key Outcome / 关键收获 |
|---|---|---|---|
| I - Build & Ship / 构建与交付 | ch01-ch02 | 3-4 h / 3-4 小时 | Build metadata, cross-compilation, static binaries / 构建元数据、交叉编译、静态二进制 |
| II - Measure & Verify / 度量与验证 | ch03-ch05 | 4-5 h / 4-5 小时 | Statistical benchmarking, coverage gates, Miri/sanitizers / 统计型基准测试、覆盖率门禁、Miri 与 sanitizer |
| III - Harden & Optimize / 加固与优化 | ch06-ch10 | 6-8 h / 6-8 小时 | Supply chain security, release profiles, compile-time tools, no_std, Windows / 供应链安全、发布配置、编译期工具、no_std 与 Windows |
| IV - Integrate / 集成 | ch11-ch13 | 3-4 h / 3-4 小时 | Production CI/CD pipeline, tricks, capstone exercise / 生产级 CI/CD 流水线、实践技巧与综合练习 |
| 16-21 h | Full production engineering pipeline / 完整生产工程流水线视角 |
Working Through Exercises / 练习建议
Each chapter contains exercises with difficulty indicators. Solutions are provided in expandable <details> blocks - try the exercise first, then check your work.
每章都包含带难度标记的 练习。答案放在可展开的 <details> 区块中,建议先做题,再核对答案。
- 🟢 exercises can often be done in 10-15 minutes / 🟢 练习通常可在 10-15 分钟内完成
- 🟡 exercises require 20-30 minutes and may involve running tools locally / 🟡 练习通常需要 20-30 分钟,并可能需要本地运行工具
- 🔶 exercises require significant setup and experimentation (1+ hour) / 🔶 练习通常需要较多环境准备与实验时间(1 小时以上)
Prerequisites / 前置知识
| Concept / 概念 | Where to learn it / 建议学习位置 |
|---|---|
| Cargo workspace layout / Cargo 工作区结构 | Rust Book ch14.3 |
| Feature flags / Feature 标志 | Cargo Reference - Features |
#[cfg(test)] and basic testing / #[cfg(test)] 与基础测试 | Rust Patterns ch12 / Rust Patterns 第 12 章 |
unsafe blocks and FFI basics / unsafe 代码块与 FFI 基础 | Rust Patterns ch10 / Rust Patterns 第 10 章 |
Chapter Dependency Map / 章节依赖图
+-----------------+
| ch00 |
| Intro |
+----+-----+------+
+--------+----+---+--+---+---------+------+
| | | | | |
ch01 ch03 ch04 ch05 ch06 ch09
Build Bench Cov Miri Deps no_std
| | | | | |
| +-------+------+ | |
| | | ch10
ch02 ch07 ch07 Windows
Cross RelProf RelProf
| | | |
| ch08 | |
| CompTime | |
+----------------+----------------+------+
|
ch11
CI/CD Pipeline
|
ch12 ---- ch13
Tricks Quick Ref
Read in any order: ch01, ch03, ch04, ch05, ch06, ch09 are independent.
Read after prerequisites: ch02 (needs ch01), ch07-ch08 (benefit from ch03-ch06), ch10 (benefits from ch09).
Read last: ch11 (ties everything together), ch12 (tricks), ch13 (reference).
可任意顺序阅读:ch01、ch03、ch04、ch05、ch06、ch09 相互独立。
建议在具备前置知识后阅读:ch02(依赖 ch01),ch07-ch08(先学 ch03-ch06 效果更好),ch10(最好先看 ch09)。
建议最后阅读:ch11(综合收束全书)、ch12(技巧汇总)、ch13(参考速查)。
Annotated Table of Contents / 带说明的目录
Part I - Build & Ship / 第一部分:构建与交付
| # | Chapter / 章节 | Difficulty / 难度 | Description / 说明 |
|---|---|---|---|
| 1 | Build Scripts - build.rs in Depth / 构建脚本:深入理解 build.rs | 🟢 | Compile-time constants, compiling C code, protobuf generation, system library linking, anti-patterns / 编译期常量、编译 C 代码、protobuf 生成、系统库链接与反模式 |
| 2 | Cross-Compilation - One Source, Many Targets / 交叉编译:一份源码,多种目标 | 🟡 | Target triples, musl static binaries, ARM cross-compile, cross, cargo-zigbuild, GitHub Actions / 目标三元组、musl 静态二进制、ARM 交叉编译、cross、cargo-zigbuild 与 GitHub Actions |
Part II - Measure & Verify / 第二部分:度量与验证
| # | Chapter / 章节 | Difficulty / 难度 | Description / 说明 |
|---|---|---|---|
| 3 | Benchmarking - Measuring What Matters / 基准测试:衡量真正重要的指标 | 🟡 | Criterion.rs, Divan, perf flamegraphs, PGO, continuous benchmarking in CI / Criterion.rs、Divan、perf 火焰图、PGO 与 CI 中的持续基准测试 |
| 4 | Code Coverage - Seeing What Tests Miss / 代码覆盖率:发现测试遗漏 | 🟢 | cargo-llvm-cov, cargo-tarpaulin, grcov, Codecov/Coveralls CI integration / cargo-llvm-cov、cargo-tarpaulin、grcov 与 Codecov/Coveralls 集成 |
| 5 | Miri, Valgrind, and Sanitizers / Miri、Valgrind 与 Sanitizer | 🔶 | MIR interpreter, Valgrind memcheck/Helgrind, ASan/MSan/TSan, cargo-fuzz, loom / MIR 解释器、Valgrind memcheck/Helgrind、ASan/MSan/TSan、cargo-fuzz 与 loom |
Part III - Harden & Optimize / 第三部分:加固与优化
| # | Chapter / 章节 | Difficulty / 难度 | Description / 说明 |
|---|---|---|---|
| 6 | Dependency Management and Supply Chain Security / 依赖管理与供应链安全 | 🟢 | cargo-audit, cargo-deny, cargo-vet, cargo-outdated, cargo-semver-checks / cargo-audit、cargo-deny、cargo-vet、cargo-outdated 与 cargo-semver-checks |
| 7 | Release Profiles and Binary Size / 发布配置与二进制体积 | 🟡 | Release profile anatomy, LTO trade-offs, cargo-bloat, cargo-udeps / 发布配置结构、LTO 权衡、cargo-bloat 与 cargo-udeps |
| 8 | Compile-Time and Developer Tools / 编译期与开发者工具 | 🟡 | sccache, mold, cargo-nextest, cargo-expand, cargo-geiger, workspace lints, MSRV / sccache、mold、cargo-nextest、cargo-expand、cargo-geiger、工作区 lint 与 MSRV |
| 9 | no_std and Feature Verification / no_std 与特性验证 | 🔶 | cargo-hack, core/alloc/std layering, custom panic handlers, testing no_std code / cargo-hack、core/alloc/std 分层、自定义 panic handler 与 no_std 代码测试 |
| 10 | Windows and Conditional Compilation / Windows 与条件编译 | 🟡 | #[cfg] patterns, windows-sys/windows crates, cargo-xwin, platform abstraction / #[cfg] 模式、windows-sys/windows crate、cargo-xwin 与平台抽象 |
Part IV - Integrate / 第四部分:集成
| # | Chapter / 章节 | Difficulty / 难度 | Description / 说明 |
|---|---|---|---|
| 11 | Putting It All Together - A Production CI/CD Pipeline / 综合实战:生产级 CI/CD 流水线 | 🟡 | GitHub Actions workflow, cargo-make, pre-commit hooks, cargo-dist, capstone / GitHub Actions 工作流、cargo-make、pre-commit hook、cargo-dist 与综合实战 |
| 12 | Tricks from the Trenches / 一线实践技巧 | 🟡 | 10 battle-tested patterns: deny(warnings) trap, cache tuning, dep dedup, RUSTFLAGS, more / 10 个经验证的实战模式:deny(warnings) 陷阱、缓存调优、依赖去重、RUSTFLAGS 等 |
| 13 | Quick Reference Card / 速查卡 | - | Commands at a glance, 60+ decision table entries, further reading links / 命令速览、60+ 条决策表条目以及延伸阅读链接 |
Build Scripts — build.rs in Depth 🟢
What you’ll learn:
- How
build.rsfits into the Cargo build pipeline and when it runs- Five production patterns: compile-time constants, C/C++ compilation, protobuf codegen,
pkg-configlinking, and feature detection- Anti-patterns that slow builds or break cross-compilation
- How to balance traceability with reproducible builds
Cross-references: Cross-Compilation uses build scripts for target-aware builds ·
no_std& Features extendscfgflags set here · CI/CD Pipeline orchestrates build scripts in automation
Every Cargo package can include a file named build.rs at the crate root.
Cargo compiles and executes this file before compiling your crate. The build
script communicates back to Cargo through println! instructions on stdout.
What build.rs Is and When It Runs
┌─────────────────────────────────────────────────────────┐
│ Cargo Build Pipeline │
│ │
│ 1. Resolve dependencies │
│ 2. Download crates │
│ 3. Compile build.rs ← ordinary Rust, runs on HOST │
│ 4. Execute build.rs ← stdout → Cargo instructions │
│ 5. Compile the crate (using instructions from step 4) │
│ 6. Link │
└─────────────────────────────────────────────────────────┘
Key facts:
build.rsruns on the host machine, not the target. During cross-compilation, the build script runs on your development machine even when the final binary targets a different architecture.- The build script’s scope is limited to its own package. It cannot affect how
other crates compile — unless the package declares a
linkskey inCargo.toml, which enables passing metadata to dependent crates viacargo::metadata=KEY=VALUE. - It runs every time Cargo detects a change — unless you emit
cargo::rerun-if-changedinstructions to limit re-runs.
Note (Rust 1.71+): Since Rust 1.71, Cargo fingerprints the compiled
build.rsbinary — if the binary is identical, it won’t re-run even if source timestamps changed. However,cargo::rerun-if-changed=build.rsis still valuable: without anyrerun-if-changedinstruction, Cargo re-runsbuild.rswhenever any file in the package changes (not justbuild.rs). Emittingcargo::rerun-if-changed=build.rslimits re-runs to only whenbuild.rsitself changes — a significant compile-time saving in large crates.
- It can emit cfg flags, environment variables, linker arguments, and file paths that the main crate consumes.
The minimal Cargo.toml entry:
[package]
name = "my-crate"
version = "0.1.0"
edition = "2021"
build = "build.rs" # default — Cargo looks for build.rs automatically
# build = "src/build.rs" # or put it elsewhere
The Cargo Instruction Protocol
Your build script communicates with Cargo by printing instructions to stdout.
Since Rust 1.77, the preferred prefix is cargo:: (replacing the older
cargo: single-colon form).
| Instruction | Purpose |
|---|---|
cargo::rerun-if-changed=PATH | Only re-run build.rs when PATH changes |
cargo::rerun-if-env-changed=VAR | Only re-run when environment variable VAR changes |
cargo::rustc-link-lib=NAME | Link against native library NAME |
cargo::rustc-link-search=PATH | Add PATH to the library search path |
cargo::rustc-cfg=KEY | Set a #[cfg(KEY)] flag for conditional compilation |
cargo::rustc-cfg=KEY="VALUE" | Set a #[cfg(KEY = "VALUE")] flag |
cargo::rustc-env=KEY=VALUE | Set an environment variable accessible via env!() |
cargo::rustc-cdylib-link-arg=FLAG | Pass FLAG to the linker for cdylib targets |
cargo::warning=MESSAGE | Display a warning during compilation |
cargo::metadata=KEY=VALUE | Store metadata readable by dependent crates |
// build.rs — minimal example
fn main() {
// Only re-run if build.rs itself changes
println!("cargo::rerun-if-changed=build.rs");
// Set a compile-time environment variable
let timestamp = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_secs().to_string())
.unwrap_or_else(|_| "0".into());
println!("cargo::rustc-env=BUILD_TIMESTAMP={timestamp}");
}
Pattern 1: Compile-Time Constants
The most common use case: baking build metadata into the binary so you can report it at runtime (git hash, build date, CI job ID).
// build.rs
use std::process::Command;
fn main() {
println!("cargo::rerun-if-changed=.git/HEAD");
println!("cargo::rerun-if-changed=.git/refs");
// Git commit hash
let output = Command::new("git")
.args(["rev-parse", "--short", "HEAD"])
.output()
.expect("git not found");
let git_hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
println!("cargo::rustc-env=GIT_HASH={git_hash}");
// Build profile (debug or release)
let profile = std::env::var("PROFILE").unwrap_or_else(|_| "unknown".into());
println!("cargo::rustc-env=BUILD_PROFILE={profile}");
// Target triple
let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
println!("cargo::rustc-env=BUILD_TARGET={target}");
}
#![allow(unused)]
fn main() {
// src/main.rs — consuming the build-time values
fn print_version() {
println!(
"{} {} (git:{} target:{} profile:{})",
env!("CARGO_PKG_NAME"),
env!("CARGO_PKG_VERSION"),
env!("GIT_HASH"),
env!("BUILD_TARGET"),
env!("BUILD_PROFILE"),
);
}
}
Built-in Cargo environment variables you get for free, no build.rs needed:
CARGO_PKG_NAME,CARGO_PKG_VERSION,CARGO_PKG_AUTHORS,CARGO_PKG_DESCRIPTION,CARGO_MANIFEST_DIR. See the full list.
Pattern 2: Compiling C/C++ Code with the cc Crate
When your Rust crate wraps a C library or needs a small C helper (common in
hardware interfaces), the cc crate simplifies
compilation inside build.rs.
# Cargo.toml
[build-dependencies]
cc = "1.0"
// build.rs
fn main() {
println!("cargo::rerun-if-changed=csrc/");
cc::Build::new()
.file("csrc/ipmi_raw.c")
.file("csrc/smbios_parser.c")
.include("csrc/include")
.flag("-Wall")
.flag("-Wextra")
.opt_level(2)
.compile("diag_helpers");
// This produces libdiag_helpers.a and emits the right
// cargo::rustc-link-lib and cargo::rustc-link-search instructions.
}
#![allow(unused)]
fn main() {
// src/lib.rs — FFI bindings to the compiled C code
extern "C" {
fn ipmi_raw_command(
netfn: u8,
cmd: u8,
data: *const u8,
data_len: usize,
response: *mut u8,
response_len: *mut usize,
) -> i32;
}
/// Safe wrapper around the raw IPMI command interface.
/// Assumes: enum IpmiError { CommandFailed(i32), ... }
pub fn send_ipmi_command(netfn: u8, cmd: u8, data: &[u8]) -> Result<Vec<u8>, IpmiError> {
let mut response = vec![0u8; 256];
let mut response_len: usize = response.len();
// SAFETY: response buffer is large enough and response_len is correctly initialized.
let rc = unsafe {
ipmi_raw_command(
netfn,
cmd,
data.as_ptr(),
data.len(),
response.as_mut_ptr(),
&mut response_len,
)
};
if rc != 0 {
return Err(IpmiError::CommandFailed(rc));
}
response.truncate(response_len);
Ok(response)
}
}
For C++ code, use .cpp(true) and .flag("-std=c++17"):
// build.rs — C++ variant
fn main() {
println!("cargo::rerun-if-changed=cppsrc/");
cc::Build::new()
.cpp(true)
.file("cppsrc/vendor_parser.cpp")
.flag("-std=c++17")
.flag("-fno-exceptions") // match Rust's no-exception model
.compile("vendor_helpers");
}
Pattern 3: Protocol Buffers and Code Generation
Build scripts excel at code generation — turning .proto, .fbs, or .json
schema files into Rust source at compile time. Here’s the protobuf pattern
using prost-build:
# Cargo.toml
[build-dependencies]
prost-build = "0.13"
// build.rs
fn main() {
println!("cargo::rerun-if-changed=proto/");
prost_build::compile_protos(
&["proto/diagnostics.proto", "proto/telemetry.proto"],
&["proto/"],
)
.expect("Failed to compile protobuf definitions");
}
#![allow(unused)]
fn main() {
// src/lib.rs — include the generated code
pub mod diagnostics {
include!(concat!(env!("OUT_DIR"), "/diagnostics.rs"));
}
pub mod telemetry {
include!(concat!(env!("OUT_DIR"), "/telemetry.rs"));
}
}
OUT_DIRis a Cargo-provided directory where build scripts should place generated files. Each crate gets its ownOUT_DIRundertarget/.
Pattern 4: Linking System Libraries with pkg-config
For system libraries that provide .pc files (systemd, OpenSSL, libpci),
the pkg-config crate probes the system and
emits the right link instructions:
# Cargo.toml
[build-dependencies]
pkg-config = "0.3"
// build.rs
fn main() {
// Probe for libpci (used for PCIe device enumeration)
pkg_config::Config::new()
.atleast_version("3.6.0")
.probe("libpci")
.expect("libpci >= 3.6.0 not found — install pciutils-dev");
// Probe for libsystemd (optional — for sd_notify integration)
if pkg_config::probe_library("libsystemd").is_ok() {
println!("cargo::rustc-cfg=has_systemd");
}
}
#![allow(unused)]
fn main() {
// src/lib.rs — conditional compilation based on pkg-config probing
#[cfg(has_systemd)]
mod systemd_notify {
extern "C" {
fn sd_notify(unset_environment: i32, state: *const std::ffi::c_char) -> i32;
}
pub fn notify_ready() {
let state = std::ffi::CString::new("READY=1").unwrap();
// SAFETY: state is a valid null-terminated C string.
unsafe { sd_notify(0, state.as_ptr()) };
}
}
#[cfg(not(has_systemd))]
mod systemd_notify {
pub fn notify_ready() {
// no-op on systems without systemd
}
}
}
Pattern 5: Feature Detection and Conditional Compilation
Build scripts can probe the compilation environment and set cfg flags that the main crate uses for conditional code paths.
CPU architecture and OS detection (safe — these are compile-time constants):
// build.rs — detect CPU features and OS capabilities
fn main() {
println!("cargo::rerun-if-changed=build.rs");
let target = std::env::var("TARGET").unwrap();
let target_os = std::env::var("CARGO_CFG_TARGET_OS").unwrap();
// Enable AVX2-optimized paths on x86_64
if target.starts_with("x86_64") {
println!("cargo::rustc-cfg=has_x86_64");
}
// Enable ARM NEON paths on aarch64
if target.starts_with("aarch64") {
println!("cargo::rustc-cfg=has_aarch64");
}
// Detect if /dev/ipmi0 is available (build-time check)
if target_os == "linux" && std::path::Path::new("/dev/ipmi0").exists() {
println!("cargo::rustc-cfg=has_ipmi_device");
}
}
⚠️ Anti-pattern demonstration — The code below shows a tempting but problematic approach. Do not use this in production.
// build.rs — BAD: runtime hardware detection at build time
fn main() {
// ANTI-PATTERN: Binary is baked to the BUILD machine's hardware.
// If you build on a machine with a GPU and deploy to one without,
// the binary silently assumes a GPU is present.
if std::process::Command::new("accel-query")
.arg("--query-gpu=name")
.arg("--format=csv,noheader")
.output()
.is_ok()
{
println!("cargo::rustc-cfg=has_accel_device");
}
}
#![allow(unused)]
fn main() {
// src/gpu.rs — code that adapts based on build-time detection
pub fn query_gpu_info() -> GpuResult {
#[cfg(has_accel_device)]
{
run_accel_query()
}
#[cfg(not(has_accel_device))]
{
GpuResult::NotAvailable("accel-query not found at build time".into())
}
}
}
⚠️ Why this is wrong: Runtime device detection is almost always better than build-time detection for optional hardware. The binary produced above is tied to the build machine’s hardware configuration — it will behave differently on the deployment target. Use build-time detection only for capabilities that are truly fixed at compile time (architecture, OS, library availability). For hardware like GPUs, detect at runtime with
which accel-queryoraccel-mgmtprobing.
Anti-Patterns and Pitfalls
| Anti-Pattern | Why It’s Bad | Fix |
|---|---|---|
No rerun-if-changed | build.rs runs on every build, slowing iteration | Always emit at least cargo::rerun-if-changed=build.rs |
| Network calls in build.rs | Builds fail offline, non-reproducible | Vendor files or use a separate fetch step |
Writing to src/ | Cargo doesn’t expect source to change during build | Write to OUT_DIR and use include!() |
| Heavy computation | Slows every cargo build | Cache results in OUT_DIR, gate with rerun-if-changed |
| Ignoring cross-compilation | Using Command::new("gcc") without respecting $CC | Use the cc crate which handles cross-compilation toolchains |
| Panicking without context | unwrap() gives opaque “build script failed” error | Use .expect("descriptive message") or print cargo::warning= |
Application: Embedding Build Metadata
The project currently uses env!("CARGO_PKG_VERSION") for version
reporting. A build script would extend this with richer metadata:
// build.rs — proposed addition
fn main() {
println!("cargo::rerun-if-changed=.git/HEAD");
println!("cargo::rerun-if-changed=.git/refs");
println!("cargo::rerun-if-changed=build.rs");
// Embed git hash for traceability in diagnostic reports
if let Ok(output) = std::process::Command::new("git")
.args(["rev-parse", "--short=10", "HEAD"])
.output()
{
let hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
println!("cargo::rustc-env=APP_GIT_HASH={hash}");
} else {
println!("cargo::rustc-env=APP_GIT_HASH=unknown");
}
// Embed build timestamp for report correlation
let timestamp = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_secs().to_string())
.unwrap_or_else(|_| "0".into());
println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");
// Emit target triple — useful in multi-arch deployment
let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
println!("cargo::rustc-env=APP_TARGET={target}");
}
#![allow(unused)]
fn main() {
// src/version.rs — consuming the metadata
pub struct BuildInfo {
pub version: &'static str,
pub git_hash: &'static str,
pub build_epoch: &'static str,
pub target: &'static str,
}
pub const BUILD_INFO: BuildInfo = BuildInfo {
version: env!("CARGO_PKG_VERSION"),
git_hash: env!("APP_GIT_HASH"),
build_epoch: env!("APP_BUILD_EPOCH"),
target: env!("APP_TARGET"),
};
impl BuildInfo {
/// Parse the epoch at runtime when needed (const &str → u64 is not
/// possible on stable Rust — there is no const fn for str-to-int).
pub fn build_epoch_secs(&self) -> u64 {
self.build_epoch.parse().unwrap_or(0)
}
}
impl std::fmt::Display for BuildInfo {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(
f,
"DiagTool v{} (git:{} target:{})",
self.version, self.git_hash, self.target
)
}
}
}
Key insight from the project: The codebase has zero
build.rsfiles across all many crates because it’s pure Rust with no C dependencies, no codegen, and no system library linking. When you need these,build.rsis the tool — but don’t add it “just because.” The absence of build scripts in a large codebase is a feature, not a gap. See Dependency Management for how the project manages its supply chain without custom build logic. is a positive signal of a clean architecture.
Try It Yourself
-
Embed git metadata: Create a
build.rsthat emitsAPP_GIT_HASHandAPP_BUILD_EPOCHas environment variables. Consume them withenv!()inmain.rsand print the build info. Verify the hash changes after a commit. -
Probe a system library: Write a
build.rsthat usespkg-configto probe forlibz(zlib). Emitcargo::rustc-cfg=has_zlibif found. Inmain.rs, conditionally print “zlib available” or “zlib not found” based on the cfg flag. -
Trigger a build failure intentionally: Remove the
rerun-if-changedline from yourbuild.rsand observe how many times it reruns duringcargo buildandcargo test. Then add it back and compare.
Reproducible Builds
Chapter 1 teaches embedding timestamps and git hashes into binaries. This is useful for traceability, but it conflicts with reproducible builds — the property that building the same source always produces the same binary.
The tension:
| Goal | Achievement | Cost |
|---|---|---|
| Traceability | APP_BUILD_EPOCH in binary | Every build is unique — can’t verify integrity |
| Reproducibility | cargo build --locked always produces same output | No build-time metadata |
Practical resolution:
# 1. Always use --locked in CI (ensures Cargo.lock is respected)
cargo build --release --locked
# Fails if Cargo.lock is missing or outdated — catches "works on my machine"
# 2. For reproducibility-critical builds, set SOURCE_DATE_EPOCH
SOURCE_DATE_EPOCH=$(git log -1 --format=%ct) cargo build --release --locked
# Uses the last commit timestamp instead of "now" — same commit = same binary
#![allow(unused)]
fn main() {
// In build.rs: respect SOURCE_DATE_EPOCH for reproducibility
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
.unwrap_or_else(|_| {
std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_secs().to_string())
.unwrap_or_else(|_| "0".into())
});
println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");
}
Best practice: Use
SOURCE_DATE_EPOCHin build scripts so release builds are reproducible (git-hash + locked deps + deterministic timestamp = same binary), while dev builds still get live timestamps for convenience.
Build Pipeline Decision Diagram
flowchart TD
START["Need compile-time work?"] -->|No| SKIP["No build.rs needed"]
START -->|Yes| WHAT{"What kind?"}
WHAT -->|"Embed metadata"| P1["Pattern 1\nCompile-Time Constants"]
WHAT -->|"Compile C/C++"| P2["Pattern 2\ncc crate"]
WHAT -->|"Code generation"| P3["Pattern 3\nprost-build / tonic-build"]
WHAT -->|"Link system lib"| P4["Pattern 4\npkg-config"]
WHAT -->|"Detect features"| P5["Pattern 5\ncfg flags"]
P1 --> RERUN["Always emit\ncargo::rerun-if-changed"]
P2 --> RERUN
P3 --> RERUN
P4 --> RERUN
P5 --> RERUN
style SKIP fill:#91e5a3,color:#000
style RERUN fill:#ffd43b,color:#000
style P1 fill:#e3f2fd,color:#000
style P2 fill:#e3f2fd,color:#000
style P3 fill:#e3f2fd,color:#000
style P4 fill:#e3f2fd,color:#000
style P5 fill:#e3f2fd,color:#000
🏋️ Exercises
🟢 Exercise 1: Version Stamp
Create a minimal crate with a build.rs that embeds the current git hash and build profile into environment variables. Print them from main(). Verify the output changes between debug and release builds.
Solution
// build.rs
fn main() {
println!("cargo::rerun-if-changed=.git/HEAD");
println!("cargo::rerun-if-changed=build.rs");
let hash = std::process::Command::new("git")
.args(["rev-parse", "--short", "HEAD"])
.output()
.map(|o| String::from_utf8_lossy(&o.stdout).trim().to_string())
.unwrap_or_else(|_| "unknown".into());
println!("cargo::rustc-env=GIT_HASH={hash}");
println!("cargo::rustc-env=BUILD_PROFILE={}", std::env::var("PROFILE").unwrap_or_default());
}
// src/main.rs
fn main() {
println!("{} v{} (git:{} profile:{})",
env!("CARGO_PKG_NAME"),
env!("CARGO_PKG_VERSION"),
env!("GIT_HASH"),
env!("BUILD_PROFILE"),
);
}
cargo run # shows profile:debug
cargo run --release # shows profile:release
🟡 Exercise 2: Conditional System Library
Write a build.rs that probes for both libz and libpci using pkg-config. Emit a cfg flag for each one found. In main.rs, print which libraries were detected at build time.
Solution
# Cargo.toml
[build-dependencies]
pkg-config = "0.3"
// build.rs
fn main() {
println!("cargo::rerun-if-changed=build.rs");
if pkg_config::probe_library("zlib").is_ok() {
println!("cargo::rustc-cfg=has_zlib");
}
if pkg_config::probe_library("libpci").is_ok() {
println!("cargo::rustc-cfg=has_libpci");
}
}
// src/main.rs
fn main() {
#[cfg(has_zlib)]
println!("✅ zlib detected");
#[cfg(not(has_zlib))]
println!("❌ zlib not found");
#[cfg(has_libpci)]
println!("✅ libpci detected");
#[cfg(not(has_libpci))]
println!("❌ libpci not found");
}
Key Takeaways
build.rsruns on the host at compile time — always emitcargo::rerun-if-changedto avoid unnecessary rebuilds- Use the
cccrate (not rawgcccommands) for C/C++ compilation — it handles cross-compilation toolchains correctly - Write generated files to
OUT_DIR, never tosrc/— Cargo doesn’t expect source to change during builds - Prefer runtime detection over build-time detection for optional hardware
- Use
SOURCE_DATE_EPOCHto make builds reproducible when embedding timestamps
Cross-Compilation — One Source, Many Targets 🟡
What you’ll learn:
- How Rust target triples work and how to add them with
rustup- Building static musl binaries for container/cloud deployment
- Cross-compiling to ARM (aarch64) with native toolchains,
cross, andcargo-zigbuild- Setting up GitHub Actions matrix builds for multi-architecture CI
Cross-references: Build Scripts — build.rs runs on HOST during cross-compilation · Release Profiles — LTO and strip settings for cross-compiled release binaries · Windows — Windows cross-compilation and
no_stdtargets
Cross-compilation means building an executable on one machine (the host) that
runs on a different machine (the target). The host might be your x86_64 laptop;
the target might be an ARM server, a musl-based container, or even a Windows machine.
Rust makes this remarkably feasible because rustc is already a cross-compiler —
it just needs the right target libraries and a compatible linker.
The Target Triple Anatomy
Every Rust compilation target is identified by a target triple (which often has four parts despite the name):
<arch>-<vendor>-<os>-<env>
Examples:
x86_64 - unknown - linux - gnu ← standard Linux (glibc)
x86_64 - unknown - linux - musl ← static Linux (musl libc)
aarch64 - unknown - linux - gnu ← ARM 64-bit Linux
x86_64 - pc - windows- msvc ← Windows with MSVC
aarch64 - apple - darwin ← macOS on Apple Silicon
x86_64 - unknown - none ← bare metal (no OS)
List all available targets:
# Show all targets rustc can compile to (~250 targets)
rustc --print target-list | wc -l
# Show installed targets on your system
rustup target list --installed
# Show current default target
rustc -vV | grep host
Installing Toolchains with rustup
# Add target libraries (Rust std for that target)
rustup target add x86_64-unknown-linux-musl
rustup target add aarch64-unknown-linux-gnu
# Now you can cross-compile:
cargo build --target x86_64-unknown-linux-musl
cargo build --target aarch64-unknown-linux-gnu # needs a linker — see below
What rustup target add gives you: the pre-compiled std, core, and alloc
libraries for that target. It does not give you a C linker or C library. For targets
that need a C toolchain (most gnu targets), you need to install one separately.
# Ubuntu/Debian — install the cross-linker for aarch64
sudo apt install gcc-aarch64-linux-gnu
# Ubuntu/Debian — install musl toolchain for static builds
sudo apt install musl-tools
# Fedora
sudo dnf install gcc-aarch64-linux-gnu
.cargo/config.toml — Per-Target Configuration
Instead of passing --target on every command, configure defaults in
.cargo/config.toml at your project root or home directory:
# .cargo/config.toml
# Default target for this project (optional — omit to keep native default)
# [build]
# target = "x86_64-unknown-linux-musl"
# Linker for aarch64 cross-compilation
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
rustflags = ["-C", "target-feature=+crc"]
# Linker for musl static builds (usually just the system gcc works)
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"
rustflags = ["-C", "target-feature=+crc,+aes"]
# ARM 32-bit (Raspberry Pi, embedded)
[target.armv7-unknown-linux-gnueabihf]
linker = "arm-linux-gnueabihf-gcc"
# Environment variables for all targets
[env]
# Example: set a custom sysroot
# SYSROOT = "/opt/cross/sysroot"
Config file search order (first match wins):
<project>/.cargo/config.toml<project>/../.cargo/config.toml(parent directories, walking up)$CARGO_HOME/config.toml(usually~/.cargo/config.toml)
Static Binaries with musl
For deploying to minimal containers (Alpine, scratch Docker images) or systems where you can’t control the glibc version, build with musl:
# Install musl target
rustup target add x86_64-unknown-linux-musl
sudo apt install musl-tools # provides musl-gcc
# Build a fully static binary
cargo build --release --target x86_64-unknown-linux-musl
# Verify it's static
file target/x86_64-unknown-linux-musl/release/diag_tool
# → ELF 64-bit LSB executable, x86-64, statically linked
ldd target/x86_64-unknown-linux-musl/release/diag_tool
# → not a dynamic executable
Static vs dynamic trade-offs:
| Aspect | glibc (dynamic) | musl (static) |
|---|---|---|
| Binary size | Smaller (shared libs) | Larger (~5-15 MB increase) |
| Portability | Needs matching glibc version | Runs anywhere on Linux |
| DNS resolution | Full nsswitch support | Basic resolver (no mDNS) |
| Deployment | Needs sysroot or container | Single binary, no deps |
| Performance | Slightly faster malloc | Slightly slower malloc |
dlopen() support | Yes | No |
For the project: A static musl build is ideal for deployment to diverse server hardware where you can’t guarantee the host OS version. The single-binary deployment model eliminates “works on my machine” issues.
Cross-Compiling to ARM (aarch64)
ARM servers (AWS Graviton, Ampere Altra, Grace) are increasingly common in data centers. Cross-compiling for aarch64 from an x86_64 host:
# Step 1: Install target + cross-linker
rustup target add aarch64-unknown-linux-gnu
sudo apt install gcc-aarch64-linux-gnu
# Step 2: Configure linker in .cargo/config.toml (see above)
# Step 3: Build
cargo build --release --target aarch64-unknown-linux-gnu
# Step 4: Verify the binary
file target/aarch64-unknown-linux-gnu/release/diag_tool
# → ELF 64-bit LSB executable, ARM aarch64
Running tests for the target architecture requires either:
- An actual ARM machine
- QEMU user-mode emulation
# Install QEMU user-mode (runs ARM binaries on x86_64)
sudo apt install qemu-user qemu-user-static binfmt-support
# Now cargo test can run cross-compiled tests through QEMU
cargo test --target aarch64-unknown-linux-gnu
# (Slow — each test binary is emulated. Use for CI validation, not daily dev.)
Configure QEMU as the test runner in .cargo/config.toml:
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
runner = "qemu-aarch64-static -L /usr/aarch64-linux-gnu"
The cross Tool — Docker-Based Cross-Compilation
The cross tool provides a zero-setup
cross-compilation experience using pre-configured Docker images:
# Install cross (from crates.io — stable releases)
cargo install cross
# Or from git for latest features (less stable):
# cargo install cross --git https://github.com/cross-rs/cross
# Cross-compile — no toolchain setup needed!
cross build --release --target aarch64-unknown-linux-gnu
cross build --release --target x86_64-unknown-linux-musl
cross build --release --target armv7-unknown-linux-gnueabihf
# Cross-test — QEMU included in the Docker image
cross test --target aarch64-unknown-linux-gnu
How it works: cross replaces cargo and runs the build inside a Docker
container that has the correct cross-compilation toolchain pre-installed. Your
source is mounted into the container, and the output goes to your normal target/
directory.
Customizing the Docker image with Cross.toml:
# Cross.toml
[target.aarch64-unknown-linux-gnu]
# Use a custom Docker image with extra system libraries
image = "my-registry/cross-aarch64:latest"
# Pre-install system packages
pre-build = [
"dpkg --add-architecture arm64",
"apt-get update && apt-get install -y libpci-dev:arm64"
]
[target.aarch64-unknown-linux-gnu.env]
# Pass environment variables into the container
passthrough = ["CI", "GITHUB_TOKEN"]
cross requires Docker (or Podman) but eliminates the need to manually install
cross-compilers, sysroots, and QEMU. It’s the recommended approach for CI.
Using Zig as a Cross-Compilation Linker
Zig bundles a C compiler and cross-compilation sysroot for ~40 targets in a single ~40 MB download. This makes it a remarkably convenient cross-linker for Rust:
# Install Zig (single binary, no package manager needed)
# Download from https://ziglang.org/download/
# Or via package manager:
sudo snap install zig --classic --beta # Ubuntu
brew install zig # macOS
# Install cargo-zigbuild
cargo install cargo-zigbuild
Why Zig? The key advantage is glibc version targeting. Zig lets you specify the exact glibc version to link against, ensuring your binary runs on older Linux distributions:
# Build for glibc 2.17 (CentOS 7 / RHEL 7 compatibility)
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17
# Build for aarch64 with glibc 2.28 (Ubuntu 18.04+)
cargo zigbuild --release --target aarch64-unknown-linux-gnu.2.28
# Build for musl (fully static)
cargo zigbuild --release --target x86_64-unknown-linux-musl
The .2.17 suffix is a Zig extension — it tells Zig’s linker to use glibc 2.17
symbol versions, so the resulting binary runs on CentOS 7 and later. No Docker,
no sysroot management, no cross-compiler installation.
Comparison: cross vs cargo-zigbuild vs manual:
| Feature | Manual | cross | cargo-zigbuild |
|---|---|---|---|
| Setup effort | High (install toolchain per target) | Low (needs Docker) | Low (single binary) |
| Docker required | No | Yes | No |
| glibc version targeting | No (uses host glibc) | No (uses container glibc) | Yes (exact version) |
| Test execution | Needs QEMU | Included | Needs QEMU |
| macOS → Linux | Difficult | Easy | Easy |
| Linux → macOS | Very difficult | Not supported | Limited |
| Binary size overhead | None | None | None |
CI Pipeline: GitHub Actions Matrix
A production-grade CI workflow that builds for multiple targets:
# .github/workflows/cross-build.yml
name: Cross-Platform Build
on: [push, pull_request]
env:
CARGO_TERM_COLOR: always
jobs:
build:
strategy:
matrix:
include:
- target: x86_64-unknown-linux-gnu
os: ubuntu-latest
name: linux-x86_64
- target: x86_64-unknown-linux-musl
os: ubuntu-latest
name: linux-x86_64-static
- target: aarch64-unknown-linux-gnu
os: ubuntu-latest
name: linux-aarch64
use_cross: true
- target: x86_64-pc-windows-msvc
os: windows-latest
name: windows-x86_64
runs-on: ${{ matrix.os }}
name: Build (${{ matrix.name }})
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- name: Install musl tools
if: matrix.target == 'x86_64-unknown-linux-musl'
run: sudo apt-get install -y musl-tools
- name: Install cross
if: matrix.use_cross
run: cargo install cross
- name: Build (native)
if: "!matrix.use_cross"
run: cargo build --release --target ${{ matrix.target }}
- name: Build (cross)
if: matrix.use_cross
run: cross build --release --target ${{ matrix.target }}
- name: Run tests
if: "!matrix.use_cross"
run: cargo test --target ${{ matrix.target }}
- name: Upload artifact
uses: actions/upload-artifact@v4
with:
name: diag_tool-${{ matrix.name }}
path: target/${{ matrix.target }}/release/diag_tool*
Application: Multi-Architecture Server Builds
The binary currently has no cross-compilation setup. For a hardware diagnostics tool deployed across diverse server fleets, the recommended addition:
my_workspace/
├── .cargo/
│ └── config.toml ← linker configs per target
├── Cross.toml ← cross tool configuration
└── .github/workflows/
└── cross-build.yml ← CI matrix for 3 targets
Recommended .cargo/config.toml:
# .cargo/config.toml for the project
# Release profile optimizations (already in Cargo.toml, shown for reference)
# [profile.release]
# lto = true
# codegen-units = 1
# panic = "abort"
# strip = true
# aarch64 for ARM servers (Graviton, Ampere, Grace)
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
# musl for portable static binaries
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"
Recommended build targets:
| Target | Use Case | Deploy To |
|---|---|---|
x86_64-unknown-linux-gnu | Default native build | Standard x86 servers |
x86_64-unknown-linux-musl | Static binary, any distro | Containers, minimal hosts |
aarch64-unknown-linux-gnu | ARM servers | Graviton, Ampere, Grace |
Key insight: The
[profile.release]in the workspace’s rootCargo.tomlalready haslto = true,codegen-units = 1,panic = "abort", andstrip = true— an ideal release profile for cross-compiled deployment binaries (see Release Profiles for the full impact table). Combined with musl, this produces a single ~10 MB static binary with no runtime dependencies.
Troubleshooting Cross-Compilation
| Symptom | Cause | Fix |
|---|---|---|
linker 'aarch64-linux-gnu-gcc' not found | Missing cross-linker toolchain | sudo apt install gcc-aarch64-linux-gnu |
cannot find -lssl (musl target) | System OpenSSL is glibc-linked | Use vendored feature: openssl = { version = "0.10", features = ["vendored"] } |
build.rs runs wrong binary | build.rs runs on HOST, not target | Check CARGO_CFG_TARGET_OS in build.rs, not cfg!(target_os) |
Tests pass locally, fail in cross | Docker image missing test fixtures | Mount test data via Cross.toml: [build.env] volumes = ["./TestArea:/TestArea"] |
undefined reference to __cxa_thread_atexit_impl | Old glibc on target | Use cargo-zigbuild with explicit glibc version: --target x86_64-unknown-linux-gnu.2.17 |
| Binary segfaults on ARM | Compiled for wrong ARM variant | Verify target triple matches hardware: aarch64-unknown-linux-gnu for 64-bit ARM |
GLIBC_2.XX not found at runtime | Build machine has newer glibc | Use musl for static builds, or cargo-zigbuild for glibc version pinning |
Cross-Compilation Decision Tree
flowchart TD
START["Need to cross-compile?"] --> STATIC{"Static binary?"}
STATIC -->|Yes| MUSL["musl target\n--target x86_64-unknown-linux-musl"]
STATIC -->|No| GLIBC{"Need old glibc?"}
GLIBC -->|Yes| ZIG["cargo-zigbuild\n--target x86_64-unknown-linux-gnu.2.17"]
GLIBC -->|No| ARCH{"Target arch?"}
ARCH -->|"Same arch"| NATIVE["Native toolchain\nrustup target add + linker"]
ARCH -->|"ARM/other"| DOCKER{"Docker available?"}
DOCKER -->|Yes| CROSS["cross build\nDocker-based, zero setup"]
DOCKER -->|No| MANUAL["Manual sysroot\napt install gcc-aarch64-linux-gnu"]
style MUSL fill:#91e5a3,color:#000
style ZIG fill:#91e5a3,color:#000
style CROSS fill:#91e5a3,color:#000
style NATIVE fill:#e3f2fd,color:#000
style MANUAL fill:#ffd43b,color:#000
🏋️ Exercises
🟢 Exercise 1: Static musl Binary
Build any Rust binary for x86_64-unknown-linux-musl. Verify it’s statically linked using file and ldd.
Solution
rustup target add x86_64-unknown-linux-musl
cargo new hello-static && cd hello-static
cargo build --release --target x86_64-unknown-linux-musl
# Verify
file target/x86_64-unknown-linux-musl/release/hello-static
# Output: ... statically linked ...
ldd target/x86_64-unknown-linux-musl/release/hello-static
# Output: not a dynamic executable
🟡 Exercise 2: GitHub Actions Cross-Build Matrix
Write a GitHub Actions workflow that builds a Rust project for three targets: x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, and aarch64-unknown-linux-gnu. Use a matrix strategy.
Solution
name: Cross-build
on: [push]
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
target:
- x86_64-unknown-linux-gnu
- x86_64-unknown-linux-musl
- aarch64-unknown-linux-gnu
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- name: Install cross
run: cargo install cross --locked
- name: Build
run: cross build --release --target ${{ matrix.target }}
- uses: actions/upload-artifact@v4
with:
name: binary-${{ matrix.target }}
path: target/${{ matrix.target }}/release/my-binary
Key Takeaways
- Rust’s
rustcis already a cross-compiler — you just need the right target and linker - musl produces fully static binaries with zero runtime dependencies — ideal for containers
cargo-zigbuildsolves the “which glibc version” problem for enterprise Linux targetscrossis the easiest path for ARM and other exotic targets — Docker handles the sysroot- Always test with
fileandlddto verify the binary matches your deployment target
Benchmarking — Measuring What Matters 🟡
What you’ll learn:
- Why naive timing with
Instant::now()produces unreliable results- Statistical benchmarking with Criterion.rs and the lighter Divan alternative
- Profiling hot spots with
perf, flamegraphs, and PGO- Setting up continuous benchmarking in CI to catch regressions automatically
Cross-references: Release Profiles — once you find the hot spot, optimize the binary · CI/CD Pipeline — benchmark job in the pipeline · Code Coverage — coverage tells you what’s tested, benchmarks tell you what’s fast
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” — Donald Knuth
The hard part isn’t writing benchmarks — it’s writing benchmarks that produce meaningful, reproducible, actionable numbers. This chapter covers the tools and techniques that get you from “it seems fast” to “we have statistical evidence that PR #347 regressed parsing throughput by 4.2%.”
Why Not std::time::Instant?
The temptation:
// ❌ Naive benchmarking — unreliable results
use std::time::Instant;
fn main() {
let start = Instant::now();
let result = parse_device_query_output(&sample_data);
let elapsed = start.elapsed();
println!("Parsing took {:?}", elapsed);
// Problem 1: Compiler may optimize away `result` (dead code elimination)
// Problem 2: Single sample — no statistical significance
// Problem 3: CPU frequency scaling, thermal throttling, other processes
// Problem 4: Cold cache vs warm cache not controlled
}
Problems with manual timing:
- Dead code elimination — the compiler may skip the computation entirely if the result isn’t used.
- No warm-up — the first run includes cache misses, JIT effects (irrelevant in Rust, but OS page faults apply), and lazy initialization.
- No statistical analysis — a single measurement tells you nothing about variance, outliers, or confidence intervals.
- No regression detection — you can’t compare against previous runs.
Criterion.rs — Statistical Benchmarking
Criterion.rs is the de facto standard for Rust micro-benchmarks. It uses statistical methods to produce reliable measurements and detects performance regressions automatically.
Setup:
# Cargo.toml
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports", "cargo_bench_support"] }
[[bench]]
name = "parsing_bench"
harness = false # Use Criterion's harness, not the built-in test harness
A complete benchmark:
#![allow(unused)]
fn main() {
// benches/parsing_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion, BenchmarkId};
/// Data type for parsed GPU information
#[derive(Debug, Clone)]
struct GpuInfo {
index: u32,
name: String,
temp_c: u32,
power_w: f64,
}
/// The function under test — simulate parsing device-query CSV output
fn parse_gpu_csv(input: &str) -> Vec<GpuInfo> {
input
.lines()
.filter(|line| !line.starts_with('#'))
.filter_map(|line| {
let fields: Vec<&str> = line.split(", ").collect();
if fields.len() >= 4 {
Some(GpuInfo {
index: fields[0].parse().ok()?,
name: fields[1].to_string(),
temp_c: fields[2].parse().ok()?,
power_w: fields[3].parse().ok()?,
})
} else {
None
}
})
.collect()
}
fn bench_parse_gpu_csv(c: &mut Criterion) {
// Representative test data
let small_input = "0, Acme Accel-V1-80GB, 32, 65.5\n\
1, Acme Accel-V1-80GB, 34, 67.2\n";
let large_input = (0..64)
.map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
.collect::<String>();
c.bench_function("parse_2_gpus", |b| {
b.iter(|| parse_gpu_csv(black_box(small_input)))
});
c.bench_function("parse_64_gpus", |b| {
b.iter(|| parse_gpu_csv(black_box(&large_input)))
});
}
criterion_group!(benches, bench_parse_gpu_csv);
criterion_main!(benches);
}
Running and reading results:
# Run all benchmarks
cargo bench
# Run a specific benchmark by name
cargo bench -- parse_64
# Output:
# parse_2_gpus time: [1.2345 µs 1.2456 µs 1.2578 µs]
# ▲ ▲ ▲
# │ confidence interval
# lower 95% median upper 95%
#
# parse_64_gpus time: [38.123 µs 38.456 µs 38.812 µs]
# change: [-1.2345% -0.5678% +0.1234%] (p = 0.12 > 0.05)
# No change in performance detected.
What black_box() does: It’s a compiler hint that prevents dead-code
elimination and over-aggressive constant folding. The compiler cannot see
through black_box, so it must actually compute the result.
Parameterized Benchmarks and Benchmark Groups
Compare multiple implementations or input sizes:
#![allow(unused)]
fn main() {
// benches/comparison_bench.rs
use criterion::{criterion_group, criterion_main, Criterion, BenchmarkId, Throughput};
fn bench_parsing_strategies(c: &mut Criterion) {
let mut group = c.benchmark_group("csv_parsing");
// Test across different input sizes
for num_gpus in [1, 8, 32, 64, 128] {
let input = generate_gpu_csv(num_gpus);
// Set throughput for bytes-per-second reporting
group.throughput(Throughput::Bytes(input.len() as u64));
group.bench_with_input(
BenchmarkId::new("split_based", num_gpus),
&input,
|b, input| b.iter(|| parse_split(input)),
);
group.bench_with_input(
BenchmarkId::new("regex_based", num_gpus),
&input,
|b, input| b.iter(|| parse_regex(input)),
);
group.bench_with_input(
BenchmarkId::new("nom_based", num_gpus),
&input,
|b, input| b.iter(|| parse_nom(input)),
);
}
group.finish();
}
criterion_group!(benches, bench_parsing_strategies);
criterion_main!(benches);
}
Output: Criterion generates an HTML report at target/criterion/report/index.html
with violin plots, comparison charts, and regression analysis — open in a browser.
Divan — A Lighter Alternative
Divan is a newer benchmarking framework that uses attribute macros instead of Criterion’s macro DSL:
# Cargo.toml
[dev-dependencies]
divan = "0.1"
[[bench]]
name = "parsing_bench"
harness = false
// benches/parsing_bench.rs
use divan::black_box;
const SMALL_INPUT: &str = "0, Acme Accel-V1-80GB, 32, 65.5\n\
1, Acme Accel-V1-80GB, 34, 67.2\n";
fn generate_gpu_csv(n: usize) -> String {
(0..n)
.map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
.collect()
}
fn main() {
divan::main();
}
#[divan::bench]
fn parse_2_gpus() -> Vec<GpuInfo> {
parse_gpu_csv(black_box(SMALL_INPUT))
}
#[divan::bench(args = [1, 8, 32, 64, 128])]
fn parse_n_gpus(n: usize) -> Vec<GpuInfo> {
let input = generate_gpu_csv(n);
parse_gpu_csv(black_box(&input))
}
// Divan output is a clean table:
// ╰─ parse_2_gpus fastest │ slowest │ median │ mean │ samples │ iters
// 1.234 µs │ 1.567 µs │ 1.345 µs │ 1.350 µs │ 100 │ 1600
When to choose Divan over Criterion:
- Simpler API (attribute macros, less boilerplate)
- Faster compilation (fewer dependencies)
- Good for quick perf checks during development
When to choose Criterion:
- Statistical regression detection across runs
- HTML reports with charts
- Established ecosystem, more CI integrations
Profiling with perf and Flamegraphs
Benchmarks tell you how fast — profiling tells you where the time goes.
# Step 1: Build with debug info (release speed, debug symbols)
cargo build --release
# Ensure debug info is available:
# [profile.release]
# debug = true # Add this temporarily for profiling
# Step 2: Record with perf
perf record --call-graph=dwarf ./target/release/diag_tool --run-diagnostics
# Step 3: Generate a flamegraph
# Install: cargo install flamegraph
# Install: cargo install addr2line --features=bin (optional, speedup cargo-flamegraph)
cargo flamegraph --root -- --run-diagnostics
# Opens an interactive SVG flamegraph
# Alternative: use perf + inferno
perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg
Reading a flamegraph:
- Width = time spent in that function (wider = slower)
- Height = call stack depth (taller ≠ slower, just deeper)
- Bottom = entry point, Top = leaf functions doing actual work
- Look for wide plateaus at the top — those are your hot spots
Profile-guided optimization (PGO):
# Step 1: Build with instrumentation
RUSTFLAGS="-Cprofile-generate=/tmp/pgo-data" cargo build --release
# Step 2: Run representative workloads
./target/release/diag_tool --run-full # generates profiling data
# Step 3: Merge profiling data
# Use the llvm-profdata that matches rustc's LLVM version:
# $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/bin/llvm-profdata
# Or if llvm-tools is installed: rustup component add llvm-tools
llvm-profdata merge -o /tmp/pgo-data/merged.profdata /tmp/pgo-data/
# Step 4: Rebuild with profiling feedback
RUSTFLAGS="-Cprofile-use=/tmp/pgo-data/merged.profdata" cargo build --release
# Typical improvement: 5-20% for compute-bound code (parsing, crypto, codegen).
# I/O-bound or syscall-heavy code (like a large project) will see much less benefit
# because the CPU is mostly waiting, not executing hot loops.
Tip: Before spending time on PGO, ensure your release profile already has LTO enabled — it typically delivers a bigger win for less effort.
hyperfine — Quick End-to-End Timing
hyperfine benchmarks entire commands,
not individual functions. It’s perfect for measuring overall binary performance:
# Install
cargo install hyperfine
# Or: sudo apt install hyperfine (Ubuntu 23.04+)
# Basic benchmark
hyperfine './target/release/diag_tool --run-diagnostics'
# Compare two implementations
hyperfine './target/release/diag_tool_v1 --run-diagnostics' \
'./target/release/diag_tool_v2 --run-diagnostics'
# Warm-up runs + minimum iterations
hyperfine --warmup 3 --min-runs 10 './target/release/diag_tool --run-all'
# Export results as JSON for CI comparison
hyperfine --export-json bench.json './target/release/diag_tool --run-all'
When to use hyperfine vs Criterion:
hyperfine: whole-binary timing, comparing before/after a refactor, I/O-bound workloads- Criterion: micro-benchmarks of individual functions, statistical regression detection
Continuous Benchmarking in CI
Detect performance regressions before they ship:
# .github/workflows/bench.yml
name: Benchmarks
on:
pull_request:
paths: ['**/*.rs', 'Cargo.toml', 'Cargo.lock']
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- name: Run benchmarks
# Requires criterion = { features = ["cargo_bench_support"] } for --output-format
run: cargo bench -- --output-format bencher | tee bench_output.txt
- name: Store benchmark result
uses: benchmark-action/github-action-benchmark@v1
with:
tool: 'cargo'
output-file-path: bench_output.txt
github-token: ${{ secrets.GITHUB_TOKEN }}
auto-push: true
alert-threshold: '120%' # Alert if 20% slower
comment-on-alert: true
fail-on-alert: true # Block PR if regression detected
Key CI considerations:
- Use dedicated benchmark runners (not shared CI) for consistent results
- Pin the runner to a specific machine type if using cloud CI
- Store historical data to detect gradual regressions
- Set thresholds based on your workload’s tolerance (5% for hot paths, 20% for cold)
Application: Parsing Performance
The project has several performance-sensitive parsing paths that would benefit from benchmarks:
| Parsing Hot Spot | Crate | Why It Matters |
|---|---|---|
| accelerator-query CSV/XML output | device_diag | Called per-GPU, up to 8× per run |
| Sensor event parsing | event_log | Thousands of records on busy servers |
| PCIe topology JSON | topology_lib | Complex nested structures, golden-file validated |
| Report JSON serialization | diag_framework | Final report output, size-sensitive |
| Config JSON loading | config_loader | Startup latency |
Recommended first benchmark — the topology parser, which already has golden-file test data:
#![allow(unused)]
fn main() {
// topology_lib/benches/parse_bench.rs (proposed)
use criterion::{criterion_group, criterion_main, Criterion, Throughput};
use std::fs;
fn bench_topology_parse(c: &mut Criterion) {
let mut group = c.benchmark_group("topology_parse");
for golden_file in ["S2001", "S1015", "S1035", "S1080"] {
let path = format!("tests/test_data/{golden_file}.json");
let data = fs::read_to_string(&path).expect("golden file not found");
group.throughput(Throughput::Bytes(data.len() as u64));
group.bench_function(golden_file, |b| {
b.iter(|| {
topology_lib::TopologyProfile::from_json_str(
criterion::black_box(&data)
)
});
});
}
group.finish();
}
criterion_group!(benches, bench_topology_parse);
criterion_main!(benches);
}
Try It Yourself
-
Write a Criterion benchmark: Pick any parsing function in your codebase. Create a
benches/directory, set up a Criterion benchmark that measures throughput in bytes/second. Runcargo benchand examine the HTML report. -
Generate a flamegraph: Build your project with
debug = truein[profile.release], then runcargo flamegraph -- <your-args>. Identify the three widest stacks at the top of the flamegraph — those are your hot spots. -
Compare with
hyperfine: Installhyperfineand benchmark the overall execution time of your binary with different flags. Compare it to the per-function times from Criterion. Where does the time go that Criterion doesn’t see? (Answer: I/O, syscalls, process startup.)
Benchmark Tool Selection
flowchart TD
START["Want to measure performance?"] --> WHAT{"What level?"}
WHAT -->|"Single function"| CRITERION["Criterion.rs\nStatistical, regression detection"]
WHAT -->|"Quick function check"| DIVAN["Divan\nLighter, attribute macros"]
WHAT -->|"Whole binary"| HYPERFINE["hyperfine\nEnd-to-end, wall-clock"]
WHAT -->|"Find hot spots"| PERF["perf + flamegraph\nCPU sampling profiler"]
CRITERION --> CI_BENCH["Continuous benchmarking\nin GitHub Actions"]
PERF --> OPTIMIZE["Profile-Guided\nOptimization (PGO)"]
style CRITERION fill:#91e5a3,color:#000
style DIVAN fill:#91e5a3,color:#000
style HYPERFINE fill:#e3f2fd,color:#000
style PERF fill:#ffd43b,color:#000
style CI_BENCH fill:#e3f2fd,color:#000
style OPTIMIZE fill:#ffd43b,color:#000
🏋️ Exercises
🟢 Exercise 1: First Criterion Benchmark
Create a crate with a function that sorts a Vec<u64> of 10,000 random elements. Write a Criterion benchmark for it, then switch to .sort_unstable() and observe the performance difference in the HTML report.
Solution
# Cargo.toml
[[bench]]
name = "sort_bench"
harness = false
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
rand = "0.8"
#![allow(unused)]
fn main() {
// benches/sort_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use rand::Rng;
fn generate_data(n: usize) -> Vec<u64> {
let mut rng = rand::thread_rng();
(0..n).map(|_| rng.gen()).collect()
}
fn bench_sort(c: &mut Criterion) {
let mut group = c.benchmark_group("sort-10k");
group.bench_function("stable", |b| {
b.iter_batched(
|| generate_data(10_000),
|mut data| { data.sort(); black_box(&data); },
criterion::BatchSize::SmallInput,
)
});
group.bench_function("unstable", |b| {
b.iter_batched(
|| generate_data(10_000),
|mut data| { data.sort_unstable(); black_box(&data); },
criterion::BatchSize::SmallInput,
)
});
group.finish();
}
criterion_group!(benches, bench_sort);
criterion_main!(benches);
}
cargo bench
open target/criterion/sort-10k/report/index.html
🟡 Exercise 2: Flamegraph Hot Spot
Build a project with debug = true in [profile.release], then generate a flamegraph. Identify the top 3 widest stacks.
Solution
# Cargo.toml
[profile.release]
debug = true # Keep symbols for flamegraph
cargo install flamegraph
cargo flamegraph --release -- <your-args>
# Opens flamegraph.svg in browser
# The widest stacks at the top are your hot spots
Key Takeaways
- Never benchmark with
Instant::now()— use Criterion.rs for statistical rigor and regression detection black_box()prevents the compiler from optimizing away your benchmark targethyperfinemeasures wall-clock time for the whole binary; Criterion measures individual functions — use both- Flamegraphs show where time is spent; benchmarks show how much time is spent
- Continuous benchmarking in CI catches performance regressions before they ship
Code Coverage — Seeing What Tests Miss 🟢
What you’ll learn:
- Source-based coverage with
cargo-llvm-cov(the most accurate Rust coverage tool)- Quick coverage checks with
cargo-tarpaulinand Mozilla’sgrcov- Setting up coverage gates in CI with Codecov and Coveralls
- A coverage-guided testing strategy that prioritizes high-risk blind spots
Cross-references: Miri and Sanitizers — coverage finds untested code, Miri finds UB in tested code · Benchmarking — coverage shows what’s tested, benchmarks show what’s fast · CI/CD Pipeline — coverage gate in the pipeline
Code coverage measures which lines, branches, or functions your tests actually execute. It doesn’t prove correctness (a covered line can still have bugs), but it reliably reveals blind spots — code paths that no test exercises at all.
With 1,006 tests across many crates, the project has substantial test investment. Coverage analysis answers: “Is that investment reaching the code that matters?”
Source-Based Coverage with llvm-cov
Rust uses LLVM, which provides source-based coverage instrumentation — the most
accurate coverage method available. The recommended tool is
cargo-llvm-cov:
# Install
cargo install cargo-llvm-cov
# Or via rustup component (for the raw llvm tools)
rustup component add llvm-tools-preview
Basic usage:
# Run tests and show per-file coverage summary
cargo llvm-cov
# Generate HTML report (browsable, line-by-line highlighting)
cargo llvm-cov --html
# Output: target/llvm-cov/html/index.html
# Generate LCOV format (for CI integrations)
cargo llvm-cov --lcov --output-path lcov.info
# Workspace-wide coverage (all crates)
cargo llvm-cov --workspace
# Include only specific packages
cargo llvm-cov --package accel_diag --package topology_lib
# Coverage including doc tests
cargo llvm-cov --doctests
Reading the HTML report:
target/llvm-cov/html/index.html
├── Filename │ Function │ Line │ Branch │ Region
├─ accel_diag/src/lib.rs │ 78.5% │ 82.3% │ 61.2% │ 74.1%
├─ sel_mgr/src/parse.rs│ 95.2% │ 96.8% │ 88.0% │ 93.5%
├─ topology_lib/src/.. │ 91.0% │ 93.4% │ 79.5% │ 89.2%
└─ ...
Green = covered Red = not covered Yellow = partially covered (branch)
Coverage types explained:
| Type | What It Measures | Significance |
|---|---|---|
| Line coverage | Which source lines were executed | Basic “was this code reached?” |
| Branch coverage | Which if/match arms were taken | Catches untested conditions |
| Function coverage | Which functions were called | Finds dead code |
| Region coverage | Which code regions (sub-expressions) were hit | Most granular |
cargo-tarpaulin — The Quick Path
cargo-tarpaulin is a Linux-specific
coverage tool that’s simpler to set up (no LLVM components needed):
# Install
cargo install cargo-tarpaulin
# Basic coverage report
cargo tarpaulin
# HTML output
cargo tarpaulin --out Html
# With specific options
cargo tarpaulin \
--workspace \
--timeout 120 \
--out Xml Html \
--output-dir coverage/ \
--exclude-files "*/tests/*" "*/benches/*" \
--ignore-panics
# Skip certain crates
cargo tarpaulin --workspace --exclude diag_tool # exclude the binary crate
tarpaulin vs llvm-cov comparison:
| Feature | cargo-llvm-cov | cargo-tarpaulin |
|---|---|---|
| Accuracy | Source-based (most accurate) | Ptrace-based (occasional overcounting) |
| Platform | Any (llvm-based) | Linux only |
| Branch coverage | Yes | Limited |
| Doc tests | Yes | No |
| Setup | Needs llvm-tools-preview | Self-contained |
| Speed | Faster (compile-time instrumentation) | Slower (ptrace overhead) |
| Stability | Very stable | Occasional false positives |
Recommendation: Use cargo-llvm-cov for accuracy. Use cargo-tarpaulin when
you need a quick check without installing LLVM tools.
grcov — Mozilla’s Coverage Tool
grcov is Mozilla’s coverage aggregator.
It consumes raw LLVM profiling data and produces reports in multiple formats:
# Install
cargo install grcov
# Step 1: Build with coverage instrumentation
export RUSTFLAGS="-Cinstrument-coverage"
export LLVM_PROFILE_FILE="target/coverage/%p-%m.profraw"
cargo build --tests
# Step 2: Run tests (generates .profraw files)
cargo test
# Step 3: Aggregate with grcov
grcov target/coverage/ \
--binary-path target/debug/ \
--source-dir . \
--output-types html,lcov \
--output-path target/coverage/report \
--branch \
--ignore-not-existing \
--ignore "*/tests/*" \
--ignore "*/.cargo/*"
# Step 4: View report
open target/coverage/report/html/index.html
When to use grcov: It’s most useful when you need to merge coverage from multiple test runs (e.g., unit tests + integration tests + fuzz tests) into a single report.
Coverage in CI: Codecov and Coveralls
Upload coverage data to a tracking service for historical trends and PR annotations:
# .github/workflows/coverage.yml
name: Code Coverage
on: [push, pull_request]
jobs:
coverage:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: llvm-tools-preview
- name: Install cargo-llvm-cov
uses: taiki-e/install-action@cargo-llvm-cov
- name: Generate coverage
run: cargo llvm-cov --workspace --lcov --output-path lcov.info
- name: Upload to Codecov
uses: codecov/codecov-action@v4
with:
files: lcov.info
token: ${{ secrets.CODECOV_TOKEN }}
fail_ci_if_error: true
# Optional: enforce minimum coverage
- name: Check coverage threshold
run: |
cargo llvm-cov --workspace --fail-under-lines 80
# Fails the build if line coverage drops below 80%
Coverage gates — enforce minimums per crate by reading the JSON output:
# Get per-crate coverage as JSON
cargo llvm-cov --workspace --json | jq '.data[0].totals.lines.percent'
# Fail if below threshold
cargo llvm-cov --workspace --fail-under-lines 80
cargo llvm-cov --workspace --fail-under-functions 70
cargo llvm-cov --workspace --fail-under-regions 60
Coverage-Guided Testing Strategy
Coverage numbers alone are meaningless without a strategy. Here’s how to use coverage data effectively:
Step 1: Triage by risk
High coverage, high risk → ✅ Good — maintain it
High coverage, low risk → 🔄 Possibly over-tested — skip if slow
Low coverage, high risk → 🔴 Write tests NOW — this is where bugs hide
Low coverage, low risk → 🟡 Track but don't panic
Step 2: Focus on branch coverage, not line coverage
#![allow(unused)]
fn main() {
// 100% line coverage, 50% branch coverage — still risky!
pub fn classify_temperature(temp_c: i32) -> ThermalState {
if temp_c > 105 { // ← tested with temp=110 → Critical
ThermalState::Critical
} else if temp_c > 85 { // ← tested with temp=90 → Warning
ThermalState::Warning
} else if temp_c < -10 { // ← NEVER TESTED → sensor error case missed
ThermalState::SensorError
} else {
ThermalState::Normal // ← tested with temp=25 → Normal
}
}
}
Step 3: Exclude noise
# Exclude test code from coverage (it's always "covered")
cargo llvm-cov --workspace --ignore-filename-regex 'tests?\.rs$|benches/'
# Exclude generated code
cargo llvm-cov --workspace --ignore-filename-regex 'target/'
In code, mark untestable sections:
#![allow(unused)]
fn main() {
// Coverage tools recognize this pattern
#[cfg(not(tarpaulin_include))] // tarpaulin
fn unreachable_hardware_path() {
// This path requires actual GPU hardware to trigger
}
// For llvm-cov, use a more targeted approach:
// Simply accept that some paths need integration/hardware tests,
// not unit tests. Track them in a coverage exceptions list.
}
Complementary Testing Tools
proptest — Property-Based Testing finds edge cases that hand-written tests miss:
[dev-dependencies]
proptest = "1"
#![allow(unused)]
fn main() {
use proptest::prelude::*;
proptest! {
#[test]
fn parse_never_panics(input in "\\PC*") {
// proptest generates thousands of random strings
// If parse_gpu_csv panics on any input, the test fails
// and proptest minimizes the failing case for you.
let _ = parse_gpu_csv(&input);
}
#[test]
fn temperature_roundtrip(raw in 0u16..4096) {
let temp = Temperature::from_raw(raw);
let md = temp.millidegrees_c();
// Property: millidegrees should always be derivable from raw
assert_eq!(md, (raw as i32) * 625 / 10);
}
}
}
insta — Snapshot Testing for large structured outputs (JSON, text reports):
[dev-dependencies]
insta = { version = "1", features = ["json"] }
#![allow(unused)]
fn main() {
#[test]
fn test_der_report_format() {
let report = generate_der_report(&test_results);
// First run: creates a snapshot file. Subsequent runs: compares against it.
// Run `cargo insta review` to accept changes interactively.
insta::assert_json_snapshot!(report);
}
}
When to add proptest/insta: If your unit tests are all “happy path” examples, proptest will find the edge cases you missed. If you’re testing large output formats (JSON reports, DER records), insta snapshots are faster to write and maintain than hand-written assertions.
Application: 1,000+ Tests Coverage Map
The project has 1,000+ tests but no coverage tracking. Adding it reveals the testing investment distribution. Uncovered paths are prime candidates for Miri and sanitizer verification:
Recommended coverage configuration:
# Quick workspace coverage (proposed CI command)
cargo llvm-cov --workspace \
--ignore-filename-regex 'tests?\.rs$' \
--fail-under-lines 75 \
--html
# Per-crate coverage for targeted improvement
for crate in accel_diag event_log topology_lib network_diag compute_diag fan_diag; do
echo "=== $crate ==="
cargo llvm-cov --package "$crate" --json 2>/dev/null | \
jq -r '.data[0].totals | "Lines: \(.lines.percent | round)% Branches: \(.branches.percent | round)%"'
done
Expected high-coverage crates (based on test density):
topology_lib— 922-line golden-file test suiteevent_log— registry withcreate_test_record()helperscable_diag—make_test_event()/make_test_context()patterns
Expected coverage gaps (based on code inspection):
- Error handling arms in IPMI communication paths
- GPU hardware-specific branches (require actual GPU)
dmesgparsing edge cases (platform-dependent output)
The 80/20 rule of coverage: Getting from 0% to 80% coverage is straightforward. Getting from 80% to 95% requires increasingly contrived test scenarios. Getting from 95% to 100% requires
#[cfg(not(...))]exclusions and is rarely worth the effort. Target 80% line coverage and 70% branch coverage as a practical floor.
Troubleshooting Coverage
| Symptom | Cause | Fix |
|---|---|---|
llvm-cov shows 0% for all files | Instrumentation not applied | Ensure you run cargo llvm-cov, not cargo test + llvm-cov separately |
Coverage counts unreachable!() as uncovered | Those branches exist in compiled code | Use #[cfg(not(tarpaulin_include))] or add to exclusion regex |
| Test binary crashes under coverage | Instrumentation + sanitizer conflict | Don’t combine cargo llvm-cov with -Zsanitizer=address; run them separately |
Coverage differs between llvm-cov and tarpaulin | Different instrumentation techniques | Use llvm-cov as source of truth (compiler-native); file issues for large discrepancies |
error: profraw file is malformed | Test binary crashed mid-execution | Fix the test failure first; profraw files are corrupt when the process exits abnormally |
| Branch coverage seems impossibly low | Optimizer creates branches for match arms, unwrap, etc. | Focus on line coverage for practical thresholds; branch coverage is inherently lower |
Try It Yourself
-
Measure coverage on your project: Run
cargo llvm-cov --workspace --htmland open the report. Find the three files with the lowest coverage. Are they untested, or inherently hard to test (hardware-dependent code)? -
Set a coverage gate: Add
cargo llvm-cov --workspace --fail-under-lines 60to your CI. Intentionally comment out a test and verify CI fails. Then raise the threshold to your project’s actual coverage level minus 2%. -
Branch vs. line coverage: Write a function with a 3-arm
matchand test only 2 arms. Compare line coverage (may show 66%) vs. branch coverage (may show 50%). Which metric is more useful for your project?
Coverage Tool Selection
flowchart TD
START["Need code coverage?"] --> ACCURACY{"Priority?"}
ACCURACY -->|"Most accurate"| LLVM["cargo-llvm-cov\nSource-based, compiler-native"]
ACCURACY -->|"Quick check"| TARP["cargo-tarpaulin\nLinux only, fast"]
ACCURACY -->|"Multi-run aggregate"| GRCOV["grcov\nMozilla, combines profiles"]
LLVM --> CI_GATE["CI coverage gate\n--fail-under-lines 80"]
TARP --> CI_GATE
CI_GATE --> UPLOAD{"Upload to?"}
UPLOAD -->|"Codecov"| CODECOV["codecov/codecov-action"]
UPLOAD -->|"Coveralls"| COVERALLS["coverallsapp/github-action"]
style LLVM fill:#91e5a3,color:#000
style TARP fill:#e3f2fd,color:#000
style GRCOV fill:#e3f2fd,color:#000
style CI_GATE fill:#ffd43b,color:#000
🏋️ Exercises
🟢 Exercise 1: First Coverage Report
Install cargo-llvm-cov, run it on any Rust project, and open the HTML report. Find the three files with the lowest line coverage.
Solution
cargo install cargo-llvm-cov
cargo llvm-cov --workspace --html --open
# The report sorts files by coverage — lowest at the bottom
# Look for files under 50% — those are your blind spots
🟡 Exercise 2: CI Coverage Gate
Add a coverage gate to a GitHub Actions workflow that fails if line coverage drops below 60%. Verify it works by commenting out a test.
Solution
# .github/workflows/coverage.yml
name: Coverage
on: [push, pull_request]
jobs:
coverage:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: llvm-tools-preview
- run: cargo install cargo-llvm-cov
- run: cargo llvm-cov --workspace --fail-under-lines 60
Comment out a test, push, and watch the workflow fail.
Key Takeaways
cargo-llvm-covis the most accurate coverage tool for Rust — it uses the compiler’s own instrumentation- Coverage doesn’t prove correctness, but zero coverage proves zero testing — use it to find blind spots
- Set a coverage gate in CI (e.g.,
--fail-under-lines 80) to prevent regressions - Don’t chase 100% coverage — focus on high-risk code paths (error handling, unsafe, parsing)
- Never combine coverage instrumentation with sanitizers in the same run
Miri, Valgrind, and Sanitizers — Verifying Unsafe Code 🔴
What you’ll learn:
- Miri as a MIR interpreter — what it catches (aliasing, UB, leaks) and what it can’t (FFI, syscalls)
- Valgrind memcheck, Helgrind (data races), Callgrind (profiling), and Massif (heap)
- LLVM sanitizers: ASan, MSan, TSan, LSan with nightly
-Zbuild-stdcargo-fuzzfor crash discovery andloomfor concurrency model checking- A decision tree for choosing the right verification tool
Cross-references: Code Coverage — coverage finds untested paths, Miri verifies the tested ones ·
no_std& Features —no_stdcode often requiresunsafethat Miri can verify · CI/CD Pipeline — Miri job in the pipeline
Safe Rust guarantees memory safety and data-race freedom at compile time. But the
moment you write unsafe — for FFI, hand-rolled data structures, or performance
tricks — those guarantees become your responsibility. This chapter covers the
tools that verify your unsafe code actually upholds the safety contracts it claims.
Miri — An Interpreter for Unsafe Rust
Miri is an interpreter for Rust’s Mid-level Intermediate Representation (MIR). Instead of compiling to machine code, Miri executes your program step-by-step with exhaustive checks for undefined behavior at every operation.
# Install Miri (nightly-only component)
rustup +nightly component add miri
# Run your test suite under Miri
cargo +nightly miri test
# Run a specific binary under Miri
cargo +nightly miri run
# Run a specific test
cargo +nightly miri test -- test_name
How Miri works:
Source → rustc → MIR → Miri interprets MIR
│
├─ Tracks every pointer's provenance
├─ Validates every memory access
├─ Checks alignment at every deref
├─ Detects use-after-free
├─ Detects data races (with threads)
└─ Enforces Stacked Borrows / Tree Borrows rules
What Miri Catches (and What It Cannot)
Miri detects:
| Category | Example | Would Crash at Runtime? |
|---|---|---|
| Out-of-bounds access | ptr.add(100).read() past allocation | Sometimes (depends on page layout) |
| Use after free | Reading a dropped Box through raw pointer | Sometimes (depends on allocator) |
| Double free | Calling drop_in_place twice | Usually |
| Unaligned access | (ptr as *const u32).read() on odd address | On some architectures |
| Invalid values | transmute::<u8, bool>(2) | Silently wrong |
| Dangling references | &*ptr where ptr is freed | No (silent corruption) |
| Data races | Two threads, one writing, no synchronization | Intermittent, hard to reproduce |
| Stacked Borrows violation | Aliasing &mut references | No (silent corruption) |
Miri does NOT detect:
| Limitation | Why |
|---|---|
| Logic bugs | Miri checks memory safety, not correctness |
| Concurrency deadlocks | Miri checks data races, not livelocks |
| Performance issues | Interpretation is 10-100× slower than native |
| OS/hardware interaction | Miri can’t emulate syscalls, device I/O |
| All FFI calls | Can’t interpret C code (only Rust MIR) |
| Exhaustive path coverage | Only tests the paths your test suite reaches |
A concrete example — catching unsound code that “works” in practice:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
#[test]
fn test_miri_catches_ub() {
// This "works" in release builds but is undefined behavior
let mut v = vec![1, 2, 3];
let ptr = v.as_ptr();
// Push may reallocate, invalidating ptr
v.push(4);
// ❌ UB: ptr may be dangling after reallocation
// Miri will catch this even if the allocator happens to
// not move the buffer.
// let _val = unsafe { *ptr };
// Error: Miri would report:
// "pointer to alloc1234 was dereferenced after this
// allocation got freed"
// ✅ Correct: get a fresh pointer after mutation
let ptr = v.as_ptr();
let val = unsafe { *ptr };
assert_eq!(val, 1);
}
}
}
Running Miri on a Real Crate
Practical Miri workflow for a crate with unsafe:
# Step 1: Run all tests under Miri
cargo +nightly miri test 2>&1 | tee miri_output.txt
# Step 2: If Miri reports errors, isolate them
cargo +nightly miri test -- failing_test_name
# Step 3: Use Miri's backtrace for diagnosis
MIRIFLAGS="-Zmiri-backtrace=full" cargo +nightly miri test
# Step 4: Choose a borrow model
# Stacked Borrows (default, stricter):
cargo +nightly miri test
# Tree Borrows (experimental, more permissive):
MIRIFLAGS="-Zmiri-tree-borrows" cargo +nightly miri test
Miri flags for common scenarios:
# Disable isolation (allow file system access, env vars)
MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test
# Memory leak detection is ON by default in Miri.
# To suppress leak errors (e.g., for intentional leaks):
# MIRIFLAGS="-Zmiri-ignore-leaks" cargo +nightly miri test
# Seed the RNG for reproducible results with randomized tests
MIRIFLAGS="-Zmiri-seed=42" cargo +nightly miri test
# Enable strict provenance checking
MIRIFLAGS="-Zmiri-strict-provenance" cargo +nightly miri test
# Multiple flags
MIRIFLAGS="-Zmiri-disable-isolation -Zmiri-backtrace=full -Zmiri-strict-provenance" \
cargo +nightly miri test
Miri in CI:
# .github/workflows/miri.yml
name: Miri
on: [push, pull_request]
jobs:
miri:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@nightly
with:
components: miri
- name: Run Miri
run: cargo miri test --workspace
env:
MIRIFLAGS: "-Zmiri-backtrace=full"
# Leak checking is on by default.
# Skip tests that use system calls Miri can't handle
# (file I/O, networking, etc.)
Performance note: Miri is 10-100× slower than native execution. A test suite that runs in 5 seconds natively may take 5 minutes under Miri. In CI, run Miri on a focused subset: crates with
unsafecode only.
Valgrind and Its Rust Integration
Valgrind is the classic C/C++ memory checker. It works on compiled Rust binaries too, checking for memory errors at the machine-code level.
# Install Valgrind
sudo apt install valgrind # Debian/Ubuntu
sudo dnf install valgrind # Fedora
# Build with debug info (Valgrind needs symbols)
cargo build --tests
# or for release with debug info:
# cargo build --release
# [profile.release]
# debug = true
# Run a specific test binary under Valgrind
valgrind --tool=memcheck \
--leak-check=full \
--show-leak-kinds=all \
--track-origins=yes \
./target/debug/deps/my_crate-abc123 --test-threads=1
# Run the main binary
valgrind --tool=memcheck \
--leak-check=full \
--error-exitcode=1 \
./target/debug/diag_tool --run-diagnostics
Valgrind tools beyond memcheck:
| Tool | Command | What It Detects |
|---|---|---|
| Memcheck | --tool=memcheck | Memory leaks, use-after-free, buffer overflows |
| Helgrind | --tool=helgrind | Data races and lock-order violations |
| DRD | --tool=drd | Data races (different detection algorithm) |
| Callgrind | --tool=callgrind | CPU instruction profiling (path-level) |
| Massif | --tool=massif | Heap memory profiling over time |
| Cachegrind | --tool=cachegrind | Cache miss analysis |
Using Callgrind for instruction-level profiling:
# Record instruction counts (more stable than wall-clock time)
valgrind --tool=callgrind \
--callgrind-out-file=callgrind.out \
./target/release/diag_tool --run-diagnostics
# Visualize with KCachegrind
kcachegrind callgrind.out
# or the text-based alternative:
callgrind_annotate callgrind.out | head -100
Miri vs Valgrind — when to use which:
| Aspect | Miri | Valgrind |
|---|---|---|
| Checks Rust-specific UB | ✅ Stacked/Tree Borrows | ❌ Not aware of Rust rules |
| Checks C FFI code | ❌ Can’t interpret C | ✅ Checks all machine code |
| Needs nightly | ✅ Yes | ❌ No |
| Speed | 10-100× slower | 10-50× slower |
| Platform | Any (interprets MIR) | Linux, macOS (runs native code) |
| Data race detection | ✅ Yes | ✅ Yes (Helgrind/DRD) |
| Leak detection | ✅ Yes | ✅ Yes (more thorough) |
| False positives | Very rare | Occasional (especially with allocators) |
Use both:
- Miri for pure-Rust
unsafecode (Stacked Borrows, provenance) - Valgrind for FFI-heavy code and whole-program leak analysis
AddressSanitizer, MemorySanitizer, ThreadSanitizer
LLVM sanitizers are compile-time instrumentation passes that insert runtime checks. They’re faster than Valgrind (2-5× overhead vs 10-50×) and catch different classes of bugs.
# Required: install Rust source for rebuilding std with sanitizer instrumentation
rustup component add rust-src --toolchain nightly
# AddressSanitizer (ASan) — buffer overflows, use-after-free, stack overflows
RUSTFLAGS="-Zsanitizer=address" \
cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
# MemorySanitizer (MSan) — uninitialized memory reads
RUSTFLAGS="-Zsanitizer=memory" \
cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
# ThreadSanitizer (TSan) — data races
RUSTFLAGS="-Zsanitizer=thread" \
cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
# LeakSanitizer (LSan) — memory leaks (included in ASan by default)
RUSTFLAGS="-Zsanitizer=leak" \
cargo +nightly test --target x86_64-unknown-linux-gnu
Note: ASan, MSan, and TSan require
-Zbuild-stdto rebuild the standard library with sanitizer instrumentation. LSan does not.
Sanitizer comparison:
| Sanitizer | Overhead | Catches | Nightly? | -Zbuild-std? |
|---|---|---|---|---|
| ASan | 2× memory, 2× CPU | Buffer overflow, use-after-free, stack overflow | Yes | Yes |
| MSan | 3× memory, 3× CPU | Uninitialized reads | Yes | Yes |
| TSan | 5-10× memory, 5× CPU | Data races | Yes | Yes |
| LSan | Minimal | Memory leaks | Yes | No |
Practical example — catching a data race with TSan:
#![allow(unused)]
fn main() {
use std::sync::Arc;
use std::thread;
fn racy_counter() -> u64 {
// ❌ UB: unsynchronized shared mutable state
let data = Arc::new(std::cell::UnsafeCell::new(0u64));
let mut handles = vec![];
for _ in 0..4 {
let data = Arc::clone(&data);
handles.push(thread::spawn(move || {
for _ in 0..1000 {
// SAFETY: UNSOUND — data race!
unsafe {
*data.get() += 1;
}
}
}));
}
for h in handles {
h.join().unwrap();
}
// Value should be 4000 but may be anything due to race
unsafe { *data.get() }
}
// Both Miri and TSan catch this:
// Miri: "Data race detected between (1) write and (2) write"
// TSan: "WARNING: ThreadSanitizer: data race"
//
// Fix: use AtomicU64 or Mutex<u64>
}
Related Tools: Fuzzing and Concurrency Verification
cargo-fuzz — Coverage-Guided Fuzzing (finds crashes in parsers and decoders):
# Install
cargo install cargo-fuzz
# Initialize a fuzz target
cargo fuzz init
cargo fuzz add parse_gpu_csv
#![allow(unused)]
fn main() {
// fuzz/fuzz_targets/parse_gpu_csv.rs
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: &[u8]| {
if let Ok(s) = std::str::from_utf8(data) {
// The fuzzer generates millions of inputs looking for panics/crashes.
let _ = diag_tool::parse_gpu_csv(s);
}
});
}
# Run the fuzzer (runs until interrupted or crash found)
cargo +nightly fuzz run parse_gpu_csv -- -max_total_time=300 # 5 minutes
# Minimize a crash
cargo +nightly fuzz tmin parse_gpu_csv artifacts/parse_gpu_csv/crash-...
When to fuzz: Any function that parses untrusted/semi-trusted input (sensor output, config files, network data, JSON/CSV). Fuzzing found real bugs in every major Rust parser crate (serde, regex, image).
loom — Concurrency Model Checker (exhaustively tests atomic orderings):
[dev-dependencies]
loom = "0.7"
#![allow(unused)]
fn main() {
#[cfg(loom)]
mod tests {
use loom::sync::atomic::{AtomicUsize, Ordering};
use loom::thread;
#[test]
fn test_counter_is_atomic() {
loom::model(|| {
let counter = loom::sync::Arc::new(AtomicUsize::new(0));
let c1 = counter.clone();
let c2 = counter.clone();
let t1 = thread::spawn(move || { c1.fetch_add(1, Ordering::SeqCst); });
let t2 = thread::spawn(move || { c2.fetch_add(1, Ordering::SeqCst); });
t1.join().unwrap();
t2.join().unwrap();
// loom explores ALL possible thread interleavings
assert_eq!(counter.load(Ordering::SeqCst), 2);
});
}
}
}
When to use
loom: When you have lock-free data structures or custom synchronization primitives. Loom exhaustively explores thread interleavings — it’s a model checker, not a stress test. Not needed forMutex/RwLock-based code.
When to Use Which Tool
Decision tree for unsafe verification:
Is the code pure Rust (no FFI)?
├─ Yes → Use Miri (catches Rust-specific UB, Stacked Borrows)
│ Also run ASan in CI for defense-in-depth
└─ No (calls C/C++ code via FFI)
├─ Memory safety concerns?
│ └─ Yes → Use Valgrind memcheck AND ASan
├─ Concurrency concerns?
│ └─ Yes → Use TSan (faster) or Helgrind (more thorough)
└─ Memory leak concerns?
└─ Yes → Use Valgrind --leak-check=full
Recommended CI matrix:
# Run all tools in parallel for fast feedback
jobs:
miri:
runs-on: ubuntu-latest
steps:
- uses: dtolnay/rust-toolchain@nightly
with: { components: miri }
- run: cargo miri test --workspace
asan:
runs-on: ubuntu-latest
steps:
- uses: dtolnay/rust-toolchain@nightly
- run: |
RUSTFLAGS="-Zsanitizer=address" \
cargo test -Zbuild-std --target x86_64-unknown-linux-gnu
valgrind:
runs-on: ubuntu-latest
steps:
- run: sudo apt-get install -y valgrind
- uses: dtolnay/rust-toolchain@stable
- run: cargo build --tests
- run: |
for test_bin in $(find target/debug/deps -maxdepth 1 -executable -type f ! -name '*.d'); do
valgrind --error-exitcode=1 --leak-check=full "$test_bin" --test-threads=1
done
Application: Zero Unsafe — and When You’ll Need It
The project contains zero unsafe blocks across 90K+ lines of
Rust. This is a remarkable achievement for a systems-level diagnostics tool and
demonstrates that safe Rust is sufficient for:
- IPMI communication (via
std::process::Commandtoipmitool) - GPU queries (via
std::process::Commandtoaccel-query) - PCIe topology parsing (pure JSON/text parsing)
- SEL record management (pure data structures)
- DER report generation (JSON serialization)
When will the project need unsafe?
The likely triggers for introducing unsafe:
| Scenario | Why unsafe | Recommended Verification |
|---|---|---|
| Direct ioctl-based IPMI | libc::ioctl() bypasses ipmitool subprocess | Miri + Valgrind |
| Direct GPU driver queries | accel-mgmt FFI instead of accel-query parsing | Valgrind (C library) |
| Memory-mapped PCIe config | mmap for direct config-space reads | ASan + Valgrind |
| Lock-free SEL buffer | AtomicPtr for concurrent event collection | Miri + TSan |
| Embedded/no_std variant | Raw pointer manipulation for bare-metal | Miri |
Preparation: Before introducing unsafe, add the verification tools to CI:
# Cargo.toml — add a feature flag for unsafe optimizations
[features]
default = []
direct-ipmi = [] # Enable direct ioctl IPMI instead of ipmitool subprocess
direct-accel-api = [] # Enable accel-mgmt FFI instead of accel-query parsing
#![allow(unused)]
fn main() {
// src/ipmi.rs — gated behind a feature flag
#[cfg(feature = "direct-ipmi")]
mod direct {
//! Direct IPMI device access via /dev/ipmi0 ioctl.
//!
//! # Safety
//! This module uses `unsafe` for ioctl system calls.
//! Verified with: Miri (where possible), Valgrind memcheck, ASan.
use std::os::unix::io::RawFd;
// ... unsafe ioctl implementation ...
}
#[cfg(not(feature = "direct-ipmi"))]
mod subprocess {
//! IPMI via ipmitool subprocess (default, fully safe).
// ... current implementation ...
}
}
Key insight: Keep
unsafebehind feature flags so it can be verified independently. Runcargo +nightly miri test --features direct-ipmiin CI to continuously verify the unsafe paths without affecting the safe default build.
cargo-careful — Extra UB Checks on Stable
cargo-careful runs your code
with extra standard library checks enabled — catching some undefined behavior
that normal builds ignore, without requiring nightly or Miri’s 10-100× slowdown:
# Install (requires nightly, but runs your code at near-native speed)
cargo install cargo-careful
# Run tests with extra UB checks (catches uninitialized memory, invalid values)
cargo +nightly careful test
# Run a binary with extra checks
cargo +nightly careful run -- --run-diagnostics
What cargo-careful catches that normal builds don’t:
- Reads of uninitialized memory in
MaybeUninitandzeroed() - Creating invalid
bool,char, or enum values via transmute - Unaligned pointer reads/writes
copy_nonoverlappingwith overlapping ranges
Where it fits in the verification ladder:
Least overhead Most thorough
├─ cargo test ──► cargo careful test ──► Miri ──► ASan ──► Valgrind ─┤
│ (0× overhead) (~1.5× overhead) (10-100×) (2×) (10-50×) │
│ Safe Rust only Catches some UB Pure-Rust FFI+Rust FFI+Rust │
Recommendation: Add
cargo +nightly careful testto CI as a fast safety check. It runs at near-native speed (unlike Miri) and catches real bugs that safe Rust abstractions mask.
Troubleshooting Miri and Sanitizers
| Symptom | Cause | Fix |
|---|---|---|
Miri does not support FFI | Miri is a Rust interpreter; it can’t execute C code | Use Valgrind or ASan for FFI code instead |
error: unsupported operation: can't call foreign function | Miri hit an extern "C" call | Mock the FFI boundary or gate behind #[cfg(miri)] |
Stacked Borrows violation | Aliasing rule violation — even if code “works” | Miri is correct; refactor to avoid aliasing &mut with & |
Sanitizer says DEADLYSIGNAL | ASan detected buffer overflow | Check array indexing, slice operations, and pointer arithmetic |
LeakSanitizer: detected memory leaks | Box::leak(), forget(), or missing drop() | Intentional: suppress with __lsan_disable(); unintentional: fix the leak |
| Miri is extremely slow | Miri interprets, doesn’t compile — 10-100× slower | Run only on --lib tests or tag specific tests with #[cfg_attr(miri, ignore)] for slow ones |
TSan: false positive with atomics | TSan doesn’t understand Rust’s atomic ordering model perfectly | Add TSAN_OPTIONS=suppressions=tsan.supp with specific suppressions |
Try It Yourself
-
Trigger a Miri UB detection: Write an
unsafefunction that creates two&mutreferences to the samei32(aliasing violation). Runcargo +nightly miri testand observe the “Stacked Borrows” error. Fix it withUnsafeCellor separate allocations. -
Run ASan on a deliberate bug: Create a test that does
unsafeout-of-bounds array access. Build withRUSTFLAGS="-Zsanitizer=address"and observe ASan’s report. Note how it pinpoints the exact line. -
Benchmark Miri overhead: Time
cargo test --libvscargo +nightly miri test --libon the same test suite. Calculate the slowdown factor. Based on this, decide which tests to run under Miri in CI and which to skip with#[cfg_attr(miri, ignore)].
Safety Verification Decision Tree
flowchart TD
START["Have unsafe code?"] -->|No| SAFE["Safe Rust — no\nverification needed"]
START -->|Yes| KIND{"What kind?"}
KIND -->|"Pure Rust unsafe"| MIRI["Miri\nMIR interpreter\ncatches aliasing, UB, leaks"]
KIND -->|"FFI / C interop"| VALGRIND["Valgrind memcheck\nor ASan"]
KIND -->|"Concurrent unsafe"| CONC{"Lock-free?"}
CONC -->|"Atomics/lock-free"| LOOM["loom\nModel checker for atomics"]
CONC -->|"Mutex/shared state"| TSAN["TSan or\nMiri -Zmiri-check-number-validity"]
MIRI --> CI_MIRI["CI: cargo +nightly miri test"]
VALGRIND --> CI_VALGRIND["CI: valgrind --leak-check=full"]
style SAFE fill:#91e5a3,color:#000
style MIRI fill:#e3f2fd,color:#000
style VALGRIND fill:#ffd43b,color:#000
style LOOM fill:#ff6b6b,color:#000
style TSAN fill:#ffd43b,color:#000
🏋️ Exercises
🟡 Exercise 1: Trigger a Miri UB Detection
Write an unsafe function that creates two &mut references to the same i32 (aliasing violation). Run cargo +nightly miri test and observe the Stacked Borrows error. Fix it.
Solution
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
#[test]
fn aliasing_ub() {
let mut x: i32 = 42;
let ptr = &mut x as *mut i32;
unsafe {
// BUG: Two &mut references to the same location
let _a = &mut *ptr;
let _b = &mut *ptr; // Miri: Stacked Borrows violation!
}
}
}
}
Fix: use separate allocations or UnsafeCell:
#![allow(unused)]
fn main() {
use std::cell::UnsafeCell;
#[test]
fn no_aliasing_ub() {
let x = UnsafeCell::new(42);
unsafe {
let a = &mut *x.get();
*a = 100;
}
}
}
🔴 Exercise 2: ASan Out-of-Bounds Detection
Create a test with unsafe out-of-bounds array access. Build with RUSTFLAGS="-Zsanitizer=address" on nightly and observe ASan’s report.
Solution
#![allow(unused)]
fn main() {
#[test]
fn oob_access() {
let arr = [1u8, 2, 3, 4, 5];
let ptr = arr.as_ptr();
unsafe {
let _val = *ptr.add(10); // Out of bounds!
}
}
}
RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std \
--target x86_64-unknown-linux-gnu -- oob_access
# ASan report: stack-buffer-overflow at <exact address>
Key Takeaways
- Miri is the tool for pure-Rust
unsafe— it catches aliasing violations, use-after-free, and leaks that compile and pass tests - Valgrind is the tool for FFI/C interop — it works on the final binary without recompilation
- Sanitizers (ASan, TSan, MSan) require nightly but run at near-native speed — ideal for large test suites
loomis purpose-built for verifying lock-free concurrent data structures- Run Miri in CI on every push; run sanitizers on a nightly schedule to avoid slowing the main pipeline
Dependency Management and Supply Chain Security 🟢
What you’ll learn:
- Scanning for known vulnerabilities with
cargo-audit- Enforcing license, advisory, and source policies with
cargo-deny- Supply chain trust verification with Mozilla’s
cargo-vet- Tracking outdated dependencies and detecting breaking API changes
- Visualizing and deduplicating your dependency tree
Cross-references: Release Profiles —
cargo-udepstrims unused dependencies found here · CI/CD Pipeline — audit and deny jobs in the pipeline · Build Scripts —build-dependenciesare part of your supply chain too
A Rust binary doesn’t just contain your code — it contains every transitive
dependency in your Cargo.lock. A vulnerability, license violation, or
malicious crate anywhere in that tree becomes your problem. This chapter
covers the tools that make dependency management auditable and automated.
cargo-audit — Known Vulnerability Scanning
cargo-audit
checks your Cargo.lock against the RustSec Advisory Database,
which tracks known vulnerabilities in published crates.
# Install
cargo install cargo-audit
# Scan for known vulnerabilities
cargo audit
# Output:
# Crate: chrono
# Version: 0.4.19
# Title: Potential segfault in localtime_r invocations
# Date: 2020-11-10
# ID: RUSTSEC-2020-0159
# URL: https://rustsec.org/advisories/RUSTSEC-2020-0159
# Solution: Upgrade to >= 0.4.20
# Check and fail CI if vulnerabilities exist
cargo audit --deny warnings
# Generate JSON output for automated processing
cargo audit --json
# Fix vulnerabilities by updating Cargo.lock
cargo audit fix
CI integration:
# .github/workflows/audit.yml
name: Security Audit
on:
schedule:
- cron: '0 0 * * *' # Daily check — advisories appear continuously
push:
paths: ['Cargo.lock']
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: rustsec/audit-check@v2
with:
token: ${{ secrets.GITHUB_TOKEN }}
cargo-deny — Comprehensive Policy Enforcement
cargo-deny goes far beyond
vulnerability scanning. It enforces policies across four dimensions:
- Advisories — known vulnerabilities (like cargo-audit)
- Licenses — allowed/denied license list
- Bans — forbidden crates or duplicate versions
- Sources — allowed registries and git sources
# Install
cargo install cargo-deny
# Initialize configuration
cargo deny init
# Creates deny.toml with documented defaults
# Run all checks
cargo deny check
# Run specific checks
cargo deny check advisories
cargo deny check licenses
cargo deny check bans
cargo deny check sources
Example deny.toml:
# deny.toml
[advisories]
vulnerability = "deny" # Fail on known vulnerabilities
unmaintained = "warn" # Warn on unmaintained crates
yanked = "deny" # Fail on yanked crates
notice = "warn" # Warn on informational advisories
[licenses]
unlicensed = "deny" # All crates must have a license
allow = [
"MIT",
"Apache-2.0",
"BSD-2-Clause",
"BSD-3-Clause",
"ISC",
"Unicode-DFS-2016",
]
copyleft = "deny" # No GPL/LGPL/AGPL in this project
default = "deny" # Deny anything not explicitly allowed
[bans]
multiple-versions = "warn" # Warn if same crate appears at 2 versions
wildcards = "deny" # No path = "*" in dependencies
highlight = "all" # Show all duplicates, not just first
# Ban specific problematic crates
deny = [
# openssl-sys pulls in C OpenSSL — prefer rustls
{ name = "openssl-sys", wrappers = ["native-tls"] },
]
# Allow specific duplicate versions (when unavoidable)
[[bans.skip]]
name = "syn"
version = "1.0" # syn 1.x and 2.x often coexist
[sources]
unknown-registry = "deny" # Only allow crates.io
unknown-git = "deny" # No random git dependencies
allow-registry = ["https://github.com/rust-lang/crates.io-index"]
License enforcement is particularly valuable for commercial projects:
# Check which licenses are in your dependency tree
cargo deny list
# Output:
# MIT — 127 crates
# Apache-2.0 — 89 crates
# BSD-3-Clause — 12 crates
# MPL-2.0 — 3 crates ← might need legal review
# Unicode-DFS — 1 crate
cargo-vet — Supply Chain Trust Verification
cargo-vet (from Mozilla) addresses a
different question: not “does this crate have known bugs?” but “has a trusted
human actually reviewed this code?”
# Install
cargo install cargo-vet
# Initialize (creates supply-chain/ directory)
cargo vet init
# Check which crates need review
cargo vet
# After reviewing a crate, certify it:
cargo vet certify serde 1.0.203
# Records that you've audited serde 1.0.203 for your criteria
# Import audits from trusted organizations
cargo vet import mozilla
cargo vet import google
cargo vet import bytecode-alliance
How it works:
supply-chain/
├── audits.toml ← Your team's audit certifications
├── config.toml ← Trust configuration and criteria
└── imports.lock ← Pinned imports from other organizations
cargo-vet is most valuable for organizations with strict supply-chain
requirements (government, finance, infrastructure). For most teams,
cargo-deny provides sufficient protection.
cargo-outdated and cargo-semver-checks
cargo-outdated — find dependencies that have newer versions:
cargo install cargo-outdated
cargo outdated --workspace
# Output:
# Name Project Compat Latest Kind
# serde 1.0.193 1.0.203 1.0.203 Normal
# regex 1.9.6 1.10.4 1.10.4 Normal
# thiserror 1.0.50 1.0.61 2.0.3 Normal ← major version available
cargo-semver-checks — detect breaking API changes before publishing.
Essential for library crates:
cargo install cargo-semver-checks
# Check if your changes are semver-compatible
cargo semver-checks
# Output:
# ✗ Function `parse_gpu_csv` is now private (was public)
# → This is a BREAKING change. Bump MAJOR version.
#
# ✗ Struct `GpuInfo` has a new required field `power_limit_w`
# → This is a BREAKING change. Bump MAJOR version.
#
# ✓ Function `parse_gpu_csv_v2` was added (non-breaking)
cargo-tree — Dependency Visualization and Deduplication
cargo tree is built into Cargo (no installation needed) and is invaluable
for understanding your dependency graph:
# Full dependency tree
cargo tree
# Find why a specific crate is included
cargo tree --invert --package openssl-sys
# Shows all paths from your crate to openssl-sys
# Find duplicate versions
cargo tree --duplicates
# Output:
# syn v1.0.109
# └── serde_derive v1.0.193
#
# syn v2.0.48
# ├── thiserror-impl v1.0.56
# └── tokio-macros v2.2.0
# Show only direct dependencies
cargo tree --depth 1
# Show dependency features
cargo tree --format "{p} {f}"
# Count total dependencies
cargo tree | wc -l
Deduplication strategy: When cargo tree --duplicates shows the same crate
at two major versions, check if you can update the dependency chain to unify them.
Each duplicate adds compile time and binary size.
Application: Multi-Crate Dependency Hygiene
The the workspace uses [workspace.dependencies] for centralized
version management — an excellent practice. Combined with
cargo tree --duplicates for size
analysis, this prevents version drift and reduces binary bloat:
# Root Cargo.toml — all versions pinned in one place
[workspace.dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["preserve_order"] }
regex = "1.10"
thiserror = "1.0"
anyhow = "1.0"
rayon = "1.8"
Recommended additions for the project:
# Add to CI pipeline:
cargo deny init # One-time setup
cargo deny check # Every PR — licenses, advisories, bans
cargo audit --deny warnings # Every push — vulnerability scanning
cargo outdated --workspace # Weekly — track available updates
Recommended deny.toml for the project:
[advisories]
vulnerability = "deny"
yanked = "deny"
[licenses]
allow = ["MIT", "Apache-2.0", "BSD-2-Clause", "BSD-3-Clause", "ISC", "Unicode-DFS-2016"]
copyleft = "deny" # Hardware diagnostics tool — no copyleft
[bans]
multiple-versions = "warn" # Track duplicates, don't block yet
wildcards = "deny"
[sources]
unknown-registry = "deny"
unknown-git = "deny"
Supply Chain Audit Pipeline
flowchart LR
PR["Pull Request"] --> AUDIT["cargo audit\nKnown CVEs"]
AUDIT --> DENY["cargo deny check\nLicenses + Bans + Sources"]
DENY --> OUTDATED["cargo outdated\nWeekly schedule"]
OUTDATED --> SEMVER["cargo semver-checks\nLibrary crates only"]
AUDIT -->|"Fail"| BLOCK["❌ Block merge"]
DENY -->|"Fail"| BLOCK
SEMVER -->|"Breaking change"| BUMP["Bump major version"]
style BLOCK fill:#ff6b6b,color:#000
style BUMP fill:#ffd43b,color:#000
style PR fill:#e3f2fd,color:#000
🏋️ Exercises
🟢 Exercise 1: Audit Your Dependencies
Run cargo audit and cargo deny init && cargo deny check on any Rust project. How many advisories are found? How many license categories are in your tree?
Solution
cargo audit
# Note any advisories — often chrono, time, or older crates
cargo deny init
cargo deny list
# Shows license breakdown: MIT (N), Apache-2.0 (N), etc.
cargo deny check
# Shows full audit across all four dimensions
🟡 Exercise 2: Find and Eliminate Duplicate Dependencies
Run cargo tree --duplicates on a workspace. Find a crate that appears at two versions. Can you update Cargo.toml to unify them? Measure the compile-time and binary-size impact.
Solution
cargo tree --duplicates
# Typical: syn 1.x and syn 2.x
# Find who pulls in the old version:
cargo tree --invert --package syn@1.0.109
# Output: serde_derive 1.0.xxx -> syn 1.0.109
# Check if a newer serde_derive uses syn 2.x:
cargo update -p serde_derive
cargo tree --duplicates
# If syn 1.x is gone, you've eliminated a duplicate
# Measure impact:
time cargo build --release # Before and after
cargo bloat --release --crates | head -20
Key Takeaways
cargo auditcatches known CVEs — run it on every push and on a daily schedulecargo denyenforces four policy dimensions: advisories, licenses, bans, and sources- Use
[workspace.dependencies]to centralize version management across a multi-crate workspace cargo tree --duplicatesreveals bloat; each duplicate adds compile time and binary sizecargo-vetis for high-security environments;cargo-denyis sufficient for most teams
Release Profiles and Binary Size 🟡
What you’ll learn:
- Release profile anatomy: LTO, codegen-units, panic strategy, strip, opt-level
- Thin vs Fat vs Cross-Language LTO trade-offs
- Binary size analysis with
cargo-bloat- Dependency trimming with
cargo-udeps,cargo-macheteandcargo-shearCross-references: Compile-Time Tools — the other half of optimization · Benchmarking — measure runtime before you optimize · Dependencies — trimming deps reduces both size and compile time
The default cargo build --release is already good. But for production
deployment — especially single-binary tools deployed to thousands of servers —
there’s a significant gap between “good” and “optimized.” This chapter covers
the profile knobs and the tools to measure binary size.
Release Profile Anatomy
Cargo profiles control how rustc compiles your code. The defaults are
conservative — designed for broad compatibility, not maximum performance:
# Cargo.toml — Cargo's built-in defaults (what you get if you specify nothing)
[profile.release]
opt-level = 3 # Optimization level (0=none, 1=basic, 2=good, 3=aggressive)
lto = false # Link-time optimization OFF
codegen-units = 16 # Parallel compilation units (faster compile, less optimization)
panic = "unwind" # Stack unwinding on panic (larger binary, catch_unwind works)
strip = "none" # Keep all symbols and debug info
overflow-checks = false # No integer overflow checks in release
debug = false # No debug info in release
Production-optimized profile (what the project already uses):
[profile.release]
lto = true # Full cross-crate optimization
codegen-units = 1 # Single codegen unit — maximum optimization opportunity
panic = "abort" # No unwinding overhead — smaller, faster
strip = true # Remove all symbols — smaller binary
The impact of each setting:
| Setting | Default → Optimized | Binary Size | Runtime Speed | Compile Time |
|---|---|---|---|---|
lto = false → true | — | -10 to -20% | +5 to +20% | 2-5× slower |
codegen-units = 16 → 1 | — | -5 to -10% | +5 to +10% | 1.5-2× slower |
panic = "unwind" → "abort" | — | -5 to -10% | Negligible | Negligible |
strip = "none" → true | — | -50 to -70% | None | None |
opt-level = 3 → "s" | — | -10 to -30% | -5 to -10% | Similar |
opt-level = 3 → "z" | — | -15 to -40% | -10 to -20% | Similar |
Additional profile tweaks:
[profile.release]
# All of the above, plus:
overflow-checks = true # Keep overflow checks even in release (safety > speed)
debug = "line-tables-only" # Minimal debug info for backtraces without full DWARF
rpath = false # Don't embed runtime library paths
incremental = false # Disable incremental compilation (cleaner builds)
# For size-optimized builds (embedded, WASM):
# opt-level = "z" # Optimize for size aggressively
# strip = "symbols" # Strip symbols but keep debug sections
Per-crate profile overrides — optimize hot crates, leave others alone:
# Dev builds: optimize dependencies but not your code (fast recompile)
[profile.dev.package."*"]
opt-level = 2 # Optimize all dependencies in dev mode
# Release builds: override specific crate optimization
[profile.release.package.serde_json]
opt-level = 3 # Maximum optimization for JSON parsing
codegen-units = 1
# Test profile: match release behavior for accurate integration tests
[profile.test]
opt-level = 1 # Some optimization to avoid timeout in slow tests
LTO in Depth — Thin vs Fat vs Cross-Language
Link-Time Optimization lets LLVM optimize across crate boundaries — inlining
functions from serde_json into your parsing code, removing dead code from
regex, etc. Without LTO, each crate is a separate optimization island.
[profile.release]
# Option 1: Fat LTO (default when lto = true)
lto = true
# All code merged into one LLVM module → maximum optimization
# Slowest compile, smallest/fastest binary
# Option 2: Thin LTO
lto = "thin"
# Each crate stays separate but LLVM does cross-module optimization
# Faster compile than fat LTO, nearly as good optimization
# Best trade-off for most projects
# Option 3: No LTO
lto = false
# Only intra-crate optimization
# Fastest compile, larger binary
# Option 4: Off (explicit)
lto = "off"
# Same as false
Fat LTO vs Thin LTO:
| Aspect | Fat LTO (true) | Thin LTO ("thin") |
|---|---|---|
| Optimization quality | Best | ~95% of fat |
| Compile time | Slow (all code in one module) | Moderate (parallel modules) |
| Memory usage | High (all LLVM IR in memory) | Lower (streaming) |
| Parallelism | None (single module) | Good (per-module) |
| Recommended for | Final release builds | CI builds, development |
Cross-language LTO — optimize across Rust and C boundaries:
[profile.release]
lto = true
# Cargo.toml — for crates using the cc crate
[build-dependencies]
cc = "1.0"
// build.rs — enable cross-language (linker-plugin) LTO
fn main() {
// The cc crate respects CFLAGS from the environment.
// For cross-language LTO, compile C code with:
// -flto=thin -O2
cc::Build::new()
.file("csrc/fast_parser.c")
.flag("-flto=thin")
.opt_level(2)
.compile("fast_parser");
}
# Enable linker-plugin LTO (requires compatible LLD or gold linker)
RUSTFLAGS="-Clinker-plugin-lto -Clinker=clang -Clink-arg=-fuse-ld=lld" \
cargo build --release
Cross-language LTO allows LLVM to inline C functions into Rust callers and vice versa. This is most impactful for FFI-heavy code where small C functions are called frequently (e.g., IPMI ioctl wrappers).
Binary Size Analysis with cargo-bloat
cargo-bloat answers:
“What functions and crates are taking up the most space in my binary?”
# Install
cargo install cargo-bloat
# Show largest functions
cargo bloat --release -n 20
# Output:
# File .text Size Crate Name
# 2.8% 5.1% 78.5KiB serde_json serde_json::de::Deserializer::parse_...
# 2.1% 3.8% 58.2KiB regex_syntax regex_syntax::ast::parse::ParserI::p...
# 1.5% 2.7% 42.1KiB accel_diag accel_diag::vendor::parse_smi_output
# ...
# Show by crate (which dependencies are biggest)
cargo bloat --release --crates
# Output:
# File .text Size Crate
# 12.3% 22.1% 340KiB serde_json
# 8.7% 15.6% 240KiB regex
# 6.2% 11.1% 170KiB std
# 5.1% 9.2% 141KiB accel_diag
# ...
# Compare two builds (before/after optimization)
cargo bloat --release --crates > before.txt
# ... make changes ...
cargo bloat --release --crates > after.txt
diff before.txt after.txt
Common bloat sources and fixes:
| Bloat Source | Typical Size | Fix |
|---|---|---|
regex (full engine) | 200-400 KB | Use regex-lite if you don’t need Unicode |
serde_json (full) | 200-350 KB | Consider simd-json or sonic-rs if perf matters |
| Generics monomorphization | Varies | Use dyn Trait at API boundaries |
Formatting machinery (Display, Debug) | 50-150 KB | #[derive(Debug)] on large enums adds up |
| Panic message strings | 20-80 KB | panic = "abort" removes unwinding, strip removes strings |
| Unused features | Varies | Disable default features: serde = { version = "1", default-features = false } |
Trimming Dependencies with cargo-udeps
cargo-udeps finds dependencies
declared in Cargo.toml that your code doesn’t actually use:
# Install (requires nightly)
cargo install cargo-udeps
# Find unused dependencies
cargo +nightly udeps --workspace
# Output:
# unused dependencies:
# `diag_tool v0.1.0`
# └── "tempfile" (dev-dependency)
#
# `accel_diag v0.1.0`
# └── "once_cell" ← was needed before LazyLock, now dead
Every unused dependency:
- Increases compile time
- Increases binary size
- Adds supply chain risk
- Adds potential license complications
Alternative: cargo-machete — faster, heuristic-based approach:
cargo install cargo-machete
cargo machete
# Faster but may have false positives (heuristic, not compilation-based)
Alternative: cargo-shear — sweet spot between cargo-udeps and cargo-machete:
cargo install cargo-shear
cargo shear --fix
# Slower than cargo-machete but much faster than cargo-udeps
# Much less false positives than cargo-machete
Size Optimization Decision Tree
flowchart TD
START["Binary too large?"] --> STRIP{"strip = true?"}
STRIP -->|"No"| DO_STRIP["Add strip = true\n-50 to -70% size"]
STRIP -->|"Yes"| LTO{"LTO enabled?"}
LTO -->|"No"| DO_LTO["Add lto = true\ncodegen-units = 1"]
LTO -->|"Yes"| BLOAT["Run cargo-bloat\n--crates"]
BLOAT --> BIG_DEP{"Large dependency?"}
BIG_DEP -->|"Yes"| REPLACE["Replace with lighter\nalternative or disable\ndefault features"]
BIG_DEP -->|"No"| UDEPS["cargo-udeps\nRemove unused deps"]
UDEPS --> OPT_LEVEL{"Need smaller?"}
OPT_LEVEL -->|"Yes"| SIZE_OPT["opt-level = 's' or 'z'"]
style DO_STRIP fill:#91e5a3,color:#000
style DO_LTO fill:#e3f2fd,color:#000
style REPLACE fill:#ffd43b,color:#000
style SIZE_OPT fill:#ff6b6b,color:#000
🏋️ Exercises
🟢 Exercise 1: Measure LTO Impact
Build a project with default release settings, then with lto = true + codegen-units = 1 + strip = true. Compare binary size and compile time.
Solution
# Default release
cargo build --release
ls -lh target/release/my-binary
time cargo build --release # Note time
# Optimized release — add to Cargo.toml:
# [profile.release]
# lto = true
# codegen-units = 1
# strip = true
# panic = "abort"
cargo clean
cargo build --release
ls -lh target/release/my-binary # Typically 30-50% smaller
time cargo build --release # Typically 2-3× slower to compile
🟡 Exercise 2: Find Your Biggest Crate
Run cargo bloat --release --crates on a project. Identify the largest dependency. Can you reduce it by disabling default features or switching to a lighter alternative?
Solution
cargo install cargo-bloat
cargo bloat --release --crates
# Output:
# File .text Size Crate
# 12.3% 22.1% 340KiB serde_json
# 8.7% 15.6% 240KiB regex
# For regex — try regex-lite if you don't need Unicode:
# regex-lite = "0.1" # ~10× smaller than full regex
# For serde — disable default features if you don't need std:
# serde = { version = "1", default-features = false, features = ["derive"] }
cargo bloat --release --crates # Compare after changes
Key Takeaways
lto = true+codegen-units = 1+strip = true+panic = "abort"is the production release profile- Thin LTO (
lto = "thin") gives 80% of Fat LTO’s benefit at a fraction of the compile cost cargo-bloat --cratestells you exactly which dependencies are eating binary spacecargo-udeps,cargo-macheteandcargo-shearfind dead dependencies that waste compile time and binary size- Per-crate profile overrides let you optimize hot crates without slowing the whole build
Compile-Time and Developer Tools 🟡
What you’ll learn:
- Compilation caching with
sccachefor local and CI builds- Faster linking with
mold(3-10× faster than the default linker)cargo-nextest: a faster, more informative test runner- Developer visibility tools:
cargo-expand,cargo-geiger,cargo-watch- Workspace lints, MSRV policy, and documentation-as-CI
Cross-references: Release Profiles — LTO and binary size optimization · CI/CD Pipeline — these tools integrate into your pipeline · Dependencies — fewer deps = faster compiles
Compile-Time Optimization: sccache, mold, cargo-nextest
Long compile times are the #1 developer pain point in Rust. These tools collectively can cut iteration time by 50-80%:
sccache — Shared compilation cache:
# Install
cargo install sccache
# Configure as the Rust wrapper
export RUSTC_WRAPPER=sccache
# Or set permanently in .cargo/config.toml:
# [build]
# rustc-wrapper = "sccache"
# First build: normal speed (populates cache)
cargo build --release # 3 minutes
# Clean + rebuild: cache hits for unchanged crates
cargo clean && cargo build --release # 45 seconds
# Check cache statistics
sccache --show-stats
# Compile requests 1,234
# Cache hits 987 (80%)
# Cache misses 247
sccache supports shared caches (S3, GCS, Azure Blob) for team-wide and CI
cache sharing.
mold — A faster linker:
Linking is often the slowest phase. mold is 3-5× faster than lld and
10-20× faster than the default GNU ld:
# Install
sudo apt install mold # Ubuntu 22.04+
# Note: mold is for ELF targets (Linux). macOS uses Mach-O, not ELF.
# The macOS linker (ld64) is already quite fast; if you need faster:
# brew install sold # sold = mold for Mach-O (experimental, less mature)
# In practice, macOS link times are rarely a bottleneck.
# Use mold for linking
# .cargo/config.toml
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
# See https://github.com/rui314/mold/blob/main/docs/mold.md#environment-variables
export MOLD_JOBS=1
# Verify mold is being used
cargo build -v 2>&1 | grep mold
cargo-nextest — A faster test runner:
# Install
cargo install cargo-nextest
# Run tests (parallel by default, per-test timeout, retry)
cargo nextest run
# Key advantages over cargo test:
# - Each test runs in its own process → better isolation
# - Parallel execution with smart scheduling
# - Per-test timeouts (no more hanging CI)
# - JUnit XML output for CI
# - Retry failed tests
# Configuration
cargo nextest run --retries 2 --fail-fast
# Archive test binaries (useful for CI: build once, test on multiple machines)
cargo nextest archive --archive-file tests.tar.zst
cargo nextest run --archive-file tests.tar.zst
# .config/nextest.toml
[profile.default]
retries = 0
slow-timeout = { period = "60s", terminate-after = 3 }
fail-fast = true
[profile.ci]
retries = 2
fail-fast = false
junit = { path = "test-results.xml" }
Combined dev configuration:
# .cargo/config.toml — optimize the development inner loop
[build]
rustc-wrapper = "sccache" # Cache compilation artifacts
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"] # Faster linking
# Dev profile: optimize deps but not your code
# (put in Cargo.toml)
# [profile.dev.package."*"]
# opt-level = 2
cargo-expand and cargo-geiger — Visibility Tools
cargo-expand — see what macros generate:
cargo install cargo-expand
# Expand all macros in a specific module
cargo expand --lib accel_diag::vendor
# Expand a specific derive
# Given: #[derive(Debug, Serialize, Deserialize)]
# cargo expand shows the generated impl blocks
cargo expand --lib --tests
Invaluable for debugging #[derive] macro output, macro_rules! expansions,
and understanding what serde generates for your types.
In addition to cargo-expand, you can also use rust-analyzer to expand macros:
- Move cursor to the macro you want to check.
- Open command palette (e.g.
F1on VSCode). - Search for
rust-analyzer: Expand macro recursively at caret.
cargo-geiger — count unsafe usage across your dependency tree:
cargo install cargo-geiger
cargo geiger
# Output:
# Metric output format: x/y
# x = unsafe code used by the build
# y = total unsafe code found in the crate
#
# Functions Expressions Impls Traits Methods
# 0/0 0/0 0/0 0/0 0/0 ✅ my_crate
# 0/5 0/23 0/2 0/0 0/3 ✅ serde
# 3/3 14/14 0/0 0/0 2/2 ❗ libc
# 15/15 142/142 4/4 0/0 12/12 ☢️ ring
# The symbols:
# ✅ = no unsafe used
# ❗ = some unsafe used
# ☢️ = heavily unsafe
For the project’s zero-unsafe policy, cargo geiger verifies that no
dependency introduces unsafe code into the call graph that your code actually
exercises.
Workspace Lints — [workspace.lints]
Since Rust 1.74, you can configure Clippy and compiler lints centrally in
Cargo.toml — no more #![deny(...)] at the top of every crate:
# Root Cargo.toml — lint configuration for all crates
[workspace.lints.clippy]
unwrap_used = "warn" # Prefer ? or expect("reason")
dbg_macro = "deny" # No dbg!() in committed code
todo = "warn" # Track incomplete implementations
large_enum_variant = "warn" # Catch accidental size bloat
[workspace.lints.rust]
unsafe_code = "deny" # Enforce zero-unsafe policy
missing_docs = "warn" # Encourage documentation
# Each crate's Cargo.toml — opt into workspace lints
[lints]
workspace = true
This replaces scattered #![deny(clippy::unwrap_used)] attributes and ensures
consistent policy across the entire workspace.
Auto-fixing Clippy warnings:
# Let Clippy automatically fix machine-applicable suggestions
cargo clippy --fix --workspace --all-targets --allow-dirty
# Fix and also apply suggestions that may change behavior (review carefully!)
cargo clippy --fix --workspace --all-targets --allow-dirty -- -W clippy::pedantic
Tip: Run
cargo clippy --fixbefore committing. It handles trivial issues (unused imports, redundant clones, type simplifications) that are tedious to fix by hand.
MSRV Policy and rust-version
Minimum Supported Rust Version (MSRV) ensures your crate compiles on older toolchains. This matters when deploying to systems with frozen Rust versions.
# Cargo.toml
[package]
name = "diag_tool"
version = "0.1.0"
rust-version = "1.75" # Minimum Rust version required
# Verify MSRV compliance
cargo +1.75.0 check --workspace
# Automated MSRV discovery
cargo install cargo-msrv
cargo msrv find
# Output: Minimum Supported Rust Version is 1.75.0
# Verify in CI
cargo msrv verify
MSRV in CI:
jobs:
msrv:
name: Check MSRV
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@master
with:
toolchain: "1.75.0" # Match rust-version in Cargo.toml
- run: cargo check --workspace
MSRV strategy:
- Binary applications (like a large project): Use latest stable. No MSRV needed.
- Library crates (published to crates.io): Set MSRV to oldest Rust version
that supports all features you use. Commonly
N-2(two versions behind current). - Enterprise deployments: Set MSRV to match the oldest Rust version installed on your fleet.
Application: Production Binary Profile
The project already has an excellent release profile:
# Current workspace Cargo.toml
[profile.release]
lto = true # ✅ Full cross-crate optimization
codegen-units = 1 # ✅ Maximum optimization
panic = "abort" # ✅ No unwinding overhead
strip = true # ✅ Remove symbols for deployment
[profile.dev]
opt-level = 0 # ✅ Fast compilation
debug = true # ✅ Full debug info
Recommended additions:
# Optimize dependencies in dev mode (faster test execution)
[profile.dev.package."*"]
opt-level = 2
# Test profile: some optimization to prevent timeout in slow tests
[profile.test]
opt-level = 1
# Keep overflow checks in release (safety)
[profile.release]
lto = true
codegen-units = 1
panic = "abort"
strip = true
overflow-checks = true # ← add this: catch integer overflows
debug = "line-tables-only" # ← add this: backtraces without full DWARF
Recommended developer tooling:
# .cargo/config.toml (proposed)
[build]
rustc-wrapper = "sccache" # 80%+ cache hit after first build
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"] # 3-5× faster linking
Expected impact on the project:
| Metric | Current | With Additions |
|---|---|---|
| Release binary | ~10 MB (stripped, LTO) | Same |
| Dev build time | ~45s | ~25s (sccache + mold) |
| Rebuild (1 file change) | ~15s | ~5s (sccache + mold) |
| Test execution | cargo test | cargo nextest — 2× faster |
| Dep vulnerability scanning | None | cargo audit in CI |
| License compliance | Manual | cargo deny automated |
| Unused dependency detection | Manual | cargo udeps in CI |
cargo-watch — Auto-Rebuild on File Changes
cargo-watch re-runs a command
every time a source file changes — essential for tight feedback loops:
# Install
cargo install cargo-watch
# Re-check on every save (instant feedback)
cargo watch -x check
# Run clippy + tests on change
cargo watch -x 'clippy --workspace --all-targets' -x 'test --workspace --lib'
# Watch only specific crates (faster for large workspaces)
cargo watch -w accel_diag/src -x 'test -p accel_diag'
# Clear screen between runs
cargo watch -c -x check
Tip: Combine with
mold+sccachefrom above for sub-second re-check times on incremental changes.
cargo doc and Workspace Documentation
For a large workspace, generated documentation is essential for
discoverability. cargo doc uses rustdoc to produce HTML docs from
doc-comments and type signatures:
# Generate docs for all workspace crates (opens in browser)
cargo doc --workspace --no-deps --open
# Include private items (useful during development)
cargo doc --workspace --no-deps --document-private-items
# Check doc-links without generating HTML (fast CI check)
cargo doc --workspace --no-deps 2>&1 | grep -E 'warning|error'
Intra-doc links — link between types across crates without URLs:
#![allow(unused)]
fn main() {
/// Runs GPU diagnostics using [`GpuConfig`] settings.
///
/// See [`crate::accel_diag::run_diagnostics`] for the implementation.
/// Returns [`DiagResult`] which can be serialized to the
/// [`DerReport`](crate::core_lib::DerReport) format.
pub fn run_accel_diag(config: &GpuConfig) -> DiagResult {
// ...
}
}
Show platform-specific APIs in docs:
#![allow(unused)]
fn main() {
// Cargo.toml: [package.metadata.docs.rs]
// all-features = true
// rustdoc-args = ["--cfg", "docsrs"]
/// Windows-only: read battery status via Win32 API.
///
/// Only available on `cfg(windows)` builds.
#[cfg(windows)]
#[doc(cfg(windows))] // Shows "Available on Windows only" badge in docs
pub fn get_battery_status() -> Option<u8> {
// ...
}
}
CI documentation check:
# Add to CI workflow
- name: Check documentation
run: RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps
# Treats broken intra-doc links as errors
For the project: With many crates,
cargo doc --workspaceis the best way for new team members to discover the API surface. AddRUSTDOCFLAGS="-D warnings"to CI to catch broken doc-links before merge.
Compile-Time Decision Tree
flowchart TD
START["Compile too slow?"] --> WHERE{"Where's the time?"}
WHERE -->|"Recompiling\nunchanged crates"| SCCACHE["sccache\nShared compilation cache"]
WHERE -->|"Linking phase"| MOLD["mold linker\n3-10× faster linking"]
WHERE -->|"Running tests"| NEXTEST["cargo-nextest\nParallel test runner"]
WHERE -->|"Everything"| COMBO["All of the above +\ncargo-udeps to trim deps"]
SCCACHE --> CI_CACHE{"CI or local?"}
CI_CACHE -->|"CI"| S3["S3/GCS shared cache"]
CI_CACHE -->|"Local"| LOCAL["Local disk cache\nauto-configured"]
style SCCACHE fill:#91e5a3,color:#000
style MOLD fill:#e3f2fd,color:#000
style NEXTEST fill:#ffd43b,color:#000
style COMBO fill:#b39ddb,color:#000
🏋️ Exercises
🟢 Exercise 1: Set Up sccache + mold
Install sccache and mold, configure them in .cargo/config.toml, then measure the compile time improvement on a clean rebuild.
Solution
# Install
cargo install sccache
sudo apt install mold # Ubuntu 22.04+
# Configure .cargo/config.toml:
cat > .cargo/config.toml << 'EOF'
[build]
rustc-wrapper = "sccache"
[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
EOF
# First build (populates cache)
time cargo build --release # e.g., 180s
# Clean + rebuild (cache hits)
cargo clean
time cargo build --release # e.g., 45s
sccache --show-stats
# Cache hits should be 60-80%+
🟡 Exercise 2: Switch to cargo-nextest
Install cargo-nextest and run your test suite. Compare wall-clock time with cargo test. What’s the speedup?
Solution
cargo install cargo-nextest
# Standard test runner
time cargo test --workspace 2>&1 | tail -5
# nextest (parallel per-test-binary execution)
time cargo nextest run --workspace 2>&1 | tail -5
# Typical speedup: 2-5× for large workspaces
# nextest also provides:
# - Per-test timing
# - Retries for flaky tests
# - JUnit XML output for CI
cargo nextest run --workspace --retries 2
Key Takeaways
sccachewith S3/GCS backend shares compilation cache across team and CImoldis the fastest ELF linker — link times drop from seconds to millisecondscargo-nextestruns tests in parallel per-binary with better output and retry supportcargo-geigercountsunsafeusage — run it before accepting new dependencies[workspace.lints]centralizes Clippy and rustc lint configuration across a multi-crate workspace
no_std and Feature Verification 🔴
What you’ll learn:
- Verifying feature combinations systematically with
cargo-hack- The three layers of Rust:
corevsallocvsstdand when to use each- Building
no_stdcrates with custom panic handlers and allocators- Testing
no_stdcode on host and with QEMUCross-references: Windows & Conditional Compilation — the platform half of this topic · Cross-Compilation — cross-compiling to ARM and embedded targets · Miri and Sanitizers — verifying
unsafecode inno_stdenvironments · Build Scripts —cfgflags emitted bybuild.rs
Rust runs everywhere from 8-bit microcontrollers to cloud servers. This chapter
covers the foundation: stripping the standard library with #![no_std] and
verifying that your feature combinations actually compile.
Verifying Feature Combinations with cargo-hack
cargo-hack tests all feature
combinations systematically — essential for crates with #[cfg(...)] code:
# Install
cargo install cargo-hack
# Check that every feature compiles individually
cargo hack check --each-feature --workspace
# The nuclear option: test ALL feature combinations (exponential!)
# Only practical for crates with <8 features.
cargo hack check --feature-powerset --workspace
# Practical compromise: test each feature alone + all features + no features
cargo hack check --each-feature --workspace --no-dev-deps
cargo check --workspace --all-features
cargo check --workspace --no-default-features
Why this matters for the project:
If you add platform features (linux, windows, direct-ipmi, direct-accel-api),
cargo-hack catches combinations that break:
# Example: features that gate platform code
[features]
default = ["linux"]
linux = [] # Linux-specific hardware access
windows = ["dep:windows-sys"] # Windows-specific APIs
direct-ipmi = [] # unsafe IPMI ioctl (ch05)
direct-accel-api = [] # unsafe accel-mgmt FFI (ch05)
# Verify all features compile in isolation AND together
cargo hack check --each-feature -p diag_tool
# Catches: "feature 'windows' doesn't compile without 'direct-ipmi'"
# Catches: "#[cfg(feature = \"linux\")] has a typo — it's 'lnux'"
CI integration:
# Add to CI pipeline (fast — just compilation checks)
- name: Feature matrix check
run: cargo hack check --each-feature --workspace --no-dev-deps
Rule of thumb: Run
cargo hack check --each-featurein CI for any crate with 2+ features. Run--feature-powersetonly for core library crates with <8 features — it’s exponential ($2^n$ combinations).
no_std — When and Why
#![no_std] tells the compiler: “don’t link the standard library.” Your
crate can only use core (and optionally alloc). Why would you want this?
| Scenario | Why no_std |
|---|---|
| Embedded firmware (ARM Cortex-M, RISC-V) | No OS, no heap, no file system |
| UEFI diagnostics tool | Pre-boot environment, no OS APIs |
| Kernel modules | Kernel space can’t use userspace std |
| WebAssembly (WASM) | Minimize binary size, no OS dependencies |
| Bootloaders | Run before any OS exists |
| Shared library with C interface | Avoid Rust runtime in callers |
For hardware diagnostics, no_std becomes relevant when building:
- UEFI-based pre-boot diagnostic tools (before the OS loads)
- BMC firmware diagnostics (resource-constrained ARM SoCs)
- Kernel-level PCIe diagnostics (kernel module or eBPF probe)
core vs alloc vs std — The Three Layers
┌─────────────────────────────────────────────────────────────┐
│ std │
│ Everything in core + alloc, PLUS: │
│ • File I/O (std::fs, std::io) │
│ • Networking (std::net) │
│ • Threads (std::thread) │
│ • Time (std::time) │
│ • Environment (std::env) │
│ • Process (std::process) │
│ • OS-specific (std::os::unix, std::os::windows) │
├─────────────────────────────────────────────────────────────┤
│ alloc (available with #![no_std] + extern crate │
│ alloc, if you have a global allocator) │
│ • String, Vec, Box, Rc, Arc │
│ • BTreeMap, BTreeSet │
│ • format!() macro │
│ • Collections and smart pointers that need heap │
├─────────────────────────────────────────────────────────────┤
│ core (always available, even in #![no_std]) │
│ • Primitive types (u8, bool, char, etc.) │
│ • Option, Result │
│ • Iterator, slice, array, str (slices, not String) │
│ • Traits: Clone, Copy, Debug, Display, From, Into │
│ • Atomics (core::sync::atomic) │
│ • Cell, RefCell (core::cell) — Pin (core::pin) │
│ • core::fmt (formatting without allocation) │
│ • core::mem, core::ptr (low-level memory operations) │
│ • Math: core::num, basic arithmetic │
└─────────────────────────────────────────────────────────────┘
What you lose without std:
- No
HashMap(requires a hasher — useBTreeMapfromalloc, orhashbrown) - No
println!()(requires stdout — usecore::fmt::Writeto a buffer) - No
std::error::Error(stabilized incoresince Rust 1.81, but many ecosystems haven’t migrated) - No file I/O, no networking, no threads (unless provided by a platform HAL)
- No
Mutex(usespin::Mutexor platform-specific locks)
Building a no_std Crate
#![allow(unused)]
fn main() {
// src/lib.rs — a no_std library crate
#![no_std]
// Optionally use heap allocation
extern crate alloc;
use alloc::string::String;
use alloc::vec::Vec;
use core::fmt;
/// Temperature reading from a thermal sensor.
/// This struct works in any environment — bare metal to Linux.
#[derive(Clone, Copy, Debug)]
pub struct Temperature {
/// Raw sensor value (0.0625°C per LSB for typical I2C sensors)
raw: u16,
}
impl Temperature {
pub const fn from_raw(raw: u16) -> Self {
Self { raw }
}
/// Convert to degrees Celsius (fixed-point, no FPU required)
pub const fn millidegrees_c(&self) -> i32 {
(self.raw as i32) * 625 / 10 // 0.0625°C resolution
}
pub fn degrees_c(&self) -> f32 {
self.raw as f32 * 0.0625
}
}
impl fmt::Display for Temperature {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let md = self.millidegrees_c();
// Handle sign correctly for values between -0.999°C and -0.001°C
// where md / 1000 == 0 but the value is negative.
if md < 0 && md > -1000 {
write!(f, "-0.{:03}°C", (-md) % 1000)
} else {
write!(f, "{}.{:03}°C", md / 1000, (md % 1000).abs())
}
}
}
/// Parse space-separated temperature values.
/// Uses alloc — requires a global allocator.
pub fn parse_temperatures(input: &str) -> Vec<Temperature> {
input
.split_whitespace()
.filter_map(|s| s.parse::<u16>().ok())
.map(Temperature::from_raw)
.collect()
}
/// Format without allocation — writes directly to a buffer.
/// Works in `core`-only environments (no alloc, no heap).
pub fn format_temp_into(temp: &Temperature, buf: &mut [u8]) -> usize {
use core::fmt::Write;
struct SliceWriter<'a> {
buf: &'a mut [u8],
pos: usize,
}
impl<'a> Write for SliceWriter<'a> {
fn write_str(&mut self, s: &str) -> fmt::Result {
let bytes = s.as_bytes();
let remaining = self.buf.len() - self.pos;
if bytes.len() > remaining {
// Buffer full — signal the error instead of silently truncating.
// Callers can check the returned pos for partial writes.
return Err(fmt::Error);
}
self.buf[self.pos..self.pos + bytes.len()].copy_from_slice(bytes);
self.pos += bytes.len();
Ok(())
}
}
let mut w = SliceWriter { buf, pos: 0 };
let _ = write!(w, "{}", temp);
w.pos
}
}
# Cargo.toml for a no_std crate
[package]
name = "thermal-sensor"
version = "0.1.0"
edition = "2021"
[features]
default = ["alloc"]
alloc = [] # Enable Vec, String, etc.
std = [] # Enable full std (implies alloc)
[dependencies]
# Use no_std-compatible crates
serde = { version = "1.0", default-features = false, features = ["derive"] }
# ↑ default-features = false drops std dependency!
Key crate pattern: Many popular crates (serde, log, rand, embedded-hal) support
no_stdviadefault-features = false. Always check whether a dependency requiresstdbefore using it in ano_stdcontext. Note that some crates (e.g.,regex) require at leastallocand don’t work incore-only environments.
Custom Panic Handlers and Allocators
In #![no_std] binaries (not libraries), you must provide a panic handler
and optionally a global allocator:
// src/main.rs — a no_std binary (e.g., UEFI diagnostic)
#![no_std]
#![no_main]
extern crate alloc;
use core::panic::PanicInfo;
// Required: what to do on panic (no stack unwinding available)
#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
// In embedded: blink an LED, write to UART, hang
// In UEFI: write to console, halt
// Minimal: just loop forever
loop {
core::hint::spin_loop();
}
}
// Required if using alloc: provide a global allocator
use alloc::alloc::{GlobalAlloc, Layout};
struct BumpAllocator {
// Simple bump allocator for embedded/UEFI
// In practice, use a crate like `linked_list_allocator` or `embedded-alloc`
}
// WARNING: This is a non-functional placeholder! Calling alloc() will return
// null, causing immediate UB (the global allocator contract requires non-null
// returns for non-zero-sized allocations). In real code, use an established
// allocator crate:
// - embedded-alloc (embedded targets)
// - linked_list_allocator (UEFI / OS kernels)
// - talc (general-purpose no_std)
unsafe impl GlobalAlloc for BumpAllocator {
/// # Safety
/// Layout must have non-zero size. Returns null (placeholder — will crash).
unsafe fn alloc(&self, _layout: Layout) -> *mut u8 {
// PLACEHOLDER — will crash! Replace with real allocation logic.
core::ptr::null_mut()
}
/// # Safety
/// `_ptr` must have been returned by `alloc` with a compatible layout.
unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {
// No-op for bump allocator
}
}
#[global_allocator]
static ALLOCATOR: BumpAllocator = BumpAllocator {};
// Entry point (platform-specific, not fn main)
// For UEFI: #[entry] or efi_main
// For embedded: #[cortex_m_rt::entry]
Testing no_std Code
Tests run on the host machine, which has std. The trick: your library is
no_std, but your test harness uses std:
#![allow(unused)]
fn main() {
// Your crate: #![no_std] in src/lib.rs
// But tests run under std automatically:
#[cfg(test)]
mod tests {
use super::*;
// std is available here — println!, assert!, Vec all work
#[test]
fn test_temperature_conversion() {
let temp = Temperature::from_raw(800); // 50.0°C
assert_eq!(temp.millidegrees_c(), 50000);
assert!((temp.degrees_c() - 50.0).abs() < 0.01);
}
#[test]
fn test_format_into_buffer() {
let temp = Temperature::from_raw(800);
let mut buf = [0u8; 32];
let len = format_temp_into(&temp, &mut buf);
let s = core::str::from_utf8(&buf[..len]).unwrap();
assert_eq!(s, "50.000°C");
}
}
}
Testing on the actual target (when std isn’t available at all):
# Use defmt-test for on-device testing (embedded ARM)
# Use uefi-test-runner for UEFI targets
# Use QEMU for cross-architecture tests without hardware
# Run no_std library tests on host (always works):
cargo test --lib
# Verify no_std compilation against a no_std target:
cargo check --target thumbv7em-none-eabihf # ARM Cortex-M
cargo check --target riscv32imac-unknown-none-elf # RISC-V
no_std Decision Tree
flowchart TD
START["Does your code need\nthe standard library?"] --> NEED_FS{"File system,\nnetwork, threads?"}
NEED_FS -->|"Yes"| USE_STD["Use std\nNormal application"]
NEED_FS -->|"No"| NEED_HEAP{"Need heap allocation?\nVec, String, Box"}
NEED_HEAP -->|"Yes"| USE_ALLOC["#![no_std]\nextern crate alloc"]
NEED_HEAP -->|"No"| USE_CORE["#![no_std]\ncore only"]
USE_ALLOC --> VERIFY["cargo-hack\n--each-feature"]
USE_CORE --> VERIFY
USE_STD --> VERIFY
VERIFY --> TARGET{"Target has OS?"}
TARGET -->|"Yes"| HOST_TEST["cargo test --lib\nStandard testing"]
TARGET -->|"No"| CROSS_TEST["QEMU / defmt-test\nOn-device testing"]
style USE_STD fill:#91e5a3,color:#000
style USE_ALLOC fill:#ffd43b,color:#000
style USE_CORE fill:#ff6b6b,color:#000
🏋️ Exercises
🟡 Exercise 1: Feature Combination Verification
Install cargo-hack and run cargo hack check --each-feature --workspace on a project with multiple features. Does it find any broken combinations?
Solution
cargo install cargo-hack
# Check each feature individually
cargo hack check --each-feature --workspace --no-dev-deps
# If a feature combination fails:
# error[E0433]: failed to resolve: use of undeclared crate or module `std`
# → This means a feature gate is missing a #[cfg] guard
# Check all features + no features + each individually:
cargo hack check --each-feature --workspace
cargo check --workspace --all-features
cargo check --workspace --no-default-features
🔴 Exercise 2: Build a no_std Library
Create a library crate that compiles with #![no_std]. Implement a simple stack-allocated ring buffer. Verify it compiles for thumbv7em-none-eabihf (ARM Cortex-M).
Solution
#![allow(unused)]
fn main() {
// lib.rs
#![no_std]
pub struct RingBuffer<const N: usize> {
data: [u8; N],
head: usize,
len: usize,
}
impl<const N: usize> RingBuffer<N> {
pub const fn new() -> Self {
Self { data: [0; N], head: 0, len: 0 }
}
pub fn push(&mut self, byte: u8) -> bool {
if self.len == N { return false; }
let idx = (self.head + self.len) % N;
self.data[idx] = byte;
self.len += 1;
true
}
pub fn pop(&mut self) -> Option<u8> {
if self.len == 0 { return None; }
let byte = self.data[self.head];
self.head = (self.head + 1) % N;
self.len -= 1;
Some(byte)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn push_pop() {
let mut rb = RingBuffer::<4>::new();
assert!(rb.push(1));
assert!(rb.push(2));
assert_eq!(rb.pop(), Some(1));
assert_eq!(rb.pop(), Some(2));
assert_eq!(rb.pop(), None);
}
}
}
rustup target add thumbv7em-none-eabihf
cargo check --target thumbv7em-none-eabihf
# ✅ Compiles for bare-metal ARM
Key Takeaways
cargo-hack --each-featureis essential for any crate with conditional compilation — run it in CIcore→alloc→stdare layered: each adds capabilities but requires more runtime support- Custom panic handlers and allocators are required for bare-metal
no_stdbinaries - Test
no_stdlibraries on the host withcargo test --lib— no hardware needed - Run
--feature-powersetonly for core libraries with <8 features — it’s $2^n$ combinations
Windows and Conditional Compilation 🟡
What you’ll learn:
- Windows support patterns:
windows-sys/windowscrates,cargo-xwin- Conditional compilation with
#[cfg]— checked by the compiler, not the preprocessor- Platform abstraction architecture: when
#[cfg]blocks suffice vs when to use traits- Cross-compiling for Windows from Linux
Cross-references:
no_std& Features —cargo-hackand feature verification · Cross-Compilation — general cross-build setup · Build Scripts —cfgflags emitted bybuild.rs
Windows Support — Platform Abstractions
Rust’s #[cfg()] attributes and Cargo features allow a single codebase to
target both Linux and Windows cleanly. The project already
demonstrates this pattern in platform::run_command:
#![allow(unused)]
fn main() {
// Real pattern from the project — platform-specific shell invocation
pub fn exec_cmd(cmd: &str, timeout_secs: Option<u64>) -> Result<CommandResult, CommandError> {
#[cfg(windows)]
let mut child = Command::new("cmd")
.args(["/C", cmd])
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()?;
#[cfg(not(windows))]
let mut child = Command::new("sh")
.args(["-c", cmd])
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()?;
// ... rest is platform-independent ...
}
}
Available cfg predicates:
#![allow(unused)]
fn main() {
// Operating system
#[cfg(target_os = "linux")] // Linux specifically
#[cfg(target_os = "windows")] // Windows
#[cfg(target_os = "macos")] // macOS
#[cfg(unix)] // Linux, macOS, BSDs, etc.
#[cfg(windows)] // Windows (shorthand)
// Architecture
#[cfg(target_arch = "x86_64")] // x86 64-bit
#[cfg(target_arch = "aarch64")] // ARM 64-bit
#[cfg(target_arch = "x86")] // x86 32-bit
// Pointer width (portable alternative to arch)
#[cfg(target_pointer_width = "64")] // Any 64-bit platform
#[cfg(target_pointer_width = "32")] // Any 32-bit platform
// Environment / C library
#[cfg(target_env = "gnu")] // glibc
#[cfg(target_env = "musl")] // musl libc
#[cfg(target_env = "msvc")] // MSVC on Windows
// Endianness
#[cfg(target_endian = "little")]
#[cfg(target_endian = "big")]
// Combinations with any(), all(), not()
#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
#[cfg(any(target_os = "linux", target_os = "macos"))]
#[cfg(not(windows))]
}
The windows-sys and windows Crates
For calling Windows APIs directly:
# Cargo.toml — use windows-sys for raw FFI (lighter, no abstraction)
[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [
"Win32_Foundation",
"Win32_System_Services",
"Win32_System_Registry",
"Win32_System_Power",
] }
# NOTE: windows-sys uses semver-incompatible releases (0.48 → 0.52 → 0.59).
# Pin to a single minor version — each release may remove or rename API bindings.
# Check https://github.com/microsoft/windows-rs for the latest version
# before starting a new project.
# Or use the windows crate for safe wrappers (heavier, more ergonomic)
# windows = { version = "0.59", features = [...] }
#![allow(unused)]
fn main() {
// src/platform/windows.rs
#[cfg(windows)]
mod win {
use windows_sys::Win32::System::Power::{
GetSystemPowerStatus, SYSTEM_POWER_STATUS,
};
pub fn get_battery_status() -> Option<u8> {
let mut status = SYSTEM_POWER_STATUS::default();
// SAFETY: GetSystemPowerStatus writes to the provided buffer.
// The buffer is correctly sized and aligned.
let ok = unsafe { GetSystemPowerStatus(&mut status) };
if ok != 0 {
Some(status.BatteryLifePercent)
} else {
None
}
}
}
}
windows-sys vs windows crate:
| Aspect | windows-sys | windows |
|---|---|---|
| API style | Raw FFI (unsafe calls) | Safe Rust wrappers |
| Binary size | Minimal (just extern declarations) | Larger (wrapper code) |
| Compile time | Fast | Slower |
| Ergonomics | C-style, manual safety | Rust-idiomatic |
| Error handling | Raw BOOL / HRESULT | Result<T, windows::core::Error> |
| Use when | Performance-critical, thin wrapper | Application code, ease of use |
Cross-Compiling for Windows from Linux
# Option 1: MinGW (GNU ABI)
rustup target add x86_64-pc-windows-gnu
sudo apt install gcc-mingw-w64-x86-64
cargo build --target x86_64-pc-windows-gnu
# Produces a .exe — runs on Windows, links against msvcrt
# Option 2: MSVC ABI via xwin (for full MSVC compatibility)
cargo install cargo-xwin
cargo xwin build --target x86_64-pc-windows-msvc
# Uses Microsoft's CRT and SDK headers downloaded automatically
# Option 3: Zig-based cross-compilation
cargo zigbuild --target x86_64-pc-windows-gnu
GNU vs MSVC ABI on Windows:
| Aspect | x86_64-pc-windows-gnu | x86_64-pc-windows-msvc |
|---|---|---|
| Linker | MinGW ld | MSVC link.exe or lld-link |
| C runtime | msvcrt.dll (universal) | ucrtbase.dll (modern) |
| C++ interop | GCC ABI | MSVC ABI |
| Cross-compile from Linux | Easy (MinGW) | Possible (cargo-xwin) |
| Windows API support | Full | Full |
| Debug info format | DWARF | PDB |
| Recommended for | Simple tools, CI builds | Full Windows integration |
Conditional Compilation Patterns
Pattern 1: Platform module selection
#![allow(unused)]
fn main() {
// src/platform/mod.rs — compile different modules per OS
#[cfg(target_os = "linux")]
mod linux;
#[cfg(target_os = "linux")]
pub use linux::*;
#[cfg(target_os = "windows")]
mod windows;
#[cfg(target_os = "windows")]
pub use windows::*;
// Both modules implement the same public API:
// pub fn get_cpu_temperature() -> Result<f64, PlatformError>
// pub fn list_pci_devices() -> Result<Vec<PciDevice>, PlatformError>
}
Pattern 2: Feature-gated platform support
# Cargo.toml
[features]
default = ["linux"]
linux = [] # Linux-specific hardware access
windows = ["dep:windows-sys"] # Windows-specific APIs
[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [...], optional = true }
#![allow(unused)]
fn main() {
// Compile error if someone tries to build for Windows without the feature:
#[cfg(all(target_os = "windows", not(feature = "windows")))]
compile_error!("Enable the 'windows' feature to build for Windows");
}
Pattern 3: Trait-based platform abstraction
#![allow(unused)]
fn main() {
/// Platform-independent interface for hardware access.
pub trait HardwareAccess {
type Error: std::error::Error;
fn read_cpu_temperature(&self) -> Result<f64, Self::Error>;
fn read_gpu_temperature(&self, gpu_index: u32) -> Result<f64, Self::Error>;
fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error>;
fn send_ipmi_command(&self, cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error>;
}
#[cfg(target_os = "linux")]
pub struct LinuxHardware;
#[cfg(target_os = "linux")]
impl HardwareAccess for LinuxHardware {
type Error = LinuxHwError;
fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
// Read from /sys/class/thermal/thermal_zone0/temp
let raw = std::fs::read_to_string("/sys/class/thermal/thermal_zone0/temp")?;
Ok(raw.trim().parse::<f64>()? / 1000.0)
}
// ...
}
#[cfg(target_os = "windows")]
pub struct WindowsHardware;
#[cfg(target_os = "windows")]
impl HardwareAccess for WindowsHardware {
type Error = WindowsHwError;
fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
// Read via WMI (Win32_TemperatureProbe) or Open Hardware Monitor
todo!("WMI temperature query")
}
// ...
}
/// Create the platform-appropriate implementation
pub fn create_hardware() -> impl HardwareAccess {
#[cfg(target_os = "linux")]
{ LinuxHardware }
#[cfg(target_os = "windows")]
{ WindowsHardware }
}
}
Platform Abstraction Architecture
For a project that targets multiple platforms, organize code into three layers:
┌──────────────────────────────────────────────────┐
│ Application Logic (platform-independent) │
│ diag_tool, accel_diag, network_diag, event_log, etc. │
│ Uses only the platform abstraction trait │
├──────────────────────────────────────────────────┤
│ Platform Abstraction Layer (trait definitions) │
│ trait HardwareAccess { ... } │
│ trait CommandRunner { ... } │
│ trait FileSystem { ... } │
├──────────────────────────────────────────────────┤
│ Platform Implementations (cfg-gated) │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Linux impl │ │ Windows impl │ │
│ │ /sys, /proc │ │ WMI, Registry│ │
│ │ ipmitool │ │ ipmiutil │ │
│ │ lspci │ │ devcon │ │
│ └──────────────┘ └──────────────┘ │
└──────────────────────────────────────────────────┘
Testing the abstraction: Mock the platform trait for unit tests:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
struct MockHardware {
cpu_temp: f64,
gpu_temps: Vec<f64>,
}
impl HardwareAccess for MockHardware {
type Error = std::io::Error;
fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
Ok(self.cpu_temp)
}
fn read_gpu_temperature(&self, index: u32) -> Result<f64, Self::Error> {
self.gpu_temps.get(index as usize)
.copied()
.ok_or_else(|| std::io::Error::new(
std::io::ErrorKind::NotFound,
format!("GPU {index} not found")
))
}
fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error> {
Ok(vec![]) // Mock returns empty
}
fn send_ipmi_command(&self, _cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error> {
Ok(IpmiResponse::default())
}
}
#[test]
fn test_thermal_check_with_mock() {
let hw = MockHardware {
cpu_temp: 75.0,
gpu_temps: vec![82.0, 84.0],
};
let result = run_thermal_diagnostic(&hw);
assert!(result.is_ok());
}
}
}
Application: Linux-First, Windows-Ready
The project is already partially Windows-ready. Use
cargo-hack to verify all feature
combinations, and cross-compile
to test on Windows from Linux:
Already done:
platform::run_commanduses#[cfg(windows)]for shell selection- Tests use
#[cfg(windows)]/#[cfg(not(windows))]for platform-appropriate test commands
Recommended evolution path for Windows support:
Phase 1: Extract platform abstraction trait (current → 2 weeks)
├─ Define HardwareAccess trait in core_lib
├─ Wrap current Linux code behind LinuxHardware impl
└─ All diagnostic modules depend on trait, not Linux specifics
Phase 2: Add Windows stubs (2 weeks)
├─ Implement WindowsHardware with TODO stubs
├─ CI builds for x86_64-pc-windows-msvc (compile check only)
└─ Tests pass with MockHardware on all platforms
Phase 3: Windows implementation (ongoing)
├─ IPMI via ipmiutil.exe or OpenIPMI Windows driver
├─ GPU via accel-mgmt (accel-api.dll) — same API as Linux
├─ PCIe via Windows Setup API (SetupDiEnumDeviceInfo)
└─ NIC via WMI (Win32_NetworkAdapter)
Cross-platform CI addition:
# Add to CI matrix
- target: x86_64-pc-windows-msvc
os: windows-latest
name: windows-x86_64
This ensures the codebase compiles on Windows even before full Windows
implementation is complete — catching cfg mistakes early.
Key insight: The abstraction doesn’t need to be perfect on day one. Start with
#[cfg]blocks in leaf functions (likeexec_cmdalready does), then refactor to traits when you have two or more platform implementations. Premature abstraction is worse than#[cfg]blocks.
Conditional Compilation Decision Tree
flowchart TD
START["Platform-specific code?"] --> HOW_MANY{"How many platforms?"}
HOW_MANY -->|"2 (Linux + Windows)"| CFG_BLOCKS["#[cfg] blocks\nin leaf functions"]
HOW_MANY -->|"3+"| TRAIT_APPROACH["Platform trait\n+ per-platform impl"]
CFG_BLOCKS --> WINAPI{"Need Windows APIs?"}
WINAPI -->|"Minimal"| WIN_SYS["windows-sys\nRaw FFI bindings"]
WINAPI -->|"Rich (COM, etc)"| WIN_RS["windows crate\nSafe idiomatic wrappers"]
WINAPI -->|"None\n(just #[cfg])"| NATIVE["cfg(windows)\ncfg(unix)"]
TRAIT_APPROACH --> CI_CHECK["cargo-hack\n--each-feature"]
CFG_BLOCKS --> CI_CHECK
CI_CHECK --> XCOMPILE["Cross-compile in CI\ncargo-xwin or\nnative runners"]
style CFG_BLOCKS fill:#91e5a3,color:#000
style TRAIT_APPROACH fill:#ffd43b,color:#000
style WIN_SYS fill:#e3f2fd,color:#000
style WIN_RS fill:#e3f2fd,color:#000
🏋️ Exercises
🟢 Exercise 1: Platform-Conditional Module
Create a module with #[cfg(unix)] and #[cfg(windows)] implementations of a get_hostname() function. Verify both compile with cargo check and cargo check --target x86_64-pc-windows-msvc.
Solution
#![allow(unused)]
fn main() {
// src/hostname.rs
#[cfg(unix)]
pub fn get_hostname() -> String {
use std::fs;
fs::read_to_string("/etc/hostname")
.unwrap_or_else(|_| "unknown".to_string())
.trim()
.to_string()
}
#[cfg(windows)]
pub fn get_hostname() -> String {
use std::env;
env::var("COMPUTERNAME").unwrap_or_else(|_| "unknown".to_string())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn hostname_is_not_empty() {
let name = get_hostname();
assert!(!name.is_empty());
}
}
}
# Verify Linux compilation
cargo check
# Verify Windows compilation (cross-check)
rustup target add x86_64-pc-windows-msvc
cargo check --target x86_64-pc-windows-msvc
🟡 Exercise 2: Cross-Compile for Windows with cargo-xwin
Install cargo-xwin and build a simple binary for x86_64-pc-windows-msvc from Linux. Verify the output is a .exe.
Solution
cargo install cargo-xwin
rustup target add x86_64-pc-windows-msvc
cargo xwin build --release --target x86_64-pc-windows-msvc
# Downloads Windows SDK headers/libs automatically
file target/x86_64-pc-windows-msvc/release/my-binary.exe
# Output: PE32+ executable (console) x86-64, for MS Windows
# You can also test with Wine:
wine target/x86_64-pc-windows-msvc/release/my-binary.exe
Key Takeaways
- Start with
#[cfg]blocks in leaf functions; refactor to traits only when three or more platforms diverge windows-sysis for raw FFI; thewindowscrate provides safe, idiomatic wrapperscargo-xwincross-compiles to Windows MSVC ABI from Linux — no Windows machine needed- Always check
--target x86_64-pc-windows-msvcin CI even if you only ship on Linux - Combine
#[cfg]with Cargo features for optional platform support (e.g.,feature = "windows")
Putting It All Together — A Production CI/CD Pipeline 🟡
What you’ll learn:
- Structuring a multi-stage GitHub Actions CI workflow (check → test → coverage → security → cross → release)
- Caching strategies with
rust-cacheandsave-iftuning- Running Miri and sanitizers on a nightly schedule
- Task automation with
Makefile.tomland pre-commit hooks- Automated releases with
cargo-distCross-references: Build Scripts · Cross-Compilation · Benchmarking · Coverage · Miri/Sanitizers · Dependencies · Release Profiles · Compile-Time Tools ·
no_std· Windows
Individual tools are useful. A pipeline that orchestrates them automatically on every push is transformative. This chapter assembles the tools from chapters 1–10 into a cohesive CI/CD workflow.
The Complete GitHub Actions Workflow
A single workflow file that runs all verification stages in parallel:
# .github/workflows/ci.yml
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
CARGO_TERM_COLOR: always
CARGO_ENCODED_RUSTFLAGS: "-Dwarnings" # Treat warnings as errors (top-level crate only)
# NOTE: Unlike RUSTFLAGS, CARGO_ENCODED_RUSTFLAGS does not affect build scripts
# or proc-macros, which avoids false failures from third-party warnings.
# Use RUSTFLAGS="-Dwarnings" instead if you want to enforce on build scripts too.
jobs:
# ─── Stage 1: Fast feedback (< 2 min) ───
check:
name: Check + Clippy + Format
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: clippy, rustfmt
- uses: Swatinem/rust-cache@v2 # Cache dependencies
- name: Check Cargo.lock
run: cargo fetch --locked
- name: Check doc
run: RUSTDOCFLAGS='-Dwarnings' cargo doc --workspace --all-features --no-deps
- name: Check compilation
run: cargo check --workspace --all-targets --all-features
- name: Clippy lints
run: cargo clippy --workspace --all-targets --all-features -- -D warnings
- name: Formatting
run: cargo fmt --all -- --check
# ─── Stage 2: Tests (< 5 min) ───
test:
name: Test (${{ matrix.os }})
needs: check
strategy:
matrix:
os: [ubuntu-latest, windows-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
- name: Run tests
run: cargo test --workspace
- name: Run doc tests
run: cargo test --workspace --doc
# ─── Stage 3: Cross-compilation (< 10 min) ───
cross:
name: Cross (${{ matrix.target }})
needs: check
strategy:
matrix:
include:
- target: x86_64-unknown-linux-musl
os: ubuntu-latest
- target: aarch64-unknown-linux-gnu
os: ubuntu-latest
use_cross: true
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- name: Install musl-tools
if: contains(matrix.target, 'musl')
run: sudo apt-get install -y musl-tools
- name: Install cross
if: matrix.use_cross
uses: taiki-e/install-action@cross
- name: Build (native)
if: "!matrix.use_cross"
run: cargo build --release --target ${{ matrix.target }}
- name: Build (cross)
if: matrix.use_cross
run: cross build --release --target ${{ matrix.target }}
- name: Upload artifact
uses: actions/upload-artifact@v4
with:
name: binary-${{ matrix.target }}
path: target/${{ matrix.target }}/release/diag_tool
# ─── Stage 4: Coverage (< 10 min) ───
coverage:
name: Code Coverage
needs: check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: llvm-tools-preview
- uses: taiki-e/install-action@cargo-llvm-cov
- name: Generate coverage
run: cargo llvm-cov --workspace --lcov --output-path lcov.info
- name: Enforce minimum coverage
run: cargo llvm-cov --workspace --fail-under-lines 75
- name: Upload to Codecov
uses: codecov/codecov-action@v4
with:
files: lcov.info
token: ${{ secrets.CODECOV_TOKEN }}
# ─── Stage 5: Safety verification (< 15 min) ───
miri:
name: Miri
needs: check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@nightly
with:
components: miri
- name: Run Miri
run: cargo miri test --workspace
env:
MIRIFLAGS: "-Zmiri-backtrace=full"
# ─── Stage 6: Benchmarks (PR only, < 10 min) ───
bench:
name: Benchmarks
if: github.event_name == 'pull_request'
needs: check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- name: Run benchmarks
run: cargo bench -- --output-format bencher | tee bench.txt
- name: Compare with baseline
uses: benchmark-action/github-action-benchmark@v1
with:
tool: 'cargo'
output-file-path: bench.txt
github-token: ${{ secrets.GITHUB_TOKEN }}
alert-threshold: '115%'
comment-on-alert: true
Pipeline execution flow:
┌─────────┐
│ check │ ← clippy + fmt + cargo check (2 min)
└────┬────┘
┌─────────┬──┴──┬──────────┬──────────┐
▼ ▼ ▼ ▼ ▼
┌──────┐ ┌──────┐ ┌────────┐ ┌──────┐ ┌──────┐
│ test │ │cross │ │coverage│ │ miri │ │bench │
│ (2×) │ │ (2×) │ │ │ │ │ │(PR) │
└──────┘ └──────┘ └────────┘ └──────┘ └──────┘
3 min 8 min 8 min 12 min 5 min
Total wall-clock: ~14 min (parallel after check gate)
CI Caching Strategies
Swatinem/rust-cache@v2 is the
standard Rust CI cache action. It caches ~/.cargo and target/ between
runs, but large workspaces need tuning:
# Basic (what we use above)
- uses: Swatinem/rust-cache@v2
# Tuned for a large workspace:
- uses: Swatinem/rust-cache@v2
with:
# Separate caches per job — prevents test artifacts bloating build cache
prefix-key: "v1-rust"
key: ${{ matrix.os }}-${{ matrix.target || 'default' }}
# Only save cache on main branch (PRs read but don't write)
save-if: ${{ github.ref == 'refs/heads/main' }}
# Cache Cargo registry + git checkouts + target dir
cache-targets: true
cache-all-crates: true
Cache invalidation gotchas:
| Problem | Fix |
|---|---|
| Cache grows unbounded (>5 GB) | Set prefix-key: "v2-rust" to force fresh cache |
| Different features pollute cache | Use key: ${{ hashFiles('**/Cargo.lock') }} |
| PR cache overwrites main | Set save-if: ${{ github.ref == 'refs/heads/main' }} |
| Cross-compilation targets bloat | Use separate key per target triple |
Sharing cache between jobs:
The check job saves the cache; downstream jobs (test, cross, coverage)
read it. With save-if on main only, PR runs get the benefit of cached
dependencies without writing stale caches.
Measured impact on large-scale workspace: Cold build ~4 min → cached build ~45 sec. The cache action alone saves ~25 min of CI time per pipeline run (across all parallel jobs).
Makefile.toml with cargo-make
cargo-make provides a portable
task runner that works across platforms (unlike make/Makefile):
# Install
cargo install cargo-make
# Makefile.toml — at workspace root
[config]
default_to_workspace = false
# ─── Developer workflows ───
[tasks.dev]
description = "Full local verification (same checks as CI)"
dependencies = ["check", "test", "clippy", "fmt-check"]
[tasks.check]
command = "cargo"
args = ["check", "--workspace", "--all-targets"]
[tasks.test]
command = "cargo"
args = ["test", "--workspace"]
[tasks.clippy]
command = "cargo"
args = ["clippy", "--workspace", "--all-targets", "--", "-D", "warnings"]
[tasks.fmt]
command = "cargo"
args = ["fmt", "--all"]
[tasks.fmt-check]
command = "cargo"
args = ["fmt", "--all", "--", "--check"]
# ─── Coverage ───
[tasks.coverage]
description = "Generate HTML coverage report"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--html", "--open"]
[tasks.coverage-ci]
description = "Generate LCOV for CI upload"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--lcov", "--output-path", "lcov.info"]
# ─── Benchmarks ───
[tasks.bench]
description = "Run all benchmarks"
command = "cargo"
args = ["bench"]
# ─── Cross-compilation ───
[tasks.build-musl]
description = "Build static binary (musl)"
command = "cargo"
args = ["build", "--release", "--target", "x86_64-unknown-linux-musl"]
[tasks.build-arm]
description = "Build for aarch64 (requires cross)"
command = "cross"
args = ["build", "--release", "--target", "aarch64-unknown-linux-gnu"]
[tasks.build-all]
description = "Build for all deployment targets"
dependencies = ["build-musl", "build-arm"]
# ─── Safety verification ───
[tasks.miri]
description = "Run Miri on all tests"
toolchain = "nightly"
command = "cargo"
args = ["miri", "test", "--workspace"]
[tasks.audit]
description = "Check for known vulnerabilities"
install_crate = "cargo-audit"
command = "cargo"
args = ["audit"]
# ─── Release ───
[tasks.release-dry]
description = "Preview what cargo-release would do"
install_crate = "cargo-release"
command = "cargo"
args = ["release", "--workspace", "--dry-run"]
Usage:
# Equivalent of CI pipeline, locally
cargo make dev
# Generate and view coverage
cargo make coverage
# Build for all targets
cargo make build-all
# Run safety checks
cargo make miri
# Check for vulnerabilities
cargo make audit
Pre-Commit Hooks: Custom Scripts and cargo-husky
Catch issues before they reach CI. The recommended approach is a custom git hook — it’s simple, transparent, and has no external dependencies:
#!/bin/sh
# .githooks/pre-commit
set -e
echo "=== Pre-commit checks ==="
# Fast checks first
echo "→ cargo fmt --check"
cargo fmt --all -- --check
echo "→ cargo check"
cargo check --workspace --all-targets
echo "→ cargo clippy"
cargo clippy --workspace --all-targets -- -D warnings
echo "→ cargo test (lib only, fast)"
cargo test --workspace --lib
echo "=== All checks passed ==="
# Install the hook
git config core.hooksPath .githooks
chmod +x .githooks/pre-commit
Alternative: cargo-husky (auto-installs hooks via build script):
⚠️ Note:
cargo-huskyhas not been updated since 2022. It still works but is effectively unmaintained. Consider the custom hook approach above for new projects.
cargo install cargo-husky
# Cargo.toml — add to dev-dependencies of root crate
[dev-dependencies]
cargo-husky = { version = "1", default-features = false, features = [
"precommit-hook",
"run-cargo-check",
"run-cargo-clippy",
"run-cargo-fmt",
"run-cargo-test",
] }
Release Workflow: cargo-release and cargo-dist
cargo-release — automates version bumping, tagging, and publishing:
# Install
cargo install cargo-release
# release.toml — at workspace root
[workspace]
consolidate-commits = true
pre-release-commit-message = "chore: release {{version}}"
tag-message = "v{{version}}"
tag-name = "v{{version}}"
# Don't publish internal crates
[[package]]
name = "core_lib"
release = false
[[package]]
name = "diag_framework"
release = false
# Only publish the main binary
[[package]]
name = "diag_tool"
release = true
# Preview release
cargo release patch --dry-run
# Execute release (bumps version, commits, tags, optionally publishes)
cargo release patch --execute
# 0.1.0 → 0.1.1
cargo release minor --execute
# 0.1.1 → 0.2.0
cargo-dist — generates downloadable release binaries for GitHub Releases:
# Install
cargo install cargo-dist
# Initialize (creates CI workflow + metadata)
cargo dist init
# Preview what would be built
cargo dist plan
# Generate the release (usually done by CI on tag push)
cargo dist build
# Cargo.toml additions from `cargo dist init`
[workspace.metadata.dist]
cargo-dist-version = "0.28.0"
ci = "github"
targets = [
"x86_64-unknown-linux-gnu",
"x86_64-unknown-linux-musl",
"aarch64-unknown-linux-gnu",
"x86_64-pc-windows-msvc",
]
install-path = "CARGO_HOME"
This generates a GitHub Actions workflow that, on tag push:
- Builds the binary for all target platforms
- Creates a GitHub Release with downloadable
.tar.gz/.ziparchives - Generates shell/PowerShell installer scripts
- Publishes to crates.io (if configured)
Try It Yourself — Capstone Exercise
This exercise ties together every chapter. You will build a complete engineering pipeline for a fresh Rust workspace:
-
Create a new workspace with two crates: a library (
core_lib) and a binary (cli). Add abuild.rsthat embeds the git hash and build timestamp usingSOURCE_DATE_EPOCH(ch01). -
Set up cross-compilation for
x86_64-unknown-linux-muslandaarch64-unknown-linux-gnu. Verify both targets build withcargo zigbuildorcross(ch02). -
Add a benchmark using Criterion or Divan for a function in
core_lib. Run it locally and record a baseline (ch03). -
Measure code coverage with
cargo llvm-cov. Set a minimum threshold of 80% and verify it passes (ch04). -
Run
cargo +nightly careful testandcargo miri test. Add a test that exercisesunsafecode if you have any (ch05). -
Configure
cargo-denywith adeny.tomlthat bansopenssland enforces MIT/Apache-2.0 licensing (ch06). -
Optimize the release profile with
lto = "thin",strip = true, andcodegen-units = 1. Measure binary size before/after withcargo bloat(ch07). -
Add
cargo hack --each-featureverification. Create a feature flag for an optional dependency and ensure it compiles alone (ch09). -
Write the GitHub Actions workflow (this chapter) with all 6 stages. Add
Swatinem/rust-cache@v2withsave-iftuning.
Success criteria: Push to GitHub → all CI stages green → cargo dist plan
shows your release targets. You now have a production-grade Rust pipeline.
CI Pipeline Architecture
flowchart LR
subgraph "Stage 1 — Fast Feedback < 2 min"
CHECK["cargo check\ncargo clippy\ncargo fmt"]
end
subgraph "Stage 2 — Tests < 5 min"
TEST["cargo nextest\ncargo test --doc"]
end
subgraph "Stage 3 — Coverage"
COV["cargo llvm-cov\nfail-under 80%"]
end
subgraph "Stage 4 — Security"
SEC["cargo audit\ncargo deny check"]
end
subgraph "Stage 5 — Cross-Build"
CROSS["musl static\naarch64 + x86_64"]
end
subgraph "Stage 6 — Release (tag only)"
REL["cargo dist\nGitHub Release"]
end
CHECK --> TEST --> COV --> SEC --> CROSS --> REL
style CHECK fill:#91e5a3,color:#000
style TEST fill:#91e5a3,color:#000
style COV fill:#e3f2fd,color:#000
style SEC fill:#ffd43b,color:#000
style CROSS fill:#e3f2fd,color:#000
style REL fill:#b39ddb,color:#000
Key Takeaways
- Structure CI as parallel stages: fast checks first, expensive jobs behind gates
Swatinem/rust-cache@v2withsave-if: ${{ github.ref == 'refs/heads/main' }}prevents PR cache thrashing- Run Miri and heavier sanitizers on a nightly
schedule:trigger, not on every push Makefile.toml(cargo make) bundles multi-tool workflows into a single command for local devcargo-distautomates cross-platform release builds — stop writing platform matrix YAML by hand
Tricks from the Trenches 🟡
What you’ll learn:
- Battle-tested patterns that don’t fit neatly into one chapter
- Common pitfalls and their fixes — from CI flake to binary bloat
- Quick-win techniques you can apply to any Rust project today
Cross-references: Every chapter in this book — these tricks cut across all topics
This chapter collects engineering patterns that come up repeatedly in production Rust codebases. Each trick is self-contained — read them in any order.
1. The deny(warnings) Trap
Problem: #![deny(warnings)] in source code breaks builds when Clippy
adds new lints — your code that compiled yesterday fails today.
Fix: Use CARGO_ENCODED_RUSTFLAGS in CI instead of a source-level attribute:
# CI: treat warnings as errors without touching source
env:
CARGO_ENCODED_RUSTFLAGS: "-Dwarnings"
Or use [workspace.lints] for finer control:
# Cargo.toml
[workspace.lints.rust]
unsafe_code = "deny"
[workspace.lints.clippy]
all = { level = "deny", priority = -1 }
pedantic = { level = "warn", priority = -1 }
See Compile-Time Tools, Workspace Lints for the full pattern.
2. Compile Once, Test Everywhere
Problem: cargo test recompiles when switching between --lib, --doc,
and --test because they use different profiles.
Fix: Use cargo nextest for unit/integration tests and run doc-tests
separately:
cargo nextest run --workspace # Fast: parallel, cached
cargo test --workspace --doc # Doc-tests (nextest can't run these)
See Compile-Time Tools for
cargo-nextestsetup.
3. Feature Flag Hygiene
Problem: A library crate has default = ["std"] but nobody tests
--no-default-features. One day an embedded user reports it doesn’t compile.
Fix: Add cargo-hack to CI:
- name: Feature matrix
run: |
cargo hack check --each-feature --no-dev-deps
cargo check --no-default-features
cargo check --all-features
See
no_stdand Feature Verification for the full pattern.
4. The Lock File Debate — Commit or Ignore?
Rule of thumb:
| Crate Type | Commit Cargo.lock? | Why |
|---|---|---|
| Binary / application | Yes | Reproducible builds |
| Library | No (.gitignore) | Let downstream choose versions |
| Workspace with both | Yes | Binary wins |
Add a CI check to ensure the lock file stays up-to-date:
- name: Check lock file
run: cargo update --locked # Fails if Cargo.lock is stale
5. Debug Builds with Optimized Dependencies
Problem: Debug builds are painfully slow because dependencies (especially
serde, regex) aren’t optimized.
Fix: Optimize deps in dev profile while keeping your code unoptimized for fast recompilation:
# Cargo.toml
[profile.dev.package."*"]
opt-level = 2 # Optimize all dependencies in dev mode
This slows the first build slightly but makes runtime dramatically faster during development. Particularly impactful for database-backed services and parsers.
See Release Profiles for per-crate profile overrides.
6. CI Cache Thrashing
Problem: Swatinem/rust-cache@v2 saves a new cache on every PR, bloating
storage and slowing restore times.
Fix: Only save cache from main, restore from anywhere:
- uses: Swatinem/rust-cache@v2
with:
save-if: ${{ github.ref == 'refs/heads/main' }}
For workspaces with multiple binaries, add a shared-key:
- uses: Swatinem/rust-cache@v2
with:
shared-key: "ci-${{ matrix.target }}"
save-if: ${{ github.ref == 'refs/heads/main' }}
See CI/CD Pipeline for the full workflow.
7. RUSTFLAGS vs CARGO_ENCODED_RUSTFLAGS
Problem: RUSTFLAGS="-Dwarnings" applies to everything — including
build scripts and proc-macros. A warning in serde_derive’s build.rs
fails your CI.
Fix: Use CARGO_ENCODED_RUSTFLAGS which only applies to the top-level
crate:
# BAD — breaks on third-party build script warnings
RUSTFLAGS="-Dwarnings" cargo build
# GOOD — only affects your crate
CARGO_ENCODED_RUSTFLAGS="-Dwarnings" cargo build
# ALSO GOOD — workspace lints (Cargo.toml)
[workspace.lints.rust]
warnings = "deny"
8. Reproducible Builds with SOURCE_DATE_EPOCH
Problem: Embedding chrono::Utc::now() in build.rs makes builds
non-reproducible — every build produces a different binary hash.
Fix: Honor SOURCE_DATE_EPOCH:
#![allow(unused)]
fn main() {
// build.rs
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
.ok()
.and_then(|s| s.parse::<i64>().ok())
.unwrap_or_else(|| chrono::Utc::now().timestamp());
println!("cargo:rustc-env=BUILD_TIMESTAMP={timestamp}");
}
See Build Scripts for the full build.rs patterns.
9. The cargo tree Deduplication Workflow
Problem: cargo tree --duplicates shows 5 versions of syn and 3 of
tokio-util. Compile time is painful.
Fix: Systematic deduplication:
# Step 1: Find duplicates
cargo tree --duplicates
# Step 2: Find who pulls the old version
cargo tree --invert --package syn@1.0.109
# Step 3: Update the culprit
cargo update -p serde_derive # Might pull in syn 2.x
# Step 4: If no update available, pin in [patch]
# [patch.crates-io]
# old-crate = { git = "...", branch = "syn2-migration" }
# Step 5: Verify
cargo tree --duplicates # Should be shorter
See Dependency Management for
cargo-denyand supply chain security.
10. Pre-Push Smoke Test
Problem: You push, CI takes 10 minutes, fails on a formatting issue.
Fix: Run the fast checks locally before push:
# Makefile.toml (cargo-make)
[tasks.pre-push]
description = "Local smoke test before pushing"
script = '''
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace --lib
'''
cargo make pre-push # < 30 seconds
git push
Or use a git pre-push hook:
#!/bin/sh
# .git/hooks/pre-push
cargo fmt --all -- --check && cargo clippy --workspace -- -D warnings
See CI/CD Pipeline for
Makefile.tomlpatterns.
🏋️ Exercises
🟢 Exercise 1: Apply Three Tricks
Pick three tricks from this chapter and apply them to an existing Rust project. Which had the biggest impact?
Solution
Typical high-impact combination:
-
[profile.dev.package."*"] opt-level = 2— Immediate improvement in dev-mode runtime (2-10× faster for parsing-heavy code) -
CARGO_ENCODED_RUSTFLAGS— Eliminates false CI failures from third-party warnings -
cargo-hack --each-feature— Usually finds at least one broken feature combination in any project with 3+ features
# Apply trick 5:
echo '[profile.dev.package."*"]' >> Cargo.toml
echo 'opt-level = 2' >> Cargo.toml
# Apply trick 7 in CI:
# Replace RUSTFLAGS with CARGO_ENCODED_RUSTFLAGS
# Apply trick 3:
cargo install cargo-hack
cargo hack check --each-feature --no-dev-deps
🟡 Exercise 2: Deduplicate Your Dependency Tree
Run cargo tree --duplicates on a real project. Eliminate at least one duplicate. Measure compile-time before and after.
Solution
# Before
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l # Count duplicate lines
# Find and fix one duplicate
cargo tree --duplicates
cargo tree --invert --package <duplicate-crate>@<old-version>
cargo update -p <parent-crate>
# After
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l # Should be fewer
# Typical result: 5-15% compile time reduction per eliminated
# duplicate (especially for heavy crates like syn, tokio)
Key Takeaways
- Use
CARGO_ENCODED_RUSTFLAGSinstead ofRUSTFLAGSto avoid breaking third-party build scripts [profile.dev.package."*"] opt-level = 2is the single highest-impact dev experience trick- Cache tuning (
save-ifon main only) prevents CI cache bloat on active repositories cargo tree --duplicates+cargo updateis a free compile-time win — do it monthly- Run fast checks locally with
cargo make pre-pushto avoid CI round-trip waste
Quick Reference Card
Cheat Sheet: Commands at a Glance
# ─── Build Scripts ───
cargo build # Compiles build.rs first, then crate
cargo build -vv # Verbose — shows build.rs output
# ─── Cross-Compilation ───
rustup target add x86_64-unknown-linux-musl
cargo build --release --target x86_64-unknown-linux-musl
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17
cross build --release --target aarch64-unknown-linux-gnu
# ─── Benchmarking ───
cargo bench # Run all benchmarks
cargo bench -- parse # Run benchmarks matching "parse"
cargo flamegraph -- --args # Generate flamegraph from binary
perf record -g ./target/release/bin # Record perf data
perf report # View perf data interactively
# ─── Coverage ───
cargo llvm-cov --html # HTML report
cargo llvm-cov --lcov --output-path lcov.info
cargo llvm-cov --workspace --fail-under-lines 80
cargo tarpaulin --out Html # Alternative tool
# ─── Safety Verification ───
cargo +nightly miri test # Run tests under Miri
MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test
valgrind --leak-check=full ./target/debug/binary
RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
# ─── Audit & Supply Chain ───
cargo audit # Known vulnerability scan
cargo audit --deny warnings # Fail CI on any advisory
cargo deny check # License + advisory + ban + source checks
cargo deny list # List all licenses in dep tree
cargo vet # Supply chain trust verification
cargo outdated --workspace # Find outdated dependencies
cargo semver-checks # Detect breaking API changes
cargo geiger # Count unsafe in dependency tree
# ─── Binary Optimization ───
cargo bloat --release --crates # Size contribution per crate
cargo bloat --release -n 20 # 20 largest functions
cargo +nightly udeps --workspace # Find unused dependencies
cargo machete # Fast unused dep detection
cargo expand --lib module::name # See macro expansions
cargo msrv find # Discover minimum Rust version
cargo clippy --fix --workspace --allow-dirty # Auto-fix lint warnings
# ─── Compile-Time Optimization ───
export RUSTC_WRAPPER=sccache # Shared compilation cache
sccache --show-stats # Cache hit statistics
cargo nextest run # Faster test runner
cargo nextest run --retries 2 # Retry flaky tests
# ─── Platform Engineering ───
cargo check --target thumbv7em-none-eabihf # Verify no_std builds
cargo build --target x86_64-pc-windows-gnu # Cross-compile to Windows
cargo xwin build --target x86_64-pc-windows-msvc # MSVC ABI cross-compile
cfg!(target_os = "linux") # Compile-time cfg (evaluates to bool)
# ─── Release ───
cargo release patch --dry-run # Preview release
cargo release patch --execute # Bump, commit, tag, publish
cargo dist plan # Preview distribution artifacts
Decision Table: Which Tool When
| Goal | Tool | When to Use |
|---|---|---|
| Embed git hash / build info | build.rs | Binary needs traceability |
| Compile C code with Rust | cc crate in build.rs | FFI to small C libraries |
| Generate code from schemas | prost-build / tonic-build | Protobuf, gRPC, FlatBuffers |
| Link system library | pkg-config in build.rs | OpenSSL, libpci, systemd |
| Static Linux binary | --target x86_64-unknown-linux-musl | Container/cloud deployment |
| Target old glibc | cargo-zigbuild | RHEL 7, CentOS 7 compatibility |
| ARM server binary | cross or cargo-zigbuild | Graviton/Ampere deployment |
| Statistical benchmarks | Criterion.rs | Performance regression detection |
| Quick perf check | Divan | Development-time profiling |
| Find hot spots | cargo flamegraph / perf | After benchmark identifies slow code |
| Line/branch coverage | cargo-llvm-cov | CI coverage gates, gap analysis |
| Quick coverage check | cargo-tarpaulin | Local development |
| Rust UB detection | Miri | Pure-Rust unsafe code |
| C FFI memory safety | Valgrind memcheck | Mixed Rust/C codebases |
| Data race detection | TSan or Miri | Concurrent unsafe code |
| Buffer overflow detection | ASan | unsafe pointer arithmetic |
| Leak detection | Valgrind or LSan | Long-running services |
| Local CI equivalent | cargo-make | Developer workflow automation |
| Pre-commit checks | cargo-husky or git hooks | Catch issues before push |
| Automated releases | cargo-release + cargo-dist | Version management + distribution |
| Dependency auditing | cargo-audit / cargo-deny | Supply chain security |
| License compliance | cargo-deny (licenses) | Commercial / enterprise projects |
| Supply chain trust | cargo-vet | High-security environments |
| Find outdated deps | cargo-outdated | Scheduled maintenance |
| Detect breaking changes | cargo-semver-checks | Library crate publishing |
| Dependency tree analysis | cargo tree --duplicates | Dedup and trim dep graph |
| Binary size analysis | cargo-bloat | Size-constrained deployments |
| Find unused deps | cargo-udeps / cargo-machete | Trim compile time and size |
| LTO tuning | lto = true or "thin" | Release binary optimization |
| Size-optimized binary | opt-level = "z" + strip = true | Embedded / WASM / containers |
| Unsafe usage audit | cargo-geiger | Security policy enforcement |
| Macro debugging | cargo-expand | Derive / macro_rules debugging |
| Faster linking | mold linker | Developer inner loop |
| Compilation cache | sccache | CI and local build speed |
| Faster tests | cargo-nextest | CI and local test speed |
| MSRV compliance | cargo-msrv | Library publishing |
no_std library | #![no_std] + default-features = false | Embedded, UEFI, WASM |
| Windows cross-compile | cargo-xwin / MinGW | Linux → Windows builds |
| Platform abstraction | #[cfg] + trait pattern | Multi-OS codebases |
| Windows API calls | windows-sys / windows crate | Native Windows functionality |
| End-to-end timing | hyperfine | Whole-binary benchmarks, before/after comparison |
| Property-based testing | proptest | Edge case discovery, parser robustness |
| Snapshot testing | insta | Large structured output verification |
| Coverage-guided fuzzing | cargo-fuzz | Crash discovery in parsers |
| Concurrency model checking | loom | Lock-free data structures, atomic ordering |
| Feature combination testing | cargo-hack | Crates with multiple #[cfg] features |
| Fast UB checks (near-native) | cargo-careful | CI safety gate, lighter than Miri |
| Auto-rebuild on save | cargo-watch | Developer inner loop, tight feedback |
| Workspace documentation | cargo doc + rustdoc | API discovery, onboarding, doc-link CI |
| Reproducible builds | --locked + SOURCE_DATE_EPOCH | Release integrity verification |
| CI cache tuning | Swatinem/rust-cache@v2 | Build time reduction (cold → cached) |
| Workspace lint policy | [workspace.lints] in Cargo.toml | Consistent Clippy/compiler lints across all crates |
| Auto-fix lint warnings | cargo clippy --fix | Automated cleanup of trivial issues |
Further Reading
Generated as a companion reference — a companion to Rust Patterns and Type-Driven Correctness.
Version 1.3 — Added cargo-hack, cargo-careful, cargo-watch, cargo doc, reproducible builds, CI caching strategies, capstone exercise, and chapter dependency diagram for completeness.