7177a263bf
Comprehensive 6-tier plan to address the 1.5h full-rebuild pathology when making small config changes. Covers content-hash output fingerprinting, per-crate granularity, public API surface tracking, restat / equivalence caching, and developer-experience tools. Synthesizes techniques from Nix, Buildroot, Yocto, GN/Ninja, Cargo, and Bazel adapted to Red Bear OS's Rust cookbook. Triggered by: 2-line edit to local/sources/base/Cargo.toml caused 1.5h full rebuild of redbear-mini. Root cause: cookbook tracks at recipe granularity (one stage.pkgar for 45-member Cargo workspace) instead of crate granularity.
551 lines
22 KiB
Markdown
551 lines
22 KiB
Markdown
# RED BEAR OS — BUILD SYSTEM ROBUSTNESS PLAN
|
|
|
|
**Generated**: 2026-06-08
|
|
**Trigger**: A 2-line config change (`local/sources/base/Cargo.toml` — added `[patch.crates-io]`
|
|
entry, changed one path dep to absolute) caused a full 1.5-hour rebuild of the entire OS image.
|
|
That is not normal. The build system must be made of independent packages with surgical
|
|
rebuild semantics, and the cookbook must distinguish "source changed" from "no actual output change".
|
|
|
|
## THE CORE PROBLEM
|
|
|
|
Red Bear OS's cookbook treats a Cargo workspace as a single recipe:
|
|
|
|
- `base` is **one** recipe (`recipes/core/base/recipe.toml`) but contains a **45-member Cargo
|
|
workspace** (`local/sources/base/Cargo.toml` with `members = ["audiod", ..., "drivers/pcid",
|
|
..., "drivers/graphics/driver-graphics", ...]`).
|
|
- A 1-line change to `local/sources/base/Cargo.toml` invalidates the recipe (`modified_dir_ignore_git`
|
|
walks the entire source tree).
|
|
- Cargo recompiles all 45 workspace members because the workspace config changed.
|
|
- The recipe then stages all 45 binaries into one `stage.pkgar`.
|
|
- Every package that lists `base` in its `[build] dependencies` sees a newer `.pkgar` mtime and
|
|
rebuilds.
|
|
- Result: a 2-line change rebuilds the entire OS.
|
|
|
|
This violates the "red bear custom work survives changes" principle. We need surgical rebuild
|
|
semantics: a change to a single driver should rebuild only that driver, not 45 others, and the
|
|
only downstream rebuilds should be packages that actually consume the changed driver's public
|
|
output.
|
|
|
|
## WHAT MATURE SYSTEMS DO
|
|
|
|
Synthesis of Nix, Buildroot, Yocto, Chromium GN/Ninja, Cargo, and Bazel:
|
|
|
|
| System | Granularity | Cache key | Cascade behavior |
|
|
|---|---|---|---|
|
|
| **Nix** | Per-derivation | Hash of all inputs (content-addressed) | Only downstream whose input hash changed rebuilds. Quotient hashing avoids mass rebuilds when fixed inputs change |
|
|
| **Buildroot** | Per-package (stamp file) | Stamp mtime | Manual — user must know when to cascade |
|
|
| **Yocto** | Per-task (siginfo) | Hash of all recipe variables | sstate cache; equivalence server avoids redundant rebuilds |
|
|
| **GN/Ninja** | Per-target (explicit `deps`) | mtime + `restat` + `gn analyze` | `gn analyze` prunes tree to affected targets; `public_deps` distinguish API vs implementation |
|
|
| **Cargo** | Per-unit (fingerprint) | Hash of rustc version + features + target + profile + dep fingerprints | Only units with changed fingerprints rebuild; dep-fingerprint cascade |
|
|
| **Bazel** | Per-action (declared inputs) | Hash of action inputs + command line + env | Skyframe does reverse-transitive-closure; "resurrection" reverts if rebuild produces identical output |
|
|
|
|
**The four core techniques Red Bear OS is missing:**
|
|
|
|
1. **Content-addressed outputs** (Nix, Bazel) — store by hash, not by name
|
|
2. **Per-unit fingerprints with dep cascade** (Cargo) — only rebuild units whose fingerprint changes
|
|
3. **Public vs private API boundary** (GN) — only propagate dirty when public surface changes
|
|
4. **Restat / equivalence caching** (Ninja, Yocto) — if rebuilt output is byte-identical, mark dirty as false
|
|
|
|
## TIER 1 — IMMEDIATE WINS (low effort, high impact)
|
|
|
|
### T1.1 — Content-hash `stage.pkgar` to detect "no actual change"
|
|
|
|
**Problem**: When `base` rebuilds, it produces a new `stage.pkgar` (different mtime), even if the
|
|
pkgar content is byte-identical to the previous one. Downstream sees the mtime change and
|
|
rebuilds.
|
|
|
|
**Fix**: After rebuild, compute a content hash of the new `stage.pkgar`. If it matches the
|
|
previous hash, **do not bump the mtime** (or set the new pkgar's mtime to the old one).
|
|
Downstream mtime comparison will see no change → no cascade.
|
|
|
|
**Implementation** (cookbook, `src/cook/cook_build.rs`):
|
|
```rust
|
|
// After packaging stage → stage.pkgar
|
|
let new_hash = blake3::hash(&std::fs::read(&stage_pkgar)?);
|
|
let old_hash_path = stage_dir.join("stage.pkgar.hash");
|
|
if let Ok(old_hash) = std::fs::read_to_string(&old_hash_path) {
|
|
if old_hash.trim() == new_hash.to_hex().to_string() {
|
|
// Content unchanged — preserve mtime, skip cascade
|
|
preserve_mtime(&stage_pkgar, &old_pkgar)?;
|
|
return Ok(());
|
|
}
|
|
}
|
|
std::fs::write(old_hash_path, new_hash.to_hex().to_string())?;
|
|
```
|
|
|
|
**Impact**: A config change that doesn't affect output (e.g., adding a comment, reordering
|
|
members) will no longer cascade.
|
|
|
|
**Effort**: 1 day (cookbook + recipe-side `blake3` dep if not present).
|
|
|
|
### T1.2 — `repo cook --since=<git-ref>` incremental mode
|
|
|
|
**Problem**: `repo cook` rebuilds everything that's "dirty" by mtime. For a developer iterating
|
|
on a single file, this can be over-inclusive.
|
|
|
|
**Fix**: Add `--since=<ref>` flag that uses `git diff --name-only <ref>..HEAD` to find changed
|
|
files, then walks the reverse dep graph to find affected recipes.
|
|
|
|
**Implementation**: New `src/cook/cook_incremental.rs`:
|
|
```rust
|
|
// 1. git diff --name-only <ref>..HEAD → list of changed files
|
|
// 2. For each changed file, find recipes whose source contains it
|
|
// 3. Build reverse dep graph (BFS)
|
|
// 4. Build root-first, then dependents
|
|
```
|
|
|
|
**Impact**: A 1-line change in one file rebuilds only that file's recipe + cascade, not the
|
|
whole source-modified set.
|
|
|
|
**Effort**: 3-5 days (git plumbing + BFS + integration with build-redbear.sh).
|
|
|
|
### T1.3 — Fix cascade script to use Cargo workspace member detection
|
|
|
|
**Problem**: `local/scripts/rebuild-cascade.sh` uses text grep
|
|
(`grep -q "dependencies.*=.*\[.*${target}.*\]"`) which misses:
|
|
- Cargo workspace member-to-member dependencies (e.g., `pcid` and `pcid-spawner` in same workspace)
|
|
- `dev-dependencies`
|
|
- Conditional dependencies behind features
|
|
|
|
**Fix**: Augment with Cargo workspace member parsing. For each recipe, if it's a Cargo recipe,
|
|
parse `Cargo.toml` for `[workspace.members]` and add member-to-member edges.
|
|
|
|
**Implementation** (rebuild-cascade.sh, augment with cargo-aware pass):
|
|
```bash
|
|
# After text-grep pass, add cargo workspace members
|
|
for recipe_toml in $(find recipes/ local/recipes/ -name "recipe.toml"); do
|
|
source_dir=$(toml_get "$recipe_toml" source.path)
|
|
if [ -f "$source_dir/Cargo.toml" ]; then
|
|
# Parse workspace members
|
|
members=$(grep -A100 '^\[workspace\]' "$source_dir/Cargo.toml" | \
|
|
grep -E '^\s*"[^"]+",?\s*$' | tr -d '",' | xargs)
|
|
for member in $members; do
|
|
# Each member is a potential dependent if it has Cargo deps on the target
|
|
...
|
|
done
|
|
fi
|
|
done
|
|
```
|
|
|
|
**Impact**: Cascade detection is accurate; no missed rebuilds, no false rebuilds.
|
|
|
|
**Effort**: 2-3 days.
|
|
|
|
### T1.4 — Per-source-hash invalidation in `modified_dir_ignore_git`
|
|
|
|
**Problem**: `fs.rs:160-167` walks the ENTIRE source tree to find the newest file mtime. A
|
|
single `.swp` file or build artifact can invalidate the cache.
|
|
|
|
**Fix**: Use git tree hash for source modification detection. If the source is a git repo
|
|
(most local sources are), use `git rev-parse HEAD:./path` to get a content hash. Only when the
|
|
hash changes, mark dirty.
|
|
|
|
**Implementation** (fs.rs):
|
|
```rust
|
|
pub fn source_fingerprint(dir: &Path) -> Result<String> {
|
|
if is_git_repo(dir) {
|
|
// git rev-parse --verify HEAD -- path → only hashes tracked files
|
|
let output = Command::new("git")
|
|
.args(&["-C", dir.to_str().unwrap(), "ls-tree", "-r", "HEAD"])
|
|
.output()?;
|
|
let mut hasher = blake3::Hasher::new();
|
|
for line in output.stdout.lines() {
|
|
hasher.update(line.as_bytes());
|
|
hasher.update(b"\n");
|
|
}
|
|
Ok(hasher.finalize().to_hex().to_string())
|
|
} else {
|
|
// Fallback: hash all files
|
|
Ok(blake3::hash_dir(dir)?.to_hex().to_string())
|
|
}
|
|
}
|
|
```
|
|
|
|
**Impact**: Build artifacts (`.swp`, `target/`, `Cargo.lock` if not tracked) no longer trigger
|
|
rebuilds. Stale mtime due to touch operations no longer triggers.
|
|
|
|
**Effort**: 2 days.
|
|
|
|
## TIER 2 — PER-CRATE GRANULARITY (medium effort, very high impact)
|
|
|
|
### T2.1 — Split `base` workspace into per-binary sub-recipes
|
|
|
|
**Problem**: `base` is one recipe with 45 workspace members. A 1-line change rebuilds all 45.
|
|
|
|
**Fix**: Two options:
|
|
|
|
**Option A (simpler)**: Keep `base` as a Cargo workspace, but change the cookbook's `build()`
|
|
to track per-binary `stage.pkgar` files. Each binary gets its own pkgar; downstream depends on
|
|
specific binaries.
|
|
|
|
**Option B (cleaner)**: Split `base` into one recipe per binary. Each recipe:
|
|
- Has its own `recipe.toml` with `template = "cargo"` and a single `-p` filter
|
|
- Stages its own binary
|
|
- Other packages depend on specific binaries (e.g., `pcid-bin`, `usbhidd-bin`)
|
|
|
|
**Recommended**: Option A first (smaller diff), then migrate to Option B as cleanup.
|
|
|
|
**Implementation** (cookbook, `cook_build.rs`):
|
|
```rust
|
|
// Detect workspace members
|
|
let members = parse_workspace_members(&source_manifest)?;
|
|
for member in members {
|
|
let member_pkgar = stage_dir.join(format!("{member}.pkgar"));
|
|
// Per-member mtime + per-member hash
|
|
let member_source_dir = source_dir.join(&member);
|
|
let member_modified = modified_dir_ignore_git(&member_source_dir)?;
|
|
let member_deps = member_dependencies(&member, &source_manifest)?;
|
|
// Per-member cache check
|
|
if member_pkgar_modified < member_modified ||
|
|
member_pkgar_modified < deps_modified_for(&member_deps) {
|
|
// Rebuild this member only
|
|
cargo build -p member ...
|
|
}
|
|
}
|
|
```
|
|
|
|
**Impact**: A change to `audiod` rebuilds only `audiod`, not all 45 base binaries. A change to
|
|
`pcid` rebuilds only `pcid`. This is the single biggest win.
|
|
|
|
**Effort**: 1-2 weeks (cookbook refactor + recipe updates).
|
|
|
|
### T2.2 — Restructure kernel and mesa recipes similarly
|
|
|
|
Same pattern as T2.1 for:
|
|
- `kernel` recipe (single `kernel.elf` output, but multiple internal stages)
|
|
- `mesa` recipe (single `libGL.so` etc., but multiple internal sub-libraries)
|
|
- `llvm21` recipe (single `clang` binary, but many internal components)
|
|
|
|
**Impact**: A change to one mesa component rebuilds only that component, not the whole mesa
|
|
build (which takes 20+ minutes).
|
|
|
|
**Effort**: 1 week each.
|
|
|
|
### T2.3 — Per-binfmt_pkg output tracking
|
|
|
|
**Problem**: A recipe's `stage/` directory contains many files, all bundled into one
|
|
`stage.pkgar`. The cookbook doesn't know which file in `stage/` corresponds to which binary.
|
|
|
|
**Fix**: Add `installs = [...]` to recipes (already partially supported), and use it to track
|
|
per-output mtime.
|
|
|
|
**Implementation**: When a recipe declares `installs = ["/usr/bin/foo", "/usr/lib/libbar.so"]`,
|
|
the cookbook:
|
|
- Tracks mtime per output path
|
|
- Computes per-output hash
|
|
- Lets downstream depend on specific output paths
|
|
|
|
**Impact**: When `libdrm.so` changes but `libdrm_intel.so` doesn't, only consumers of
|
|
`libdrm.so` rebuild.
|
|
|
|
**Effort**: 1-2 weeks.
|
|
|
|
## TIER 3 — OUTPUT FINGERPRINTING (medium-high effort, very high impact)
|
|
|
|
### T3.1 — Hash the sysroot content of each recipe
|
|
|
|
**Problem**: Currently the cookbook only checks mtime of `stage.pkgar`, not its content. Two
|
|
builds that produce identical pkgar content still cascade downstream.
|
|
|
|
**Fix**: Compute BLAKE3 hash of the staged sysroot artifacts; cache it; use it as part of
|
|
the package fingerprint.
|
|
|
|
**Implementation** (`src/cook/cook_build.rs`):
|
|
```rust
|
|
// After staging files into stage/
|
|
let stage_fingerprint = compute_stage_fingerprint(&stage_dir)?;
|
|
let fp_file = stage_dir.join("stage.fingerprint");
|
|
let new_fp = blake3::hash(stage_fingerprint.as_bytes()).to_hex().to_string();
|
|
|
|
if let Ok(old_fp) = std::fs::read_to_string(&fp_file) {
|
|
if old_fp.trim() == new_fp {
|
|
// Stage contents identical — preserve mtime
|
|
preserve_mtime_recursive(&stage_dir)?;
|
|
}
|
|
}
|
|
std::fs::write(fp_file, new_fp)?;
|
|
```
|
|
|
|
**Impact**: A rebuild that produces identical output (e.g., due to deterministic compiler
|
|
output for unchanged sources) doesn't cascade.
|
|
|
|
**Effort**: 3-5 days.
|
|
|
|
### T3.2 — Cascade invalidation only when downstream input hash changes
|
|
|
|
**Problem**: Cascade currently triggers on mtime. Mtime can change without content change.
|
|
|
|
**Fix**: Instead of `stage_modified < deps_modified`, use:
|
|
`downstream_fingerprint_input < upstream_fingerprint`.
|
|
|
|
Where:
|
|
- Each recipe declares a `fingerprint_inputs = [...]` list (paths it consumes)
|
|
- On each rebuild, hash the contents of those paths
|
|
- Store the hash as part of the recipe's fingerprint
|
|
|
|
**Implementation**:
|
|
```rust
|
|
// Recipe declares:
|
|
[package]
|
|
fingerprint_inputs = ["/usr/lib/libdrm.so", "/usr/include/libdrm/drm.h"]
|
|
|
|
// Cookbook computes:
|
|
let input_fingerprint = blake3::hash_dir_contents(fingerprint_inputs)?;
|
|
```
|
|
|
|
**Impact**: When `libdrm.so` content doesn't change (e.g., internal implementation), consumers
|
|
don't rebuild.
|
|
|
|
**Effort**: 1-2 weeks.
|
|
|
|
### T3.3 — Yocto-style equivalence cache for ABI-stable rebuilds
|
|
|
|
**Problem**: When a recipe's source changes but its output is byte-identical, the recipe
|
|
rebuilds but downstream should not.
|
|
|
|
**Fix**: Implement an "equivalence cache" — a database mapping old content hash → new content
|
|
hash for ABI-equivalent outputs. When the new content hash matches an old one (within the
|
|
equivalence class), downstream is not invalidated.
|
|
|
|
**Implementation**: SQLite-backed equivalence cache at `.redbear/equivalence.db`. Keyed by
|
|
input hash + build flags; value is the set of "equivalent" output hashes.
|
|
|
|
**Impact**: Even non-deterministic builds (e.g., embedded timestamps) can be marked equivalent.
|
|
|
|
**Effort**: 2-3 weeks.
|
|
|
|
## TIER 4 — PUBLIC API TRACKING (high effort, high impact for kernel/relibc)
|
|
|
|
### T4.1 — Distinguish public headers from internal sources
|
|
|
|
**Problem**: relibc changes cascade to all C/C++ packages. But only changes to relibc's public
|
|
headers (in `local/sources/relibc/include/`) should cascade. Internal changes to
|
|
`local/sources/relibc/src/` should not.
|
|
|
|
**Fix**: Each recipe declares `public_api = [...]` — the list of paths that constitute its
|
|
public API. Only mtime/hash changes to those paths trigger cascade.
|
|
|
|
**Implementation**:
|
|
```rust
|
|
let public_api = recipe.public_api_paths();
|
|
let public_api_modified = public_api.iter()
|
|
.map(|p| modified(p))
|
|
.max()?;
|
|
|
|
if stage_modified < public_api_modified {
|
|
cascade_to_dependents();
|
|
}
|
|
```
|
|
|
|
**Impact**: A change to `relibc/src/header/errno/mod.rs` (internal) doesn't cascade.
|
|
A change to `relibc/include/sys/errno.h` (public) does.
|
|
|
|
**Effort**: 1-2 weeks per "API surface" (relibc, kernel, mesa).
|
|
|
|
### T4.2 — Track ABI via `.so` version files / SONAME bumps
|
|
|
|
**Problem**: Even if headers change, if the ABI version is unchanged, downstream can use the
|
|
new library without recompilation.
|
|
|
|
**Fix**: Parse `.so` files for SONAME. Compare SONAME between old and new build. If
|
|
SONAME unchanged, no cascade.
|
|
|
|
**Implementation**: ELF SONAME parser in cookbook. SONAME = `readelf -d *.so | grep SONAME`.
|
|
|
|
**Impact**: Relibc ABI-preserving changes don't cascade to C/C++ packages.
|
|
|
|
**Effort**: 1 week.
|
|
|
|
### T4.3 — Header dependency graph for C/C++ packages
|
|
|
|
**Problem**: A C/C++ package's includes cascade is "any header in any include path", which is
|
|
over-inclusive. The actual cascade should be "headers this file actually includes".
|
|
|
|
**Fix**: Use `gcc -M` / `clang -M` to generate per-file header dependencies. Hash the
|
|
resulting `.d` file. Cascade only when those specific headers change.
|
|
|
|
**Impact**: A change to `errno.h` doesn't cascade to packages that don't include `errno.h`.
|
|
|
|
**Effort**: 1-2 weeks.
|
|
|
|
## TIER 5 — RESTAT / OUTPUT STABILITY (medium effort, medium impact)
|
|
|
|
### T5.1 — After rebuild, check if installed files differ from previous
|
|
|
|
**Problem**: Currently, every rebuild produces a new `stage.pkgar` regardless of whether
|
|
content changed.
|
|
|
|
**Fix**: After `cargo build` and before `pkgar` packaging, diff the new sysroot against the
|
|
old sysroot. If all files are byte-identical, copy old `stage.pkgar` mtime to new files.
|
|
|
|
**Implementation**: `diff -r` or content-hash comparison.
|
|
|
|
**Impact**: Idempotent builds don't cascade.
|
|
|
|
**Effort**: 3-5 days.
|
|
|
|
### T5.2 — Idempotent packaging
|
|
|
|
**Problem**: pkgar files include timestamps, so identical content produces different pkgar
|
|
files.
|
|
|
|
**Fix**: Make pkgar packaging deterministic (sort entries, zero timestamps, fixed compression).
|
|
|
|
**Impact**: Identical content → identical pkgar → no cascade.
|
|
|
|
**Effort**: 1 week (upstream pkgar changes).
|
|
|
|
## TIER 6 — DEPENDENCY GRAPH ANALYSIS (low effort, medium impact)
|
|
|
|
### T6.1 — Add `repo graph` to show full dependency graph
|
|
|
|
**Problem**: Hard to know what rebuilds when X changes.
|
|
|
|
**Fix**: Add `repo graph` that emits the full dep graph in DOT format. Visualize with
|
|
`xdot` or similar.
|
|
|
|
**Implementation**: New `src/bin/repo_graph.rs` (or subcommand in repo.rs).
|
|
|
|
**Effort**: 2-3 days.
|
|
|
|
### T6.2 — Add `repo cook --since=<commit>` to only rebuild affected packages
|
|
|
|
**Problem**: When you `git pull` or merge a branch, you want to rebuild only what the merge
|
|
touched.
|
|
|
|
**Fix**: Use `git diff --name-only` between old HEAD and new HEAD, walk reverse deps.
|
|
|
|
**Implementation**: New `--since` flag in `repo cook`. Falls back to `--changed` for tracked
|
|
files.
|
|
|
|
**Effort**: 3-5 days.
|
|
|
|
### T6.3 — Add `repo why <pkg>` to show what triggers rebuilds
|
|
|
|
**Problem**: When `pkg` rebuilds, why? What cascaded into it?
|
|
|
|
**Fix**: Reverse-dep analysis — show the path from each changed source to the target recipe.
|
|
|
|
**Implementation**: BFS from changed source paths, through recipe deps and Cargo workspace
|
|
members, to target recipe.
|
|
|
|
**Effort**: 2-3 days.
|
|
|
|
## PRIORITIZED ROADMAP
|
|
|
|
| Tier | Effort | Impact | Risk | Priority |
|
|
|---|---|---|---|---|
|
|
| T1.1 Content-hash stage.pkgar | 1 day | High (catches all no-op rebuilds) | Low | **P0 — DO FIRST** |
|
|
| T1.4 Per-source-hash via git tree | 2 days | High (eliminates spurious dirty) | Low | **P0** |
|
|
| T1.2 `--since` flag | 3-5 days | High (developer workflow) | Medium | **P1** |
|
|
| T1.3 Cascade script cargo-aware | 2-3 days | Medium | Low | **P1** |
|
|
| T2.1 Split base per-binary | 1-2 weeks | Very high (45 → 1 rebuild) | Medium (breaking) | **P1** |
|
|
| T3.1 Sysroot fingerprint | 3-5 days | High | Low | **P1** |
|
|
| T2.2 Split kernel/mesa | 1 week each | High | Medium | **P2** |
|
|
| T3.2 Downstream-input hash | 1-2 weeks | Very high | Medium | **P2** |
|
|
| T6.1 `repo graph` | 2-3 days | Medium (devx) | Low | **P2** |
|
|
| T6.2 `--since` commit | 3-5 days | High (devx) | Low | **P2** |
|
|
| T5.1 Restat diff | 3-5 days | Medium | Low | **P3** |
|
|
| T3.3 Equivalence cache | 2-3 weeks | High | Medium (cache coherency) | **P3** |
|
|
| T4.1 Public API surface | 1-2 weeks | High (relibc) | Medium (semantics) | **P3** |
|
|
| T4.2 SONAME tracking | 1 week | Medium | Low | **P4** |
|
|
| T4.3 Header dep graph | 1-2 weeks | Medium | Medium | **P4** |
|
|
| T5.2 Idempotent pkgar | 1 week | Medium | Medium | **P4** |
|
|
| T2.3 Per-binfmt_pkg | 1-2 weeks | High | Medium | **P4** |
|
|
| T6.3 `repo why` | 2-3 days | Low (devx) | Low | **P4** |
|
|
|
|
## PHASED IMPLEMENTATION
|
|
|
|
### Phase A (1-2 weeks) — Stop the bleeding
|
|
- T1.1 content-hash stage.pkgar
|
|
- T1.4 git-tree source fingerprint
|
|
- T1.3 cargo-aware cascade
|
|
- Result: 2-line Cargo.toml change no longer cascades if output is identical
|
|
|
|
### Phase B (2-4 weeks) — Per-crate granularity
|
|
- T2.1 split base workspace (1-2 weeks)
|
|
- T2.2 split kernel/mesa (1 week each)
|
|
- T3.1 sysroot fingerprint
|
|
- T3.2 downstream-input hash cascade
|
|
- Result: A change to one driver rebuilds only that driver
|
|
|
|
### Phase C (2-4 weeks) — API surface tracking
|
|
- T4.1 public API surface for relibc + kernel
|
|
- T4.2 SONAME tracking
|
|
- T4.3 header dep graph
|
|
- T5.1 restat diff
|
|
- T5.2 idempotent pkgar
|
|
- Result: Internal implementation changes don't cascade
|
|
|
|
### Phase D (1-2 weeks) — Developer experience
|
|
- T6.1 `repo graph`
|
|
- T6.2 `--since` commit
|
|
- T6.3 `repo why`
|
|
- T1.2 `--since=<ref>` incremental
|
|
- Result: Developer can answer "what rebuilds" instantly
|
|
|
|
## METRICS & SUCCESS CRITERIA
|
|
|
|
The build system is healthy when:
|
|
|
|
| Metric | Current | Target |
|
|
|---|---|---|
|
|
| 1-line Cargo.toml change rebuilds | Full OS (1.5h) | < 5 min (only changed recipe) |
|
|
| `make` after no source change | Full OS (1.5h) | 0 sec (idempotent, no-op) |
|
|
| 1-line kernel source change | Full OS (1.5h) | < 10 min (kernel + kernel consumers) |
|
|
| 1-line relibc internal change | Full OS (1.5h) | < 5 min (relibc + 0 consumers if API unchanged) |
|
|
| `repo cook --since=v0.1.0` | Full OS | < 1 min (1-2 packages) |
|
|
| `repo why mesa` | N/A | < 1 sec (printed graph) |
|
|
|
|
## DESIGN CONSTRAINTS
|
|
|
|
These constraints are non-negotiable:
|
|
|
|
1. **Offline-first** — `REPO_OFFLINE=1` must remain default. All changes must work without
|
|
network access.
|
|
2. **Determinism** — Outputs must be byte-identical for identical inputs (modulo timestamps).
|
|
3. **Backward compat** — Existing recipes must continue to work without modification.
|
|
4. **No new build dependencies** — Only use crates already in the workspace.
|
|
5. **Performance** — Fingerprint computation must be O(source) or O(staged-output), not O(n²).
|
|
6. **Durability** — Fingerprint caches must survive `make distclean` (in `local/.cache/`).
|
|
|
|
## NON-GOALS
|
|
|
|
We will NOT:
|
|
- Replace Cargo (too invasive, too risky)
|
|
- Migrate to Bazel or Nix (would require months of work)
|
|
- Add remote artifact caching (out of scope; we have local sstate)
|
|
- Rewrite the build in a different language
|
|
- Add a distributed build cluster
|
|
|
|
## WHY THIS PLAN WILL WORK
|
|
|
|
Mature systems (Nix, Cargo, Yocto) already implement these patterns. The techniques are
|
|
proven. Red Bear OS only needs to add **output fingerprinting** and **per-crate granularity**,
|
|
both of which are well-understood in the broader build systems literature.
|
|
|
|
The hardest part is **T2.1 (per-crate granularity for base)** because it requires cookbook
|
|
changes. But the rest can be implemented incrementally and tested with `--no-cache` for
|
|
correctness.
|
|
|
|
## NEXT STEPS
|
|
|
|
1. Implement T1.1 (content-hash stage.pkgar) — 1 day, low risk
|
|
2. Implement T1.4 (git-tree source fingerprint) — 2 days, low risk
|
|
3. Implement T1.3 (cargo-aware cascade) — 2-3 days, low risk
|
|
4. Test: 1-line Cargo.toml change should rebuild only base
|
|
5. Implement T2.1 (per-binary base) — 1-2 weeks
|
|
6. Test: 1-line pcid source change should rebuild only pcid
|
|
7. Implement T3.1, T3.2 (output fingerprinting + cascade-by-hash)
|
|
8. Test: rebuild with identical source produces no cascade
|
|
9. Phase D devx improvements (graph, why, since)
|
|
|
|
## REFERENCES
|
|
|
|
- Nix Quotient Hashing: <https://nix.dev/manual/nix/2.34/store/derivation/outputs/content-address>
|
|
- Cargo Fingerprint Module: <https://doc.rust-lang.org/stable/nightly-rustc/cargo/core/compiler/fingerprint/index.html>
|
|
- GN Analyze: <https://gn.googlesource.com/gn/+/HEAD/docs/reference.md>
|
|
- Yocto sstate: <https://docs.yoctoproject.org/5.0.9/overview-manual/concepts.html>
|
|
- Bazel Skyframe: <https://preview.bazel.build/reference/skyframe>
|
|
- Buildroot Rebuilding: <https://buildroot.org/downloads/manual/rebuilding-packages.txt>
|