feat: build system hardening — collision detection, validation gates, init path enforcement

5-phase hardening to prevent silent file-layer collisions (the D-Bus
regression class):

Phase 1: lint-config-paths.sh + make lint-config in depends.mk
Phase 2: CollisionTracker in installer (content-hash comparison)
Phase 3: installs manifests in recipe.toml + validate-file-ownership.sh
Phase 4: validate-init-services.sh + make validate in disk.mk
Phase 5: documentation (AGENTS.md, BUILD-SYSTEM-HARDENING-PLAN.md)

Both redbear-mini and redbear-full build and validate clean.
66 declared install paths in base, zero conflicts.
This commit is contained in:
2026-05-03 22:25:22 +01:00
parent 907d447369
commit 2e764746e7
21 changed files with 1503 additions and 69 deletions
+403
View File
@@ -0,0 +1,403 @@
# Build System Hardening Plan
**Date:** 2026-05-03
**Status:** Implemented
**Scope:** Installer file-layer collision detection, config-layer path enforcement,
recipe file-ownership tracking, validation gates, and architectural documentation.
**Triggering incident:** 40 init service files in `config/redbear-*.toml` used
`/usr/lib/init.d/` paths. The `base` package installs to the same directory.
Package staging silently overwrote config overrides. The init scheduler blocked
on `scheme`-type services that were supposed to be overridden to `oneshot_async`,
preventing D-Bus and 20+ services from ever starting.
**Fix applied:** Changed all config `[[files]]` init service paths from
`/usr/lib/init.d/` to `/etc/init.d/`. The init system's `config_for_dirs()`
BTreeMap gives `/etc/init.d/` priority over `/usr/lib/init.d/` for the same
filename, so config overrides now survive package installation and take effect
at runtime.
**Goal:** Prevent this class of silent file collision from recurring by adding
build-time detection, installer awareness, and architectural documentation.
---
## Phase 1: Config-Layer Path Enforcement (12 days)
**Objective:** Ensure config `[[files]]` entries for init services always use
`/etc/init.d/` paths. Detect violations at build time.
### 1.1 Add a build-time lint for init service path violations
Create `scripts/lint-config-paths.sh` that:
- Parses all `config/redbear-*.toml` files
- Finds `[[files]]` entries with `path = "/usr/lib/init.d/..."`
- Reports violations with file, line number, and path
- Returns non-zero if any violations found
- Can be integrated into the build as a pre-build step
**Why a script, not Rust:** Config parsing is already TOML-based and a shell
script with `grep`/`awk` is sufficient for this lint. Adding it to the cookbook
Rust tool would require rebuilding the tool for lint-only changes. A script is
cheaper to iterate on and can run without a Rust toolchain rebuild.
**Acceptance:**
```bash
scripts/lint-config-paths.sh # exits 0 when clean, 1 + report when violations found
```
### 1.2 Document the init service layer convention
Add to AGENTS.md (project root) a clear rule:
> **Init service file ownership:**
> - Packages own `/usr/lib/init.d/` — the default service files installed by recipe staging
> - Config overrides own `/etc/init.d/` — override files created by `[[files]]` entries
> - The init system's `config_for_dirs()` gives `/etc/init.d/` priority via BTreeMap dedup
> - Config `[[files]]` entries MUST NOT use `/usr/lib/init.d/` paths for init services
### 1.3 Add Makefile integration
In `mk/config.mk` or `mk/depends.mk`, add a pre-build lint step:
```makefile
# Lint config files for init service path violations
lint-config:
@scripts/lint-config-paths.sh
# Hook into the build before repo cook
repo: lint-config
```
---
## Phase 2: Installer Collision Detection (23 days)
**Objective:** The installer detects when a config `[[files]]` entry would be
silently overwritten by package staging, and warns or errors accordingly.
### 2.1 Track file provenance during `install_dir()`
Modify `install_dir()` in `installer.rs` to track which layer created each file:
```rust
struct InstallTracker {
/// Map from destination path to the layer that created it
files: BTreeMap<PathBuf, FileProvenance>,
}
enum FileProvenance {
ConfigPreInstall, // Created by [[files]] with postinstall=false
Package, // Created by install_packages()
ConfigPostInstall, // Created by [[files]] with postinstall=true
}
```
Implementation points:
- Before `file.create(&output_dir)`, record the path and layer
- Before `install_packages()`, snapshot existing files
- After `install_packages()`, diff to find new/overwritten files
- After postinstall `[[files]]`, record new files
### 2.2 Detect and report collisions
During the diff after `install_packages()`:
1. If a file existed from `ConfigPreInstall` and was overwritten by `Package`:
- **WARN** (default): Print a warning showing the collision
- **ERROR** (strict mode via `STRICT_COLLISION=1` env): Fail the build
2. For init service files specifically (`/usr/lib/init.d/*.service`,
`/etc/init.d/*.service`):
- Always **ERROR**: Init service collisions are never acceptable because they
silently break the boot sequence
3. For other file types:
- **WARN** by default: Some collisions may be intentional (e.g., default
configs that packages override with versioned copies)
### 2.3 Collision report format
```
[COLLISION] /usr/lib/init.d/10_evdevd.service
Created by: config redbear-mini.toml (pre-install)
Overwritten by: package base
Impact: init service override lost
Fix: Change config [[files]] path from /usr/lib/init.d/ to /etc/init.d/
```
### 2.4 Implementation location
Patch against `recipes/core/installer/source/src/installer.rs`:
- New module `src/tracker.rs` with `InstallTracker`
- Modify `install_dir()` to use tracker
- Patch stored in `local/patches/installer/`
**Acceptance:**
- Build with a known collision (revert the /etc/init.d/ fix temporarily) should
produce clear error output
- Build with current configs should produce zero collisions
---
## Phase 3: Recipe File-Ownership Manifests (35 days)
**Objective:** Recipes declare what paths they install, enabling build-time
conflict detection between packages and between packages and config layers.
### 3.1 Add optional `installs` field to recipe.toml
```toml
[package]
# Optional: declare what paths this recipe installs into the image
# Used for collision detection and build validation
installs = [
"/usr/lib/init.d/10_evdevd.service",
"/usr/lib/init.d/11_udev.service",
"/usr/bin/evdevd",
"/usr/lib/libevdev.so",
]
```
This is **optional** — existing recipes without `installs` work as before.
New recipes and frequently-updated recipes should declare their installs.
### 3.2 Build-time ownership registry
The `repo cook` command builds an in-memory registry:
```
path → recipe_name
```
When multiple recipes claim the same path:
- **WARN** for non-critical paths (shared headers, etc.)
- **ERROR** for init service paths (`.service` files in `init.d/`)
### 3.3 Auto-generation tool
Create `scripts/generate-installs-manifest.sh`:
- Inspects recipe stage directory after build
- Lists all installed files relative to sysroot root
- Outputs suggested `installs = [...]` for recipe.toml
- Can be run as `make manifest.<recipe>`
### 3.4 Implementation location
Patch against `src/cook/package.rs` and recipe parsing in `src/`:
- Parse `installs` field from `[package]` section
- Build registry during `repo cook --with-package-deps`
- Check for conflicts before staging
---
## Phase 4: Post-Image Validation Gates (23 days)
**Objective:** After the image is created, validate that init service files
match expectations and no config overrides were silently lost.
### 4.1 Init service validation script
Create `scripts/validate-init-services.sh`:
```bash
# Mount image, inspect init.d directories, validate:
# 1. Every /etc/init.d/*.service file has different content from /usr/lib/init.d/ counterpart
# (if they exist in both — if identical, the override is redundant)
# 2. No /usr/lib/init.d/*.service file was supposed to be overridden but wasn't
# 3. All scheme-type services have corresponding scheme daemons in the image
# 4. Service dependency graph has no missing dependencies
# 5. Service dependency graph has no cycles
```
Validation checks:
1. **Override verification**: For each file in `/etc/init.d/`, verify it differs
from the corresponding `/usr/lib/init.d/` file (if any). If identical, warn
about redundant override.
2. **Missing override detection**: For each config `[[files]]` entry targeting
`/etc/init.d/`, verify the file actually exists in the mounted image and
matches the config content.
3. **Scheme service audit**: List all services with `type = { scheme = "..." }`.
For each, verify the scheme binary exists in `/usr/bin/`. Warn about scheme
services that may block the scheduler if the daemon isn't guaranteed to start.
4. **Dependency cycle check**: Parse all service files, build a dependency graph,
detect cycles.
5. **Missing dependency check**: For each `requires`/`requires_weak` entry,
verify the referenced target/service file exists.
### 4.2 Makefile integration
Add to `mk/disk.mk`:
```makefile
# Validate init services in the built image
validate-init: $(BUILD)/harddrive.img
@scripts/validate-init-services.sh $(BUILD)/harddrive.img
# Full validation gate
validate: validate-init
@echo "Build validation passed"
```
### 4.3 CI integration
No `.gitlab-ci.yml` exists in the repository yet. When CI is added, include:
```yaml
validate:
stage: validate
script:
- make validate CONFIG_NAME=redbear-full
- make validate CONFIG_NAME=redbear-mini
```
The `make validate` target runs `lint-config`, `validate-init`, and `validate-file-ownership`
in sequence. It requires a built image (`harddrive.img`) to exist.
---
## Phase 5: Architectural Documentation (1 day)
**Objective:** Document the file ownership hierarchy, installer ordering, and
init system override mechanism so future contributors understand the constraints.
### 5.1 Update AGENTS.md (project root)
Add a section "Installer File Layering" covering:
1. **Layer ordering during `install_dir()`:**
```
Layer 1: Config pre-install [[files]] (postinstall = false)
Layer 2: Package staging (install_packages())
Layer 3: Config post-install [[files]] (postinstall = true)
Layer 4: User/group creation (passwd, shadow, group)
```
2. **Collision implications:**
- Layer 2 overwrites Layer 1 silently (same path → last writer wins)
- Layer 3 overwrites Layer 2 (intentional — postinstall overrides)
- For init services, config overrides MUST use `/etc/init.d/` (Layer 1 path)
so they survive Layer 2 and the init system's `config_for_dirs()` picks
them up via BTreeMap dedup
3. **Init system override mechanism:**
- `config_for_dirs(["/usr/lib/init.d", "/etc/init.d"])` → BTreeMap
- Same filename: `/etc/init.d/` entry overwrites `/usr/lib/init.d/` entry
- This is the intended override path: packages own `/usr/lib/init.d/`,
configs own `/etc/init.d/`
### 5.2 Update BUILD-SYSTEM-INVARIANTS.md
Add new invariants:
> **Invariant I1: Init Service Path Separation**
>
> Config `[[files]]` entries that create or override init service files MUST use
> `/etc/init.d/` paths. Package-owned service files go in `/usr/lib/init.d/`.
> The installer does not detect file collisions between layers.
> **Invariant I2: Config Override Survival**
>
> Any file created by config `[[files]]` that must survive package installation
> MUST use a path that packages do not install to. The init system's
> `config_for_dirs()` mechanism provides this for init services via the
> `/etc/init.d/` override directory.
> **Invariant I3: Post-Install is the Override Layer**
>
> `[[files]]` entries with `postinstall = true` run AFTER package installation
> and are guaranteed to overwrite any package-provided file. Use this for files
> that must always reflect the config's content regardless of package content.
> Prefer `/etc/` directory overrides over postinstall for init services, because
> postinstall requires all overrides to be explicitly marked and is easy to miss.
### 5.3 Update local/AGENTS.md
Add a "Build System Safety" section referencing this plan and the invariants.
---
## Implementation Order
| Phase | Duration | Dependencies | Risk | Value |
|-------|----------|-------------|------|-------|
| Phase 1 | 12 days | None | Low | Prevents recurrence immediately |
| Phase 5 | 1 day | None | Low | Knowledge preservation |
| Phase 2 | 23 days | Phase 1 | Medium | Catches future collisions |
| Phase 4 | 23 days | Phase 1 | Medium | Validates built images |
| Phase 3 | 35 days | Phase 2 | Higher | Full ownership tracking |
**Recommended execution order:** Phase 1 → Phase 5 → Phase 2 → Phase 4 → Phase 3
Phases 1 and 5 are documentation and linting — zero risk, immediate value.
Phase 2 is the core installer improvement. Phase 4 adds validation on top.
Phase 3 is the most ambitious and can be deferred.
---
## Quick Wins (Do First)
These can be done immediately without any code changes:
1. **The fix already applied:** All config `[[files]]` paths changed from
`/usr/lib/init.d/` to `/etc/init.d/` — verified working (40 services,
D-Bus operational).
2. **Add lint script** (Phase 1.1): ~30 minutes of work.
3. **Update AGENTS.md** (Phase 5.1): ~1 hour of documentation.
4. **Update BUILD-SYSTEM-INVARIANTS.md** (Phase 5.2): ~30 minutes.
---
## File Change Summary
| File | Change | Phase |
|------|--------|-------|
| `scripts/lint-config-paths.sh` | New — lint for /usr/lib/init.d/ in config files | 1 |
| `mk/depends.mk` | Add lint-config target | 1 |
| `AGENTS.md` | Add installer file layering section | 5 |
| `local/docs/BUILD-SYSTEM-INVARIANTS.md` | Add invariants I1I3 | 5 |
| `local/patches/installer/collision-detection.patch` | New — installer collision detection | 2 |
| `recipes/core/installer/recipe.toml` | Wire collision detection patch | 2 |
| `scripts/validate-init-services.sh` | New — post-image init validation | 4 |
| `mk/disk.mk` | Add validate-init target | 4 |
| `src/cook/package.rs` | Parse installs field from recipe.toml | 3 |
| `src/recipe.rs` (or equivalent) | Add installs field to recipe struct | 3 |
---
## Scope Boundaries
**In scope:**
- Init service file path enforcement and collision detection
- Installer file-layer collision detection
- Post-image validation for init services
- Recipe file-ownership manifests (optional field)
- Architectural documentation
**Out of scope:**
- Init system redesign (scheduler, service types, dependency resolution)
- Package manager changes (pkgar format, dependency resolution)
- Build system Makefile restructuring
- Runtime validation of service startup order
- General file-conflict detection across all filesystem paths
(init service paths are the critical path; general detection is Phase 3)
---
## Relationship to Existing Plans
- **BUILD-SYSTEM-INVARIANTS.md**: This plan adds invariants I1I3 to the existing
surface-ownership model. Phases 14 implement enforcement of these new invariants.
- **PATCH-GOVERNANCE.md**: Unchanged. Patch governance covers source-tree durability;
this plan covers installer file-layer collisions — orthogonal concerns.
- **CONSOLE-TO-KDE-DESKTOP-PLAN.md**: This plan is infrastructure, not a desktop
feature. It prevents build-system regressions that could block the desktop path.
- **DBUS-INTEGRATION-PLAN.md**: The triggering incident was a D-Bus regression caused
by init service file collisions. This plan prevents recurrence of the root cause.