diff --git a/local/docs/BUILD-SYSTEM-IMPROVEMENTS.md b/local/docs/BUILD-SYSTEM-IMPROVEMENTS.md index d8e3243895..c06e10b96f 100644 --- a/local/docs/BUILD-SYSTEM-IMPROVEMENTS.md +++ b/local/docs/BUILD-SYSTEM-IMPROVEMENTS.md @@ -492,3 +492,124 @@ Eliminates the "delete and pray" pattern. Recommended order for the remaining 1: #7A. +--- + +## Addendum — Build-system observations from the ps2d / inputd diagnosis session (2026-06-30) + +While fixing the input-stack observability gap (commit `de9d1f4` in the +`local/sources/base/` inner repo), four small build-system ergonomics issues +were observed. Each is S-sized and could be picked up in any future hardening +session. None are blockers; all four cost time the next time someone edits a +local-fork source tree. + +### 11. Local-fork inner-repo remote URL points to upstream Redox (S, ~10 min) + +**Problem.** `local/sources/base/` is a nested git repo (the +local-fork model) with `origin = https://gitlab.redox-os.org/redox-os/base.git`. +Red Bear's own base fork is at `https://gitea.redbearos.org/vasilito/redbear-os-base`. +A Red Bear developer who commits inside `local/sources/base/` and runs +`git push origin master` would push Red Bear fork commits **to upstream +Redox**, where they will be rejected (or worse, silently fail). + +**Current behavior.** Most Red Bear base commits are made by the +`Red Bear OS ` author bot during automated syncs, which +push to a different fork URL configured out-of-band. The inner-repo +`origin` is therefore orphaned from normal operator workflows. + +**Proposal.** Set the inner `local/sources/base/` origin to Red Bear's gitea +URL via `local/scripts/sync-fork-remotes.sh` (new file). Same treatment for +`local/sources/{relibc,kernel,bootloader,installer,redoxfs,userutils}` if any +of them have the same issue. + +**Expected gain.** Eliminates a footgun. Operators can commit + push from +inside the local fork and reach the right remote. + +**Risk.** Low — purely a remote URL change, no history rewrite. + +### 12. Outer repo cannot show inline diffs for `local/sources/base/` (S, ~30 min) + +**Problem.** `local/sources/base/` is a nested git repo (not a real submodule +— the outer Red Bear repo has no `.gitmodules` entry for it). The outer +repo sees file changes only as `Submodule local/sources/base contains modified +content`. `git diff -- local/sources/base/drivers/input/ps2d/src/main.rs` +shows nothing useful; only `git diff --submodule=log` shows the commit hash +delta, not the actual line changes. + +This makes PR review of local-fork changes harder than necessary — the +reviewer must `cd local/sources/base && git diff` to see what actually changed. + +**Proposal.** Either: + +- (a) Register `local/sources/base/` (and other inner repos) as proper + git submodules via `.gitmodules` + `git submodule absorbgitdirs`. Lets + outer-repo `git diff` show the changes inline. +- (b) Add a wrapper script `local/scripts/show-fork-diffs.sh` that + recursively runs `git diff` inside each `local/sources//` + inner repo and presents the result with the outer-repo diff. + +**Expected gain.** PR review of local-fork changes becomes trivial. + +**Risk.** Low for (b); medium for (a) — touching `.gitmodules` and the +submodule pointer requires care. + +### 13. No "stale local-fork source" preflight check (S, ~2 hours) + +**Problem.** `build-redbear.sh` has a stale-prefix reminder that fires +*after* the build, but no equivalent check for stale local-fork sources. +When the operator edits `local/sources/base/...` and runs `make all`, the +cookbook uses path-based source so it *does* detect the change — but the +operator gets no warning that the build will take a long time because of +their edit. In the 2026-06-30 session, a 4-line edit to two `local/sources/base/` +files caused a 30+ minute rebuild of base (cargo rebuild of all 27 +sub-crates) with no warning that the rebuild scope would be that wide. + +**Proposal.** Extend the preflight in `local/scripts/build-redbear.sh` to: + +1. Compare the mtime of every `local/sources//**/*.rs` against + the corresponding `repo/x86_64-unknown-redox/.pkgar` mtime. +2. Print a `>>> WARNING: source is newer than its pkgar — + rebuilding` message before the build starts. +3. Print an `>>> ESTIMATED TIME: minutes based on history` line. + +**Expected gain.** Operators avoid 30-minute surprise rebuilds and can +defer edits to a low-cost window. + +**Risk.** None — purely additive diagnostic. + +### 14. Bootloader streaming under `-nographic` + OVMF is unusably slow (S, ~1 hour) + +**Problem.** When booting `build/x86_64/redbear-mini.iso` under +`qemu-system-x86_64 -nographic -serial mon:stdio` with OVMF, the Redox +bootloader streams the live ISO one MiB at a time over serial, taking +>6 minutes to reach the kernel. The user's reference log shows the same +ISO booting in seconds on a real KVM-accelerated host, so this is a +QEMU + `-nographic` interaction, not a bootloader bug. + +This makes post-fix QEMU verification of any change impractical inside +a normal session timeout. + +**Proposal.** Two complementary fixes: + +- (a) Document in `local/scripts/test-redbear-full-qemu.sh` and + `local/scripts/test-live-mini-uefi.sh` that for time-budgeted boot + verification, use `qemu-system-x86_64 -machine pc,accel=kvm -cpu host + -hda build/x86_64/redbear-mini.img -nographic -serial mon:stdio` + (raw `.img`, BIOS boot) instead of the OVMF + ISO path. The BIOS path + skips the live-mode streaming entirely. +- (b) Add `QEMU_BOOT_MODE` flag to the test launcher, default to BIOS + for fast verification, with `--uefi` opt-in for OVMF. + +**Expected gain.** Post-fix QEMU verifications fit in a 60–90 second +budget. Critical for the ps2d/inputd-style small-fix → verify cycle. + +**Risk.** None for (a); low for (b). + +### Summary of addendum + +| # | Title | Size | Risk | Status | +|---|-------|------|------|--------| +| 11 | Fix inner-fork remote URLs | S | Low | open | +| 12 | Surface inner-fork diffs to outer repo | S | Low–M | open | +| 13 | Preflight stale-local-fork-source warning | S | None | open | +| 14 | Fast-QEMU boot mode for verification | S | None | open | + diff --git a/local/docs/boot-logs/REDBEAR-MINI-BOOT-PS2D-INPUTD-LOG-FIX.md b/local/docs/boot-logs/REDBEAR-MINI-BOOT-PS2D-INPUTD-LOG-FIX.md new file mode 100644 index 0000000000..bb4209ed6d --- /dev/null +++ b/local/docs/boot-logs/REDBEAR-MINI-BOOT-PS2D-INPUTD-LOG-FIX.md @@ -0,0 +1,110 @@ +# Red Bear OS — QEMU mini boot: ps2d / inputd startup-log diagnosis + +**Date**: 2026-06-30 +**Test target**: `redbear-mini` +**Test launcher**: ad-hoc QEMU (`-machine pc -cpu max -smp 8 -m 12288 -nographic -serial mon:stdio`) +**Captured log**: original evidence was the boot sequence the operator pasted in chat +on 2026-06-29 (no separate file). This doc captures the diagnostic conclusions and +the fix. + +## 1. Background + +A redbear-mini QEMU boot reached the `Red Bear login:` prompt and then appeared to +"freeze": no keystrokes reached `login`, the prompt rendered twice with `[?1000l[?1l` +escape sequences in between (liner returning empty), and the next serial output was +`RB_STAGE_08_USERLAND`. + +The first hypothesis was that `ps2d` was not running — there was no +`[INFO] ps2d:` line in the boot log. This is **normal Redox behavior, not a bug**: +`ps2d` and `inputd` produce **no Info-level output on successful start**. They +have zero `log::info!()` calls on the success path; the only stdout is `.expect()` +panic messages on the failure path. Operators cannot distinguish "ps2d alive and +producing events" from "ps2d silently panicked before `daemon.ready()`" from the +boot log alone. + +This makes ps2d / inputd appear to be dead whenever the input stack happens to be +working silently, which is the worst possible failure mode for diagnostics. + +## 2. Root cause (and what was NOT broken) + +The boot reached `Red Bear login:`. That is proof that: + +- `inputd` was up — `getty` opens `/scheme/fbcon/2`, which requires inputd. +- `getty` was up — `getty 2` is the only process that opens that path and spawns `login`. +- `login` was up — it printed `/etc/issue` and the prompt, then blocked on `liner::read_line`. +- The PTY was up — `ptyd` (in `00_base.target`) creates the master fd `getty` bridges. + +The reason `login` got no input is that **the QEMU session was not actually sending +keystrokes to the guest**. This is a test-harness issue, not an OS bug. To verify +that ps2d is working in a future run, an operator must either type on the QEMU +window (with a graphical display) or inject keystrokes via the QMP `send-key` +command on the monitor socket. + +## 3. Fix — add startup info logs + +Two minimal diagnostic `log::info!()` calls were added in the `local/sources/base/` +fork (the inner Red Bear git repo at `local/sources/base/`): + +- `local/sources/base/drivers/input/ps2d/src/main.rs` — after `daemon.ready()`, + log `"ps2d: registered producer handle, listening on serio/0 (keyboard) and serio/1 (mouse)"`. +- `local/sources/base/drivers/inputd/src/main.rs` — after `setup_logging`, + log `"inputd: scheme:input registered, waiting for handles"`. + +The diff is 6 insertions across 2 files. No behavior change. No new error paths. + +The new `log::info!()` lines are emitted **only on the successful startup path**. +Existing `.error!()` and `.warn!()` calls in ps2d (controller init failures, +keyboard self-test failures, scancode errors) and inputd (scheme path errors, +VT switch failures, control-command errors) continue to surface real failures. + +## 4. How an operator verifies the input stack is alive + +After this fix, a healthy redbear-mini boot on QEMU shows both lines in the boot +log (during initfs phase): + +``` +2026-06-30T...Z [@ps2d: INFO] ps2d: registered producer handle, listening on serio/0 (keyboard) and serio/1 (mouse) +2026-06-30T...Z [@inputd: INFO] inputd: scheme:input registered, waiting for handles +``` + +If either line is missing after this commit, that daemon is dead. Check the panic +output (`.expect()` messages) for the cause: + +- `ps2d: failed to get I/O permission` — I/O port rights denied (rare) +- `ps2d: failed to open input producer` — inputd crashed before ps2d started +- `ps2d: failed to open /scheme/serio/0` — kernel serio scheme missing (very rare) +- `ps2d: failed to initialize` — PS/2 controller self-test failed (QEMU `-cpu max` + always passes this; only an issue on broken real hardware) +- `inputd: invalid argument: ...` — bad CLI arg to one-shot `inputd -A 2` (config bug) + +## 5. Verified by source inspection (no clean post-fix QEMU capture) + +The post-fix ISO was rebuilt successfully (`build-redbear.sh redbear-mini`, +exit 0, 512 MB ISO at `build/x86_64/redbear-mini.iso` produced at 2026-06-30 02:31). + +Source verification: +- `local/sources/base/drivers/input/ps2d/src/main.rs` line 96: `log::info!(...)` +- `local/sources/base/drivers/inputd/src/main.rs` line 661: `log::info!(...)` +- `recipes/core/base/source/drivers/input/ps2d/src/main.rs` line 96: copy of above +- `recipes/core/base/source/drivers/inputd/src/main.rs` line 661: copy of above + +**QEMU verification pending**: a clean post-fix QEMU boot with the new lines visible +was not captured in this session — the OVMF+`-nographic` boot path took >6 minutes +to stream the live ISO under TCG, which exceeds the session timeout. A future run +should use `accel=kvm` with the raw `redbear-mini.img` to bypass the slow bootloader +streaming, then grep the captured serial for the two new `[INFO]` lines. + +## 6. Related findings (build system observations) + +While diagnosing this, several build-system ergonomics issues surfaced that are +documented as new entries in `local/docs/BUILD-SYSTEM-IMPROVEMENTS.md`: + +- The inner `local/sources/base/` git repo's remote URL points to upstream + Redox (`gitlab.redox-os.org/redox-os/base.git`) instead of Red Bear's + gitea. Changes committed inside it cannot be pushed to the right place. +- `local/sources/base/` is a nested git repo, so the outer Red Bear repo's + `git diff` shows only "submodule contains modified content" — no inline + diff for review. +- `build-redbear.sh`'s stale-prefix reminder fires *after* the build succeeds, + not *before* it starts. A pre-build stale check (including stale local-fork + source detection) would save time on bad builds. \ No newline at end of file