docs: capture ps2d/inputd boot-log diagnosis + addendum to BUILD-SYSTEM-IMPROVEMENTS

Two documentation changes:

1. New file local/docs/boot-logs/REDBEAR-MINI-BOOT-PS2D-INPUTD-LOG-FIX.md
   captures the 2026-06-30 diagnosis of why the mini boot appeared to
   freeze at the login prompt. Records:
   - The actual root cause (test harness not injecting keystrokes, not
     an OS bug — ps2d/inputd were working silently).
   - The committed fix (de9d1f4 in local/sources/base/ adds two
     log::info!() startup messages so operators can verify the input
     stack is alive from the boot log).
   - The expected post-fix boot log lines and how to interpret them.
   - Verification status (source-inspected; clean post-fix QEMU boot
     pending due to slow bootloader streaming under -nographic).

2. Addendum appended to local/docs/BUILD-SYSTEM-IMPROVEMENTS.md
   documenting four build-system ergonomics issues observed during
   the diagnosis session:
   - #11: local/sources/base/ inner git repo origin points to
     upstream Redox instead of Red Bear gitea.
   - #12: outer Red Bear repo cannot show inline diffs for the
     nested local/sources/base/ git repo (submodule pointer dirty).
   - #13: no preflight warning for stale local-fork source (a
     4-line edit caused a 30+ min rebuild with no advance notice).
   - #14: -nographic + OVMF boot is too slow for time-budgeted
     post-fix QEMU verification; recommend BIOS + KVM path.

Both items are S-sized and could be picked up in any future hardening
session. No code changes in this commit.
This commit is contained in:
2026-06-30 02:55:51 +03:00
parent 49363122d1
commit 41045fd2f1
2 changed files with 231 additions and 0 deletions
+121
View File
@@ -492,3 +492,124 @@ Eliminates the "delete and pray" pattern.
Recommended order for the remaining 1: #7A.
---
## Addendum — Build-system observations from the ps2d / inputd diagnosis session (2026-06-30)
While fixing the input-stack observability gap (commit `de9d1f4` in the
`local/sources/base/` inner repo), four small build-system ergonomics issues
were observed. Each is S-sized and could be picked up in any future hardening
session. None are blockers; all four cost time the next time someone edits a
local-fork source tree.
### 11. Local-fork inner-repo remote URL points to upstream Redox (S, ~10 min)
**Problem.** `local/sources/base/` is a nested git repo (the
local-fork model) with `origin = https://gitlab.redox-os.org/redox-os/base.git`.
Red Bear's own base fork is at `https://gitea.redbearos.org/vasilito/redbear-os-base`.
A Red Bear developer who commits inside `local/sources/base/` and runs
`git push origin master` would push Red Bear fork commits **to upstream
Redox**, where they will be rejected (or worse, silently fail).
**Current behavior.** Most Red Bear base commits are made by the
`Red Bear OS <build@redbearos.org>` author bot during automated syncs, which
push to a different fork URL configured out-of-band. The inner-repo
`origin` is therefore orphaned from normal operator workflows.
**Proposal.** Set the inner `local/sources/base/` origin to Red Bear's gitea
URL via `local/scripts/sync-fork-remotes.sh` (new file). Same treatment for
`local/sources/{relibc,kernel,bootloader,installer,redoxfs,userutils}` if any
of them have the same issue.
**Expected gain.** Eliminates a footgun. Operators can commit + push from
inside the local fork and reach the right remote.
**Risk.** Low — purely a remote URL change, no history rewrite.
### 12. Outer repo cannot show inline diffs for `local/sources/base/` (S, ~30 min)
**Problem.** `local/sources/base/` is a nested git repo (not a real submodule
— the outer Red Bear repo has no `.gitmodules` entry for it). The outer
repo sees file changes only as `Submodule local/sources/base contains modified
content`. `git diff -- local/sources/base/drivers/input/ps2d/src/main.rs`
shows nothing useful; only `git diff --submodule=log` shows the commit hash
delta, not the actual line changes.
This makes PR review of local-fork changes harder than necessary — the
reviewer must `cd local/sources/base && git diff` to see what actually changed.
**Proposal.** Either:
- (a) Register `local/sources/base/` (and other inner repos) as proper
git submodules via `.gitmodules` + `git submodule absorbgitdirs`. Lets
outer-repo `git diff` show the changes inline.
- (b) Add a wrapper script `local/scripts/show-fork-diffs.sh` that
recursively runs `git diff` inside each `local/sources/<component>/`
inner repo and presents the result with the outer-repo diff.
**Expected gain.** PR review of local-fork changes becomes trivial.
**Risk.** Low for (b); medium for (a) — touching `.gitmodules` and the
submodule pointer requires care.
### 13. No "stale local-fork source" preflight check (S, ~2 hours)
**Problem.** `build-redbear.sh` has a stale-prefix reminder that fires
*after* the build, but no equivalent check for stale local-fork sources.
When the operator edits `local/sources/base/...` and runs `make all`, the
cookbook uses path-based source so it *does* detect the change — but the
operator gets no warning that the build will take a long time because of
their edit. In the 2026-06-30 session, a 4-line edit to two `local/sources/base/`
files caused a 30+ minute rebuild of base (cargo rebuild of all 27
sub-crates) with no warning that the rebuild scope would be that wide.
**Proposal.** Extend the preflight in `local/scripts/build-redbear.sh` to:
1. Compare the mtime of every `local/sources/<component>/**/*.rs` against
the corresponding `repo/x86_64-unknown-redox/<component>.pkgar` mtime.
2. Print a `>>> WARNING: <component> source is newer than its pkgar —
rebuilding` message before the build starts.
3. Print an `>>> ESTIMATED TIME: <N> minutes based on history` line.
**Expected gain.** Operators avoid 30-minute surprise rebuilds and can
defer edits to a low-cost window.
**Risk.** None — purely additive diagnostic.
### 14. Bootloader streaming under `-nographic` + OVMF is unusably slow (S, ~1 hour)
**Problem.** When booting `build/x86_64/redbear-mini.iso` under
`qemu-system-x86_64 -nographic -serial mon:stdio` with OVMF, the Redox
bootloader streams the live ISO one MiB at a time over serial, taking
>6 minutes to reach the kernel. The user's reference log shows the same
ISO booting in seconds on a real KVM-accelerated host, so this is a
QEMU + `-nographic` interaction, not a bootloader bug.
This makes post-fix QEMU verification of any change impractical inside
a normal session timeout.
**Proposal.** Two complementary fixes:
- (a) Document in `local/scripts/test-redbear-full-qemu.sh` and
`local/scripts/test-live-mini-uefi.sh` that for time-budgeted boot
verification, use `qemu-system-x86_64 -machine pc,accel=kvm -cpu host
-hda build/x86_64/redbear-mini.img -nographic -serial mon:stdio`
(raw `.img`, BIOS boot) instead of the OVMF + ISO path. The BIOS path
skips the live-mode streaming entirely.
- (b) Add `QEMU_BOOT_MODE` flag to the test launcher, default to BIOS
for fast verification, with `--uefi` opt-in for OVMF.
**Expected gain.** Post-fix QEMU verifications fit in a 6090 second
budget. Critical for the ps2d/inputd-style small-fix → verify cycle.
**Risk.** None for (a); low for (b).
### Summary of addendum
| # | Title | Size | Risk | Status |
|---|-------|------|------|--------|
| 11 | Fix inner-fork remote URLs | S | Low | open |
| 12 | Surface inner-fork diffs to outer repo | S | LowM | open |
| 13 | Preflight stale-local-fork-source warning | S | None | open |
| 14 | Fast-QEMU boot mode for verification | S | None | open |
@@ -0,0 +1,110 @@
# Red Bear OS — QEMU mini boot: ps2d / inputd startup-log diagnosis
**Date**: 2026-06-30
**Test target**: `redbear-mini`
**Test launcher**: ad-hoc QEMU (`-machine pc -cpu max -smp 8 -m 12288 -nographic -serial mon:stdio`)
**Captured log**: original evidence was the boot sequence the operator pasted in chat
on 2026-06-29 (no separate file). This doc captures the diagnostic conclusions and
the fix.
## 1. Background
A redbear-mini QEMU boot reached the `Red Bear login:` prompt and then appeared to
"freeze": no keystrokes reached `login`, the prompt rendered twice with `[?1000l[?1l`
escape sequences in between (liner returning empty), and the next serial output was
`RB_STAGE_08_USERLAND`.
The first hypothesis was that `ps2d` was not running — there was no
`[INFO] ps2d:` line in the boot log. This is **normal Redox behavior, not a bug**:
`ps2d` and `inputd` produce **no Info-level output on successful start**. They
have zero `log::info!()` calls on the success path; the only stdout is `.expect()`
panic messages on the failure path. Operators cannot distinguish "ps2d alive and
producing events" from "ps2d silently panicked before `daemon.ready()`" from the
boot log alone.
This makes ps2d / inputd appear to be dead whenever the input stack happens to be
working silently, which is the worst possible failure mode for diagnostics.
## 2. Root cause (and what was NOT broken)
The boot reached `Red Bear login:`. That is proof that:
- `inputd` was up — `getty` opens `/scheme/fbcon/2`, which requires inputd.
- `getty` was up — `getty 2` is the only process that opens that path and spawns `login`.
- `login` was up — it printed `/etc/issue` and the prompt, then blocked on `liner::read_line`.
- The PTY was up — `ptyd` (in `00_base.target`) creates the master fd `getty` bridges.
The reason `login` got no input is that **the QEMU session was not actually sending
keystrokes to the guest**. This is a test-harness issue, not an OS bug. To verify
that ps2d is working in a future run, an operator must either type on the QEMU
window (with a graphical display) or inject keystrokes via the QMP `send-key`
command on the monitor socket.
## 3. Fix — add startup info logs
Two minimal diagnostic `log::info!()` calls were added in the `local/sources/base/`
fork (the inner Red Bear git repo at `local/sources/base/`):
- `local/sources/base/drivers/input/ps2d/src/main.rs` — after `daemon.ready()`,
log `"ps2d: registered producer handle, listening on serio/0 (keyboard) and serio/1 (mouse)"`.
- `local/sources/base/drivers/inputd/src/main.rs` — after `setup_logging`,
log `"inputd: scheme:input registered, waiting for handles"`.
The diff is 6 insertions across 2 files. No behavior change. No new error paths.
The new `log::info!()` lines are emitted **only on the successful startup path**.
Existing `.error!()` and `.warn!()` calls in ps2d (controller init failures,
keyboard self-test failures, scancode errors) and inputd (scheme path errors,
VT switch failures, control-command errors) continue to surface real failures.
## 4. How an operator verifies the input stack is alive
After this fix, a healthy redbear-mini boot on QEMU shows both lines in the boot
log (during initfs phase):
```
2026-06-30T...Z [@ps2d:<line> INFO] ps2d: registered producer handle, listening on serio/0 (keyboard) and serio/1 (mouse)
2026-06-30T...Z [@inputd:<line> INFO] inputd: scheme:input registered, waiting for handles
```
If either line is missing after this commit, that daemon is dead. Check the panic
output (`.expect()` messages) for the cause:
- `ps2d: failed to get I/O permission` — I/O port rights denied (rare)
- `ps2d: failed to open input producer` — inputd crashed before ps2d started
- `ps2d: failed to open /scheme/serio/0` — kernel serio scheme missing (very rare)
- `ps2d: failed to initialize` — PS/2 controller self-test failed (QEMU `-cpu max`
always passes this; only an issue on broken real hardware)
- `inputd: invalid argument: ...` — bad CLI arg to one-shot `inputd -A 2` (config bug)
## 5. Verified by source inspection (no clean post-fix QEMU capture)
The post-fix ISO was rebuilt successfully (`build-redbear.sh redbear-mini`,
exit 0, 512 MB ISO at `build/x86_64/redbear-mini.iso` produced at 2026-06-30 02:31).
Source verification:
- `local/sources/base/drivers/input/ps2d/src/main.rs` line 96: `log::info!(...)`
- `local/sources/base/drivers/inputd/src/main.rs` line 661: `log::info!(...)`
- `recipes/core/base/source/drivers/input/ps2d/src/main.rs` line 96: copy of above
- `recipes/core/base/source/drivers/inputd/src/main.rs` line 661: copy of above
**QEMU verification pending**: a clean post-fix QEMU boot with the new lines visible
was not captured in this session — the OVMF+`-nographic` boot path took >6 minutes
to stream the live ISO under TCG, which exceeds the session timeout. A future run
should use `accel=kvm` with the raw `redbear-mini.img` to bypass the slow bootloader
streaming, then grep the captured serial for the two new `[INFO]` lines.
## 6. Related findings (build system observations)
While diagnosing this, several build-system ergonomics issues surfaced that are
documented as new entries in `local/docs/BUILD-SYSTEM-IMPROVEMENTS.md`:
- The inner `local/sources/base/` git repo's remote URL points to upstream
Redox (`gitlab.redox-os.org/redox-os/base.git`) instead of Red Bear's
gitea. Changes committed inside it cannot be pushed to the right place.
- `local/sources/base/` is a nested git repo, so the outer Red Bear repo's
`git diff` shows only "submodule contains modified content" — no inline
diff for review.
- `build-redbear.sh`'s stale-prefix reminder fires *after* the build succeeds,
not *before* it starts. A pre-build stale check (including stale local-fork
source detection) would save time on bad builds.