docs: capture ps2d/inputd boot-log diagnosis + addendum to BUILD-SYSTEM-IMPROVEMENTS
Two documentation changes:
1. New file local/docs/boot-logs/REDBEAR-MINI-BOOT-PS2D-INPUTD-LOG-FIX.md
captures the 2026-06-30 diagnosis of why the mini boot appeared to
freeze at the login prompt. Records:
- The actual root cause (test harness not injecting keystrokes, not
an OS bug — ps2d/inputd were working silently).
- The committed fix (de9d1f4 in local/sources/base/ adds two
log::info!() startup messages so operators can verify the input
stack is alive from the boot log).
- The expected post-fix boot log lines and how to interpret them.
- Verification status (source-inspected; clean post-fix QEMU boot
pending due to slow bootloader streaming under -nographic).
2. Addendum appended to local/docs/BUILD-SYSTEM-IMPROVEMENTS.md
documenting four build-system ergonomics issues observed during
the diagnosis session:
- #11: local/sources/base/ inner git repo origin points to
upstream Redox instead of Red Bear gitea.
- #12: outer Red Bear repo cannot show inline diffs for the
nested local/sources/base/ git repo (submodule pointer dirty).
- #13: no preflight warning for stale local-fork source (a
4-line edit caused a 30+ min rebuild with no advance notice).
- #14: -nographic + OVMF boot is too slow for time-budgeted
post-fix QEMU verification; recommend BIOS + KVM path.
Both items are S-sized and could be picked up in any future hardening
session. No code changes in this commit.
This commit is contained in:
@@ -492,3 +492,124 @@ Eliminates the "delete and pray" pattern.
|
||||
|
||||
Recommended order for the remaining 1: #7A.
|
||||
|
||||
---
|
||||
|
||||
## Addendum — Build-system observations from the ps2d / inputd diagnosis session (2026-06-30)
|
||||
|
||||
While fixing the input-stack observability gap (commit `de9d1f4` in the
|
||||
`local/sources/base/` inner repo), four small build-system ergonomics issues
|
||||
were observed. Each is S-sized and could be picked up in any future hardening
|
||||
session. None are blockers; all four cost time the next time someone edits a
|
||||
local-fork source tree.
|
||||
|
||||
### 11. Local-fork inner-repo remote URL points to upstream Redox (S, ~10 min)
|
||||
|
||||
**Problem.** `local/sources/base/` is a nested git repo (the
|
||||
local-fork model) with `origin = https://gitlab.redox-os.org/redox-os/base.git`.
|
||||
Red Bear's own base fork is at `https://gitea.redbearos.org/vasilito/redbear-os-base`.
|
||||
A Red Bear developer who commits inside `local/sources/base/` and runs
|
||||
`git push origin master` would push Red Bear fork commits **to upstream
|
||||
Redox**, where they will be rejected (or worse, silently fail).
|
||||
|
||||
**Current behavior.** Most Red Bear base commits are made by the
|
||||
`Red Bear OS <build@redbearos.org>` author bot during automated syncs, which
|
||||
push to a different fork URL configured out-of-band. The inner-repo
|
||||
`origin` is therefore orphaned from normal operator workflows.
|
||||
|
||||
**Proposal.** Set the inner `local/sources/base/` origin to Red Bear's gitea
|
||||
URL via `local/scripts/sync-fork-remotes.sh` (new file). Same treatment for
|
||||
`local/sources/{relibc,kernel,bootloader,installer,redoxfs,userutils}` if any
|
||||
of them have the same issue.
|
||||
|
||||
**Expected gain.** Eliminates a footgun. Operators can commit + push from
|
||||
inside the local fork and reach the right remote.
|
||||
|
||||
**Risk.** Low — purely a remote URL change, no history rewrite.
|
||||
|
||||
### 12. Outer repo cannot show inline diffs for `local/sources/base/` (S, ~30 min)
|
||||
|
||||
**Problem.** `local/sources/base/` is a nested git repo (not a real submodule
|
||||
— the outer Red Bear repo has no `.gitmodules` entry for it). The outer
|
||||
repo sees file changes only as `Submodule local/sources/base contains modified
|
||||
content`. `git diff -- local/sources/base/drivers/input/ps2d/src/main.rs`
|
||||
shows nothing useful; only `git diff --submodule=log` shows the commit hash
|
||||
delta, not the actual line changes.
|
||||
|
||||
This makes PR review of local-fork changes harder than necessary — the
|
||||
reviewer must `cd local/sources/base && git diff` to see what actually changed.
|
||||
|
||||
**Proposal.** Either:
|
||||
|
||||
- (a) Register `local/sources/base/` (and other inner repos) as proper
|
||||
git submodules via `.gitmodules` + `git submodule absorbgitdirs`. Lets
|
||||
outer-repo `git diff` show the changes inline.
|
||||
- (b) Add a wrapper script `local/scripts/show-fork-diffs.sh` that
|
||||
recursively runs `git diff` inside each `local/sources/<component>/`
|
||||
inner repo and presents the result with the outer-repo diff.
|
||||
|
||||
**Expected gain.** PR review of local-fork changes becomes trivial.
|
||||
|
||||
**Risk.** Low for (b); medium for (a) — touching `.gitmodules` and the
|
||||
submodule pointer requires care.
|
||||
|
||||
### 13. No "stale local-fork source" preflight check (S, ~2 hours)
|
||||
|
||||
**Problem.** `build-redbear.sh` has a stale-prefix reminder that fires
|
||||
*after* the build, but no equivalent check for stale local-fork sources.
|
||||
When the operator edits `local/sources/base/...` and runs `make all`, the
|
||||
cookbook uses path-based source so it *does* detect the change — but the
|
||||
operator gets no warning that the build will take a long time because of
|
||||
their edit. In the 2026-06-30 session, a 4-line edit to two `local/sources/base/`
|
||||
files caused a 30+ minute rebuild of base (cargo rebuild of all 27
|
||||
sub-crates) with no warning that the rebuild scope would be that wide.
|
||||
|
||||
**Proposal.** Extend the preflight in `local/scripts/build-redbear.sh` to:
|
||||
|
||||
1. Compare the mtime of every `local/sources/<component>/**/*.rs` against
|
||||
the corresponding `repo/x86_64-unknown-redox/<component>.pkgar` mtime.
|
||||
2. Print a `>>> WARNING: <component> source is newer than its pkgar —
|
||||
rebuilding` message before the build starts.
|
||||
3. Print an `>>> ESTIMATED TIME: <N> minutes based on history` line.
|
||||
|
||||
**Expected gain.** Operators avoid 30-minute surprise rebuilds and can
|
||||
defer edits to a low-cost window.
|
||||
|
||||
**Risk.** None — purely additive diagnostic.
|
||||
|
||||
### 14. Bootloader streaming under `-nographic` + OVMF is unusably slow (S, ~1 hour)
|
||||
|
||||
**Problem.** When booting `build/x86_64/redbear-mini.iso` under
|
||||
`qemu-system-x86_64 -nographic -serial mon:stdio` with OVMF, the Redox
|
||||
bootloader streams the live ISO one MiB at a time over serial, taking
|
||||
>6 minutes to reach the kernel. The user's reference log shows the same
|
||||
ISO booting in seconds on a real KVM-accelerated host, so this is a
|
||||
QEMU + `-nographic` interaction, not a bootloader bug.
|
||||
|
||||
This makes post-fix QEMU verification of any change impractical inside
|
||||
a normal session timeout.
|
||||
|
||||
**Proposal.** Two complementary fixes:
|
||||
|
||||
- (a) Document in `local/scripts/test-redbear-full-qemu.sh` and
|
||||
`local/scripts/test-live-mini-uefi.sh` that for time-budgeted boot
|
||||
verification, use `qemu-system-x86_64 -machine pc,accel=kvm -cpu host
|
||||
-hda build/x86_64/redbear-mini.img -nographic -serial mon:stdio`
|
||||
(raw `.img`, BIOS boot) instead of the OVMF + ISO path. The BIOS path
|
||||
skips the live-mode streaming entirely.
|
||||
- (b) Add `QEMU_BOOT_MODE` flag to the test launcher, default to BIOS
|
||||
for fast verification, with `--uefi` opt-in for OVMF.
|
||||
|
||||
**Expected gain.** Post-fix QEMU verifications fit in a 60–90 second
|
||||
budget. Critical for the ps2d/inputd-style small-fix → verify cycle.
|
||||
|
||||
**Risk.** None for (a); low for (b).
|
||||
|
||||
### Summary of addendum
|
||||
|
||||
| # | Title | Size | Risk | Status |
|
||||
|---|-------|------|------|--------|
|
||||
| 11 | Fix inner-fork remote URLs | S | Low | open |
|
||||
| 12 | Surface inner-fork diffs to outer repo | S | Low–M | open |
|
||||
| 13 | Preflight stale-local-fork-source warning | S | None | open |
|
||||
| 14 | Fast-QEMU boot mode for verification | S | None | open |
|
||||
|
||||
|
||||
@@ -0,0 +1,110 @@
|
||||
# Red Bear OS — QEMU mini boot: ps2d / inputd startup-log diagnosis
|
||||
|
||||
**Date**: 2026-06-30
|
||||
**Test target**: `redbear-mini`
|
||||
**Test launcher**: ad-hoc QEMU (`-machine pc -cpu max -smp 8 -m 12288 -nographic -serial mon:stdio`)
|
||||
**Captured log**: original evidence was the boot sequence the operator pasted in chat
|
||||
on 2026-06-29 (no separate file). This doc captures the diagnostic conclusions and
|
||||
the fix.
|
||||
|
||||
## 1. Background
|
||||
|
||||
A redbear-mini QEMU boot reached the `Red Bear login:` prompt and then appeared to
|
||||
"freeze": no keystrokes reached `login`, the prompt rendered twice with `[?1000l[?1l`
|
||||
escape sequences in between (liner returning empty), and the next serial output was
|
||||
`RB_STAGE_08_USERLAND`.
|
||||
|
||||
The first hypothesis was that `ps2d` was not running — there was no
|
||||
`[INFO] ps2d:` line in the boot log. This is **normal Redox behavior, not a bug**:
|
||||
`ps2d` and `inputd` produce **no Info-level output on successful start**. They
|
||||
have zero `log::info!()` calls on the success path; the only stdout is `.expect()`
|
||||
panic messages on the failure path. Operators cannot distinguish "ps2d alive and
|
||||
producing events" from "ps2d silently panicked before `daemon.ready()`" from the
|
||||
boot log alone.
|
||||
|
||||
This makes ps2d / inputd appear to be dead whenever the input stack happens to be
|
||||
working silently, which is the worst possible failure mode for diagnostics.
|
||||
|
||||
## 2. Root cause (and what was NOT broken)
|
||||
|
||||
The boot reached `Red Bear login:`. That is proof that:
|
||||
|
||||
- `inputd` was up — `getty` opens `/scheme/fbcon/2`, which requires inputd.
|
||||
- `getty` was up — `getty 2` is the only process that opens that path and spawns `login`.
|
||||
- `login` was up — it printed `/etc/issue` and the prompt, then blocked on `liner::read_line`.
|
||||
- The PTY was up — `ptyd` (in `00_base.target`) creates the master fd `getty` bridges.
|
||||
|
||||
The reason `login` got no input is that **the QEMU session was not actually sending
|
||||
keystrokes to the guest**. This is a test-harness issue, not an OS bug. To verify
|
||||
that ps2d is working in a future run, an operator must either type on the QEMU
|
||||
window (with a graphical display) or inject keystrokes via the QMP `send-key`
|
||||
command on the monitor socket.
|
||||
|
||||
## 3. Fix — add startup info logs
|
||||
|
||||
Two minimal diagnostic `log::info!()` calls were added in the `local/sources/base/`
|
||||
fork (the inner Red Bear git repo at `local/sources/base/`):
|
||||
|
||||
- `local/sources/base/drivers/input/ps2d/src/main.rs` — after `daemon.ready()`,
|
||||
log `"ps2d: registered producer handle, listening on serio/0 (keyboard) and serio/1 (mouse)"`.
|
||||
- `local/sources/base/drivers/inputd/src/main.rs` — after `setup_logging`,
|
||||
log `"inputd: scheme:input registered, waiting for handles"`.
|
||||
|
||||
The diff is 6 insertions across 2 files. No behavior change. No new error paths.
|
||||
|
||||
The new `log::info!()` lines are emitted **only on the successful startup path**.
|
||||
Existing `.error!()` and `.warn!()` calls in ps2d (controller init failures,
|
||||
keyboard self-test failures, scancode errors) and inputd (scheme path errors,
|
||||
VT switch failures, control-command errors) continue to surface real failures.
|
||||
|
||||
## 4. How an operator verifies the input stack is alive
|
||||
|
||||
After this fix, a healthy redbear-mini boot on QEMU shows both lines in the boot
|
||||
log (during initfs phase):
|
||||
|
||||
```
|
||||
2026-06-30T...Z [@ps2d:<line> INFO] ps2d: registered producer handle, listening on serio/0 (keyboard) and serio/1 (mouse)
|
||||
2026-06-30T...Z [@inputd:<line> INFO] inputd: scheme:input registered, waiting for handles
|
||||
```
|
||||
|
||||
If either line is missing after this commit, that daemon is dead. Check the panic
|
||||
output (`.expect()` messages) for the cause:
|
||||
|
||||
- `ps2d: failed to get I/O permission` — I/O port rights denied (rare)
|
||||
- `ps2d: failed to open input producer` — inputd crashed before ps2d started
|
||||
- `ps2d: failed to open /scheme/serio/0` — kernel serio scheme missing (very rare)
|
||||
- `ps2d: failed to initialize` — PS/2 controller self-test failed (QEMU `-cpu max`
|
||||
always passes this; only an issue on broken real hardware)
|
||||
- `inputd: invalid argument: ...` — bad CLI arg to one-shot `inputd -A 2` (config bug)
|
||||
|
||||
## 5. Verified by source inspection (no clean post-fix QEMU capture)
|
||||
|
||||
The post-fix ISO was rebuilt successfully (`build-redbear.sh redbear-mini`,
|
||||
exit 0, 512 MB ISO at `build/x86_64/redbear-mini.iso` produced at 2026-06-30 02:31).
|
||||
|
||||
Source verification:
|
||||
- `local/sources/base/drivers/input/ps2d/src/main.rs` line 96: `log::info!(...)`
|
||||
- `local/sources/base/drivers/inputd/src/main.rs` line 661: `log::info!(...)`
|
||||
- `recipes/core/base/source/drivers/input/ps2d/src/main.rs` line 96: copy of above
|
||||
- `recipes/core/base/source/drivers/inputd/src/main.rs` line 661: copy of above
|
||||
|
||||
**QEMU verification pending**: a clean post-fix QEMU boot with the new lines visible
|
||||
was not captured in this session — the OVMF+`-nographic` boot path took >6 minutes
|
||||
to stream the live ISO under TCG, which exceeds the session timeout. A future run
|
||||
should use `accel=kvm` with the raw `redbear-mini.img` to bypass the slow bootloader
|
||||
streaming, then grep the captured serial for the two new `[INFO]` lines.
|
||||
|
||||
## 6. Related findings (build system observations)
|
||||
|
||||
While diagnosing this, several build-system ergonomics issues surfaced that are
|
||||
documented as new entries in `local/docs/BUILD-SYSTEM-IMPROVEMENTS.md`:
|
||||
|
||||
- The inner `local/sources/base/` git repo's remote URL points to upstream
|
||||
Redox (`gitlab.redox-os.org/redox-os/base.git`) instead of Red Bear's
|
||||
gitea. Changes committed inside it cannot be pushed to the right place.
|
||||
- `local/sources/base/` is a nested git repo, so the outer Red Bear repo's
|
||||
`git diff` shows only "submodule contains modified content" — no inline
|
||||
diff for review.
|
||||
- `build-redbear.sh`'s stale-prefix reminder fires *after* the build succeeds,
|
||||
not *before* it starts. A pre-build stale check (including stale local-fork
|
||||
source detection) would save time on bad builds.
|
||||
Reference in New Issue
Block a user