diff --git a/local/docs/redbear-power-improvement-plan.md b/local/docs/redbear-power-improvement-plan.md index 06899de31f..02cc464f44 100644 --- a/local/docs/redbear-power-improvement-plan.md +++ b/local/docs/redbear-power-improvement-plan.md @@ -5885,17 +5885,296 @@ servers and CI runners. ### 67.7 What was NOT changed (intentional) -- **Per-thread CPU%** (synthetic) — defer to v1.44 if - user demand appears. The Linux kernel only exposes - process-total CPU%, not per-thread. +- **Per-thread CPU%** (synthetic) — defer to v1.45+ if + user demand appears. The Linux kernel exposes + `/proc//task//stat`, but the Redox + `proc:` scheme (in `local/sources/kernel/src/scheme/proc.rs`) + does NOT expose `task/` paths. Until the kernel + proc scheme is extended, the feature would work on + Linux hosts (CI only) but silently return `None` + for every process on the actual Red Bear runtime. + This is the same trap as the v1.41 + `read_thread_io` (`/proc//io`) — Linux-only + data source. Tracked as a kernel-side follow-up: + "kernel proc scheme: add `/proc/[pid]/task/[tid]/stat` + + `io` paths to enable per-thread CPU% in + redbear-power." Not a v1.44 feature. - **CPU affinity setter (taskset-style keypress)** — - defer to v1.44. The reader side is in v1.42; the - writer side requires an ioctl wrapper that we don't - have yet. -- **History reclaim for `disk_history`** — defer to - v1.44 if a use case appears. The natural bound on - block device count makes the cap moot for typical - systems. + v1.44 candidate. The reader side shipped in v1.42. + See §68 for the v1.44 plan. +- **History reclaim for `disk_history`** — defer + indefinitely. The natural bound on block device + count (~4-8 typically, max ~32 on any realistic + machine) is well below any reasonable cap. A + 32-disk cap on a map that currently holds 4-8 + entries solves a problem that doesn't occur. + +## 68. v1.44 Plan: CPU Affinity Setter (2026-06-21) + +This is a planning-only entry. The v1.43 scope audit +identified three candidates for v1.44: + +1. CPU affinity setter (taskset-style keypress) +2. Per-thread CPU% (synthetic) +3. History reclaim for `disk_history` + +This section captures the **feasibility analysis** and +the **implementation plan** for the chosen candidate +(#1). The other two were audited but rejected for +specific reasons documented below. + +### 68.1 Feasibility: surprising pre-existing work + +The agent's v1.43 audit surfaced a major correction +to the v1.42 deferred-list assumption: + +**The Redox kernel already implements +`sched_setaffinity` and `sched_getaffinity`.** + +Specifically: + +- `local/sources/kernel/src/syscall/mod.rs:235-236` + dispatches `SYS_SCHED_SETAFFINITY` and + `SYS_SCHED_GETAFFINITY`. +- `local/sources/kernel/src/syscall/process.rs:322-349` + (`sched_setaffinity`) and lines 360-382 + (`sched_getaffinity`) are real working + implementations with `RawMask` IO, + `ctx.sched_affinity.override_from(&raw_mask)`, etc. +- `__NR_sched_setaffinity` = 203 (x86_64) / 122 + (aarch64) and `__NR_sched_getaffinity` = 204 / + 123 are already exposed in + `local/sources/relibc/src/header/sys_syscall/`. + +The only piece missing is the **relibc POSIX +`` wrapper** that the rest of userspace +links against. v1.44's relibc-only scope is ~80-100 +LoC of new code (the patch carrier scaffolding is +already mature across P0-P11). + +### 68.2 Reuse of P7-pthread-affinity helpers + +The existing relibc patch +`local/patches/relibc/P7-pthread-affinity.patch` +(231 lines) already provides the helpers we'll need: + +- `cpu_set_t` type (1024 bits, 16 × u64) +- `cpuset_to_u64(&cpu_set_t) -> u64` — convert the + bit-set to a u64 mask (sufficient for any + realistic machine with ≤ 64 CPUs) +- `copy_u64_to_cpuset(u64, &mut cpu_set_t)` — + inverse + +If those helpers are `pub(crate)` from the pthread +module, v1.44's relibc patch reuses them directly +(avoiding ~30 LoC of duplication). If they're +`pub(super)` or private, we duplicate the 30 LoC — +the duplication is acceptable because the cost is +small and the alternative is a cross-module +visibility refactor outside v1.44's scope. + +### 68.3 Kernel pid=0 limitation (honest UX) + +The kernel `sched_setaffinity` syscall only supports +`pid == 0` (current process). Other PIDs return +`ESRCH`. This is documented at +`local/sources/kernel/src/syscall/process.rs:336-338` +as a TODO ("PID-based lookup not yet supported"). + +**Implication for v1.44 UX**: the operator can +highlight a process in the Process panel, but +pressing `A` will pin **redbear-power's own** +affinity to that process's CPU list, not the +highlighted process's. The popup will surface this +limitation in plain language: + +> Set redbear-power's CPU affinity to match this +> process's list? (Note: pinning another process's +> affinity requires future kernel support.) + +This is honest and operator-friendly. The htop +"highlight and pin" workflow becomes "highlight to +inspect, A to pin redbear-power's own TUI". A real +operator workflow: pin the monitor TUI to a +housekeeping core (CPU 0) so it doesn't fight for +time with the workload under measurement. + +A future kernel patch (extending +`sched_setaffinity` to honor non-zero PIDs) would +unblock the full htop UX. That's a separate kernel +patch, not v1.44. + +### 68.4 Rejected candidates (audit trail) + +**Candidate 2 — Per-thread CPU% (synthetic)**: +rejected for v1.44. + +Reason: the Redox `proc:` scheme does not expose +`/proc//task//stat`. The feature would +work on Linux hosts (CI passes) but silently return +`None` for every process on the actual Red Bear +runtime. This is the same trap as the v1.41 +`read_thread_io` (which also relies on +`/proc//io` — also not exposed by Redox's +`proc:` scheme). The Linux `cargo test` would pass; +the operator on real Red Bear would see a "—" column +everywhere. + +The kernel fix is small (similar shape to the +existing `status` path in +`local/sources/kernel/src/scheme/proc.rs:253`) but +it's a kernel-side change, not a redbear-power +feature. Tracked as a follow-up kernel patch, not +v1.44. + +**Candidate 3 — History reclaim for +`disk_history`**: rejected for v1.44. + +Reason: `disk_history` is keyed by disk name and +has a natural bound on block device count +(~4-8 typically, max ~32 on any realistic +machine). A 32-disk cap on a map that currently +holds 4-8 entries solves a problem that doesn't +exist. Even on a hypothetical server with 32 +disks, alphabetical eviction by +`BTreeMap::pop_first()` would not correspond to +"least recently active", so the cap would be +gamed immediately. + +The ~30 LoC of cap code is fine to write, but it's +not worth a v1.44 slot. Drive-by include it in any +other patch if convenient; otherwise defer +indefinitely. + +### 68.5 Implementation plan for v1.44 (when approved) + +When the user gives the go-ahead: + +#### Step 1 — relibc patch `P12-sched-setaffinity.patch` + +- `local/sources/relibc/src/header/sched/mod.rs`: + add `sched_setaffinity` and `sched_getaffinity` + POSIX wrappers (~40 LoC), reusing + `cpuset_to_u64`/`copy_u64_to_cpuset` from + `pthread/mod.rs` if `pub(crate)`, else + duplicating. +- `local/sources/relibc/src/platform/pal/mod.rs`: + add 2 trait methods (`sched_setaffinity`, + `sched_getaffinity`) to `Pal` trait (~6 LoC). +- `local/sources/relibc/src/platform/redox/mod.rs`: + add 2 redox impls using + `syscall::sched_setaffinity`/`sched_getaffinity` + wrappers (~25 LoC). Process-scoped (not + thread-scoped), so no FdGuard needed. +- `local/sources/relibc/src/platform/linux/mod.rs`: + add 2 linux impls using raw `syscall!` (~15 LoC). +- `local/sources/relibc/src/header/sched/cbindgen.toml`: + add to `[export].include` (~5 LoC). +- Tests in `src/header/sched/mod.rs` (~20 LoC): + set/get round-trip, NULL mask, EINVAL on bad + size, cpuset_to_u64 limits >64. + +Total: ~110 LoC, packaged as `P12-sched-setaffinity.patch`, +wired into the recipe via `local/patches/relibc/P12-...patch`. + +#### Step 2 — redbear-power `affinity.rs` module + +```rust +// src/affinity.rs +pub fn get_current_affinity() -> Result { ... } +pub fn set_current_affinity(mask: u64) -> Result<(), i32> { ... } +``` + +Wraps `libc::sched_setaffinity` / `sched_getaffinity`. +Reuses `parse_cpu_list` and `format_cpu_list` from +v1.42 for mask ↔ display string conversion. + +#### Step 3 — Key binding in `main.rs` + +Press `A` (capital A) on a Process panel row → +open PID detail popup with a "Set redbear-power +affinity" action. Press `Esc` to close without +acting. + +The lower-case `a` key is reserved for a future +"pin highlighted process" workflow (when the +kernel extends `sched_setaffinity` to honor +non-zero PIDs). + +#### Step 4 — PID detail popup integration + +Extend the existing `[cpu_affinity]` section +(added in v1.42) with a "Pin redbear-power to +this CPU list" button (rendered as a selectable +line). On Enter, calls +`affinity::set_current_affinity(parsed_mask)`. + +The popup surfaces the pid=0 limitation in plain +text (so the operator understands the action +applies to the TUI, not the highlighted process). + +#### Step 5 — Tests in redbear-power + +5-7 new tests: +- `get_current_affinity_returns_nonzero_mask` + (sanity check on test runner's own mask) +- `set_then_get_round_trip` (the canonical + round-trip — set a mask, read it back, assert + equal) +- `set_affinity_rejects_kernel_pids_above_zero` + (verifies the pid=0 limitation; on Linux where + the kernel DOES support non-zero PIDs, this + test is skipped or asserts the success case) +- `pid_detail_popup_shows_pin_action` + (render-test the popup text contains the + expected action line) +- `affinity_set_then_kparse_cpu_list_round_trip` + (end-to-end: parse "0-3,5" → set mask → get + mask → format → assert equal) +- Integration test: set redbear-power's own + affinity to a single CPU, verify + `/proc/self/status:Cpus_allowed_list` reflects + the change. + +#### Step 6 — Doc update + +Add §69 to this plan documenting the actual +v1.44 release (what shipped vs. planned, +audit findings, deferred list). + +### 68.6 Downstream recipe impact (audit) + +Confirmed consumers of +`sched_setaffinity`/`sched_getaffinity` in the +recipe tree (from the agent's audit): + +- `recipes/libs/mesa/source/src/util/u_cpu_detect.c:752` + — runtime probe of available CPUs. Already wraps + the call in `#ifdef` defensive checks. +- `recipes/recipes/tools/xz/source/src/common/tuklib_cpucores.c:58` + — runtime CPU count. Same defensive pattern. + +Both recipes will go from "ENOSYS on Redox" to +"current process mask works on Redox" — a strict +improvement. **No recipe will break** because the +defensive probe patterns already handle the +failure case; the v1.44 relibc patch just makes +the success case work. + +### 68.7 Effort estimate + +- relibc patch: ~3 hours (P7 reuse minimizes new + code; cbindgen export update is mechanical; + redox + linux impls are straightforward) +- redbear-power affinity.rs: ~30 minutes +- redbear-power main.rs + popup integration: + ~1.5 hours +- Tests: ~1.5 hours (the round-trip tests are + the bulk) +- Doc update: ~30 minutes (this section becomes + §69 with "what shipped") + +**Total: ~1 working day, end-to-end.** ## See Also