Update local subsystem planning docs

This commit is contained in:
2026-04-20 18:37:35 +01:00
parent 5683ba5dc3
commit 6343f173c9
17 changed files with 1246 additions and 1443 deletions
+12 -4
View File
@@ -13,11 +13,19 @@ P0 ACPI boot-baseline work is **materially complete for the historical boot goal
be read as release-grade ACPI completeness; ownership cleanup, sleep-state support, and bounded be read as release-grade ACPI completeness; ownership cleanup, sleep-state support, and bounded
bare-metal validation still remain open. Kernel patch is 574 lines, base/acpid patch is 558 lines. bare-metal validation still remain open. Kernel patch is 574 lines, base/acpid patch is 558 lines.
Where this historical ledger differs from the current source tree, prefer
`local/docs/ACPI-IMPROVEMENT-PLAN.md`. In particular, do **not** read older references here to
typed `acpid` startup hardening as proof that current-tree boot-path hardening is already complete.
Do **not** use this file as the current boot-wiring authority either: initfs lifecycle, `hwd`
`acpid` ad hoc spawning, explicit `RSDP_ADDR` forwarding plus x86 BIOS AML fallback, weak legacy
fallback, and provisional `/scheme/acpi/power` semantics are tracked in
`local/docs/ACPI-IMPROVEMENT-PLAN.md`.
## Crash Reports ## Crash Reports
| Hardware | Symptom | Root Cause | Status | | Hardware | Symptom | Root Cause | Status |
|----------|---------|------------|--------| |----------|---------|------------|--------|
| Framework Laptop 16 (AMD 7040) | Crash on boot | Unimplemented ACPI function (jackpot51/acpi#3) | ✅ Fixed (RSDP/SDT checksums, MADT NMI types, FADT parse, ACPI init typed errors) | | Framework Laptop 16 (AMD 7040) | Crash on boot | Unimplemented ACPI function (jackpot51/acpi#3) | ✅ Fixed for the historical boot-baseline path (RSDP/SDT checksums, MADT NMI types, FADT parse, related bring-up fixes). `acpid` startup hardening still remains open in the current tree. |
| Lenovo ThinkCentre M83 | `Aml(NoCurrentOp)` panic at acpid acpi.rs:256 | AML interpreter encounters unsupported opcode | Under investigation (upstream AML issue; not resolved by P0 work) | | Lenovo ThinkCentre M83 | `Aml(NoCurrentOp)` panic at acpid acpi.rs:256 | AML interpreter encounters unsupported opcode | Under investigation (upstream AML issue; not resolved by P0 work) |
| HP Compaq nc6120 | Crash after `kernel::acpi` prints APIC info | xAPIC APIC ID read returned raw value, caused page fault on Intel | ✅ Fixed (xAPIC `id()` now shifts `read(0x20) >> 24`) | | HP Compaq nc6120 | Crash after `kernel::acpi` prints APIC info | xAPIC APIC ID read returned raw value, caused page fault on Intel | ✅ Fixed (xAPIC `id()` now shifts `read(0x20) >> 24`) |
@@ -106,7 +114,7 @@ platforms, not just AMD.
| `aml_physmem.rs` | 418,423,428 | Mutex create/acquire/release | Upstream | Mainline AML interpreter | **Partially addressed** — real tracked state implemented, not placeholder | | `aml_physmem.rs` | 418,423,428 | Mutex create/acquire/release | Upstream | Mainline AML interpreter | **Partially addressed** — real tracked state implemented, not placeholder |
| `ec.rs` | 193+ (8 occurrences) | Proper error types | Upstream | Mainline EC handler | **Partially addressed** — widened accesses implemented via byte transactions | | `ec.rs` | 193+ (8 occurrences) | Proper error types | Upstream | Mainline EC handler | **Partially addressed** — widened accesses implemented via byte transactions |
| `dmar/mod.rs` | 7 | Move DMAR to separate driver | Upstream | Mainline driver refactor | **Partially addressed** — DMAR module present but not wired into startup; ownership remains transitional/orphaned rather than cleanly moved | | `dmar/mod.rs` | 7 | Move DMAR to separate driver | Upstream | Mainline driver refactor | **Partially addressed** — DMAR module present but not wired into startup; ownership remains transitional/orphaned rather than cleanly moved |
| `main.rs` | — | Startup panic/expect handling | Local | Boot-path hardening | **Addressed** — typed `StartupError` enum with explicit error messages and clean exit paths | | `main.rs` | — | Startup panic/expect handling | Local | Boot-path hardening | **Open** — active current-tree `acpid` still contains panic/expect startup paths; see Wave 1 in `local/docs/ACPI-IMPROVEMENT-PLAN.md` |
## P0 Fixes Applied ## P0 Fixes Applied
@@ -137,7 +145,7 @@ platforms, not just AMD.
| # | Fix | Description | | # | Fix | Description |
|---|-----|-------------| |---|-----|-------------|
| 1 | DMAR iterator fix | `type_bytes` renamed to `len_bytes` bug fix + `len < 4` guard | | 1 | DMAR iterator fix | `type_bytes` renamed to `len_bytes` bug fix + `len < 4` guard |
| 2 | DMAR init re-enabled | Safe on AMD (no DMAR table = early return, no crash) | | 2 | DMAR parser/runtime safety fixes | Iterator/length guards were repaired so the DMAR carrier no longer crashes merely by existing; this does **not** mean active `acpid` startup ownership was re-established |
| 3 | DMAR not wired into acpid startup | DMAR module present in `dmar/mod.rs` but not imported or called from `main.rs`; this removes active startup ownership from `acpid`, but does not yet establish a clean Intel runtime owner | | 3 | DMAR not wired into acpid startup | DMAR module present in `dmar/mod.rs` but not imported or called from `main.rs`; this removes active startup ownership from `acpid`, but does not yet establish a clean Intel runtime owner |
| 4 | FADT shutdown | `acpi_shutdown()` using PM1a/PM1b CNT_BLK writes with `\_S5` sleep types | | 4 | FADT shutdown | `acpi_shutdown()` using PM1a/PM1b CNT_BLK writes with `\_S5` sleep types |
| 5 | FADT reboot | `acpi_reboot()` using ACPI reset register via GenericAddress | | 5 | FADT reboot | `acpi_reboot()` using ACPI reset register via GenericAddress |
@@ -146,7 +154,7 @@ platforms, not just AMD.
| 8 | GenericAddress rename | `GenericAddressStructure` renamed to `GenericAddress` with `is_empty()`, `write_u8()` | | 8 | GenericAddress rename | `GenericAddressStructure` renamed to `GenericAddress` with `is_empty()`, `write_u8()` |
| 9 | Reboot wiring | `reboot_requested` flag in main.rs, scheme path detection | | 9 | Reboot wiring | `reboot_requested` flag in main.rs, scheme path detection |
| 10 | ivrs/mcfg removed | Broken stub references eliminated (deferred to P2+, handled by pcid) | | 10 | ivrs/mcfg removed | Broken stub references eliminated (deferred to P2+, handled by pcid) |
| 11 | Typed startup errors | `StartupError` enum covers all startup failure paths; no `panic!` on firmware-origin paths; ACPI-absent causes clean `exit(0)` | | 11 | Historical startup-hardening direction | Earlier patch work attempted `StartupError`-style handling, but active current-tree `acpid` still requires Wave 1 boot-path hardening; do **not** treat startup hardening as complete from this ledger alone |
| 12 | AML mutex real state | `AmlMutexState` with handle-based create/acquire/release; `FxHashMap<Handle, bool>` tracking; poisoned-state recovery | | 12 | AML mutex real state | `AmlMutexState` with handle-based create/acquire/release; `FxHashMap<Handle, bool>` tracking; poisoned-state recovery |
| 13 | EC widened accesses | `read_bytes`/`write_bytes` implement u16/u32/u64 via per-byte transactions; `ensure_access` bounds-checks against u8 addressable range | | 13 | EC widened accesses | `read_bytes`/`write_bytes` implement u16/u32/u64 via per-byte transactions; `ensure_access` bounds-checks against u8 addressable range |
| 14 | kstop shutdown eventing | `main.rs` opens `/scheme/kernel.acpi/kstop` and subscribes via `RawEventQueue`; `redbear-sessiond` reads kstop and emits D-Bus `PrepareForShutdown` signal | | 14 | kstop shutdown eventing | `main.rs` opens `/scheme/kernel.acpi/kstop` and subscribes via `RawEventQueue`; `redbear-sessiond` reads kstop and emits D-Bus `PrepareForShutdown` signal |
+427 -163
View File
@@ -7,37 +7,49 @@ release-grade complete**.
What is real today: What is real today:
- kernel early discovery and MADT/xAPIC/x2APIC bring-up are in place, - kernel early ACPI discovery exists and is used,
- `acpid` owns FADT shutdown/reboot, AML execution, DMI exposure, and ACPI power exposure, - MADT / APIC / HPET boot-baseline handling is real,
- `acpid` owns most runtime ACPI policy,
- `/scheme/kernel.acpi/kstop` shutdown eventing exists,
- `redbear-sessiond` consumes that shutdown-prep signal,
- IVRS / AMD-Vi ownership moved out of the broken `acpid` path and into `iommu`, - IVRS / AMD-Vi ownership moved out of the broken `acpid` path and into `iommu`,
- `kstop` shutdown eventing exists and is integrated with `redbear-sessiond`. - MCFG-in-`acpid` was removed in favor of the `pcid /config` path,
- `hwd` now forwards `RSDP_ADDR` / `RSDP_SIZE` to `acpid` explicitly when those values are present,
- x86 userspace AML bootstrap now has a bounded BIOS RSDP search fallback when explicit handoff is absent,
- `/scheme/acpi/power` is backed by real AML-driven adapter / battery probing rather than a pure placeholder surface, even though it is still not trustworthy enough for stronger support claims.
What is still open: What is still open:
- sleep-state support beyond `\_S5`, - `acpid` startup is not yet fully hardened,
- AML portability and runtime robustness on real firmware, - userspace AML bootstrap no longer depends solely on `RSDP_ADDR` on x86, but the explicit boot-path handoff contract is still underdocumented and non-BIOS paths remain unresolved,
- clean ownership boundaries across kernel / `acpid` / IOMMU, - normal service ownership is still transitional: `hwd` and `acpid` live on the initfs boot path rather than under a stable long-lived rootfs service contract,
- bounded real-hardware validation on AMD, Intel, and at least one EC-backed platform. - AML readiness is still coupled to PCI registration timing,
- `hwd` still spawns `acpid` ad hoc and the non-ACPI `LegacyBackend` fallback is effectively a TODO no-op,
- failed `/scheme/acpi/register_pci` handoff now uses a bounded retry path before degrading, but the degraded contract is still not strong enough to call Wave 1 closed,
- the `\_S5` / shutdown path is not yet trustworthy enough to call robust,
- `/scheme/acpi/power` is still not a trustworthy runtime power surface,
- sleep-state support beyond `S5` is incomplete,
- Intel DMAR runtime ownership is still unresolved,
- bounded bare-metal validation remains too thin for release-grade claims.
This document is therefore a **ULW execution plan** for turning the current ACPI stack from This document is the execution plan for turning the current ACPI stack from historical bring-up
historical bring-up success into a subsystem that is honest, maintainable, and release-grade. success into a subsystem that is correct under failure, explicit about ownership, honest in its
status claims, and backed by bounded runtime evidence.
## Purpose ## Purpose
This plan does **not** replace `local/docs/ACPI-FIXES.md`. This plan does **not** replace `local/docs/ACPI-FIXES.md`.
- `local/docs/ACPI-FIXES.md` remains the historical ledger for P0 ACPI bring-up and the current - `local/docs/ACPI-FIXES.md` remains the historical P0 bring-up ledger and implementation snapshot.
table-by-table implementation snapshot. - This file is the forward plan for correctness hardening, ownership cleanup, consumer integration,
- This file is the forward execution plan for closing the remaining ACPI gaps in correctness, and validation closure.
ownership clarity, consumer integration, and validation trust.
The goal is not to maximize the number of parsed ACPI tables. The goal is to make the Red Bear ACPI The goal is not to maximize the number of parsed ACPI tables. The goal is to make the ACPI stack:
stack:
- correct under bad firmware, - correct under bad firmware,
- explicit about who owns what, - explicit about who owns what,
- observable when it fails, - observable when it fails,
- and validated enough that status claims are evidence-backed rather than inferred. - honest about what is implemented versus what is validated.
## Scope ## Scope
@@ -46,7 +58,7 @@ This plan covers the Red Bear ACPI stack and its direct consumers:
- kernel ACPI discovery and early platform setup, - kernel ACPI discovery and early platform setup,
- `acpid` as the main ACPI / AML / FADT / DMI / power daemon, - `acpid` as the main ACPI / AML / FADT / DMI / power daemon,
- `iommu` as the IVRS / AMD-Vi runtime owner, - `iommu` as the IVRS / AMD-Vi runtime owner,
- `pcid` and `/config` as the PCI config-space path replacing broken MCFG-in-`acpid` stubs, - `pcid` and `/config` as the PCI config-space path,
- DMI-backed quirks flowing through `acpid` and `redox-driver-sys`, - DMI-backed quirks flowing through `acpid` and `redox-driver-sys`,
- ACPI consumers such as `redbear-sessiond`, `redbear-info`, and downstream services. - ACPI consumers such as `redbear-sessiond`, `redbear-info`, and downstream services.
@@ -62,6 +74,7 @@ Read these alongside this plan:
- `local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md` - `local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md`
- `local/docs/IOMMU-SPEC-REFERENCE.md` - `local/docs/IOMMU-SPEC-REFERENCE.md`
- `local/docs/QUIRKS-SYSTEM.md` - `local/docs/QUIRKS-SYSTEM.md`
- `local/docs/LINUX-BORROWING-RUST-IMPLEMENTATION-PLAN.md`
- `docs/02-GAP-ANALYSIS.md` - `docs/02-GAP-ANALYSIS.md`
## Evidence Model ## Evidence Model
@@ -74,8 +87,8 @@ This plan uses five evidence buckets and does **not** treat them as equivalent:
- **runtime-validated** — behavior has been exercised successfully in boot or runtime - **runtime-validated** — behavior has been exercised successfully in boot or runtime
- **negative-result-documented** — failures and platform gaps are explicitly recorded - **negative-result-documented** — failures and platform gaps are explicitly recorded
This distinction matters because the current ACPI stack has already crossed the bring-up threshold, The current ACPI stack has already crossed the bring-up threshold, but there is still meaningful
but still has meaningful distance between **implemented**, **robust**, and **trusted**. distance between **implemented**, **robust**, and **trusted**.
## Status Vocabulary ## Status Vocabulary
@@ -95,28 +108,32 @@ bounded-hardware, or release-grade completeness.
### Strong today ### Strong today
- Kernel RSDP / RSDT / XSDT / MADT handling is sufficient for current boot bring-up. - Kernel RSDP / RSDT / XSDT / MADT handling is sufficient for current boot bring-up.
- `acpid` owns FADT parsing, AML integration, DMI exposure, and ACPI-backed power-state exposure. - Kernel ACPI export is intentionally narrow: `rxsdt` and `kstop` are real and used.
- `acpid` startup uses typed `StartupError` and clean exits for several boot-critical failure paths. - `acpid` owns FADT parsing, AML integration, DMI exposure, and ACPI scheme surfaces.
- AML mutex state is real-tracked in `aml_physmem.rs`, not placeholder-only.
- EC width access is implemented via byte-transaction sequences for widened reads and writes.
- IVRS ownership was removed from the broken `acpid` stub path and moved into the `iommu` daemon. - IVRS ownership was removed from the broken `acpid` stub path and moved into the `iommu` daemon.
- MCFG handling was removed from `acpid` and replaced with the `pcid /config` path. - MCFG handling was removed from `acpid` and replaced with the `pcid /config` path.
- Shutdown eventing via `/scheme/kernel.acpi/kstop` is implemented and consumed by - Shutdown eventing via `/scheme/kernel.acpi/kstop` is implemented and consumed by
`redbear-sessiond`. `redbear-sessiond`.
- `acpid` now models `S1` / `S3` / `S4` / `S5` explicitly in userspace, and the current `_S5` - AML mutex state is real-tracked in `aml_physmem.rs`, not placeholder-only.
shutdown path routes through that model instead of a special-case magic value. - EC width access is implemented via byte-transaction sequences for widened reads and writes.
- `power_snapshot()` performs real AML-backed adapter / battery discovery and the ACPI scheme only exposes `/scheme/acpi/power` when that snapshot path succeeds.
### Weak today ### Weak today
- Sleep-state transitions beyond `\_S5` are unsupported. - `acpid` startup still contains active panic-grade `expect` paths.
- userspace AML bootstrap now has an explicit handoff path plus x86 BIOS fallback, but the producer side of that contract is still underdocumented and non-BIOS fallback remains unresolved.
- service lifecycle is still transitional: `hwd` and `acpid` are primarily initfs-owned and `acpid` is still spawned ad hoc rather than by an explicit long-lived rootfs unit.
- `\_S5` derivation currently depends on AML readiness that is still gated on PCI registration.
- `hwd` owns an ad hoc `acpid` spawn path, while `LegacyBackend` fallback is still a TODO no-op rather than a meaningful degraded probe path.
- `pcid` can continue without ACPI integration after a bounded retry window, so AML readiness still transitions from transient-not-ready to durable degraded mode without a stronger recovery contract.
- post-PCI AML bootstrap failure is now surfaced as an explicit error instead of a quietly empty symbol surface, but that path still needs broader boot-path proof.
- `set_global_s_state()` is effectively `S5`-only.
- Sleep eventing is unsupported. - Sleep eventing is unsupported.
- `SLP_TYPb` remains incomplete for broader sleep-state handling. - `SLP_TYPb` remains incomplete for broader sleep-state handling.
- Non-`S5` sleep targets are now represented explicitly, but they remain groundwork-only and do not - `power_snapshot()` exists, but its bootstrap preconditions and runtime evidence are still too weak to justify stronger `/scheme/acpi/power` trust claims.
imply implemented suspend/resume support yet.
- AML init order is still tied to PCI FD registration timing.
- Some physmem / opregion failure paths are still not explicit enough. - Some physmem / opregion failure paths are still not explicit enough.
- DMAR remains orphaned in `acpid` source: present, not wired, not fully transferred. - DMAR remains orphaned in `acpid` source: present, not wired, not fully transferred.
- Repo status language can still blur “implemented” vs “validated”. - Repo status language can still blur “implemented” versus “validated”.
- Bare-metal validation is too thin to justify release-grade claims. - Bare-metal validation is too thin to justify release-grade claims.
## Ownership Model ## Ownership Model
@@ -127,9 +144,10 @@ The long-term ownership split should be:
|---|---|---| |---|---|---|
| RSDP / RSDT / XSDT early discovery | Kernel | implemented | | RSDP / RSDT / XSDT early discovery | Kernel | implemented |
| MADT / HPET / early unavoidable platform setup | Kernel | implemented, broader scope still transitional | | MADT / HPET / early unavoidable platform setup | Kernel | implemented, broader scope still transitional |
| FADT parsing, `\_S5`, PM register writes, reboot | `acpid` | implemented | | FADT parsing, `\_S5`, PM register writes, reboot | `acpid` | implemented, robustness still partial |
| AML execution and opregion handling | `acpid` | implemented, robustness still partial | | AML execution and opregion handling | `acpid` | implemented, robustness still partial |
| DMI exposure and ACPI power surfaces | `acpid` | implemented | | DMI exposure | `acpid` | implemented |
| ACPI runtime power surface | `acpid` | transitional / incomplete |
| IVRS / AMD-Vi runtime handling | `iommu` | implemented | | IVRS / AMD-Vi runtime handling | `iommu` | implemented |
| DMAR / Intel VT-d runtime handling | future Intel IOMMU owner | transitional / not fully assigned | | DMAR / Intel VT-d runtime handling | future Intel IOMMU owner | transitional / not fully assigned |
| PCI config-space access | `pcid` | implemented | | PCI config-space access | `pcid` | implemented |
@@ -143,35 +161,55 @@ Important ownership truth:
- Do **not** describe Intel DMAR ownership as fully complete until the orphaned `acpid` carrier is - Do **not** describe Intel DMAR ownership as fully complete until the orphaned `acpid` carrier is
removed or a real Intel runtime owner is implemented and validated. removed or a real Intel runtime owner is implemented and validated.
## Degraded-Mode Contract ## Current Runtime Contract
The ACPI stack must distinguish between **fatal**, **degradable**, and **out-of-scope** failures. The ACPI stack must distinguish between **fatal**, **degradable**, and **out-of-scope** failures.
| Condition | Expected behavior today | Classification | | Condition | Expected behavior target | Classification |
|---|---|---| |---|---|---|
| ACPI absent / empty root table | `acpid` exits cleanly without ACPI services | degradable | | ACPI absent / empty root table | `acpid` exits cleanly without ACPI services | degradable |
| Bad SDT checksum | warn, continue best-effort where supported | degradable | | Bad SDT checksum | warn, continue best-effort where supported | degradable |
| Bad table length / malformed table | behavior varies too much today; must be normalized | open contract | | Bad table length / malformed table | deterministic reject or degrade policy | open contract |
| AML init failure | `acpid` exits, ACPI scheme unavailable | currently fatal | | Missing or unproven explicit `RSDP_ADDR` producer for userspace AML | kernel ACPI may still boot and x86 AML now has a bounded BIOS fallback, but the explicit producer contract remains incomplete from the repo-visible boot path | open contract |
| AML init failure | explicit failure, not panic | currently too fragile |
| Failed `/scheme/acpi/register_pci` handoff | boot degrades without full ACPI integration after a bounded retry window, but the degraded contract still lacks stronger recovery semantics | degradable |
| ACPI backend fallback to legacy probing | degraded hardware discovery should still be useful, but current legacy fallback is effectively a no-op | known gap |
| EC timeout | AML error path should surface failure, not fabricate success | degradable | | EC timeout | AML error path should surface failure, not fabricate success | degradable |
| Missing `\_S5` | shutdown path cannot use PM registers | degradable if fallback exists | | Missing `\_S5` | shutdown path cannot use PM registers | degradable only if failure is explicit |
| Sleep-state transition request | unsupported today | known gap | | Sleep-state transition request beyond `S5` | unsupported today | known gap |
| Missing `kstop` path | no kernel-orchestrated shutdown event contract | fatal for that integration path | | Missing `kstop` path | no kernel-orchestrated shutdown event contract | fatal for that integration path |
| Missing DMAR on Intel | no Intel VT-d runtime | degradable for non-IOMMU boot | | Missing DMAR on Intel | no Intel VT-d runtime | degradable for non-IOMMU boot |
| Missing IVRS on AMD | no AMD-Vi runtime | degradable for non-IOMMU boot | | Missing IVRS on AMD | no AMD-Vi runtime | degradable for non-IOMMU boot |
Wave 1 must convert the still-fuzzy cases into explicit, table-specific policy. Wave 0 and Wave 1 must turn the still-fuzzy cases into explicit policy.
## ULW Execution Rules ## Execution Rules
These rules govern all work from this plan: These rules govern all work from this plan:
1. **No hidden status inflation.** Status words must match evidence. 1. **No hidden status inflation.** Status words must match evidence.
2. **No ownership moves without a handoff contract.** “Not wired” is not the same as “cleanly moved.” 2. **No ownership moves without a handoff contract.** “Not wired” is not the same as “cleanly moved.”
3. **No validation laundering.** QEMU success is not bare-metal success. 3. **No validation laundering.** QEMU success is not bare-metal success.
4. **No Wave 5 shortcuts.** Validation cannot substitute for unfinished architecture. 4. **No runtime fake-success paths.** Empty defaults and fabricated values must not masquerade as real support.
5. **No cross-wave dependency drift.** Later waves must not silently depend on work that was never 5. **No cross-wave dependency drift.** Later waves must not silently depend on work that was never formalized earlier.
formalized in earlier waves.
## Phase Overview Matrix
| Wave | Theme | Current status | Main blocker | Primary closure signal |
|---|---|---|---|---|
| Wave 0 | Contracts / truthfulness | partially complete | doc drift across adjacent ACPI-facing docs | one canonical vocabulary and ownership story across the repo |
| Wave 1 | Startup hardening / parser policy | partially complete | boot-path contract gaps (explicit `RSDP_ADDR` producer ownership, ad hoc `acpid` spawn) plus remaining panic-grade startup and fault paths | firmware-origin startup failures are bounded and typed and AML bootstrap preconditions are explicit |
| Wave 2 | AML ordering / shutdown / sleep scope | partially complete | shutdown/reboot result semantics and broader runtime proof still remain incomplete | deterministic `\_S5` derivation and bounded shutdown behavior |
| Wave 3 | Honest ACPI power surface | open | current power reporting is real but still provisional and under-validated | `/scheme/acpi/power` exposes only behavior that the runtime evidence can honestly support |
| Wave 4 | AML physmem / EC / runtime fault handling | partially complete | placeholder-like runtime error behavior remains in places | no correctness-critical fabricated runtime values |
| Wave 5 | Ownership cleanup / kernel contract | open | DMAR still orphaned and kernel/userspace contract still implicit | explicit long-term ownership map with no orphan carriers |
| Wave 6 | Consumer integration / observability | partially complete | consumers still rely on uneven status surfaces | shutdown/event/power consumers describe and observe reality honestly |
| Wave 7 | Validation closure / release gates | open | bounded evidence set still too thin | release claims backed by a bounded matrix and negative-result capture |
The waves are intentionally ordered. Wave 0 defines truth. Wave 1 makes boot behavior survivable.
Wave 2 fixes the most dangerous runtime correctness problems. Wave 3 stops downstream services from
depending on misleading power semantics. Waves 46 harden the remaining runtime edges and ownership
boundaries. Wave 7 is where the stronger claims are either earned or denied.
## Wave 0 — Contracts, truthfulness, and degraded-mode policy ## Wave 0 — Contracts, truthfulness, and degraded-mode policy
@@ -181,18 +219,20 @@ Establish one canonical answer to:
1. who owns what, 1. who owns what,
2. what counts as degraded but acceptable, 2. what counts as degraded but acceptable,
3. and what ACPI status words mean. 3. what ACPI status words mean,
4. and what current ACPI eventing actually covers.
### Why this wave is first ### Why this wave is first
Without a contract, all later hardening work turns into undocumented rewrites and docs drift. Without a contract, later hardening work turns into undocumented rewrites and docs drift.
### Primary files ### Primary files
- `local/docs/ACPI-FIXES.md` - `local/docs/ACPI-FIXES.md`
- this file - this file
- `HARDWARE.md`
- `docs/02-GAP-ANALYSIS.md` - `docs/02-GAP-ANALYSIS.md`
- `README.md` and related status surfaces if needed - related status surfaces as needed
### Dependencies ### Dependencies
@@ -203,23 +243,50 @@ Without a contract, all later hardening work turns into undocumented rewrites an
- one normalized ACPI vocabulary, - one normalized ACPI vocabulary,
- one degraded-mode contract, - one degraded-mode contract,
- one canonical ownership statement, - one canonical ownership statement,
- one explicit statement that current eventing is shutdown-focused,
- removal of doc language that implies subsystem completeness without evidence. - removal of doc language that implies subsystem completeness without evidence.
### Execution slices
| ID | Work slice | Concrete output | QA evidence |
|---|---|---|---|
| W0.1 | Vocabulary normalization | All ACPI-facing docs use the same status words for implemented / transitional / known gap | grep review across ACPI docs shows no conflicting support language |
| W0.2 | Ownership statement | One canonical statement for kernel / `acpid` / `iommu` / future DMAR ownership | `ACPI-IMPROVEMENT-PLAN.md`, `ACPI-FIXES.md`, and `IOMMU-SPEC-REFERENCE.md` agree |
| W0.3 | Eventing scope truthfulness | `kstop` and shutdown-only semantics become explicit everywhere they are summarized | `DBUS-INTEGRATION-PLAN.md`, `DESKTOP-STACK-CURRENT-STATUS.md`, and `AGENTS.md` stay aligned |
| W0.4 | Evidence-carrier cleanup | validation logs are treated as evidence carriers, not support-policy sources | `BAREMETAL-LOG.md` and `HARDWARE.md` no longer overclaim support |
### Specific tasks
1. Normalize ACPI status language across the canonical plan, historical ledger, hardware summary, and
public status summaries.
2. Keep `kstop` and shutdown-only eventing explicit anywhere login1, D-Bus, or desktop consumers
summarize ACPI behavior.
3. Keep DMAR ownership language transitional until a concrete Intel runtime owner exists.
4. Keep validation logs framed as evidence carriers, not as the source of support policy.
5. Reject any doc wording that implies startup hardening, honest power reporting, or full sleep
lifecycle support before those waves actually close.
### Verification ### Verification
- documentation review only, - documentation review only,
- no contradictory ownership claims across ACPI docs, - no contradictory ownership claims across ACPI docs,
- no bare “complete” wording without scope. - no bare “complete” wording without scope,
- no doc claim of startup hardening that the active code does not support.
### Exit criteria ### Exit criteria
- one canonical ownership statement exists, - one canonical ownership statement exists,
- one degraded-mode matrix exists, - one degraded-mode matrix exists,
- all top-level ACPI docs use the same vocabulary. - all top-level ACPI docs use the same vocabulary,
- current shutdown-only eventing scope is explicit.
### Current status ### Current status
- partially complete - overall: partially complete
- W0.1 Vocabulary normalization — substantially complete
- W0.2 Ownership statement — substantially complete
- W0.3 Eventing scope truthfulness — substantially complete
- W0.4 Evidence-carrier cleanup — partially complete; core carriers are aligned, but future ACPI-facing summaries must keep using this vocabulary
## Wave 1 — Boot-path hardening and parser strictness ## Wave 1 — Boot-path hardening and parser strictness
@@ -232,6 +299,13 @@ Remove catastrophic or silent failure behavior from boot-critical ACPI initializ
- `recipes/core/base/source/drivers/acpid/src/main.rs` - `recipes/core/base/source/drivers/acpid/src/main.rs`
- `recipes/core/base/source/drivers/acpid/src/acpi.rs` - `recipes/core/base/source/drivers/acpid/src/acpi.rs`
- `recipes/core/base/source/drivers/acpid/src/scheme.rs` - `recipes/core/base/source/drivers/acpid/src/scheme.rs`
- `recipes/core/base/source/drivers/hwd/src/main.rs`
- `recipes/core/base/source/drivers/hwd/src/backend/acpi.rs`
- `recipes/core/base/source/drivers/hwd/src/backend/legacy.rs`
- `recipes/core/base/source/init.initfs.d/40_hwd.service`
- `recipes/core/base/source/init/src/service.rs`
- `recipes/core/base/source/bootstrap/src/exec.rs`
- `recipes/core/kernel/source/src/scheme/sys/mod.rs`
- `recipes/core/kernel/source/src/acpi/mod.rs` - `recipes/core/kernel/source/src/acpi/mod.rs`
- kernel ACPI submodules as needed - kernel ACPI submodules as needed
@@ -242,45 +316,65 @@ Remove catastrophic or silent failure behavior from boot-critical ACPI initializ
### Deliverables ### Deliverables
- startup paths are typed and explicit, - startup paths are typed and explicit,
- AML bootstrap preconditions are explicit and satisfied by an in-tree handoff path or are clearly documented as unresolved,
- boot-path ownership between init, `hwd`, `acpid`, and `pcid` is explicit enough that degraded behavior is diagnosable,
- table rejection policy is documented per table class, - table rejection policy is documented per table class,
- parser observability is strong enough to reconstruct failures, - parser observability is strong enough to reconstruct failures,
- degraded boot succeeds for all conditions classified as degradable. - degraded boot succeeds for all conditions classified as degradable,
- no active firmware-origin startup path still depends on panic-grade behavior.
### Execution slices
| ID | Work slice | Concrete output | QA evidence |
|---|---|---|---|
| W1.1 | Startup failure typing | `acpid` startup paths classify clean exit vs fatal vs degraded continue | startup logs and code review show no firmware-path `expect()` dependence |
| W1.2 | Table policy definition | SDT/FADT/root-table reject/warn/degrade rules are written down and implemented | malformed-table tests match the documented policy |
| W1.3 | Parser observability | accepted/rejected tables are logged with enough detail to diagnose boot failures | bounded bad-table boots produce reconstructable logs |
| W1.4 | Degraded boot proof | ACPI-bad but degradable boots continue without panicking | one bounded AMD and one bounded Intel degraded-path proof |
| W1.5 | AML bootstrap contract | the source of `RSDP_ADDR` / `RSDP_SIZE` is made explicit or the contract is replaced with a documented in-tree alternative; x86 fallback remains bounded and honest | boot-path docs, init wiring, and `acpid` startup code agree on how AML bootstrap happens |
### Specific tasks ### Specific tasks
1. Finish replacing panic-grade startup behavior in active firmware-origin paths. 1. Finish replacing panic-grade startup behavior in active firmware-origin paths.
2. Define table-specific reject / warn / degrade / fail rules. 2. Define and validate the userspace AML bootstrap contract, including whether `RSDP_ADDR` / `RSDP_SIZE` remains the intended path.
3. Log accepted and rejected tables with enough evidence to debug failures. 3. Define table-specific reject / warn / degrade / fail rules.
4. Log accepted and rejected tables with enough evidence to debug failures.
5. Normalize `acpid` startup into clean exit, fatal error, and degraded-continue classes.
6. Make the boot-path ownership between init, `hwd`, `acpid`, and `pcid` explicit enough that degraded behavior is diagnosable.
### Verification ### Verification
- malformed checksum / truncated-length tests, - malformed checksum / truncated-length tests,
- QEMU validation with intentionally damaged tables if feasible, - QEMU validation with intentionally damaged tables using a documented bounded harness or a retained negative-result record,
- boot-path evidence showing where AML bootstrap parameters come from or an explicit retained blocker stating that the producer remains unresolved,
- one bounded AMD hardware boot recheck, - one bounded AMD hardware boot recheck,
- one bounded Intel hardware boot recheck, - one bounded Intel hardware boot recheck,
- evidence captured in `local/docs/BAREMETAL-LOG.md` or its successor. - evidence captured in `local/docs/BAREMETAL-LOG.md`.
### Exit criteria ### Exit criteria
- no unjustified `panic!/expect()` remains on firmware-origin startup paths, - no unjustified `panic!/expect()` remains on firmware-origin startup paths,
- AML bootstrap preconditions are explicit and consistent with the in-tree boot path,
- malformed-table decisions are deterministic and documented, - malformed-table decisions are deterministic and documented,
- degraded boot behavior matches Wave 0 classification. - degraded boot behavior matches Wave 0 classification.
### Current status ### Current status
- partially complete - overall: partially complete
- W1.1 Startup failure typing — partially complete
- W1.5 AML bootstrap contract — partially complete
## Wave 2 — AML, opregions, EC, and power-state correctness ## Wave 2 — AML ordering, shutdown correctness, and sleep-state scope
### Goal ### Goal
Close the biggest runtime-correctness gaps in the `acpid` layer. Close the highest-risk runtime-correctness gaps in the `acpid` layer.
### Primary files ### Primary files
- `recipes/core/base/source/drivers/acpid/src/acpi.rs` - `recipes/core/base/source/drivers/acpid/src/acpi.rs`
- `recipes/core/base/source/drivers/acpid/src/aml_physmem.rs` - `recipes/core/base/source/drivers/acpid/src/sleep.rs`
- `recipes/core/base/source/drivers/acpid/src/ec.rs` - `recipes/core/base/source/drivers/acpid/src/scheme.rs`
### Dependencies ### Dependencies
@@ -288,71 +382,211 @@ Close the biggest runtime-correctness gaps in the `acpid` layer.
### Deliverables ### Deliverables
- real AML synchronization semantics,
- explicit physmem / opregion failure behavior,
- deterministic AML init order, - deterministic AML init order,
- deterministic `\_S5` derivation,
- explicit shutdown success/failure behavior,
- explicit reboot correctness and fallback behavior,
- explicit sleep-state scope, - explicit sleep-state scope,
- honest EC behavior bounds. - honest `SLP_TYPb` status.
### Execution slices
| ID | Work slice | Concrete output | QA evidence |
|---|---|---|---|
| W2.1 | `\_S5` derivation timing | `\_S5` is derived at a deterministic valid point instead of accidental fallback timing | logs show when `\_S5` was computed and from what readiness state |
| W2.2 | AML readiness contract | documented split or sequencing between early AML and PCI-dependent AML | code path and docs agree on when AML is considered ready |
| W2.3 | Shutdown and reboot result semantics | shutdown and reboot paths return bounded results, log failures explicitly, and keep fallback behavior honest | QEMU + bounded real-hardware shutdown/reboot proof with failure-path logs |
| W2.4 | Sleep-scope truthfulness | non-`S5` support is either implemented in bounded form or kept explicitly deferred | no docs or APIs imply broader sleep lifecycle support prematurely |
### Specific tasks ### Specific tasks
1. Document and stress AML mutex timeout semantics. 1. Fix the `\_S5` ordering bug by **primarily** recomputing `\_S5` after PCI registration, using an early-AML split only if the recompute path proves insufficient on bounded hardware.
2. Remove silent correctness-critical physmem failure paths. 2. Document and enforce that AML readiness contract explicitly.
3. Finish `AmlSymbols` initialization contract; stop tying AML readiness to fragile PCI timing. 3. Make `set_global_s_state()` return explicit outcomes instead of relying on write-then-spin behavior.
4. Decide whether sleep support is in-scope now or explicitly deferred. 4. Bound shutdown failure semantics when PM1 writes do not power off the machine.
5. If in-scope now, implement and validate the missing sleep-state pieces, including `SLP_TYPb`. 5. Document and validate reboot ownership, including reset-register and keyboard-controller fallback behavior.
6. Decide whether non-`S5` sleep support is in scope now or explicitly deferred.
7. If deferred, keep the scope truthful in code and docs.
### Verification ### Verification
- targeted AML method execution tests, - targeted AML method execution checks,
- shutdown / reboot proof in QEMU and bounded hardware, - shutdown / reboot proof in QEMU and bounded hardware,
- EC timeout and error-path tests, - induced AML-not-ready path tests,
- concurrent ACPI scheme reads while AML methods run, - log proof of when `\_S5` was derived,
- at least one EC-backed platform check if available. - one bounded Intel and one bounded AMD shutdown/reboot recheck.
### Exit criteria ### Exit criteria
- AML synchronization is no longer placeholder-grade,
- physmem failures do not silently fabricate correctness-critical values,
- AML initialization order is reproducible and documented, - AML initialization order is reproducible and documented,
- sleep-state handling is either implemented or explicitly bounded as a known gap, - `\_S5` is no longer derived through fragile fallback timing,
- EC behavior is implemented or honestly constrained. - shutdown and reboot failures do not degrade into panic or silent hang only,
- sleep-state handling is either implemented or explicitly bounded as a known gap.
### Current status ### Current status
- partially complete - overall: partially complete
- W2.1 `\_S5` derivation timing — partially complete
- W2.2 AML readiness contract — partially complete
- W2.3 Shutdown and reboot result semantics — partially complete
- current-tree behavior now defers `\_S5` cleanly until PCI-backed AML readiness, surfaces pre-PCI shutdown as AML-not-ready, preserves shutdown dispatch details on non-completion, and treats reboot dispatch failure/returned reboot attempts as explicit non-success instead of silent success
## Wave 3 — Ownership cleanup and kernel-surface reduction ## Wave 3 — Honest runtime power surface
### Goal ### Goal
Move from transitional ownership to an architecture that can survive long-term maintenance. Stop exposing incomplete runtime power state as if it were implemented.
### Primary files
- `recipes/core/base/source/drivers/acpid/src/acpi.rs`
- `recipes/core/base/source/drivers/acpid/src/scheme.rs`
- downstream consumers such as `local/recipes/system/redbear-upower/source/src/main.rs`
### Dependencies
- Wave 2 runtime ordering and shutdown behavior stable enough that consumers can rely on ACPI state
### Deliverables
- an explicitly reduced and honest `/scheme/acpi/power` surface first,
- current `power_snapshot()` behavior is documented as real but provisional,
- consumer-visible distinction between unsupported, unavailable, and populated power state.
### Execution slices
| ID | Work slice | Concrete output | QA evidence |
|---|---|---|---|
| W3.1 | Power-surface decision | explicit primary path to reduce `/scheme/acpi/power` to an honest bounded surface before any expansion | docs and service code describe the same support level |
| W3.2 | Snapshot semantics | adapter/battery state becomes real or explicitly unavailable/unsupported | direct scheme reads show distinct responses for each state |
| W3.3 | Consumer honesty | `redbear-upower` and downstream docs stop overclaiming support | D-Bus/current-state docs match actual scheme behavior |
| W3.4 | Reporting consistency | all public summaries use the same bounded wording for ACPI-backed power | grep review shows no stale “bounded real” UPower claims |
### Specific tasks
1. Reduce or constrain the current `/scheme/acpi/power` surface so empty defaults do not masquerade as support.
2. Ensure downstream consumers can tell unsupported from currently unavailable.
3. Treat the current AML-backed adapter / battery enumeration as provisional until its bootstrap preconditions and bounded hardware evidence are strong enough to trust.
4. Keep all downstream status language pinned to the reduced surface until bounded runtime proof supports stronger claims.
### Verification
- scheme reads on supported and unsupported systems,
- downstream consumer checks,
- log review for unavailable and unsupported cases.
### Exit criteria
- `/scheme/acpi/power` no longer returns misleading empty-success behavior,
- consumers can distinguish unsupported from unavailable,
- power reporting claims in docs match the actual runtime surface.
### Current status
- open
## Wave 4 — AML physmem, EC, and runtime fault handling
### Goal
Remove correctness-critical fake values and placeholder runtime behavior.
### Primary files
- `recipes/core/base/source/drivers/acpid/src/aml_physmem.rs`
- `recipes/core/base/source/drivers/acpid/src/ec.rs`
- `recipes/core/base/source/drivers/acpid/src/acpi.rs`
### Dependencies
- Wave 1 startup hardening complete
### Deliverables
- explicit physmem / opregion failure behavior,
- EC error paths that are typed and diagnosable,
- documented AML mutex and timeout semantics,
- runtime failures that propagate clearly to callers.
### Execution slices
| ID | Work slice | Concrete output | QA evidence |
|---|---|---|---|
| W4.1 | Physmem failure propagation | correctness-critical reads stop silently returning fabricated values | forced read-failure tests produce explicit errors |
| W4.2 | EC error typing | widened-access and timeout failures are surfaced consistently | EC timeout path tests and log review |
| W4.3 | AML mutex semantics | acquire/release/timeout behavior is documented and reflected in runtime behavior | concurrent AML scheme-read/eval checks stay understandable |
| W4.4 | Runtime fault observability | callers receive clear failure categories instead of placeholder success | operator-visible logs distinguish source and impact |
### Specific tasks
1. Audit `aml_physmem.rs` for all correctness-critical “log then fabricate 0” paths.
2. Convert correctness-critical failures into explicit propagated errors.
3. Finish EC error typing and document widened-access behavior.
4. Document AML mutex timeout behavior and actual guarantees.
### Verification
- induced physmem mapping/read failure tests,
- EC timeout path tests,
- concurrent AML scheme-read and AML-eval checks,
- one EC-backed machine sanity check or one retained documented blocker explaining why that proof is still absent.
### Exit criteria
- correctness-critical runtime paths do not silently fabricate values,
- EC behavior is implemented or explicitly bounded,
- AML synchronization behavior is documented and tested.
### Current status
- overall: partially complete
- W4.1 Physmem failure propagation — partially complete
- W4.2 EC error typing — partially complete
- W4.3 AML mutex semantics — substantially complete in tracked state, still needs clearer runtime-proof coverage
- W4.4 Runtime fault observability — open
## Wave 5 — Ownership cleanup and kernel-surface reduction
### Goal
Move from transitional ownership to a durable architecture that can survive long-term maintenance.
### Primary files ### Primary files
- `recipes/core/kernel/source/src/acpi/mod.rs` - `recipes/core/kernel/source/src/acpi/mod.rs`
- kernel ACPI submodules as needed - kernel ACPI submodules as needed
- `recipes/core/kernel/source/src/scheme/acpi.rs`
- `recipes/core/base/source/drivers/acpid/src/acpi/dmar/mod.rs` - `recipes/core/base/source/drivers/acpid/src/acpi/dmar/mod.rs`
- `recipes/core/base/source/drivers/acpid/src/scheme.rs`
- `local/recipes/system/iommu/source/src/*` - `local/recipes/system/iommu/source/src/*`
### Dependencies ### Dependencies
- Wave 1 and Wave 2 are at least partially complete - Waves 1 and 2 are at least partially stable
### Deliverables ### Deliverables
- a minimum kernel ACPI contract, - a minimum kernel ACPI contract,
- explicit handoff paths for table discovery and topology, - explicit handoff paths for topology and table consumers,
- DMAR no longer orphaned in `acpid`, - DMAR no longer orphaned in `acpid`,
- ownership wording that matches the code. - ownership wording that matches the code.
### Execution slices
| ID | Work slice | Concrete output | QA evidence |
|---|---|---|---|
| W5.1 | Kernel contract write-down | explicit minimal kernel ACPI contract in docs/comments | kernel/export surfaces match the written contract |
| W5.2 | DMAR carrier cleanup | orphaned `acpid` DMAR carrier is explicitly deferred unless a real Intel runtime owner is ready in the same implementation slice | no doc claims a hidden owner that code does not implement |
| W5.3 | IOMMU ownership alignment | IVRS/DMAR ownership text across `iommu` and ACPI docs becomes stable | `ACPI-IMPROVEMENT-PLAN.md`, `IOMMU-SPEC-REFERENCE.md`, and Linux-borrowing plan agree |
| W5.4 | Regression containment | ownership cleanup does not break existing bring-up paths | before/after boot checks on AMD and Intel remain stable |
### Specific tasks ### Specific tasks
1. Define the minimum kernel ACPI surface that must remain in early boot. 1. Define the minimum kernel ACPI surface that must remain in early boot.
2. Document the userspace handoff contract for topology and table consumers. 2. Keep `rxsdt` and `kstop` as explicit exported contract until a real replacement exists.
3. Remove or relocate the orphaned DMAR carrier in `acpid`. 3. Treat explicit deferral of the orphaned DMAR carrier as the primary path until a real Intel runtime owner exists.
4. Do not claim Intel DMAR runtime ownership complete unless a real owner exists and is validated. 4. Remove or relocate the orphaned `acpid` DMAR carrier only in the same change set that introduces and validates the replacement owner.
5. Do not claim Intel DMAR runtime ownership complete unless a real owner exists and is validated.
6. Preserve IVRS ownership in `iommu`.
### Verification ### Verification
@@ -368,9 +602,9 @@ Move from transitional ownership to an architecture that can survive long-term m
### Current status ### Current status
- partially complete - open
## Wave 4 — Consumer integration and eventing quality ## Wave 6 — Consumer integration and eventing quality
### Goal ### Goal
@@ -379,54 +613,76 @@ Make ACPI consumers correct, observable, and low-friction.
### Primary files ### Primary files
- `local/recipes/system/redbear-sessiond/source/src/acpi_watcher.rs` - `local/recipes/system/redbear-sessiond/source/src/acpi_watcher.rs`
- `recipes/core/base/source/drivers/acpid/src/main.rs`
- `recipes/core/base/source/drivers/acpid/src/scheme.rs` - `recipes/core/base/source/drivers/acpid/src/scheme.rs`
- DMI / quirk consumers in `redox-driver-sys` - DMI / quirk consumers in `redox-driver-sys`
- reporting surfaces such as `redbear-info` - reporting surfaces such as `redbear-info`
### Dependencies ### Dependencies
- Wave 2 runtime behavior is stable enough for downstream consumers to depend on it - Waves 2 through 4 stable enough that consumers can depend on ACPI behavior
### Deliverables ### Deliverables
- event-driven core power-session behavior where feasible, - shutdown-focused eventing quality as a required consumer contract,
- bounded DMI quirk authority, - bounded DMI quirk authority,
- operator-facing observability strong enough to diagnose behavior, - operator-facing observability strong enough to diagnose behavior,
- explicit treatment of unsupported sleep eventing if it remains deferred. - explicit treatment of unsupported sleep eventing if it remains deferred.
### Execution slices
| ID | Work slice | Concrete output | QA evidence |
|---|---|---|---|
| W6.1 | Shutdown consumer contract | `redbear-sessiond` and D-Bus docs describe shutdown-only behavior correctly | `PrepareForShutdown` stays current; `PrepareForSleep` stays future-only |
| W6.2 | DMI quirk authority | quirk precedence and bounds are documented for ACPI/DMI consumers | `QUIRKS-SYSTEM.md` and ACPI plan do not disagree |
| W6.3 | Operator observability | AML readiness, shutdown attempts, and power availability are diagnosable | log review and status outputs distinguish unsupported vs unavailable |
| W6.4 | Consumer wording discipline | adjacent docs stop translating provisional ACPI surfaces into “real” support claims | desktop/D-Bus/Qt status docs remain aligned with the canonical plan |
### Specific tasks ### Specific tasks
1. Keep shutdown eventing on `kstop` as the canonical shutdown signal. 1. Keep shutdown eventing on `kstop` as the canonical shutdown signal.
2. Improve consumer-facing observability for ACPI state and failures. 2. Improve consumer-facing observability for AML readiness, PCI registration state, shutdown attempts, and power availability.
3. Define DMI quirk precedence and limits. 3. Define DMI quirk precedence and limits.
4. If sleep eventing remains out-of-scope, document that explicitly and consistently. 4. If sleep eventing remains out-of-scope, document that explicitly and consistently.
### Verification ### Verification
- repeated shutdown edge tests, - repeated shutdown-edge tests,
- sleep-edge tests if sleep work is in scope, - race checks with multiple simultaneous consumers of `/scheme/acpi/*`,
- DMI quirk application checks on known systems, - DMI quirk application checks on known systems,
- race checks with multiple simultaneous consumers of `/scheme/acpi/*`. - log review that diagnoses unsupported versus unavailable behavior.
### Exit criteria ### Exit criteria
- no unnecessary polling remains for core ACPI transitions where eventing is feasible, - no misleading consumer contract remains for core ACPI transitions,
- quirk precedence is documented, - quirk precedence is documented,
- consumer-visible behavior is diagnosable from logs and status outputs. - consumer-visible behavior is diagnosable from logs and status outputs.
### Current status ### Current status
- partially complete - overall: partially complete
- W6.1 Shutdown consumer contract — substantially complete
- W6.2 DMI quirk authority — partially complete
- W6.3 Operator observability — open
- W6.4 Consumer wording discipline — substantially complete
## Wave 5 — Validation closure and release gates ## Wave 7 — Validation closure and release gates
### Goal ### Goal
Turn the current ACPI stack from bring-up evidence into release-grade trust. Turn the current ACPI stack from bring-up evidence into release-grade trust.
### Primary files
- `local/docs/BAREMETAL-LOG.md`
- `HARDWARE.md`
- this file
- `docs/02-GAP-ANALYSIS.md`
- validation scripts such as `local/scripts/test-baremetal.sh` and bounded ACPI-related QEMU / runtime harnesses as they exist
### Dependencies ### Dependencies
- Waves 1 through 4 have produced stable behavior worth validating - Waves 1 through 6 have produced stable behavior worth validating
### Required validation matrix ### Required validation matrix
@@ -436,7 +692,37 @@ At minimum:
- one modern AMD machine, - one modern AMD machine,
- one modern Intel machine, - one modern Intel machine,
- one platform that exercises EC-backed AML behavior, - one platform that exercises EC-backed AML behavior,
- malformed-table or degraded-mode evidence where feasible. - malformed-table or degraded-mode evidence, or a retained blocker entry explaining why that proof could not yet be produced.
### Required matrix fields
Each matrix entry should record, at minimum:
- date,
- platform name,
- firmware mode,
- profile / config used,
- kernel / patch baseline,
- key ACPI tables present,
- APIC mode,
- shutdown result,
- reboot result,
- DMI exposure,
- power-surface state,
- AML / EC failures,
- degraded behavior observed,
- evidence location (log, script output, photo, or captured artifact),
- final classification: implemented only / QEMU-validated / bounded real-hardware validated / failed.
### Repetition standard
This plan should treat one successful run as **initial evidence**, not closure.
- QEMU proof should be repeatable at least twice on the same bounded harness.
- Each bounded real-hardware class should have at least one named passing run and one retained
negative-or-regression note if failures were seen during bring-up.
- Gate B claims should rely on repeated evidence across more than one hardware class, not a single
lucky machine.
### Deliverables ### Deliverables
@@ -445,13 +731,23 @@ At minimum:
- explicit release gates for both boot-baseline and full ACPI claims, - explicit release gates for both boot-baseline and full ACPI claims,
- docs that distinguish implemented from validated. - docs that distinguish implemented from validated.
### Execution slices
| ID | Work slice | Concrete output | QA evidence |
|---|---|---|---|
| W7.1 | Matrix carrier | one canonical bounded validation matrix exists | `BAREMETAL-LOG.md` holds named platform entries |
| W7.2 | Positive proof set | QEMU + AMD + Intel + EC-backed paths each have bounded proof entries | repeated runs recorded with dates and configs |
| W7.3 | Negative-result discipline | unresolved AML/EC/platform failures stay visible | negative results persist in logs/docs instead of disappearing |
| W7.4 | Release-gate enforcement | stronger ACPI claims are tied to explicit gate passage | summary docs do not exceed the evidence in the matrix |
### Specific tasks ### Specific tasks
1. Publish the platform matrix in `local/docs/BAREMETAL-LOG.md` or its successor. 1. Publish the platform matrix in `local/docs/BAREMETAL-LOG.md`.
2. Record for each platform: firmware mode, key ACPI tables, APIC mode, shutdown / reboot, 2. Record for each platform: firmware mode, key ACPI tables, APIC mode, shutdown / reboot, DMI / power exposure, AML / EC failures, and notable degraded behavior.
DMI / power exposure, AML / EC failures, and notable degraded behavior.
3. Preserve negative results such as unsupported AML opcodes or platform-specific regressions. 3. Preserve negative results such as unsupported AML opcodes or platform-specific regressions.
4. Require evidence before any stronger ACPI completeness claim is made. 4. Require evidence before any stronger ACPI completeness claim is made.
5. Keep a canonical evidence link or artifact pointer in each matrix row so support language can be traced back to an actual run.
6. Refuse Gate B wording unless the repeated-proof standard above is met.
### Verification ### Verification
@@ -471,6 +767,22 @@ At minimum:
- open - open
## Recommended PR Sequence
Recommended order:
1. docs/status correction,
2. `acpid` startup hardening,
3. `\_S5` / AML ordering, shutdown, and reboot correctness,
4. honest `/scheme/acpi/power`,
5. AML physmem / EC hardening,
6. DMAR ownership cleanup,
7. kernel/userspace ACPI contract write-down,
8. eventing / consumer contract cleanup,
9. validation matrix and release gates.
This order intentionally follows the wave order: Wave 0 → Wave 1 → Wave 2 → Wave 3 → Wave 4 → Wave 5 → Wave 6 → Wave 7. If a single wave is split across multiple PRs, keep the wave ordering authoritative and treat sub-PR sequencing as an implementation detail rather than a competing plan order.
## Release Gates ## Release Gates
### Gate A — Boot-Baseline ACPI Ready ### Gate A — Boot-Baseline ACPI Ready
@@ -481,7 +793,7 @@ Require:
- clean boot on bounded QEMU + AMD + Intel validation targets, - clean boot on bounded QEMU + AMD + Intel validation targets,
- working MADT / APIC initialization on those targets, - working MADT / APIC initialization on those targets,
- shutdown / reboot proof where supported, - working and bounded shutdown / reboot proof where supported,
- explicit degraded behavior for known firmware-bad cases, - explicit degraded behavior for known firmware-bad cases,
- current docs that distinguish implemented from validated. - current docs that distinguish implemented from validated.
@@ -490,85 +802,37 @@ Require:
Do **not** claim this until all of the following are true: Do **not** claim this until all of the following are true:
- AML runtime behavior is stable across the bounded matrix, - AML runtime behavior is stable across the bounded matrix,
- shutdown correctness is validated on bounded real hardware,
- sleep-state scope is implemented and validated or explicitly excluded from the release claim, - sleep-state scope is implemented and validated or explicitly excluded from the release claim,
- ownership boundaries are clean rather than merely transitional, - ownership boundaries are clean rather than transitional,
- consumer integration is observable and race-bounded, - consumer integration is observable and race-bounded,
- the platform matrix supports the stronger claim. - the platform matrix supports the stronger claim.
## Upstream vs Red Bear Work Split
### Prefer upstream for
- generic `acpid` startup hardening,
- AML mutex semantics,
- `SLP_TYPb` completion,
- EC error typing and generic width behavior,
- reuse of parsed tables inside `acpid`,
- DMAR leaving `acpid`,
- kernel ACPI scope reduction TODOs,
- generic parser quality in kernel ACPI modules.
### Red Bear owns
- honest status language,
- bounded validation matrices and runbooks,
- `redbear-sessiond` shutdown-consumer quality,
- DMI quirk governance and integration policy,
- patch carriers in `local/patches/*`,
- coordination across `acpid`, `iommu`, `pcid`, and downstream consumers.
## Sequencing Constraints
1. **Wave 0 first** — architecture and wording must stop drifting.
2. **Wave 1 before Wave 2** — runtime correctness should not sit on fragile startup behavior.
3. **Wave 2 before Wave 4** — consumer contracts must rely on correct AML / power behavior.
4. **Wave 3 after Waves 1 and 2 are partially stable** — ownership moves are risky on unstable
behavior.
5. **Wave 5 last** — validation closes work; it does not replace architecture.
## Main Risks ## Main Risks
- stricter parser behavior may expose machines currently booting only by luck, - stricter parser behavior may expose machines currently booting only by luck,
- AML / EC changes may uncover hidden PCI-registration ordering assumptions, - AML ordering fixes may reveal hidden PCI-registration assumptions,
- power-surface honesty may break consumers assuming empty means supported,
- reducing kernel scope too early may regress early bring-up, - reducing kernel scope too early may regress early bring-up,
- careless DMAR cleanup may create Intel-only regressions, - careless DMAR cleanup may create Intel-only regressions,
- DMI quirks can become a crutch if allowed to override runtime facts indiscriminately. - QEMU success may continue to hide bare-metal correctness gaps if validation stays too shallow.
## Non-Goals
- claiming sleep support that does not exist,
- calling DMAR ownership “complete” while the orphaned `acpid` module still exists,
- treating one-machine success as subsystem-level proof,
- using Wave 5 validation language to hide unfinished ownership work.
## Deliverable Order
Recommended order:
1. truth contract and doc normalization,
2. startup hardening,
3. AML / EC / power-state correctness,
4. ownership cleanup,
5. consumer / eventing quality,
6. validation closure and release gate.
## Definition of Done ## Definition of Done
This plan is substantially complete only when: This plan is substantially complete only when:
- ownership boundaries are explicit and not contradicted by the code, - startup failure behavior is bounded and non-panic-grade,
- boot-critical panic / silent-fallback paths are removed or justified, - `\_S5` shutdown behavior is deterministic and validated,
- AML and EC behavior are no longer TODO-grade, - exported power and event surfaces are honest,
- DMAR and IVRS ownership are no longer described ambiguously, - kernel/userspace ownership boundaries are explicit and not contradicted by the code,
- consumers are event-driven or explicitly bounded where eventing is not feasible, - DMAR and IVRS ownership are not described ambiguously,
- sleep-state handling is implemented or explicitly excluded from the release claim, - sleep-state handling is implemented or explicitly excluded from the release claim,
- the repo contains bounded platform evidence that supports every status claim. - the repo contains bounded platform evidence that supports every status claim.
## Current Truthful Status ## Current Truthful Status
> Red Bear ACPI is materially complete for historical boot bring-up, but still under active > Red Bear ACPI is materially complete for historical boot bring-up, but still under active
> robustness, ownership, sleep-state, and validation improvement. Shutdown eventing is implemented > correctness, ownership, power-surface, sleep-state, and validation improvement. Shutdown eventing
> via `kstop`. Sleep-state transitions remain a known gap. EC widened access is implemented via byte > is implemented via `kstop`. Current eventing is shutdown-focused, not full sleep lifecycle
> transactions. AML mutex state is real-tracked, not placeholder. DMAR is not initialized by > management. The `acpid` runtime surface still needs startup hardening, deterministic AML ordering,
> `acpid`, and Intel DMAR runtime ownership is not yet cleanly closed. Bare-metal validation for the > honest power reporting, and explicit Intel DMAR ownership before stronger ACPI claims are justified.
> full ACPI surface is still outstanding.
+2 -2
View File
@@ -40,7 +40,7 @@ take 5+ years.
|-----------|--------|--------| |-----------|--------|--------|
| UEFI boot | ✅ Works | x86_64 UEFI bootloader functional | | UEFI boot | ✅ Works | x86_64 UEFI bootloader functional |
| AMD CPUs | ✅ Works | AMD 32/64-bit supported, Ryzen Threadripper verified | | AMD CPUs | ✅ Works | AMD 32/64-bit supported, Ryzen Threadripper verified |
| ACPI | ✅ Boot-baseline complete | RSDP/SDT checksums, MADT types 0x4/0x5/0x9/0xA, LVT NMI, FADT shutdown/reboot; historical bring-up goal met, but not release-grade complete; see `local/docs/ACPI-IMPROVEMENT-PLAN.md` for remaining ownership, robustness, sleep-state, and validation work | | ACPI | ✅ Boot-baseline complete | RSDP/SDT checksums, MADT types 0x4/0x5/0x9/0xA, LVT NMI, FADT shutdown/reboot, explicit `RSDP_ADDR` forwarding into `acpid`, x86 BIOS-search AML fallback, and bounded AML-backed power enumeration exist; historical bring-up goal met, but the explicit AML bootstrap producer contract, startup hardening, sleep-state scope, and validation depth still remain open — see `local/docs/ACPI-IMPROVEMENT-PLAN.md` for the forward plan |
| x2APIC | ✅ Works | Auto-detected via CPUID, APIC/SMP functional | | x2APIC | ✅ Works | Auto-detected via CPUID, APIC/SMP functional |
| HPET | ✅ Works | Timer initialized from ACPI | | HPET | ✅ Works | Timer initialized from ACPI |
| IOMMU | 🚧 In progress | `iommu` daemon now builds, auto-discovers common IVRS table paths, reaches unit detection plus `scheme:iommu` registration in the QEMU/AMD-IOMMU validation path, and now has a guest-driven first-use self-test that initializes both discovered units and drains events successfully in QEMU; real hardware validation is still missing | | IOMMU | 🚧 In progress | `iommu` daemon now builds, auto-discovers common IVRS table paths, reaches unit detection plus `scheme:iommu` registration in the QEMU/AMD-IOMMU validation path, and now has a guest-driven first-use self-test that initializes both discovered units and drains events successfully in QEMU; real hardware validation is still missing |
@@ -477,7 +477,7 @@ P0 (ACPI boot)
## ANTI-PATTERNS FOR AMD GPU ENABLEMENT ## ANTI-PATTERNS FOR AMD GPU ENABLEMENT
- **DO NOT** attempt a clean Rust rewrite of amdgpu — 6M lines, 5+ years - **DO NOT** attempt a clean Rust rewrite of amdgpu — 6M lines, 5+ years
- **DO NOT** skip the ACPI boot baseline — AMD machines WILL NOT BOOT without the RSDP/SDT/MADT/FADT bring-up path; see `local/docs/ACPI-IMPROVEMENT-PLAN.md` for the separate post-bring-up ownership and robustness work - **DO NOT** treat the historical ACPI boot-baseline work as optional when revisiting AMD bring-up — modern AMD boot depended on the RSDP/SDT/MADT/FADT path that is now already landed; see `local/docs/ACPI-IMPROVEMENT-PLAN.md` for the separate post-bring-up ownership and robustness work
- **DO NOT** forget firmware blobs — amdgpu CANNOT FUNCTION without PSP/GC/SDMA firmware - **DO NOT** forget firmware blobs — amdgpu CANNOT FUNCTION without PSP/GC/SDMA firmware
- **DO NOT** test only in QEMU — AMD GPU behavior differs significantly from VirtIO - **DO NOT** test only in QEMU — AMD GPU behavior differs significantly from VirtIO
- **DO NOT** assume Intel patterns work for AMD — AMD uses different register maps, different firmware flow - **DO NOT** assume Intel patterns work for AMD — AMD uses different register maps, different firmware flow
+17 -14
View File
@@ -1,13 +1,17 @@
# Bare Metal Test Log — AMD Hardware # Bare Metal Validation Log — ACPI and Hardware Evidence
Template for recording test results when booting Redox on AMD hardware. Template for recording bounded bare-metal validation results on AMD and Intel hardware.
Fill one section per test run. Date is ISO 8601. Fill one section per test run. Date is ISO 8601.
This file is an **evidence log**, not the canonical source of support language. For current ACPI
status and ownership truth, use `local/docs/ACPI-IMPROVEMENT-PLAN.md`. For hardware-facing support
language, use `HARDWARE.md`.
## How to Test ## How to Test
```bash ```bash
# 1. Build the image # 1. Build the image
./local/scripts/build-redbear.sh redbear-desktop ./local/scripts/build-redbear.sh redbear-full
# 2. Burn to USB (DANGEROUS — verify target device!) # 2. Burn to USB (DANGEROUS — verify target device!)
./local/scripts/test-baremetal.sh --device /dev/sdX ./local/scripts/test-baremetal.sh --device /dev/sdX
@@ -41,7 +45,7 @@ For boot debugging, connect a serial console before powering on:
**Build:** **Build:**
- Redox version: (git rev-parse --short HEAD) - Redox version: (git rev-parse --short HEAD)
- Config: (e.g., my-amd-desktop) - Config: (e.g., redbear-full)
- Kernel patch version: (checksum of local/patches/kernel/P0-amd-acpi-x2apic.patch) - Kernel patch version: (checksum of local/patches/kernel/P0-amd-acpi-x2apic.patch)
**Result:** Booting / Broken / Recommended **Result:** Booting / Broken / Recommended
@@ -81,17 +85,16 @@ For boot debugging, connect a serial console before powering on:
- Storage: NVMe SSD - Storage: NVMe SSD
**Build:** **Build:**
- Redox version: (pending first test with P0 patches applied) - Redox version: historical note only; fresh rerun needed
- Config: my-amd-desktop - Config: historical pre-rename run; repeat on `redbear-full`
- Kernel patch: P0-amd-acpi-x2apic.patch (with timeout + SIPI fixes) - Kernel patch: historical P0 ACPI bring-up patch set (with timeout + SIPI fixes)
**Result:** PENDING TEST **Result:** Booting
**Known from HARDWARE.md:** **Known from current repo docs:**
- Previous status: **Broken** — crash due to unimplemented ACPI function - Previous status: **Broken** — crash due to unimplemented ACPI function
- Reference: jackpot51/acpi#3 - Historical boot-baseline ACPI fixes moved this machine out of the Broken path
- With P0 patches applied, x2APIC should now work; need to verify the specific - Broader bounded validation is still incomplete; a fresh run should replace this carry-forward note
ACPI function that was missing
--- ---
@@ -113,8 +116,8 @@ For boot debugging, connect a serial console before powering on:
**Analysis:** **Analysis:**
- AML interpreter hits unsupported opcode (`NoCurrentOp`) - AML interpreter hits unsupported opcode (`NoCurrentOp`)
- This is in the userspace acpid, not the kernel - This is in the userspace `acpid`, not the kernel
- Likely needs AML opcode support added to `aml_physmem.rs` or `acpi.rs` - Treat this as an unresolved bare-metal failure record until a fresh validation run disproves it
--- ---
+12 -12
View File
@@ -37,9 +37,9 @@ hardware GPU validation → KWin session bring-up → KDE Plasma session bring-u
Out of scope: USB, Wi-Fi, Bluetooth (covered by their own subsystem plans). Out of scope: USB, Wi-Fi, Bluetooth (covered by their own subsystem plans).
Tracked-default truth: this document is the canonical desktop-path plan, and the tracked default Tracked-default truth: this document is the canonical desktop-path plan, and the tracked desktop-
build now resolves to `CONFIG_NAME?=redbear-kde`. Runtime/session support claims still follow the capable surface is `redbear-full` / `redbear-live-full`. Older names such as `redbear-wayland` and
evidence model below. `redbear-kde` should be read as historical or staging labels, not supported compile targets.
--- ---
@@ -73,7 +73,7 @@ Rules:
| Area | State | Evidence | Notes | | Area | State | Evidence | Notes |
|---|---|---|---| |---|---|---|---|
| AMD bare-metal boot | validated | Boot, ACPI, SMP, x2APIC all work | Bounded to current tested hardware | | AMD bare-metal boot | validated | Boot, ACPI, SMP, x2APIC all work | Bounded to current tested hardware |
| relibc Wayland/Qt unblockers | builds + targeted runtime proof | signalfd, timerfd, eventfd, open_memstream, F_DUPFD_CLOEXEC, MSG_NOSIGNAL, bounded waitid, bounded RLIMIT, bounded eth0 networking, shm_open, bounded sem_open, bounded sys/ipc.h, bounded sys/shm.h | Strict relibc Redox-target runtime proof now exists for the fd-event slice; broader real-consumer semantics still need confirmation | | relibc Wayland/Qt unblockers | builds + targeted runtime proof | signalfd, timerfd, eventfd, open_memstream, F_DUPFD_CLOEXEC, MSG_NOSIGNAL, bounded waitid, bounded RLIMIT, bounded eth0 networking, shm_open, bounded sem_open | Strict relibc Redox-target runtime proof now exists for the fd-event slice; broader real-consumer semantics still need confirmation |
| redox-driver-sys | builds | Driver substrate | | | redox-driver-sys | builds | Driver substrate | |
| linux-kpi | builds | Linux kernel API compatibility layer | | | linux-kpi | builds | Linux kernel API compatibility layer | |
| firmware-loader | builds, boots | scheme:firmware registers at boot | | | firmware-loader | builds, boots | scheme:firmware registers at boot | |
@@ -98,9 +98,9 @@ Rules:
| plasma-wayland-protocols | builds | | | | plasma-wayland-protocols | builds | | |
| kf6-kwayland | builds | | | | kf6-kwayland | builds | | |
| kf6-kcmutils | builds | Widget-only build (QML stripped) | | | kf6-kcmutils | builds | Widget-only build (QML stripped) | |
| `redbear-wayland` profile | builds, boots | Bounded Wayland runtime profile | | | `redbear-wayland` profile | historical / staging | Bounded Wayland validation profile | Not a supported compile target |
| `redbear-full` profile | builds, boots | Broader desktop plumbing profile | Session/network/runtime integration slice | | `redbear-full` profile | builds, boots | Broader desktop plumbing profile | Session/network/runtime integration slice |
| `redbear-kde` profile | builds | KDE session-surface profile | Tracked default KWin direction; KWin only, not plasma-workspace/desktop yet | | `redbear-kde` profile | historical / staging | Older KDE session-surface profile | Not a supported compile target; use `redbear-full` / `redbear-live-full` for the tracked desktop-capable surface |
| bounded compositor validation path | experimental | Reaches xkbcommon init + EGL platform selection in QEMU | No complete session | | bounded compositor validation path | experimental | Reaches xkbcommon init + EGL platform selection in QEMU | No complete session |
| qt6-wayland-smoke | builds, partial | Creates QWindow with colored background, runs 3 seconds | | | qt6-wayland-smoke | builds, partial | Creates QWindow with colored background, runs 3 seconds | |
| QEMU graphics | usable (bounded) | Renderer is llvmpipe | Not hardware acceleration | | QEMU graphics | usable (bounded) | Renderer is llvmpipe | Not hardware acceleration |
@@ -125,12 +125,12 @@ still display-only evidence, not render proof.
The repo has crossed major build-side gates: The repo has crossed major build-side gates:
1. **relibc surface** — signalfd, timerfd, eventfd, open_memstream, F_DUPFD_CLOEXEC, MSG_NOSIGNAL, bounded waitid, bounded RLIMIT, bounded eth0 networking, shm_open, bounded sem_open, bounded sys/ipc.h, bounded sys/shm.h 1. **relibc surface** — signalfd, timerfd, eventfd, open_memstream, F_DUPFD_CLOEXEC, MSG_NOSIGNAL, bounded waitid, bounded RLIMIT, bounded eth0 networking, shm_open, bounded sem_open
2. **Driver substrate** — redox-driver-sys, linux-kpi, firmware-loader, redox-drm (AMD+Intel), amdgpu C port, evdevd, udev-shim 2. **Driver substrate** — redox-driver-sys, linux-kpi, firmware-loader, redox-drm (AMD+Intel), amdgpu C port, evdevd, udev-shim
3. **Wayland/graphics packages** — libwayland, wayland-protocols, Mesa EGL+GBM+GLES2, libdrm, libdrm_amdgpu 3. **Wayland/graphics packages** — libwayland, wayland-protocols, Mesa EGL+GBM+GLES2, libdrm, libdrm_amdgpu
4. **Qt6 + D-Bus** — qtbase (7 libs + 12 plugins), qtdeclarative (11 libs), qtsvg, qtwayland, D-Bus 1.16.2 4. **Qt6 + D-Bus** — qtbase (7 libs + 12 plugins), qtdeclarative (11 libs), qtsvg, qtwayland, D-Bus 1.16.2
5. **KF6 + KDE-facing** — All 32 KF6 frameworks, kdecoration, plasma-wayland-protocols, kf6-kwayland, kf6-kcmutils 5. **KF6 + KDE-facing** — All 32 KF6 frameworks, kdecoration, plasma-wayland-protocols, kf6-kwayland, kf6-kcmutils
6. **Tracked profiles** — redbear-wayland, redbear-full, redbear-kde 6. **Tracked profiles** — redbear-mini, redbear-live-mini, redbear-full, redbear-live-full
### What is runtime-proven (limited scope) ### What is runtime-proven (limited scope)
@@ -314,7 +314,7 @@ compositor + input + Qt client issues before session-shell complexity.
**Duration:** 610 weeks (starts after Phase 2) **Duration:** 610 weeks (starts after Phase 2)
**Goal:** Turn compositor proof into a real desktop-session substrate centered on KWin. **Goal:** Turn compositor proof into a real desktop-session substrate centered on KWin.
**Profile target:** `redbear-kde` **Profile target:** `redbear-full`
**Renderer:** LLVMpipe (software) — KWin inherits accelerated renderer once Phase 5 lands. **Renderer:** LLVMpipe (software) — KWin inherits accelerated renderer once Phase 5 lands.
#### Blocked dependency set that must be closed #### Blocked dependency set that must be closed
@@ -382,7 +382,7 @@ compositor + input + Qt client issues before session-shell complexity.
**Duration:** 812 weeks (starts after Phase 3) **Duration:** 812 weeks (starts after Phase 3)
**Goal:** Boot into a KDE Plasma session with essential desktop shell and session services. **Goal:** Boot into a KDE Plasma session with essential desktop shell and session services.
**Profile target:** `redbear-kde` **Profile target:** `redbear-full`
#### Work items #### Work items
@@ -416,7 +416,7 @@ plasma-desktop
#### Exit criteria #### Exit criteria
- [ ] `redbear-kde` boots into a KDE Plasma session (plasmashell process is running) - [ ] `redbear-full` boots into a KDE Plasma session (plasmashell process is running)
- [ ] KWin is the active compositor (`WAYLAND_DISPLAY` owned by KWin) - [ ] KWin is the active compositor (`WAYLAND_DISPLAY` owned by KWin)
- [ ] Plasma panel renders and is interactive (launcher opens, clock visible) - [ ] Plasma panel renders and is interactive (launcher opens, clock visible)
- [ ] An application can be launched from the session and displays a window - [ ] An application can be launched from the session and displays a window
@@ -609,7 +609,7 @@ continuity, not as future work.
| All 32 KF6 frameworks | ✅ Builds complete | Prior to this plan | | All 32 KF6 frameworks | ✅ Builds complete | Prior to this plan |
| Input stack (libevdev, libinput, evdevd, udev-shim) | ✅ Builds complete | Prior to this plan | | Input stack (libevdev, libinput, evdevd, udev-shim) | ✅ Builds complete | Prior to this plan |
| Mesa EGL/GBM/GLES2 + libdrm amdgpu | ✅ Builds complete | Prior to this plan | | Mesa EGL/GBM/GLES2 + libdrm amdgpu | ✅ Builds complete | Prior to this plan |
| Desktop profiles (redbear-wayland, redbear-full, redbear-kde) | ✅ Builds complete | Prior to this plan | | Desktop profiles (`redbear-mini`, `redbear-live-mini`, `redbear-full`, `redbear-live-full`) | ✅ Builds complete | Prior to this plan |
| `local/docs/DBUS-INTEGRATION-PLAN.md` | D-Bus architecture, service dependency map, and phased implementation | | `local/docs/DBUS-INTEGRATION-PLAN.md` | D-Bus architecture, service dependency map, and phased implementation |
| PRIME/DMA-BUF scheme ioctls | ✅ Implemented | Prior to this plan | | PRIME/DMA-BUF scheme ioctls | ✅ Implemented | Prior to this plan |
| KWin recipe with 5 re-enabled features | ✅ Partial build | Prior to this plan | | KWin recipe with 5 re-enabled features | ✅ Partial build | Prior to this plan |
+23 -159
View File
@@ -7,25 +7,6 @@
--- ---
## Table of Contents
1. [Executive Summary](#1-executive-summary)
2. [Architecture Principles](#2-architecture-principles)
3. [Current State Assessment](#3-current-state-assessment)
4. [Gap Analysis](#4-gap-analysis)
5. [Architecture Design](#5-architecture-design)
6. [Component Specifications](#6-component-specifications)
7. [Phased Implementation](#7-phased-implementation)
8. [Integration with Console-to-KDE Plan](#8-integration-with-console-to-kde-plan)
9. [D-Bus Service Dependency Map](#9-d-bus-service-dependency-map)
10. [Security Model](#10-security-model)
11. [Build Recipe Changes](#11-build-recipe-changes)
12. [Testing and Validation](#12-testing-and-validation)
13. [Risks and Mitigations](#13-risks-and-mitigations)
14. [Qt 6.11 D-Bus Coverage](#14-qt-611-d-bus-coverage)
---
## 1. Executive Summary ## 1. Executive Summary
D-Bus is **mandatory infrastructure** for KDE Plasma 6 — not optional, not deferrable. KDE D-Bus is **mandatory infrastructure** for KDE Plasma 6 — not optional, not deferrable. KDE
@@ -149,7 +130,7 @@ specific schemes it needs. This keeps the architecture honest and avoids a leaky
| **Session tracker** | `org.freedesktop.login1` | Session/seat/device brokering scaffold | KWin (hard requirement for DRM/libinput) | | **Session tracker** | `org.freedesktop.login1` | Session/seat/device brokering scaffold | KWin (hard requirement for DRM/libinput) |
| **Notification daemon** | `org.freedesktop.Notifications` | Notification service scaffold | kf6-knotifications | | **Notification daemon** | `org.freedesktop.Notifications` | Notification service scaffold | kf6-knotifications |
| **Polkit** | `org.freedesktop.PolicyKit1` | Authorization scaffold (always-permit) | KAuth | | **Polkit** | `org.freedesktop.PolicyKit1` | Authorization scaffold (always-permit) | KAuth |
| **UPower** | `org.freedesktop.UPower` | Bounded real ACPI-backed power enumeration | kf6-solid, PowerDevil | | **UPower** | `org.freedesktop.UPower` | Provisional ACPI-backed power service; current backing power surface is still incomplete | kf6-solid, PowerDevil |
| **UDisks2** | `org.freedesktop.UDisks2` | Bounded real `disk.*` / partition enumeration | kf6-solid | | **UDisks2** | `org.freedesktop.UDisks2` | Bounded real `disk.*` / partition enumeration | kf6-solid |
| **D-Bus service files** | `/usr/share/dbus-1/` | Activation is staged and shipped, but only for the current scaffold services | All D-Bus services | | **D-Bus service files** | `/usr/share/dbus-1/` | Activation is staged and shipped, but only for the current scaffold services | All D-Bus services |
| **D-Bus policy files** | `/etc/dbus-1/` | Policy is staged and shipped for the current scaffold services | All D-Bus services | | **D-Bus policy files** | `/etc/dbus-1/` | Policy is staged and shipped for the current scaffold services | All D-Bus services |
@@ -196,7 +177,7 @@ plasma-workspace needs:
``` ```
Complete Plasma needs (after re-enabling disabled components): Complete Plasma needs (after re-enabling disabled components):
org.freedesktop.UPower ✅ bounded real enumeration exists — still needs runtime validation for kf6-solid org.freedesktop.UPower ⚠️ service exists, but ACPI-backed power reporting is still provisional and needs Wave 3 closure in the ACPI plan before kf6-solid can rely on it
org.freedesktop.UDisks2 ✅ bounded real enumeration exists — still needs runtime validation for kf6-solid org.freedesktop.UDisks2 ✅ bounded real enumeration exists — still needs runtime validation for kf6-solid
org.freedesktop.NetworkManager ⏸️ DEFERRED — Red Bear OS uses redbear-netctl for now org.freedesktop.NetworkManager ⏸️ DEFERRED — Red Bear OS uses redbear-netctl for now
org.freedesktop.PolicyKit1 ⚠️ scaffold exists — KAuth still blocked on missing PolkitQt6-1 packaging org.freedesktop.PolicyKit1 ⚠️ scaffold exists — KAuth still blocked on missing PolkitQt6-1 packaging
@@ -285,7 +266,7 @@ Session launch (redbear-kde-session):
4. dbus-daemon --system already running 4. dbus-daemon --system already running
5. eval $(dbus-launch --sh-syntax) → session bus started 5. eval $(dbus-launch --sh-syntax) → session bus started
6. export DBUS_SESSION_BUS_ADDRESS, XDG_SESSION_ID, XDG_SEAT, XDG_RUNTIME_DIR 6. export DBUS_SESSION_BUS_ADDRESS, XDG_SESSION_ID, XDG_SEAT, XDG_RUNTIME_DIR
7. kwin_wayland --replace → registers org.kde.KWin on session bus 7. kwin_wayland_wrapper --drm → launches KWin on the session bus and owns the Wayland socket lifecycle for the current Red Bear session path
8. [later] plasmashell → registers org.kde.plasmashell on session bus 8. [later] plasmashell → registers org.kde.plasmashell on session bus
``` ```
@@ -394,7 +375,7 @@ local/recipes/system/redbear-sessiond/
│ ├── session.rs # org.freedesktop.login1.Session interface │ ├── session.rs # org.freedesktop.login1.Session interface
│ ├── seat.rs # org.freedesktop.login1.Seat interface │ ├── seat.rs # org.freedesktop.login1.Seat interface
│ ├── device_map.rs # major/minor → scheme path resolution (via udev-shim) │ ├── device_map.rs # major/minor → scheme path resolution (via udev-shim)
│ └── acpi_watcher.rs # scheme:acpi → PrepareForSleep/Shutdown signals │ └── acpi_watcher.rs # current kstop-backed shutdown watcher; sleep signaling remains future-only until ACPI sleep eventing exists
``` ```
### 6.2 D-Bus Service Activation Files ### 6.2 D-Bus Service Activation Files
@@ -585,16 +566,16 @@ APIs, which relibc provides.
| # | Task | Acceptance Criteria | | # | Task | Acceptance Criteria |
|---|------|---------------------| |---|------|---------------------|
| 3.1 | Implement `redbear-upower` — minimal UPower D-Bus service | Registers `org.freedesktop.UPower`, enumerates power devices from `scheme:acpi` | | 3.1 | Implement `redbear-upower` — minimal UPower D-Bus service | Registers `org.freedesktop.UPower`, exposes the current ACPI-backed power surface honestly, and does not overclaim unsupported battery/AC detail |
| 3.2 | Implement `redbear-udisks` — minimal UDisks2 D-Bus service | Registers `org.freedesktop.UDisks2`, enumerates block devices from `scheme:` filesystem | | 3.2 | Implement `redbear-udisks` — minimal UDisks2 D-Bus service | Registers `org.freedesktop.UDisks2`, enumerates block devices from `scheme:` filesystem |
| 3.3 | Re-enable D-Bus in kf6-solid (`-DUSE_DBUS=ON`, re-enable UPower backend) | kf6-solid builds with D-Bus enabled, UPower backend active | | 3.3 | Re-enable D-Bus in kf6-solid (`-DUSE_DBUS=ON`, re-enable UPower backend) | kf6-solid builds with D-Bus enabled, UPower backend active |
| 3.4 | Implement ACPI sleep/shutdown integration in `redbear-sessiond` | `PrepareForShutdown` emitted from the current ACPI shutdown event path; `PrepareForSleep` only after ACPI sleep eventing exists | | 3.4 | Implement ACPI sleep/shutdown integration in `redbear-sessiond` | `PrepareForShutdown` emitted from the current ACPI shutdown event path; `PrepareForSleep` only after ACPI sleep eventing exists |
| 3.5 | Validate PowerDevil (if plasma-workspace includes it) | Power management UI shows battery/AC status | | 3.5 | Validate PowerDevil (if plasma-workspace includes it) | PowerDevil only graduates once the underlying ACPI power surface is trustworthy enough for consumer use |
**Exit criteria:** **Exit criteria:**
- [ ] `org.freedesktop.UPower` registers and enumerates devices - [ ] `org.freedesktop.UPower` registers and exposes the current power surface without overclaiming unsupported detail
- [ ] `org.freedesktop.UDisks2` registers and enumerates block devices - [ ] `org.freedesktop.UDisks2` registers and enumerates block devices
- [ ] kf6-solid uses UPower backend for power queries - [ ] kf6-solid uses UPower backend for power queries only after the ACPI power surface is validated enough for consumer use
- [ ] Shutdown signal flows through login1 D-Bus interface now; sleep signal only if ACPI sleep eventing is implemented - [ ] Shutdown signal flows through login1 D-Bus interface now; sleep signal only if ACPI sleep eventing is implemented
**Dependencies:** Phase DB-2 complete, ACPI boot-baseline integration working; see `local/docs/ACPI-IMPROVEMENT-PLAN.md` for the remaining ownership, robustness, and validation work **Dependencies:** Phase DB-2 complete, ACPI boot-baseline integration working; see `local/docs/ACPI-IMPROVEMENT-PLAN.md` for the remaining ownership, robustness, and validation work
@@ -649,7 +630,7 @@ This D-Bus plan maps directly onto the phases in `CONSOLE-TO-KDE-DESKTOP-PLAN.md
| **Phase 1:** Runtime Substrate Validation | (no D-Bus work — substrate is below D-Bus) | — | | **Phase 1:** Runtime Substrate Validation | (no D-Bus work — substrate is below D-Bus) | — |
| **Phase 2:** Wayland Compositor Proof | **DB-1:** KWin MVP | login1 session broker, system/session bus validation | | **Phase 2:** Wayland Compositor Proof | **DB-1:** KWin MVP | login1 session broker, system/session bus validation |
| **Phase 3:** KWin Desktop Session | **DB-1** (completion) + **DB-2** (session services) | kglobalaccel, kded6, notifications, plasmashell D-Bus | | **Phase 3:** KWin Desktop Session | **DB-1** (completion) + **DB-2** (session services) | kglobalaccel, kded6, notifications, plasmashell D-Bus |
| **Phase 4:** KDE Plasma Session | **DB-3** (hardware services) + **DB-4** (network/policy) | UPower, udisks2, NM, polkit, full session | | **Phase 4:** KDE Plasma Session | **DB-3** (hardware services) + **DB-4** (network/policy) | UPower once the ACPI power surface is honest, udisks2, NM, polkit, full session |
| **Phase 5:** Hardware GPU Enablement | (D-Bus not on critical path for GPU) | login1 TakeDevice for GPU fd passing | | **Phase 5:** Hardware GPU Enablement | (D-Bus not on critical path for GPU) | login1 TakeDevice for GPU fd passing |
### Modifications to Console-to-KDE Plan ### Modifications to Console-to-KDE Plan
@@ -675,7 +656,7 @@ Replace the existing task 3.4 with:
| # | Task | Acceptance Criteria | | # | Task | Acceptance Criteria |
|---|------|---------------------| |---|------|---------------------|
| 4.X | D-Bus hardware services operational | `org.freedesktop.UPower` and `org.freedesktop.UDisks2` register on system bus and enumerate devices | | 4.X | D-Bus hardware services operational | `org.freedesktop.UPower` and `org.freedesktop.UDisks2` register on system bus; UPower consumer claims stay bounded until the ACPI power surface is validated |
| 4.X | D-Bus network service operational | Deferred — Red Bear OS uses `redbear-netctl`, not NetworkManager | | 4.X | D-Bus network service operational | Deferred — Red Bear OS uses `redbear-netctl`, not NetworkManager |
--- ---
@@ -695,7 +676,7 @@ org.freedesktop.DBus (dbus-daemon itself — always present)
│ └── Consumed by: kf6-solid (session properties) │ └── Consumed by: kf6-solid (session properties)
├── [Phase DB-3] org.freedesktop.UPower (redbear-upower) ├── [Phase DB-3] org.freedesktop.UPower (redbear-upower)
│ ├── Depends on: scheme:acpi (for battery/power state) │ ├── Depends on: the current `/scheme/acpi/power` surface (still provisional until the ACPI plan's Wave 3 closes)
│ └── Consumed by: kf6-solid, PowerDevil │ └── Consumed by: kf6-solid, PowerDevil
├── [Phase DB-3] org.freedesktop.UDisks2 (redbear-udisks) ├── [Phase DB-3] org.freedesktop.UDisks2 (redbear-udisks)
@@ -787,12 +768,18 @@ service to perform an operation. The broker service holds the scheme capability,
`dbus-daemon` uses the EXTERNAL SASL mechanism, which authenticates clients by their UNIX `dbus-daemon` uses the EXTERNAL SASL mechanism, which authenticates clients by their UNIX
socket credentials (UID/GID via `SCM_CREDENTIALS`). On Redox, this requires: socket credentials (UID/GID via `SCM_CREDENTIALS`). On Redox, this requires:
1. `relibc`'s `SO_PASSCRED` / `SCM_CREDENTIALS` support — verify this works with Redox 1. `relibc`'s `SO_PASSCRED` / `SCM_CREDENTIALS` support on Redox UNIX domain sockets
UNIX domain sockets
2. `getpeereid()` or equivalent — for the bus daemon to verify the connecting process's UID 2. `getpeereid()` or equivalent — for the bus daemon to verify the connecting process's UID
**Watch out:** If Redox UNIX domain sockets do not support credential passing, D-Bus Current repo status:
authentication will fail silently or fall back to cookie-based auth. Test this early.
- relibc now exposes `SO_PASSCRED`, `SO_PEERCRED`, `SCM_CREDENTIALS`, and `getpeereid()` in the
active tree
- the bounded relibc test path now covers peer-credential lookup (`SO_PEERCRED`) and credential
delivery via `recvmsg()` / `SCM_CREDENTIALS` on Redox UNIX domain sockets
That means the remaining D-Bus risk is no longer raw absence of the credential path in relibc; it
is broader desktop/runtime trust and integration with the real bus daemons.
### 10.3 Policy Granularity ### 10.3 Policy Granularity
@@ -986,13 +973,13 @@ dbus-send --session --dest=org.freedesktop.DBus --print-reply \
| Risk | Impact | Likelihood | Mitigation | | Risk | Impact | Likelihood | Mitigation |
|------|--------|-----------|------------| |------|--------|-----------|------------|
| **UNIX socket credential passing not working on Redox** | D-Bus authentication fails | Medium | Test early in Phase DB-1; if broken, fall back to cookie auth or patch relibc | | **UNIX socket credential passing regresses on Redox** | D-Bus authentication fails | Medium | Keep the relibc UDS credential tests in the preserved proof path; if broken again, fall back to cookie auth or patch relibc |
| **KWin login1 expectations exceed our minimal subset** | KWin crashes or refuses to start | Medium | Start with KWin's Noop fallback; add methods incrementally as KWin logs errors | | **KWin login1 expectations exceed our minimal subset** | KWin crashes or refuses to start | Medium | Start with KWin's Noop fallback; add methods incrementally as KWin logs errors |
| **zbus async runtime conflicts with Redox event system** | zbus doesn't build or run | Low | zbus supports multiple async runtimes; test tokio + Redox early | | **zbus async runtime conflicts with Redox event system** | zbus doesn't build or run | Low | zbus supports multiple async runtimes; test tokio + Redox early |
| **D-Bus service activation files not picked up by dbus-daemon** | Services must be started manually | Low | dbus-daemon 1.16.2 supports classic activation; verify search path in redox.patch | | **D-Bus service activation files not picked up by dbus-daemon** | Services must be started manually | Low | dbus-daemon 1.16.2 supports classic activation; verify search path in redox.patch |
| **Device major/minor mapping unstable** | TakeDevice returns wrong device | Medium | Use udev-shim as single source of truth; add validation tests | | **Device major/minor mapping unstable** | TakeDevice returns wrong device | Medium | Use udev-shim as single source of truth; add validation tests |
| **PAM not available for elogind-like session tracking** | Cannot use elogind directly | Certain | That's why we're building redbear-sessiond — no PAM dependency | | **PAM not available for elogind-like session tracking** | Cannot use elogind directly | Certain | That's why we're building redbear-sessiond — no PAM dependency |
| **Peer credential passing (SCM_CREDENTIALS) missing** | System bus policy can't verify UIDs | Medium | Coarse default-allow policy initially; tighten after credential passing works | | **Peer credential path behaves differently under real dbus-daemon load** | System bus policy can't verify UIDs reliably | Medium | The relibc credential path is now present and bounded-tested; next tighten with real dbus-daemon/session-bus runtime validation |
### 13.2 Integration Risks ### 13.2 Integration Risks
@@ -1097,126 +1084,3 @@ convenience layer. The remaining gap is the difference between **shipping minima
implementations** and **shipping full desktop-complete service contracts** for login1, implementations** and **shipping full desktop-complete service contracts** for login1,
Notifications, UPower, UDisks2, and PolicyKit. NetworkManager remains deferred and is not part of Notifications, UPower, UDisks2, and PolicyKit. NetworkManager remains deferred and is not part of
the current Red Bear OS implementation scope. the current Red Bear OS implementation scope.
---
## Appendix A: Why Not elogind?
elogind was considered and rejected as the primary session tracker because:
1. **Too Linux-shaped.** elogind depends on cgroups, PAM, udev, and the Linux device model.
Redox has none of these natively. Porting elogind would mean porting a large chunk of
Linux infrastructure that doesn't fit the scheme-based model.
2. **PAM dependency.** elogind requires `pam_elogind.so` for session tracking. Redox has
no PAM. Building a PAM compatibility layer just for elogind is wasteful when we can
build a targeted Rust daemon instead.
3. **Unnecessary scope.** elogind implements the full `org.freedesktop.login1` interface,
including cgroup management, session resource limits, and multi-seat complexity that
Red Bear OS doesn't need. Our `redbear-sessiond` implements only the subset KWin
actually consumes.
4. **The recipe already exists but doesn't build.** The WIP elogind recipe in
`recipes/wip/services/elogind/` has been "not compiled or tested" since it was added.
This confirms the difficulty of porting it.
**When to reconsider:** If KDE Plasma later adds hard dependencies on elogind-specific
behavior beyond the login1 D-Bus interface, or if a future elogind version becomes easier
to port, reassess.
## Appendix B: Why Not dbus-broker?
dbus-broker was considered and rejected because:
1. **Requires systemd for launcher.** `dbus-broker-launch` depends on `libsystemd-daemon`.
Redox has no systemd. While dbus-broker can run without systemd (as Guix does), it
loses service activation.
2. **No traditional activation.** dbus-broker does not implement the classic `.service`
file activation mechanism. It relies on systemd for service launching. This means we'd
need to write our own activation helper.
3. ** dbus-daemon is sufficient.** The reference implementation works on every non-systemd
distro (Alpine, Void, Gentoo/OpenRC). Performance concerns are irrelevant at our scale.
**When to reconsider:** If D-Bus throughput becomes a measurable bottleneck during KDE
Plasma sessions, or if dbus-broker gains first-class non-systemd support.
## Appendix C: Why Not ConsoleKit2?
ConsoleKit2 was considered as a fallback because KWin has a `session_consolekit.cpp` backend.
However:
1. **Legacy.** ConsoleKit2 is unmaintained (last release 2020). freedesktop.org itself
recommends elogind or systemd-logind.
2. **Incomplete API.** ConsoleKit2's D-Bus interface doesn't match `org.freedesktop.login1`.
It uses `org.freedesktop.ConsoleKit` with different method signatures. KWin's ConsoleKit
backend is a lowest-priority fallback, not a primary path.
3. **Doesn't help with broader compatibility.** Other freedesktop services (UPower, udisks2,
NM) all expect `org.freedesktop.login1` for session checks. ConsoleKit2 would only
help KWin, not the rest of the stack.
**Verdict:** Building a login1-compatible daemon (`redbear-sessiond`) serves KWin *and*
every other freedesktop service that checks for session validity.
## Appendix D: File Inventory
### New files to create
| File | Purpose |
|------|---------|
| `local/recipes/system/redbear-sessiond/recipe.toml` | Session broker build recipe |
| `local/recipes/system/redbear-sessiond/source/Cargo.toml` | Rust workspace |
| `local/recipes/system/redbear-sessiond/source/src/main.rs` | Daemon entry point |
| `local/recipes/system/redbear-sessiond/source/src/manager.rs` | login1.Manager interface |
| `local/recipes/system/redbear-sessiond/source/src/session.rs` | login1.Session interface |
| `local/recipes/system/redbear-sessiond/source/src/seat.rs` | login1.Seat interface |
| `local/recipes/system/redbear-sessiond/source/src/device_map.rs` | Major/minor → scheme path |
| `local/recipes/system/redbear-sessiond/source/src/acpi_watcher.rs` | ACPI signal bridge |
| `local/recipes/system/redbear-dbus-services/recipe.toml` | D-Bus service file staging |
| `local/recipes/system/redbear-dbus-services/files/system-services/org.freedesktop.login1.service` | login1 activation |
| `local/recipes/system/redbear-dbus-services/files/system-services/org.freedesktop.UPower.service` | UPower activation |
| `local/recipes/system/redbear-dbus-services/files/system-services/org.freedesktop.UDisks2.service` | UDisks2 activation |
| `local/recipes/system/redbear-dbus-services/files/system-services/org.freedesktop.PolicyKit1.service` | PolicyKit1 activation |
| `local/recipes/system/redbear-dbus-services/files/session-services/org.kde.kglobalaccel.service` | kglobalaccel activation |
| `local/recipes/system/redbear-dbus-services/files/session-services/org.kde.kded6.service` | kded6 activation |
| `local/recipes/system/redbear-dbus-services/files/session-services/org.freedesktop.Notifications.service` | Notifications activation |
| `local/recipes/system/redbear-dbus-services/files/system.d/org.freedesktop.login1.conf` | login1 policy |
| `local/recipes/system/redbear-dbus-services/files/system.d/org.freedesktop.UPower.conf` | UPower policy |
| `local/recipes/system/redbear-dbus-services/files/system.d/org.freedesktop.UDisks2.conf` | UDisks2 policy |
| `local/recipes/system/redbear-dbus-services/files/system.d/org.freedesktop.PolicyKit1.conf` | PolicyKit1 policy |
| `local/recipes/system/redbear-dbus-services/files/session.d/org.redbear.session.conf` | Session policy |
| `local/recipes/system/redbear-notifications/recipe.toml` | Notification daemon recipe |
| `local/recipes/system/redbear-notifications/source/src/main.rs` | org.freedesktop.Notifications (session bus) |
| `local/recipes/system/redbear-upower/recipe.toml` | UPower daemon recipe |
| `local/recipes/system/redbear-upower/source/src/main.rs` | org.freedesktop.UPower (system bus) |
| `local/recipes/system/redbear-udisks/recipe.toml` | UDisks2 daemon recipe |
| `local/recipes/system/redbear-udisks/source/src/main.rs` | org.freedesktop.UDisks2 (system bus) |
| `local/recipes/system/redbear-polkit/recipe.toml` | PolicyKit1 daemon recipe |
| `local/recipes/system/redbear-polkit/source/src/main.rs` | org.freedesktop.PolicyKit1 (system bus) |
| `local/recipes/libs/zbus/recipe.toml` | zbus crate build-ordering marker |
| `local/scripts/test-dbus-qemu.sh` | D-Bus validation script |
### Existing files to modify
| File | Change | Phase |
|------|--------|-------|
| `config/redbear-kde.toml` | Add redbear-sessiond package + init service + env vars | DB-1 |
| `config/redbear-full.toml` | Add redbear-sessiond package + init service (optional) | DB-1 |
| `local/recipes/kde/kf6-knotifications/recipe.toml` | Change `-DUSE_DBUS=OFF``ON` | DB-2 |
| `local/recipes/kde/kf6-solid/recipe.toml` | Deferred — Phase 5/Phase 6 validation harness exists, but runtime consumer proof is still blocked while `solid-hardware6` tooling / backend enablement remain off | DB-3 |
| `local/recipes/kde/kf6-kauth/recipe.toml` | Still uses `KAUTH_BACKEND_NAME=FAKE` until PolkitQt6-1 exists in-tree | DB-4 blocker |
| `local/recipes/kde/kf6-kio/recipe.toml` | Change `-DUSE_DBUS=OFF``ON` | DB-5 |
| `local/recipes/kde/kf6-kwallet/recipe.toml` | Replace stub with real build | DB-5 |
| `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` | Add D-Bus tasks to Phases 24 | DB-1 |
| `AGENTS.md` | Add D-Bus plan reference | DB-1 |
| `local/AGENTS.md` | Add D-Bus plan reference | DB-1 |
---
*This document should be read alongside `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` for the
full desktop execution context. D-Bus phases DB-1 through DB-5 align directly with desktop
plan Phases 2 through 5.*
+15 -15
View File
@@ -58,12 +58,12 @@ greeter/auth/session-launch stack on the `redbear-full` desktop path.
| redbear-greeterd | **builds, experimental** | Root-owned greeter orchestrator; UI/auth socket protocol, bounded restart policy, return-to-greeter daemon logic, crate tests pass; end-to-end runtime proof still pending | | redbear-greeterd | **builds, experimental** | Root-owned greeter orchestrator; UI/auth socket protocol, bounded restart policy, return-to-greeter daemon logic, crate tests pass; end-to-end runtime proof still pending |
| redbear-greeter UI | **builds, experimental** | Qt6/QML unprivileged login surface now ships in-tree; bounded runtime proof remains narrower than a full trusted KDE desktop-login claim | | redbear-greeter UI | **builds, experimental** | Qt6/QML unprivileged login surface now ships in-tree; bounded runtime proof remains narrower than a full trusted KDE desktop-login claim |
| redbear-validation-session | **builds, bounded helper** | Still staged as a validation launcher/helper, but no longer the primary `redbear-full` display-service owner | | redbear-validation-session | **builds, bounded helper** | Still staged as a validation launcher/helper, but no longer the primary `redbear-full` display-service owner |
| Greeter runtime checker | ✅ implemented (bounded checker) | `redbear-greeter-check` asserts greeter binaries, assets, service files, socket reachability, hello protocol, invalid-login handling, and a validation-only successful-login/session-return loop inside the guest; current graphical runtime proof is still blocked below the greeter slice by guest-side Qt shared-plugin parsing | | Greeter runtime checker | ✅ implemented (bounded checker) | `redbear-greeter-check` asserts greeter binaries, assets, service files, socket reachability, hello protocol, invalid-login handling, and a validation-only successful-login/session-return loop inside the guest |
| Greeter QEMU harness | ✅ implemented (bounded harness) | `test-greeter-qemu.sh` boots `redbear-full`, logs in on the fallback console, and runs the in-guest greeter checker for hello, invalid-login, and bounded successful-login return-to-greeter proof; the compositor leg is presently blocked by guest-side Qt plugin loader failure rather than missing greeter artifacts | | Greeter QEMU harness | ✅ implemented (bounded harness) | `test-greeter-qemu.sh` boots `redbear-full`, logs in on the fallback console, and now passes the in-guest greeter checker for hello, invalid-login, and bounded successful-login return-to-greeter proof |
| redbear-notifications | ✅ Scaffold | org.freedesktop.Notifications — logs to stderr, no display integration yet | | redbear-notifications | ✅ Scaffold | org.freedesktop.Notifications — logs to stderr, no display integration yet |
| redbear-upower | ✅ bounded real | org.freedesktop.UPower — enumerates real AC adapters/batteries from `/scheme/acpi/power`; desktop machines with no battery report line power only | | redbear-upower | ⚠️ scaffold / experimental | org.freedesktop.UPower — service exists, and the backing `/scheme/acpi/power` surface now performs real AML-backed enumeration, but its bootstrap preconditions and runtime proof are still too weak to call release-grade or consumer-validated; treat current enumeration as provisional until Wave 3 in `local/docs/ACPI-IMPROVEMENT-PLAN.md` closes |
| redbear-udisks | ✅ bounded real | org.freedesktop.UDisks2 — enumerates real `disk.*` schemes and partitions into read-only D-Bus objects; no fabricated mount/serial metadata | | redbear-udisks | ✅ bounded real | org.freedesktop.UDisks2 — enumerates real `disk.*` schemes and partitions into read-only D-Bus objects; no fabricated mount/serial metadata |
| Phase 5 D-Bus runtime proof | ✅ implemented (bounded QEMU proof) | `redbear-phase5-network-check` + `test-phase5-network-qemu.sh` assert bounded-real UPower/UDisks2 registration and runtime-backed enumeration on `redbear-full`; this is a desktop/network plumbing proof, not a claim that the Wi-Fi plan's later Phase W5 hardware/runtime-reporting exit criteria are complete | | Phase 5 D-Bus runtime proof | ✅ implemented (bounded QEMU proof) | `redbear-phase5-network-check` + `test-phase5-network-qemu.sh` assert bounded QEMU service registration and current runtime plumbing on `redbear-full`; treat UPower as provisional until the ACPI power surface is made honest in `local/docs/ACPI-IMPROVEMENT-PLAN.md` Wave 3 |
| Phase 6 Solid readiness proof | ✅ implemented, blocked | `redbear-phase6-kde-check` + `test-phase6-kde-qemu.sh` now distinguish real Solid validation from blocked states; `kf6-solid` remains disabled until runtime proof + tooling are present | | Phase 6 Solid readiness proof | ✅ implemented, blocked | `redbear-phase6-kde-check` + `test-phase6-kde-qemu.sh` now distinguish real Solid validation from blocked states; `kf6-solid` remains disabled until runtime proof + tooling are present |
| redbear-polkit | ✅ Scaffold | org.freedesktop.PolicyKit1 — always-permit authorization; KAuth still uses FAKE backend because PolkitQt6-1 is not packaged yet | | redbear-polkit | ✅ Scaffold | org.freedesktop.PolicyKit1 — always-permit authorization; KAuth still uses FAKE backend because PolkitQt6-1 is not packaged yet |
| redbear-dbus-services | ✅ Created | D-Bus activation files + policies staged | | redbear-dbus-services | ✅ Created | D-Bus activation files + policies staged |
@@ -130,14 +130,14 @@ Current truth for that slice:
| Piece | Current state | Remaining limitation | | Piece | Current state | Remaining limitation |
|---|---|---| |---|---|---|
| `redbear-authd` | Target-side recipe build proven; unit tests cover passwd/shadow parsing, SHA-crypt verification, lockout, approval checks | No bounded in-guest login proof yet | | `redbear-authd` | Target-side recipe build proven; unit tests cover passwd/shadow parsing, SHA-crypt and Argon2 verification, lockout, approval checks | Remaining risk is no longer auth-format handling, but broader desktop-session stability below the greeter slice |
| `redbear-session-launch` | Target-side recipe build proven; unit tests cover env/runtime-dir/argument handling | Real session handoff still depends on full greeter/runtime proof | | `redbear-session-launch` | Target-side recipe build proven; unit tests cover env/runtime-dir/argument handling, including current session environment contract | Remaining limitation is broader compositor/session stability, not the basic session-launch boundary |
| `redbear-greeterd` | Crate tests cover protocol-facing state strings, installed asset paths, bounded restart policy, and now own successful-login session launch directly after response delivery | Full desktop-login trust still depends on wider KDE runtime proof plus the unresolved guest-side Qt plugin-loader defect | | `redbear-greeterd` | Crate tests cover protocol-facing state strings, installed asset paths, bounded restart policy, and now own successful-login session launch directly after response delivery | Full desktop-login trust still depends on wider KDE runtime proof; the remaining instability is KWin compositor startup, not greeter/auth protocol wiring |
| Greeter validation helpers | `redbear-greeter-check` + `test-greeter-qemu.sh` exist and are wired for bounded runtime proof | The successful-login path is validation-only and does not replace broader KDE session proof; current graphical proof is blocked by guest-side Qt plugin parsing rather than by greeter protocol/packaging gaps | | Greeter validation helpers | `redbear-greeter-check` + `test-greeter-qemu.sh` exist and are wired for bounded runtime proof | The successful-login path is validation-only and does not replace broader KDE session proof, but the bounded QEMU greeter proof now passes |
| `redbear-greeter` packaging | Builds in-tree | Qt/QML UI binary, compositor wrapper, and branded assets are packaged; broader runtime trust still remains experimental because the guest-side Qt plugin loader currently rejects shared platform plugins (`libqminimal.so`, KWin QPA) as invalid ELF during metadata scan | | `redbear-greeter` packaging | Builds in-tree | Qt/QML UI binary, compositor wrapper, branded assets, and a shared login-protocol crate are present; Qt shared-plugin loading now works in the guest, while broader KWin runtime stability still remains experimental |
This means Red Bear now has a credible **build-visible login boundary**, but not yet a runtime-trusted This means Red Bear now has a credible **bounded runtime-visible login boundary**, but not yet a
graphical login surface. runtime-trusted general-purpose graphical login surface.
### 4. KWin reduced build is now dependency-honest, but runtime proof is still missing (desktop-session gate) ### 4. KWin reduced build is now dependency-honest, but runtime proof is still missing (desktop-session gate)
@@ -183,11 +183,11 @@ exercised on real Intel and AMD hardware.
## Bottom Line ## Bottom Line
The Red Bear desktop stack has crossed major build-side gates: The Red Bear desktop stack has crossed major build-side gates and one important bounded runtime gate:
- All Qt6 core modules, all 32 KF6 frameworks, Mesa EGL/GBM/GLES2, and D-Bus build - All Qt6 core modules, all 32 KF6 frameworks, Mesa EGL/GBM/GLES2, and D-Bus build
- Four supported compile targets exist, with desktop/graphics on `redbear-full` and `redbear-live-full` - Four supported compile targets exist, with desktop/graphics on `redbear-full` and `redbear-live-full`
- the non-visual Red Bear-native greeter/login pieces now build and test - the Red Bear-native greeter/login path now has a bounded passing QEMU proof (`GREETER_HELLO=ok`, `GREETER_INVALID=ok`, `GREETER_VALID=ok`)
- relibc compatibility is materially stronger than before - relibc compatibility is materially stronger than before
The remaining work is **runtime validation, greeter/UI completion, session assembly, and the remaining KDE session/runtime proof work**. The remaining work is **broader runtime validation, compositor/session stability, and the remaining KDE session/runtime proof work**.
Phase 1 (Runtime Substrate Validation) remains the immediate broad target, while the new greeter/login path and the KWin reduced path both still need bounded runtime proof before stronger claims are safe. Phase 1 (Runtime Substrate Validation) remains the immediate broad target. The key current boundary is now explicit: the greeter/login slice has crossed its bounded proof gate, the old `kwin_wayland` page-fault path has been removed, and current QEMU now fails lower in the desktop/runtime layer with a clean no-usable-DRM limitation rather than with a compositor crash.
+54 -58
View File
@@ -1,51 +1,30 @@
# Red Bear OS Greeter / Login Implementation Plan # Red Bear OS Greeter / Login Implementation Plan
**Version:** 1.0 — 2026-04-19 **Version:** 1.0 — 2026-04-19
**Status:** Active plan with experimental implementation in progress on `redbear-full` **Status:** Active plan with bounded greeter/login proof now passing on `redbear-full`; broader desktop-runtime trust still remains experimental
**Scope:** Red Bear-native graphical greeter, authentication boundary, and session handoff for the KDE-on-Wayland desktop path **Scope:** Red Bear-native graphical greeter, authentication boundary, and session handoff for the KDE-on-Wayland desktop path
**Parent plans:** `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` (v2.0), `local/docs/DBUS-INTEGRATION-PLAN.md` **Parent plans:** `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` (v2.0), `local/docs/DBUS-INTEGRATION-PLAN.md`
--- ---
## Table of Contents
1. [Executive Summary](#1-executive-summary)
2. [Scope and Non-Goals](#2-scope-and-non-goals)
3. [Evidence Model](#3-evidence-model)
4. [Current State Assessment](#4-current-state-assessment)
5. [Decision Record: Login-Manager Direction](#5-decision-record-login-manager-direction)
6. [Architecture Principles](#6-architecture-principles)
7. [Architecture Design](#7-architecture-design)
8. [Component Specifications](#8-component-specifications)
9. [Protocols and Session Contracts](#9-protocols-and-session-contracts)
10. [Phased Implementation](#10-phased-implementation)
11. [Testing and Validation](#11-testing-and-validation)
12. [Risks and Mitigations](#12-risks-and-mitigations)
13. [Relationship to Other Plans](#13-relationship-to-other-plans)
14. [File and Recipe Inventory](#14-file-and-recipe-inventory)
15. [Open Questions](#15-open-questions)
---
## 1. Executive Summary ## 1. Executive Summary
Red Bear OS currently has enough session substrate to start **one fixed KDE Wayland session**, but it does Red Bear OS currently has enough session substrate to start **one fixed KDE Wayland session** and now
not yet have a real graphical login path. has a bounded Red Bear-native graphical greeter/login proof on `redbear-full`, but it does not yet
have a runtime-trusted generally stable desktop login surface.
What exists today: What exists today:
- `dbus-daemon` on the system bus - `dbus-daemon` on the system bus
- `redbear-sessiond` exposing a minimal `org.freedesktop.login1` subset for KWin - `redbear-sessiond` exposing a minimal `org.freedesktop.login1` subset for KWin
- `seatd` as the seat/libseat backend - `seatd` as the seat/libseat backend
- a direct session launcher (`redbear-kde-session`) that starts `kwin_wayland` - a direct session launcher (`redbear-kde-session`) that now starts `kwin_wayland_wrapper --drm`
- fallback text `getty` surfaces on VT2 and `/scheme/debug/no-preserve` - fallback text `getty` surfaces on VT2 and `/scheme/debug/no-preserve`
What does **not** exist today: What does **not** exist today:
- no display manager - no display manager
- no graphical greeter - no runtime-trusted generally stable compositor-backed desktop login surface
- no authentication daemon
- no session-launch privilege boundary
- no PAM-backed or systemd-logind-shaped login stack - no PAM-backed or systemd-logind-shaped login stack
This plan defines the forward path for the missing layer: This plan defines the forward path for the missing layer:
@@ -136,8 +115,8 @@ Rules:
| display VT activation | `29_activate_console.service` in desktop configs | ✅ usable (bounded) | `inputd -A 3` activates desktop VT | | display VT activation | `29_activate_console.service` in desktop configs | ✅ usable (bounded) | `inputd -A 3` activates desktop VT |
| fallback text login | `30_console.service` | ✅ boots | `getty 2` on VT2 | | fallback text login | `30_console.service` | ✅ boots | `getty 2` on VT2 |
| debug console | `31_debug_console.service` | ✅ boots | `getty /scheme/debug/no-preserve -J` | | debug console | `31_debug_console.service` | ✅ boots | `getty /scheme/debug/no-preserve -J` |
| direct KDE session launcher | `/usr/bin/redbear-kde-session` | ✅ builds, experimental | Starts session bus if needed, then `exec kwin_wayland --replace` | | direct KDE session launcher | `/usr/bin/redbear-kde-session` | ✅ builds, experimental | Starts session bus if needed, then `exec kwin_wayland_wrapper --drm` |
| authentication daemon | `local/recipes/system/redbear-authd/` | ✅ builds, experimental | Local-user auth boundary with `/etc/passwd` / `/etc/shadow` / `/etc/group` parsing and SHA-crypt verification | | authentication daemon | `local/recipes/system/redbear-authd/` | ✅ builds, experimental | Local-user auth boundary with `/etc/passwd` / `/etc/shadow` / `/etc/group` parsing plus SHA-crypt and Argon2 verification |
| session launcher boundary | `local/recipes/system/redbear-session-launch/` | ✅ builds, experimental | User-session bootstrap with bounded environment/runtime-dir setup | | session launcher boundary | `local/recipes/system/redbear-session-launch/` | ✅ builds, experimental | User-session bootstrap with bounded environment/runtime-dir setup |
| greeter daemon scaffold | `local/recipes/system/redbear-greeter/` | ✅ builds, experimental | Root-owned greeter orchestrator, socket protocol, bounded restart policy | | greeter daemon scaffold | `local/recipes/system/redbear-greeter/` | ✅ builds, experimental | Root-owned greeter orchestrator, socket protocol, bounded restart policy |
| greeter config fragment | `config/redbear-greeter-services.toml` | ✅ builds, experimental | Adds `19_redbear-authd.service`, `20_greeter.service`, compatibility `20_display.service`, and fallback console dependencies | | greeter config fragment | `config/redbear-greeter-services.toml` | ✅ builds, experimental | Adds `19_redbear-authd.service`, `20_greeter.service`, compatibility `20_display.service`, and fallback console dependencies |
@@ -148,33 +127,29 @@ Rules:
| Component | Status | Gap | | Component | Status | Gap |
|---|---|---| |---|---|---|
| `redbear-sessiond` seat switching | ⚠️ scaffold | `Seat.SwitchTo` is currently logged/delegated externally to `inputd -A` | | `redbear-sessiond` seat switching | ✅ boundedly implemented | `Seat.SwitchTo` now delegates to `inputd -A <vt>`; remaining compositor stability is not blocked on a seat-switch no-op anymore |
| KDE runtime services | ⚠️ partial | D-Bus substrate exists, but broader Plasma session services remain incomplete | | KDE runtime services | ⚠️ partial | D-Bus substrate exists, but broader Plasma session services remain incomplete |
| `redbear-full` greeter flow | ⚠️ experimental | Non-visual pieces are implemented, but final packaged UI and bounded runtime proof are still pending | | `redbear-full` greeter flow | ✅ bounded proof passes | Packaged UI, auth/session plumbing, and bounded compositor-backed greeter proof now work end to end; the old `kwin_wayland` page-fault path is gone, and current QEMU now stops at a clean no-usable-DRM limitation below the greeter slice |
| greeter runtime validation | ⚠️ partial | `redbear-greeter-check` + `test-greeter-qemu.sh` exist, but final proof still depends on the packaged greeter UI | | greeter runtime validation | ✅ bounded proof passes | `redbear-greeter-check` + `test-greeter-qemu.sh` now pass hello, invalid-login, and validation-only successful-login return-to-greeter flow |
### 4.3 What Does Not Exist ### 4.3 What Does Not Exist
| Missing piece | Why it matters | | Missing piece | Why it matters |
|---|---| |---|---|
| packaged graphical greeter UI | no complete user-visible graphical login surface is staged yet |
| bounded end-to-end login proof | build-side pieces exist, but runtime-trusted login/session handoff is not proven yet |
| shared login protocol extraction | current protocol is encoded directly in the first-cut daemon/checker implementations |
| display-manager package integration | no SDDM/greetd/lightdm/ly path in repo | | display-manager package integration | no SDDM/greetd/lightdm/ly path in repo |
### 4.4 Baseline Conclusion ### 4.4 Baseline Conclusion
The current Red Bear desktop path can **start a session**, but it cannot yet **own a login flow**. The current Red Bear desktop path can now **own a bounded login flow**, and the active greeter/login
implementation bar in this plan is substantially met.
The missing work is not “port more KDE packages.” The missing work is the **login boundary**: The remaining blocker to a stronger desktop-runtime claim is now evidence-backed as **below this
greeter slice**: the old `kwin_wayland` crash path has been eliminated, and current QEMU now reaches
clean `No suitable DRM devices have been found` exits instead. That means the follow-on work has
shifted to the parent desktop/Wayland/runtime plans rather than to missing core greeter/auth/session-boundary pieces here.
1. graphical greeter surface, Future work beyond this plan should continue **without** replacing the current seat/session substrate
2. authentication boundary, and without removing existing console recovery paths.
3. session-launch privilege drop,
4. clean handoff into the existing KDE Wayland session path.
That login boundary must be added **without** replacing the current seat/session substrate and
without removing existing console recovery paths.
--- ---
@@ -524,6 +499,7 @@ elsewhere, but the greeter/auth path must interact only with installed runtime a
- `XDG_RUNTIME_DIR=/run/user/$UID` - `XDG_RUNTIME_DIR=/run/user/$UID`
- `XDG_SESSION_TYPE=wayland` - `XDG_SESSION_TYPE=wayland`
- `XDG_CURRENT_DESKTOP=KDE` - `XDG_CURRENT_DESKTOP=KDE`
- `XDG_SESSION_ID=c1`
- `KDE_FULL_SESSION=true` - `KDE_FULL_SESSION=true`
- `WAYLAND_DISPLAY=wayland-0` - `WAYLAND_DISPLAY=wayland-0`
- `XDG_SEAT=seat0` - `XDG_SEAT=seat0`
@@ -571,7 +547,11 @@ Required behavior:
## 10. Phased Implementation ## 10. Phased Implementation
### Phase G0 — Scope Freeze and Wiring Baseline > **Current implementation note:** the repo has now crossed the bounded proof bar through the core
> G0G4 path and parts of G5. The phase breakdown below remains useful as an ownership and acceptance
> model, but it should be read as an active status ladder rather than as an untouched future-only plan.
### Phase G0 — Scope Freeze and Wiring Baseline (✅ boundedly complete)
**Goal:** Freeze the architectural split and identify the tracked desktop profile(s) that will own the **Goal:** Freeze the architectural split and identify the tracked desktop profile(s) that will own the
greeter path. greeter path.
@@ -588,13 +568,13 @@ greeter path.
- session policy is explicit, - session policy is explicit,
- asset source of truth is explicit. - asset source of truth is explicit.
### Phase G1 — Service Skeleton and Boot Wiring ### Phase G1 — Service Skeleton and Boot Wiring (✅ boundedly complete)
**Goal:** Add daemon/package skeletons and init wiring without claiming a usable login flow. **Goal:** Add daemon/package skeletons and init wiring without claiming a usable login flow.
| # | Task | Acceptance criteria | | # | Task | Acceptance criteria |
|---|---|---| |---|---|---|
| G1.1 | Create recipe skeletons | `redbear-greeter`, `redbear-authd`, `redbear-session-launch`, optional `redbear-login-protocol` build and stage | | G1.1 | Create recipe skeletons | `redbear-greeter`, `redbear-authd`, `redbear-session-launch`, and shared `redbear-login-protocol` build and stage |
| G1.2 | Add config fragment | A tracked config fragment wires `20_greeter.service` and supporting files | | G1.2 | Add config fragment | A tracked config fragment wires `20_greeter.service` and supporting files |
| G1.3 | Replace direct display launch in the chosen profile | Desktop profile starts `redbear-greeterd` instead of directly starting `redbear-kde-session` | | G1.3 | Replace direct display launch in the chosen profile | Desktop profile starts `redbear-greeterd` instead of directly starting `redbear-kde-session` |
| G1.4 | Keep text/debug recovery path | VT2 `getty` and debug `getty` still boot | | G1.4 | Keep text/debug recovery path | VT2 `getty` and debug `getty` still boot |
@@ -606,7 +586,7 @@ greeter path.
- image still boots, - image still boots,
- fallback text surfaces remain reachable. - fallback text surfaces remain reachable.
### Phase G2 — Auth Foundation ### Phase G2 — Auth Foundation (✅ boundedly complete)
**Goal:** Prove the local account/authentication boundary independent of the full greeter UI. **Goal:** Prove the local account/authentication boundary independent of the full greeter UI.
@@ -626,7 +606,7 @@ greeter path.
- no UI process reads auth data, - no UI process reads auth data,
- repeated auth failure behavior is bounded and explicit. - repeated auth failure behavior is bounded and explicit.
### Phase G3 — Greeter UI and Daemon State Machine ### Phase G3 — Greeter UI and Daemon State Machine (✅ boundedly complete)
**Goal:** Bring up the graphical greeter surface and daemon orchestration. **Goal:** Bring up the graphical greeter surface and daemon orchestration.
@@ -646,7 +626,7 @@ greeter path.
- no session starts yet without auth success, - no session starts yet without auth success,
- fallback console path remains reachable under greeter failure. - fallback console path remains reachable under greeter failure.
### Phase G4 — Session Handoff to KDE on Wayland ### Phase G4 — Session Handoff to KDE on Wayland (✅ boundedly complete for the current bounded proof)
**Goal:** Replace direct session startup with authenticated session launch. **Goal:** Replace direct session startup with authenticated session launch.
@@ -664,7 +644,7 @@ greeter path.
- session exit returns to greeter, - session exit returns to greeter,
- fallback VT2 login still works. - fallback VT2 login still works.
### Phase G5 — Desktop Integration and Product Surface Hardening ### Phase G5 — Desktop Integration and Product Surface Hardening (⚠️ partial / follow-on)
**Goal:** Move from “bounded login proof” to a product-quality Red Bear login surface. **Goal:** Move from “bounded login proof” to a product-quality Red Bear login surface.
@@ -682,6 +662,14 @@ greeter path.
- hardening checks pass, - hardening checks pass,
- documentation matches shipped surface. - documentation matches shipped surface.
**Current state:**
- G5.1 is present in bounded form through the greeter power-action path,
- G5.3 is substantially present for the tracked `redbear-full` profile wiring,
- G5.4 exists as `local/scripts/test-greeter-qemu.sh` plus in-target `redbear-greeter-check`,
- the remaining open part is moving from bounded proof to stronger desktop-runtime trust and broader
compositor/session stability evidence.
### Critical Path ### Critical Path
```text ```text
@@ -720,15 +708,16 @@ The first bounded integration proofs should answer these questions in order:
### 11.3 Suggested Validation Commands / Harnesses ### 11.3 Suggested Validation Commands / Harnesses
This plan expects a bounded QEMU harness similar in style to existing Red Bear runtime proofs. This plan now has a bounded QEMU/runtime harness in the repo and should continue to follow the same
proof style as other Red Bear runtime validation flows.
Expected future surfaces: Current surfaces:
- `local/scripts/test-greeter-qemu.sh` - `local/scripts/test-greeter-qemu.sh`
- in-target checker such as `redbear-greeter-check` - in-target checker `redbear-greeter-check`
The exact script names are implementation details, but the proof style should match existing bounded The exact script names are still implementation details, but the proof style should match existing
runtime validation patterns already used elsewhere in the repo. bounded runtime validation patterns already used elsewhere in the repo.
### 11.4 Definition of Done ### 11.4 Definition of Done
@@ -743,6 +732,11 @@ This plan is only substantially complete when **all** of the following are true:
- greeter/UI failure does not trap the machine in an unrecoverable restart loop, - greeter/UI failure does not trap the machine in an unrecoverable restart loop,
- the bounded login/logout proof repeats reliably on the intended target class. - the bounded login/logout proof repeats reliably on the intended target class.
**Current status against this bar:** the bounded QEMU greeter proof now satisfies the greeter/login
implementation bar in this plan. The remaining blocker to stronger desktop-session claims reproduces
under direct `dbus-run-session -- redbear-kde-session` as well, so it no longer points to a missing
greeter/auth/session-boundary implementation inside this plan.
--- ---
## 12. Risks and Mitigations ## 12. Risks and Mitigations
@@ -791,6 +785,7 @@ the login/greeter boundary between a booted desktop substrate and a real KDE ses
```text ```text
local/recipes/system/ local/recipes/system/
├── redbear-authd/ ├── redbear-authd/
├── redbear-login-protocol/
├── redbear-session-launch/ ├── redbear-session-launch/
└── redbear-greeter/ └── redbear-greeter/
``` ```
@@ -800,8 +795,9 @@ Current implementation status:
- `redbear-authd/` — implemented (experimental, target-side recipe build proven) - `redbear-authd/` — implemented (experimental, target-side recipe build proven)
- `redbear-session-launch/` — implemented (experimental, target-side recipe build proven) - `redbear-session-launch/` — implemented (experimental, target-side recipe build proven)
- `redbear-greeter/` — implemented as an experimental bounded surface; daemon, Qt/QML UI, compositor wrapper, staged assets, and bounded runtime checks now exist, while broader KDE runtime trust still remains open - `redbear-greeter/` — implemented as an experimental bounded surface; daemon, Qt/QML UI, compositor wrapper, staged assets, and bounded runtime checks now exist, while broader KDE runtime trust still remains open
- Current blocker after the greeter/UI packaging work: guest-side Qt shared-plugin loading on Red Bear still rejects platform plugins during metadata scan (`libqminimal.so`, `qwayland-org.kde.kwin.qpa.so`) even though the plugin files are present in the image and their on-disk ELF headers read correctly via non-Qt tools. This blocks the bounded graphical compositor proof below the greeter slice. - `redbear-login-protocol/` — implemented as a shared local crate for greeter/auth/checker protocol types
- `redbear-login-protocol/` — optional follow-up extraction, not required for the first bounded runtime proof - The previous guest-side Qt shared-plugin metadata blocker is now fixed: `libqminimal.so` and `qwayland-org.kde.kwin.qpa.so` load successfully in the guest once the Redox toolchain's stale `elf.h` is synchronized with relibc's corrected ELF64 typedefs.
- Current remaining desktop-runtime blocker below the greeter slice: on current QEMU the compositor no longer page-faults, but still exits cleanly when no usable DRM device can be opened; the greeter's bounded QEMU proof still passes through hello, invalid-login, and validation-only successful-login return-to-greeter flow.
### 14.3 Proposed New Runtime Files ### 14.3 Proposed New Runtime Files
+1 -9
View File
@@ -2,15 +2,7 @@
**Purpose**: Implementation-ready hardware register and data structure reference for Red Bear OS IOMMU support. Based on AMD IOMMU Specification 48882 Rev 3.10 and Intel Virtualization Technology for Directed I/O (VT-d) Rev 5.0. **Purpose**: Implementation-ready hardware register and data structure reference for Red Bear OS IOMMU support. Based on AMD IOMMU Specification 48882 Rev 3.10 and Intel Virtualization Technology for Directed I/O (VT-d) Rev 5.0.
**Status**: The `iommu` daemon now builds in-tree, owns AMD-Vi runtime initialization, and also detects the presence of a kernel ACPI `DMAR` table so Intel VT-d runtime ownership can converge here instead of remaining conceptually stranded in `acpid`. Hardware validation is still missing in the AMD-first integration plan (see `AMD-FIRST-INTEGRATION.md`). This document provides the register and data-structure reference for finishing AMD-Vi and Intel VT-d bring-up. **Status**: The `iommu` daemon now builds in-tree, owns AMD-Vi runtime initialization, and also detects the presence of a kernel ACPI `DMAR` table as a convergence seam so Intel VT-d runtime ownership can move here instead of remaining conceptually stranded in `acpid`. That does **not** yet mean Intel VT-d ownership is cleanly closed in the current tree. Hardware validation is still missing in the AMD-first integration plan (see `AMD-FIRST-INTEGRATION.md`). This document provides the register and data-structure reference for finishing AMD-Vi and Intel VT-d bring-up.
---
## Table of Contents
1. [AMD-Vi (AMD IOMMU)](#1-amd-vi-amd-iommu)
2. [Intel VT-d](#2-intel-vt-d)
3. [Rust Struct Definitions](#3-rust-struct-definitions)
--- ---
@@ -5,6 +5,21 @@
This document assesses the current IRQ and low-level controller implementation in Red Bear OS for This document assesses the current IRQ and low-level controller implementation in Red Bear OS for
completeness and quality, then defines the next enhancement plan in execution order. completeness and quality, then defines the next enhancement plan in execution order.
This is the **canonical current implementation plan** for PCI interrupt plumbing, IRQ delivery,
MSI/MSI-X quality, low-level controller runtime proof, and IOMMU/interrupt-remapping follow-up
work.
When another document discusses PCI, IRQ, MSI/MSI-X, `pcid`, `pcid-spawner`, or low-level
controller execution order, prefer this file for:
- the current robustness judgment,
- the current implementation order,
- the current validation/proof expectations,
- and the current language for build-visible vs runtime-proven vs hardware-validated claims.
Other documents may still hold deeper architecture or donor-material detail, but they should point
here instead of acting as competing execution authorities.
It is grounded in the current repository state, especially: It is grounded in the current repository state, especially:
- `local/recipes/drivers/redox-driver-sys/` - `local/recipes/drivers/redox-driver-sys/`
@@ -106,6 +121,73 @@ controller-specific validation.
- Low-level controller quality is uneven: ACPI/APIC are much further along than IOMMU, MSI-X, and - Low-level controller quality is uneven: ACPI/APIC are much further along than IOMMU, MSI-X, and
controller-specific runtime characterization. controller-specific runtime characterization.
## Current Robustness Assessment
### Bottom line
The PCI/IRQ stack is now **architecturally credible and usable for bounded bring-up plus QEMU/runtime
proof**, but it is **not yet release-grade robust end to end**.
The strongest layers are:
- the kernel IRQ substrate,
- the `scheme:irq` delivery model,
- and the shared `redox-driver-sys` PCI/IRQ helper layer.
The weakest layers are:
- `pcid` and `pcid-spawner`,
- the `pcid` driver-interface helper surface,
- the shared `virtio-core` MSI-X setup path,
- and several upstream-owned base drivers that still panic on missing BARs, missing interrupt
handles, impossible feature states, or scheme-operation failures.
### What is materially strong today
- Kernel IRQ ownership is real and active: PIC, IOAPIC, LAPIC/x2APIC, IDT reservation, masking,
EOI, and spurious IRQ accounting all exist in the checked-in kernel.
- `redox-driver-sys` is the strongest PCI/IRQ userspace substrate: typed BAR parsing, quirk-aware
interrupt-support reporting, IRQ handle abstractions, MSI-X table helpers, affinity helpers, and
direct host-runnable substrate tests all exist.
- `redox-drm` consumes the interrupt substrate honestly with MSI-X → MSI → legacy fallback and
quirk-aware downgrade policy.
- `iommu` and the low-level proof scripts provide bounded runtime evidence rather than pretending
broader hardware support exists.
### What is still fragile today
- `pcid` and `pcid-spawner` still assume a trusted environment too often: launch sequencing,
device-enable timing, and several error paths are weaker than the substrate beneath them.
- `pcid` helper files such as `driver_interface/{bar,cap,irq_helpers,msi}.rs` still treat several
malformed-device or unsupported-state cases as invariants rather than recoverable failures.
- `virtio-core` still hard-requires MSI-X in its active x86 path and uses assert/expect/
unreachable semantics for feature/capability assumptions that are acceptable for bounded proof,
but weak for a general PCI substrate.
- A broad set of shipped consumers (`rtl8168d`, `rtl8139d`, `ixgbed`, `ac97d`, `ihdad`, `ided`,
`vboxd`, `virtio-blkd`, `virtio-gpud`, `ihdgd`, and others) still encode panic-grade startup
assumptions around BARs, IRQs, or scheme operations.
### Validation truth
- MSI-X and IOMMU now have **bounded QEMU/runtime proof**.
- xHCI interrupt mode also has bounded QEMU proof.
- That is enough to justify further implementation work and proof tooling.
- It is **not** enough to justify broad hardware robustness claims for PCI/IRQ handling.
## Current Authority Split
For PCI/IRQ planning and current-state language, use the repo doc set this way:
- **This file** — canonical implementation plan and current robustness judgment for PCI/IRQ and
low-level controllers.
- `local/docs/LINUX-BORROWING-RUST-IMPLEMENTATION-PLAN.md` — donor-material and Rust-rewrite
policy only; not the execution authority for PCI/IRQ rollout.
- `local/docs/IOMMU-SPEC-REFERENCE.md` — specification/reference detail for AMD-Vi / VT-d.
- `local/docs/QUIRKS-SYSTEM.md` and `local/docs/QUIRKS-IMPROVEMENT-PLAN.md` — quirk-policy source
of truth and forward convergence work.
- `README.md`, `docs/README.md`, `AGENTS.md`, and `local/AGENTS.md` — public/current-state summary
surfaces that should point here rather than restating competing PCI/IRQ execution plans.
## Architectural Assessment ## Architectural Assessment
### 1. IRQ delivery architecture ### 1. IRQ delivery architecture
@@ -185,7 +267,9 @@ especially under real runtime scenarios.
Strengths: Strengths:
- MADT entries for xAPIC/x2APIC/NMI are handled. - MADT entries for xAPIC/x2APIC/NMI are handled.
- ACPI reboot/shutdown/power methods exist. - ACPI reboot/shutdown/power methods are implemented, but robustness, sleep-state scope beyond
`\_S5`, and bounded validation still remain open as tracked in
`local/docs/ACPI-IMPROVEMENT-PLAN.md`.
- x2APIC and SMP platform bring-up have already crossed the foundational threshold. - x2APIC and SMP platform bring-up have already crossed the foundational threshold.
Open enhancement items: Open enhancement items:
@@ -499,6 +583,185 @@ already restored and QEMU-proven.
## Execution Plan ## Execution Plan
## Detailed Implementation Plan
The remaining work should be executed in six waves. The order matters: shared control-plane and
helper hardening comes before broad driver cleanup, and runtime-proof/observability comes before any
stronger public claim language.
### Wave 1 — Harden `pcid` / `pcid-spawner` orchestration
**Primary targets**
- `recipes/core/base/source/drivers/pcid/src/{main,scheme,driver_handler}.rs`
- `recipes/core/base/source/drivers/pcid-spawner/src/main.rs`
**Implement**
- replace panic-grade launch/setup assumptions with explicit failure states
- make device-discovered / config-matched / driver-launch-attempted / driver-launch-failed /
device-enabled states observable
- stop leaving device state ambiguous when spawn fails after partial setup
- emit explicit logs for chosen driver, config source, interrupt mode, and launch result
**Acceptance**
- no normal launch/config mismatch path depends on `expect`/`unreachable!`
- failed driver launch is bounded and observable
- enable-before-spawn behavior is either removed or explicitly justified and logged
**Verification**
- `cargo test -p pcid`
- `cargo test -p pcid-spawner`
- verify `redbear-info` still reports the expected PCI surfaces after boot
### Wave 2 — Fix `pcid` helper contract
**Primary targets**
- `recipes/core/base/source/drivers/pcid/src/driver_interface/{bar,cap,config,irq_helpers,msi,mod}.rs`
**Implement**
- replace panic-style BAR/capability helpers with typed error-returning variants
- treat malformed vendor capabilities as device faults, not invariants
- make IRQ allocation and MSI/MSI-X selection explicit return values
- keep any `expect_*` helpers as thin wrappers only where absolutely necessary
**Acceptance**
- helper layer is error-returning by default
- malformed BAR/capability state no longer aborts bring-up by default
- vector selection and failure reasons are reportable state, not only implicit side effects
**Verification**
- `cargo test -p pcid`
- add unit tests for malformed BARs, malformed vendor caps, and IRQ-allocation failure behavior
### Wave 3 — Harden shared `virtio-core` IRQ/MSI-X setup
**Primary targets**
- `recipes/core/base/source/drivers/virtio-core/src/{probe,transport,arch/x86}.rs`
**Implement**
- remove assert/expect assumptions around virtio identity, capability presence, and MSI-X setup
- return typed probe/setup failure instead
- emit explicit logs for “virtio present but unsupported/incomplete” states
- make chosen interrupt mode visible to runtime tools and proof scripts
**Acceptance**
- partial or malformed virtio devices fail probe cleanly
- MSI-X setup failure is a bounded bring-up error instead of a crash path
**Verification**
- `cargo test -p virtio-core`
- re-run virtio-using consumer/runtime tools after the change
### Wave 4 — Convert highest-risk consumers
**Primary consumer batches**
- network/storage/audio: `rtl8168d`, `rtl8139d`, `ixgbed`, `ahcid`, `nvmed`, `virtio-blkd`,
`ided`, `ac97d`, `ihdad`
- graphics/VM: `ihdgd`, `virtio-gpud`, `vboxd`
**Implement**
- move drivers onto checked helper APIs from Wave 2
- remove direct panic-grade assumptions around BARs, legacy IRQs, and scheme operations
- make legacy-only or bounded-mode requirements explicit and logged
**Acceptance**
- consumer startup fails cleanly for unsupported or partial hardware states
- legacy-only requirements are explicit compatibility outcomes, not abrupt aborts
**Verification**
- per-driver crate tests where available
- existing bounded proof scripts still pass after the helper and consumer migration
### Wave 5 — Improve observability and proof
**Primary targets**
- `local/recipes/system/redbear-info/source/src/main.rs`
- `local/recipes/system/redbear-hwutils/source/src/bin/{lspci,redbear-phase-iommu-check}.rs`
- `local/scripts/test-{msix,iommu,xhci-irq,lowlevel-controllers}-qemu.sh`
- `local/docs/{REDBEAR-INFO-RUNTIME-REPORT,SCRIPT-BEHAVIOR-MATRIX}.md`
**Implement**
- expose chosen interrupt mode per device
- expose fallback reason: MSI-X unavailable, quirk-disabled, vector allocation failed, legacy-only,
etc.
- keep QEMU-first proofs explicit and bounded
- separate “installed”, “configured”, “active”, and “runtime-functional” states clearly
**Acceptance**
- operators can answer “what IRQ mode is this device using and why?” from runtime tooling
- proof scripts distinguish architecture proof from hardware proof cleanly
**Verification**
- `cargo test` for `redbear-info`
- `cargo test` for `redbear-hwutils`
- re-run bounded proof scripts and confirm mode/fallback signals appear in output
### Wave 6 — Docs and hardware-evidence sync
**Primary targets**
- `README.md`
- `AGENTS.md`
- `local/AGENTS.md`
- `docs/README.md`
- this file
- `local/docs/{IOMMU-SPEC-REFERENCE,QUIRKS-SYSTEM,QUIRKS-IMPROVEMENT-PLAN,REDBEAR-INFO-RUNTIME-REPORT,SCRIPT-BEHAVIOR-MATRIX}.md`
**Implement**
- sync public/current-state docs to the actual proof scope
- keep QEMU/runtime proof separate from hardware proof
- keep the repo-facing status story aligned with this canonical plan
**Acceptance**
- no doc claims broader PCI/IRQ robustness than tests actually prove
- public/current-state docs all point here for PCI/IRQ execution authority
**Verification**
- manual doc/code cross-check against current runtime tools and proof scripts
### Recommended commit boundaries
1. `pcid` / `pcid-spawner` orchestration hardening
2. `pcid` helper API hardening
3. `virtio-core` MSI-X/probe hardening
4. consumer batch A
5. consumer batch B
6. observability/runtime tools
7. docs/proof sync
### Definition of done
This plans implementation work is materially complete only when:
- no panic-grade behavior remains on normal PCI/IRQ failure paths in `pcid`, shared helpers, or key
consumers,
- IRQ mode selection is observable,
- bounded proof scripts still pass,
- docs match actual proof scope,
- and the remaining gaps are clearly hardware-evidence gaps rather than hidden code fragility.
### Step A — Establish validation vocabulary in all related docs ### Step A — Establish validation vocabulary in all related docs
For every low-level controller area, use the same four states consistently: For every low-level controller area, use the same four states consistently:
@@ -1,14 +1,12 @@
# Linux Borrowing and Rust Implementation Plan for Red Bear OS # Linux Borrowing and Rust Implementation Plan for Red Bear OS
**Date:** 2026-04-18 **Date:** 2026-04-18
**Status:** Planning authority for Linux-derived borrowing boundaries and Rust rewrite guidance across low-level subsystem work **Status:** Planning authority for Linux-derived borrowing boundaries and Rust rewrite guidance across low-level subsystem work. PCI/IRQ rollout authority lives in `local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md`.
**Scope:** Hardware enablement, ACPI including suspend/resume, low-level startup/init, PCI, IRQ/MSI/MSI-X, PS/2 init, IOMMU, USB/xHCI/storage, bounded Wi-Fi transport reuse, and selective GPU/DRM orchestration reuse **Scope:** Hardware enablement, ACPI including suspend/resume, low-level startup/init, PCI, IRQ/MSI/MSI-X, PS/2 init, IOMMU, USB/xHCI/storage, bounded Wi-Fi transport reuse, and selective GPU/DRM orchestration reuse
## Intent ## Intent
This document answers a specific Red Bear question: Which Linux kernel source and Linux documentation already present in this repo should be used as donor material for Red Bear OS, what should be rewritten into Rust, what should remain reference-only, and where should that logic live in Red Bear's architecture?
> Which Linux kernel source and Linux documentation already present in this repo should be used as donor material for Red Bear OS, what should be rewritten into Rust, what should remain reference-only, and where should that logic live in Red Bears architecture?
This plan is intentionally **Red Bear-native**. It does **not** propose importing Linux subsystem architecture into Red Bear. This plan is intentionally **Red Bear-native**. It does **not** propose importing Linux subsystem architecture into Red Bear.
@@ -17,15 +15,14 @@ This plan is intentionally **Red Bear-native**. It does **not** propose importin
The software-only, bounded slices from this plan that are now implemented in code are: The software-only, bounded slices from this plan that are now implemented in code are:
- **Phase A — PCI / IRQ substrate** - **Phase A — PCI / IRQ substrate**
- shared PCI config-space parsing now preserves capability chains in `redox-driver-sys` - bounded shared substrate slices landed in code (`redox-driver-sys` capability-chain parsing,
- shared quirk-aware interrupt support summary exists (`none` / `legacy` / `msi` / `msix`) interrupt-support summary, and early `pcid` convergence)
- `pcid` now consumes the shared PCI parser in its header path for interrupt-support reporting, - the **canonical execution order, current robustness judgment, and remaining implementation work**
which starts the planned downstream convergence onto the shared substrate instead of keeping all for PCI/IRQ now live in `local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md`
capability interpretation local.
- **Phase B — ACPI / IOMMU groundwork** - **Phase B — ACPI / IOMMU groundwork**
- `acpid` now has an explicit userspace sleep-target model for `S1` / `S3` / `S4` / `S5` - `acpid` now carries an explicit userspace sleep-target model naming `S1` / `S3` / `S4` / `S5`
- `_S5` shutdown routes through that model, while non-`S5` targets remain groundwork-only - only `_S5` soft-off is materially wired today; non-`S5` targets remain groundwork-only
- `iommu` now detects kernel ACPI `DMAR` presence, establishing the Intel VT-d ownership seam - `iommu` now detects kernel ACPI `DMAR` presence as a convergence seam, but Intel VT-d runtime ownership is not yet cleanly closed outside `acpid`
- **Phase C — PS/2 / USB / storage** - **Phase C — PS/2 / USB / storage**
- `ps2d` now flushes stale controller output during probe and around core init/self-test - `ps2d` now flushes stale controller output during probe and around core init/self-test
- `xhcid` now tracks active alternate settings and resolves endpoint descriptors through that map - `xhcid` now tracks active alternate settings and resolves endpoint descriptors through that map
@@ -253,7 +250,7 @@ Reason: all of those conflict with the ownership rules that Red Bear already imp
## 5. What Red Bear still materially needs ## 5. What Red Bear still materially needs
- ACPI sleep beyond `_S5` - ACPI sleep beyond `_S5`
- Intel VT-d / DMAR runtime ownership moved out of `acpid` - clean Intel VT-d / DMAR runtime ownership outside `acpid`
- better PCI host bridge / interrupt-link handling - better PCI host bridge / interrupt-link handling
- quirk convergence in `redox-driver-sys` - quirk convergence in `redox-driver-sys`
- USB composite/interface correctness - USB composite/interface correctness
@@ -291,45 +288,39 @@ Keep only:
## 2. Implementation order ## 2. Implementation order
1. PCI / IRQ / quirk substrate For current execution order, priority ranking, and acceptance language:
2. ACPI sleep groundwork
3. IOMMU ownership cleanup - use `local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md` for PCI / IRQ / low-level
4. PS/2 hardening deltas controller work,
5. USB maturity - use `local/docs/ACPI-IMPROVEMENT-PLAN.md` for ACPI ownership/robustness/sleep work,
6. Wi-Fi bounded helper extraction - use `local/docs/USB-IMPLEMENTATION-PLAN.md` for USB maturity,
7. GPU/DRM selective orchestration extraction only after hardware proof - and use the Wi-Fi / DRM plans for those later subsystem-specific phases.
This file should keep the **borrowing and rewrite policy** for those phases, not act as a competing
execution roadmap.
## 3. Work package backlog ## 3. Work package backlog
### Phase A — PCI / IRQ / quirk substrate ### Phase A — PCI / IRQ / quirk substrate
**Primary targets** For Phase A execution detail, file targets, acceptance criteria, and validation language, use the
- `local/recipes/drivers/redox-driver-sys/source/src/pci.rs` canonical PCI/IRQ plan:
- `.../src/irq.rs`
- `.../src/quirks/*`
- `recipes/core/base/source/drivers/pcid/src/main.rs`
- `.../src/driver_interface/irq_helpers.rs`
**Implement** - `local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md`
- typed PCI capability walkers
- BAR/resource validation helpers
- MSI/MSI-X mode selection helpers
- quirk pass model in Rust
- interrupt mode reporting
**Acceptance** This document keeps only the borrowing-policy summary for Phase A:
- build clean
- unit tests for malformed capability chains and BAR layout
- interrupt mode logged deterministically
**Current implementation progress (2026-04-18)** - borrow Linux capability/fixup/MSI semantics as donor knowledge,
- `redox-driver-sys` fast PCI enumeration now parses capability chains from config bytes in the - reimplement them as typed Rust helpers in `redox-driver-sys` / `pcid`,
read-only path, so enumerated `PciDeviceInfo` records no longer default to empty capability - prefer data-driven quirks over daemon-local special cases,
lists. - and do not import Linux generic IRQ-core ownership into Red Bear.
- `PciDeviceInfo` now exposes a quirk-aware interrupt support summary (`none`, `legacy`, `msi`,
`msix`) that can serve as the common policy input for future `pcid`/driver convergence. Current implementation progress remains true but is intentionally summarized only:
- Host-runnable unit coverage exists for capability-chain parsing, malformed next-pointer handling,
and interrupt-support selection behavior. - shared PCI capability parsing and interrupt-support summarization are already present in
`redox-driver-sys`,
- and the remaining rollout/convergence work now belongs to the canonical IRQ plan rather than this
borrowing-policy document.
### Phase B — ACPI / suspend / IOMMU ### Phase B — ACPI / suspend / IOMMU
@@ -351,17 +342,18 @@ Keep only:
- IOMMU ownership clarified and moved out of `acpid` - IOMMU ownership clarified and moved out of `acpid`
**Current implementation progress (2026-04-18)** **Current implementation progress (2026-04-18)**
- `acpid` now has an explicit `SleepTarget` / `SleepPhase` model in userspace, covering `S1`, `S3`, - `acpid` now has an explicit `SleepTarget` / `SleepPhase` model in userspace, naming `S1`, `S3`,
`S4`, and `S5` as named Red Bear sleep targets. `S4`, and `S5` as Red Bear sleep targets.
- The real shutdown path now routes through that target model, while non-`S5` targets are - The real shutdown path now routes through that target model, while non-`S5` targets are
recognized but reported as groundwork-only rather than silently ignored. recognized but still remain groundwork-only rather than implemented suspend/resume support.
- Unit coverage exists for sleep-target parsing, AML sleep-object naming, and the current - Unit coverage exists for sleep-target parsing, AML sleep-object naming, and the current
Red Bear-native rule that only `S5` is treated as an implemented soft-off path today. Red Bear-native rule that only `S5` is treated as an implemented soft-off path today.
- This is still groundwork only: there is no claim of full suspend/resume or sleep eventing yet, - This is still groundwork only: there is no claim of full suspend/resume or sleep eventing yet,
and Linux suspend sequencing remains reference material rather than imported structure. and Linux suspend sequencing remains reference material rather than imported structure.
- The `iommu` daemon now also detects the presence of a kernel ACPI `DMAR` table and reports that - The `iommu` daemon now also detects the presence of a kernel ACPI `DMAR` table and reports that
Intel VT-d runtime ownership should converge there instead of remaining conceptually attached to Intel VT-d runtime ownership should converge there instead of remaining conceptually attached to
the old transitional `acpid` DMAR code. the old transitional `acpid` DMAR code, but that ownership is not yet cleanly closed in the
current tree.
### Phase C — PS/2 / USB / storage ### Phase C — PS/2 / USB / storage
+2 -2
View File
@@ -27,9 +27,9 @@ USB plan uses:
| Profile | Intent | Key Fragments | Current support language | | Profile | Intent | Key Fragments | Current support language |
|---|---|---|---| |---|---|---|---|
| `redbear-mini` | Console + storage + wired-network baseline | `minimal.toml`, `redbear-legacy-base.toml`, `redbear-device-services.toml`, `redbear-netctl.toml` | builds / primary validation baseline / DHCP boot profile enabled / input-runtime substrate wired / USB: daemons built via base and targeted for bounded mini-profile validation | | `redbear-mini` | Console + storage + wired-network baseline | `minimal.toml`, `redbear-legacy-base.toml`, `redbear-device-services.toml`, `redbear-netctl.toml` | builds / primary validation baseline / DHCP boot profile enabled / input-runtime substrate wired / USB: daemons built via base and targeted for bounded mini-profile validation |
| `redbear-live-mini` | Live/recovery form of the mini baseline | `redbear-live-minimal.toml`, `redbear-minimal.toml` | builds / live media variant of the mini profile / desktop graphics intentionally absent | | `redbear-live-mini` | Live/recovery form of the mini baseline | `redbear-live-minimal.toml`, `redbear-minimal.toml` | builds / live media variant of the mini profile for real bare metal / desktop graphics intentionally absent |
| `redbear-full` | Desktop/network/session plumbing target | `desktop.toml`, `redbear-legacy-base.toml`, `redbear-legacy-desktop.toml`, `redbear-device-services.toml`, `redbear-netctl.toml`, `redbear-greeter-services.toml` | builds / boots in QEMU / active desktop-capable compile target / support claims remain evidence-qualified | | `redbear-full` | Desktop/network/session plumbing target | `desktop.toml`, `redbear-legacy-base.toml`, `redbear-legacy-desktop.toml`, `redbear-device-services.toml`, `redbear-netctl.toml`, `redbear-greeter-services.toml` | builds / boots in QEMU / active desktop-capable compile target / support claims remain evidence-qualified |
| `redbear-live-full` | Live/recovery form of the full desktop target | `redbear-live-full.toml`, `redbear-full.toml` | builds / live desktop-capable image / inherits the full target surface | | `redbear-live-full` | Live/recovery form of the full desktop target | `redbear-live-full.toml`, `redbear-full.toml` | builds / live desktop-capable image for real bare metal / inherits the full target surface |
## Profile Notes ## Profile Notes
+37 -48
View File
@@ -40,7 +40,7 @@
| **libinput** | ✅ | 1.30.2 with comprehensive redox.patch | | **libinput** | ✅ | 1.30.2 with comprehensive redox.patch |
| **D-Bus** | ✅ | 1.16.2, libdbus-1.so | | **D-Bus** | ✅ | 1.16.2, libdbus-1.so |
| **KF6 Frameworks** | ✅ 32/32 | All frameworks built | | **KF6 Frameworks** | ✅ 32/32 | All frameworks built |
| **KWin** | 🔄 | Reduced recipe path now uses real `libxcvt`, `libepoxy`, `lcms2`, and honest `libudev.so` / `libdisplay-info.so` provider linkage, but runtime/session proof is still incomplete | | **KWin** | 🔄 | Reduced recipe path now uses real `libxcvt`, `libepoxy`, `lcms2`, and honest `libudev.so` / `libdisplay-info.so` provider linkage; guest-side Qt plugin loading is fixed, but broader KWin runtime/session stability is still incomplete |
| **Hardware acceleration** | ❌ | PRIME/DMA-BUF scheme ioctls implemented; blocked on GPU command submission (CS ioctl) | | **Hardware acceleration** | ❌ | PRIME/DMA-BUF scheme ioctls implemented; blocked on GPU command submission (CS ioctl) |
--- ---
@@ -221,54 +221,42 @@ Plus: QML debug plugins, QtQuick/QML modules staged.
--- ---
## relibc Gaps — Complete Inventory ## relibc Status — Qt-facing Summary
### Resolved (workarounds in recipe/patch) The canonical relibc assessment now lives in:
| Gap | Workaround | Location | - `local/docs/RELIBC-COMPLETENESS-AND-ENHANCEMENT-PLAN.md`
|-----|-----------|----------| - `local/docs/RELIBC-IPC-ASSESSMENT-AND-IMPROVEMENT-PLAN.md`
| `sys/statfs.h` missing | Wrapper → `sys/statvfs.h` (typedef, #define) | recipe.toml heredoc |
| `ELFMAG` string missing from `elf.h` | `#define ELFMAG "\177ELF"` prepended to source | recipe.toml printf |
| `resolv.h` availability | Minimal relibc header now exists in-tree | verify downstream consumers against the generated header |
| `unlinkat()`/`linkat()` missing | Inline stubs with `AT_FDCWD` | redox.patch |
| `byteswap.h` missing | Skip include on Redox | redox.patch (brg_endian.h) |
| Float16 soft-fp (`__truncsfhf2` etc.) | Custom IEEE 754 C implementation | redox.patch (qt_float16_shims.c) |
| Half-float comparison (`__eqhf2` etc.) | Custom IEEE 754 C implementation | redox.patch (same file) |
| `openat()` not available | `#ifdef Q_OS_REDOX` guard | redox.patch (qcore_unix_p.h) |
### Networking Surface — Now Present, Still Needs Runtime Validation This Qt status document should use those files as the source of truth instead of carrying a second,
more optimistic relibc inventory.
| Gap | Impact | relibc File to Modify | ### Current Qt-facing relibc position
|-----|--------|----------------------|
| `resolv.h` | Present in relibc as a minimal source-visible header | `recipes/core/relibc/source/src/header/resolv/` |
| `in6_pktinfo` / `ipv6_mreq` | Present in relibc | `recipes/core/relibc/source/src/header/netinet_in/mod.rs` |
| `SIOCGIF*` ioctls | Present for the current Redox `eth0` model | `recipes/core/relibc/source/src/header/sys_ioctl/redox/mod.rs` |
| `::ioctl` path | Present in relibc Redox ioctl implementation | `recipes/core/relibc/source/src/header/sys_ioctl/` |
| `ifreq` / `ifconf` / `ifaddrs` | Present for the current Redox `eth0` model | `recipes/core/relibc/source/src/header/net_if/mod.rs`, `recipes/core/relibc/source/src/header/ifaddrs/mod.rs` |
### Unresolved — Blocks Other Qt Modules/Features | Area | Current state for Qt/KDE work |
|-----|-------------------------------|
| fd-event APIs | The active relibc recipe patch chain provides bounded `eventfd`, `signalfd`, `timerfd`, and `waitid()` support. These should be treated as recipe-applied compatibility, not plain-source upstream convergence. |
| shared memory and semaphores | The active build now includes a bounded named-semaphore path through the relibc recipe surface. Broader SysV shm/sem work remains deferred outside the current concrete wave, and runtime trust for real Qt/KDE consumer paths is still the key remaining question. |
| interface discovery | `ifaddrs` / `net_if` support currently comes from a bounded Red Bear patch layer with a synthetic `loopback` + `eth0` model, not full live interface enumeration. |
| still-missing surfaces | Message queues remain absent, and broader relibc completeness work is still open in the live source tree. |
| Gap | Impact | Module Blocked | ### Qt/KDE-facing relibc gaps that still matter
|-----|--------|---------------|
| Gap | Impact | Module blocked or pressured |
|-----|--------|----------------------------|
| broader networking runtime validation | QtNetwork end-to-end behavior | QtNetwork | | broader networking runtime validation | QtNetwork end-to-end behavior | QtNetwork |
| GPU hardware display validation | Hardware-accelerated rendering | QtOpenGL hardware path | | broader shared-memory validation beyond the existing `shm_open()` path | shared memory confidence | QSharedMemory |
| broader shared-memory validation beyond the existing `shm_open()` path | Shared memory | QSharedMemory | | broader semaphore and SysV-IPC runtime validation | semaphore and IPC confidence | QSystemSemaphore and direct KDE consumers |
| broader semaphore/system-IPC validation beyond the new `sem_open()` path | POSIX semaphores | QSystemSemaphore | | process/runtime validation beyond the bounded `waitid()` path | process-control confidence | QProcess |
| process/runtime validation beyond the new bounded `waitid()` path | QProcess internals | QProcess | | GPU hardware display validation | hardware-accelerated rendering | QtOpenGL hardware path |
Recent relibc implementation progress in this repo now also includes: The practical takeaway is that relibc is no longer blocked on raw fd-event API absence, but Qt/KDE
still depends on more proof and more semantic hardening than the current patch-applied surface alone
provides.
- source-visible plus strict Redox-target runtime-tested `signalfd`, `timerfd`, `eventfd`, `open_memstream`, `F_DUPFD_CLOEXEC`, and `MSG_NOSIGNAL` In the current pass, the active bounded relibc wave was also revalidated through focused relibc
- a bounded `waitid()` path in relibc, replacing the old Qt-side waitid stub workaround tests for `eventfd`, `signalfd`, `timerfd`, `waitid`, named and unnamed semaphores,
- a bounded `eth0`-backed `net_if` / `ifaddrs` path in relibc `open_memstream`, and the bounded `ifaddrs` surface.
- a minimal source-visible `resolv.h` surface in relibc
- bounded `sys/ipc.h` / `sys/shm.h` surfaces for the `IPC_PRIVATE` shared-memory workflow
Current downstream build proof in this repo now includes:
- `libwayland` cooking successfully against the updated relibc surfaces
- qtbase configuring, building, and staging with `process`, `sharedmemory`, and `systemsemaphore` enabled
| Fontconfig | Advanced font selection | QtGui (bundled FreeType works for basic) |
--- ---
@@ -283,19 +271,19 @@ Current downstream build proof in this repo now includes:
- Unblocks: kf6-kdbusaddons, kf6-kservice, kf6-kpackage, kf6-kglobalaccel - Unblocks: kf6-kdbusaddons, kf6-kservice, kf6-kpackage, kf6-kglobalaccel
- D-Bus plan: `local/docs/DBUS-INTEGRATION-PLAN.md` — redbear-sessiond login1 broker + D-Bus service infrastructure for KDE Plasma - D-Bus plan: `local/docs/DBUS-INTEGRATION-PLAN.md` — redbear-sessiond login1 broker + D-Bus service infrastructure for KDE Plasma
**redbear-sessiond:** Implemented. Rust daemon at `local/recipes/system/redbear-sessiond/` using zbus 5, serving `org.freedesktop.login1` Manager/Session/Seat interfaces on the system bus. Maps `TakeDevice(major, minor)` to Redox scheme paths (`/scheme/drm/card0`, `/dev/input/eventN`). Config wired in `config/redbear-kde.toml` with init service at slot 13. **redbear-sessiond:** Implemented. Rust daemon at `local/recipes/system/redbear-sessiond/` using zbus 5, serving `org.freedesktop.login1` Manager/Session/Seat interfaces on the system bus. Maps `TakeDevice(major, minor)` to Redox scheme paths (`/scheme/drm/card0`, `/dev/input/eventN`). Config wired in `config/redbear-full.toml` with init service at slot 13.
**qdbuscpp2xml/qdbusxml2cpp provisioning:** Qt host build has `FEATURE_dbus=OFF` with these tools disabled. KDE recipes provision them via symlinks: kf6-kdbusaddons falls back to `/usr/bin/qdbuscpp2xml` and `/usr/bin/qdbusxml2cpp` from the host system. This works for cross-compilation but is not a long-term solution. Future improvement: enable FEATURE_dbus=ON in host build once D-Bus session bus validation passes. **qdbuscpp2xml/qdbusxml2cpp provisioning:** Qt host build has `FEATURE_dbus=OFF` with these tools disabled. KDE recipes provision them via symlinks: kf6-kdbusaddons falls back to `/usr/bin/qdbuscpp2xml` and `/usr/bin/qdbusxml2cpp` from the host system. This works for cross-compilation but is not a long-term solution. Future improvement: enable FEATURE_dbus=ON in host build once D-Bus session bus validation passes.
**KF6 D-Bus re-enablement roadmap:** 15 KF6 components currently build with `-DUSE_DBUS=OFF`. Re-enablement is gated on D-Bus service availability: kf6-knotifications needs `org.freedesktop.Notifications` (DB-2, now enabled against a stub notification daemon), kf6-solid needs runtime-validated `org.freedesktop.UPower` + `org.freedesktop.UDisks2` enumeration (DB-3, both daemons now expose bounded real enumeration). The runtime proof harness is now in place, but `kf6-solid` still keeps `-DUSE_DBUS=OFF`, `-DBUILD_DEVICE_BACKEND_upower=OFF`, and `-DBUILD_DEVICE_BACKEND_udisks2=OFF` until `solid-hardware6`/Phase 6 validation can confirm the consumer path. kf6-kio and 10 others need full desktop services (DB-5). See `local/docs/DBUS-INTEGRATION-PLAN.md` Section 14 for the complete matrix. **KF6 D-Bus re-enablement roadmap:** 15 KF6 components currently build with `-DUSE_DBUS=OFF`. Re-enablement is gated on D-Bus service availability: kf6-knotifications needs `org.freedesktop.Notifications` (DB-2, now enabled against a stub notification daemon), while kf6-solid needs `org.freedesktop.UPower` + `org.freedesktop.UDisks2` only after those services are consumer-trustworthy. The current UDisks2 path is much closer to that bar than UPower; the ACPI-backed UPower surface remains provisional until Wave 3 in `local/docs/ACPI-IMPROVEMENT-PLAN.md` closes. The runtime proof harness is now in place, but `kf6-solid` still keeps `-DUSE_DBUS=OFF`, `-DBUILD_DEVICE_BACKEND_upower=OFF`, and `-DBUILD_DEVICE_BACKEND_udisks2=OFF` until `solid-hardware6`/Phase 6 validation can confirm the consumer path. kf6-kio and 10 others need full desktop services (DB-5). See `local/docs/DBUS-INTEGRATION-PLAN.md` Section 14 for the complete matrix.
**Key insight:** QtDBus is NOT the gap — Qt6DBus builds and kf6-kdbusaddons provides the KDE convenience layer. The gap is the freedesktop service contracts (login1, Notifications, UPower, UDisks2, PolicyKit) that need Redox-native implementations. NetworkManager is deferred; Red Bear OS uses `redbear-netctl` for now. **Key insight:** QtDBus is NOT the gap — Qt6DBus builds and kf6-kdbusaddons provides the KDE convenience layer. The gap is the freedesktop service contracts (login1, Notifications, UPower, UDisks2, PolicyKit) that need Redox-native implementations. NetworkManager is deferred; Red Bear OS uses `redbear-netctl` for now.
### Phase 2b — qtwayland Module (🔄 Building) ### Phase 2b — qtwayland Module (◐ Client path complete, compositor slice intentionally reduced)
- Recipe at `recipes/wip/qt/qtwayland/recipe.toml` - Recipe at `recipes/wip/qt/qtwayland/recipe.toml`
- Uses redox-toolchain.cmake + host Qt build pattern - Uses redox-toolchain.cmake + host Qt build pattern
- Wayland compositor disabled, client-only build - Wayland client path is built and staged; compositor-side coverage remains intentionally reduced in the recipe
- OpenGL guards applied for software rendering - OpenGL guards applied for software rendering
### Phase 2c — Input Stack (✅ COMPLETE) ### Phase 2c — Input Stack (✅ COMPLETE)
@@ -366,8 +354,9 @@ Current truth for Phase 4:
not proof of a hardware-accelerated desktop session not proof of a hardware-accelerated desktop session
- the current QEMU validation harness is still software-rendered (`llvmpipe`) and should be treated - the current QEMU validation harness is still software-rendered (`llvmpipe`) and should be treated
as a bounded regression/test path, not as the final acceleration proof target as a bounded regression/test path, not as the final acceleration proof target
- the in-repo Phase 4 runtime check currently still fails in `qt6-bootstrap-check` during early Qt - the earlier guest-side Qt bootstrap/plugin blocker is now fixed: bounded guest checks load both
startup, so even the bounded software-path runtime proof remains incomplete `libqminimal.so` and KWin's `qwayland-org.kde.kwin.qpa.so`, but compositor-backed runtime proof
is still incomplete because KWin/session stability remains open
- true hardware-accelerated desktop readiness still requires GPU command submission (CS ioctl) plus real - true hardware-accelerated desktop readiness still requires GPU command submission (CS ioctl) plus real
AMD/Intel hardware validation through the DRM → GBM/EGL → compositor → Qt client path AMD/Intel hardware validation through the DRM → GBM/EGL → compositor → Qt client path
(PRIME/DMA-BUF cross-process buffer sharing is implemented at scheme level) (PRIME/DMA-BUF cross-process buffer sharing is implemented at scheme level)
@@ -444,8 +433,8 @@ Phase 1 ✅ (qtbase + qtdeclarative + qtsvg)
4. **relibc / graphics surface still incomplete for runtime** — the build-side `open_memstream` and Wayland-facing header export path now work, 4. **relibc / graphics surface still incomplete for runtime** — the build-side `open_memstream` and Wayland-facing header export path now work,
and DMA-BUF ioctls plus a bounded private CS surface now exist, but real sync objects/shared fence semantics and broader graphics runtime validation are still unavailable. and DMA-BUF ioctls plus a bounded private CS surface now exist, but real sync objects/shared fence semantics and broader graphics runtime validation are still unavailable.
5. **KDE Plasma does NOT compile or run end-to-end** — KWin, plasma-workspace, plasma-desktop recipes exist, 5. **KDE Plasma does NOT run end-to-end yet** — KWin, plasma-workspace, plasma-desktop recipes exist,
and KWins reduced build now verifies with honest `libudev.so` / `libdisplay-info.so` linkage, but runtime integration, compositor validation, and broader Plasma session proof are still missing. and KWins reduced build now verifies with honest `libudev.so` / `libdisplay-info.so` linkage and a successful current `kwin` cook, but runtime integration, compositor validation, and broader Plasma session proof are still missing.
## Honest Status Assessment ## Honest Status Assessment
+12 -1
View File
@@ -36,9 +36,10 @@ are hardware-validated at runtime.
- **Networking** — stack state, connected flag, interface, MAC, IP/CIDR, DNS, default route, - **Networking** — stack state, connected flag, interface, MAC, IP/CIDR, DNS, default route,
active `netctl` profile, visible `network.*` schemes, Wi-Fi control/firmware/transport surfaces, active `netctl` profile, visible `network.*` schemes, Wi-Fi control/firmware/transport surfaces,
and bounded Bluetooth transport/control visibility and bounded Bluetooth transport/control visibility
- **Hardware** — PCI device count, USB controller count, DRM card count, RTL8125 PCI visibility
- **Hardware** — PCI device count, USB controller count, DRM card count, RTL8125 PCI visibility, - **Hardware** — PCI device count, USB controller count, DRM card count, RTL8125 PCI visibility,
VirtIO NIC visibility for VM baselines VirtIO NIC visibility for VM baselines
- **Hardware** — bounded PCI interrupt-support summary (`none`, `legacy`, `msi`, `msix`) derived
from the shared `redox-driver-sys` parser instead of a second local decoder
- **Integrations** — tools, daemons, and integration paths such as `lspci`, `lsusb`, `netctl`, - **Integrations** — tools, daemons, and integration paths such as `lspci`, `lsusb`, `netctl`,
`pcid-spawner`, `smolnetd`, `firmware-loader`, `udev-shim`, `evdevd`, `redox-drm`, `pcid-spawner`, `smolnetd`, `firmware-loader`, `udev-shim`, `evdevd`, `redox-drm`,
`redbear-wifictl`, `redbear-btusb`, `redbear-btctl`, and the native RTL8125 and VirtIO `redbear-wifictl`, `redbear-btusb`, `redbear-btctl`, and the native RTL8125 and VirtIO
@@ -71,5 +72,15 @@ That includes new:
Recent examples include the Wi-Fi control-plane surfaces and the bounded Bluetooth first-slice Recent examples include the Wi-Fi control-plane surfaces and the bounded Bluetooth first-slice
surfaces, both of which extend the runtime report without over-claiming hardware validation. surfaces, both of which extend the runtime report without over-claiming hardware validation.
Recent PCI/IRQ examples now also include:
- aggregate interrupt-support counts in the hardware section,
- bounded `redbear-upower` integration reporting tied to the live `/scheme/acpi/power` surface,
- the distinction between “PCI device visible” and “PCI interrupt mode/runtime proof is still a
bounded claim”,
- and normalized bounded proof outputs from the MSI-X and xHCI QEMU helpers (`IRQ_DRIVER`,
`IRQ_MODE`, `IRQ_REASON`, `IRQ_LOG`) so the current proof surface says which mode was actually
observed and why the driver believed it was using that path.
The goal is for `redbear-info` to remain the first command users run when they need to understand The goal is for `redbear-info` to remain the first command users run when they need to understand
the state of a Red Bear system. the state of a Red Bear system.
@@ -1,278 +1,50 @@
# Red Bear OS relibc Completeness and Enhancement Plan # Red Bear OS relibc Assessment and Improvement Plan
## Purpose ## Purpose
This document assesses relibc in Red Bear OS for **strengths**, **deficiencies**, **subsystem-facing This document is the canonical Red Bear assessment of relibc quality, completeness, and robustness.
gaps**, and **overall quality**, then defines a practical plan for improving it.
The goal is not to treat relibc as a generic libc project. The goal is to describe: It is intentionally stricter than older relibc notes. This pass is grounded in what is visible in:
- what is already strong, - the upstream-owned working tree under `recipes/core/relibc/source/`,
- what still depends on active local overlay state rather than upstream relibc itself, - the active relibc recipe patch list in `recipes/core/relibc/recipe.toml`,
- what is still incomplete or weak, - the durable Red Bear patch carriers under `local/patches/relibc/`,
- what downstream subsystems still depend on relibc improvement, - and the tests added by the active patch chain.
- and what order of work best improves real system capability.
This is a Red Bear-specific document. It is grounded in the current repo state rather than older, It does **not** flatten those evidence types into one generic claim of "implemented".
pre-correction roadmap assumptions.
## Evidence Model ## Evidence model
This plan uses four evidence buckets and does **not** treat them as equivalent: Use these labels consistently when describing relibc in this repository:
- **source-visible** — behavior visible directly in the current relibc source tree - **plain-source-visible**: present in the current upstream-owned `recipes/core/relibc/source/` tree without relying on recipe patch replay
- **patch-carried** — behavior carried in the active `local/patches/relibc/*.patch` recipe inputs rather than upstream relibc itself - **recipe-applied**: added only when the active relibc recipe replays Red Bear patch carriers
- **build-visible downstream** — downstream packages now compile because the libc surface exists - **test-present**: test coverage exists in the source tree or the active patch chain
- **runtime-validated** — behavior has been exercised successfully in real downstream/runtime paths - **documented downstream build evidence**: another in-repo document records downstream build success, but that success was not re-executed as part of this documentation pass
- **runtime-unrevalidated in this pass**: do not describe as runtime-trusted here unless this review actually reran it
This distinction matters because relibcs current problem is often **not** “API absent,” but the gap This distinction matters because the largest relibc documentation problem in the repo was overclaiming
between **implemented**, **patch-carried**, **build-proven**, and **runtime-trusted**. plain-source convergence when the active build still depends on a substantial recipe-applied patch
chain.
## Upstream vs Red Bear ownership ## Ownership boundary
For relibc, the ownership boundary must stay explicit: - `recipes/core/relibc/source/` is an upstream-owned working tree and may be replaced on refresh.
- `recipes/core/relibc/recipe.toml` defines the currently active relibc build surface.
- `local/patches/relibc/` is the durable Red Bear compatibility carrier set.
- `local/docs/` is the durable explanation of what Red Bear currently depends on and why.
- `recipes/core/relibc/source/` is the live upstream-owned working tree used for actual build and For relibc, the honest maintenance target is:
validation
- the active Red Bear-owned durable relibc compatibility carrier is the recipe-replayed
`local/patches/relibc/*.patch` set; in the current tree that active replay has narrowed to
`local/patches/relibc/redox.patch`
- older `local/patches/relibc/P3-*.patch` files are historical bring-up references unless a current
recipe still replays them
- `local/docs/...` is the durable explanation of what those changes mean and how to reapply them
That means a relibc change is not truly preserved until its ownership is explicit in the right > fresh upstream relibc sources can be refetched, the active Red Bear relibc patch chain can be
place: > replayed, and the same intended build surface can be reconstructed.
1. if upstream now owns the behavior, the live relibc source tree is the canonical implementation ## Current implementation assessment
2. if Red Bear still owns a unique delta, it must also exist in the active
`local/patches/relibc/` recipe input set so the same result can be recreated after an upstream
refresh
The repo standard for success is not merely “the current source tree builds.” The standard is: ### 1. Plain source and active build are materially different
> we can fetch fresh upstream relibc sources, reapply the active Red Bear relibc patch carriers, and still The current upstream-owned header tree still contains clear incompleteness markers in
> rebuild the same working result. `recipes/core/relibc/source/src/header/mod.rs`, including:
Any relibc work that exists only under `recipes/core/relibc/source/` should therefore be treated as
validated-but-not-yet-preserved.
Because relibc is also one of the fastest-moving upstream areas, Red Bear should apply one more
rule here:
> if a Red Bear relibc patch solves a problem that upstream has already solved, prefer the upstream
> solution and retire or reduce the local patch.
The goal is durable compatibility, not a permanent relibc fork.
## Current Repo State
> **Implementation note (current Red Bear tree):** this repo pass moved several relibc items from
> patch-carried-only or downstream-workaround status into source-visible libc behavior. The current
> tree now contains source-visible and strict Redox-target runtime-tested `signalfd`, `timerfd`, `eventfd`, `open_memstream`,
> `F_DUPFD_CLOEXEC`, `MSG_NOSIGNAL`, a bounded `waitid()` path, bounded `RLIMIT_NOFILE` /
> `RLIMIT_MEMLOCK` behavior, a bounded `eth0`-backed `net/if.h` / `ifaddrs.h` view, a source-visible
> `resolv.h` plus bounded `res_query()` / `res_search()` compatibility paths with receive/send
> timeout hardening, a first named-semaphore implementation on top of the existing shm path, and
> bounded `sys/ipc.h` / `sys/shm.h` surfaces for the `IPC_PRIVATE` / `shmget` / `shmat` /
> `shmdt` / `shmctl(IPC_RMID)` workflow.
> **Downstream validation note (current Red Bear tree):** `libwayland` now cooks successfully
> against the updated relibc, and qtbase now configures, builds, and stages with
> `FEATURE_process=ON`, `FEATURE_sharedmemory=ON`, and `FEATURE_systemsemaphore=ON` in the current
> tree. The relibc `tests/` harness also now builds focused Redox-target binaries for `eventfd`,
> `waitid`, `res_init`, `res_query`, `sem_open`, and `shmget`, and the host-target variants of those same
> focused tests now execute successfully under the relibc-built host sysroot. That does not mean
> relibc is complete, but it does mean the implementation has crossed real downstream build/stage
> gates and direct execution-level proof rather than remaining an isolated libc-only pass. The
> current host-side `res_query` proof is still bounded: it compiles, runs, and fails fast under the
> relibc sysroot instead of hanging, but it is not yet a runtime-trusted downstream DNS proof.
>
> **Additional downstream proof (current Red Bear tree):** the in-tree `openssh` recipe now cooks
> successfully against the relibc resolver surface after switching the recipe to the rebuilt relibc
> headers/libraries and removing stale Redox-specific resolver fallbacks from the OpenSSH patch.
> That is still build/stage proof rather than runtime SSH validation, but it demonstrates that real
> consumers can now compile and link `res_init`, `res_query`, and `dn_expand` from relibc.
>
> **Fresh revalidation pass (current Red Bear tree):** the focused host-side relibc proofs were
> rerun for `eventfd`, `waitid`, `res_init`, `res_query`, `sem_open`, and `shmget`; the binaries all
> built, and the executions succeeded for `eventfd`, `waitid`, `res_init`, `sem_open`, and `shmget`
> with the bounded `res_query` test still failing fast rather than hanging. The main downstream
> consumers previously used as evidence were also rerun successfully: `CI=1 ./target/release/repo cook libwayland`,
> `CI=1 ./target/release/repo cook qtbase`, and `CI=1 ./target/release/repo cook openssh` now all
> succeed in the current tree.
>
> **Additional focused coverage (current Red Bear tree):** integrated relibc tests were also added
> for `open_memstream`, SysV semaphores via `semget`/`semop`/`semctl`, `timerfd`, `signalfd`, and
> `eventfd`. On the host-side relibc sysroot, `open_memstream`, `semget`, and the bounded SysV shm
> path execute successfully. On the Redox-target runtime path, the repaired `cookbook_redoxer`
> `write-exec` flow now executes the targeted `eventfd`, `signalfd`, and `timerfd` binaries
> successfully against the staged relibc test tree, and those tests now fail hard if the APIs are
> unavailable. That moves the fd-event APIs from source-visible/build-visible status into explicit
> runtime-tested status for the bounded relibc harness.
>
> **Fresh-upstream reapply proof (current Red Bear tree):** a fresh `repo unfetch relibc` →
> `repo fetch relibc` cycle was used to reconstruct the relibc source tree from upstream-owned
> sources, the durable `local/patches/relibc/` carrier set was reapplied to that fresh tree, and the
> resulting rebuild again supported successful downstream `libwayland` and `qtbase` cooks. That is
> the current proof that Red Bears relibc work is not only buildable in-place, but also recoverable
> after a fresh upstream source refresh.
> **Current reconstructed-state proof set:** with the refreshed source tree rebuilt from the local
> relibc overlay set, the repo now has successful cookbook evidence for all three layers in order:
> `CI=1 ./target/release/repo cook relibc`, then `CI=1 ./target/release/repo cook libwayland`, then
> `CI=1 ./target/release/repo cook qtbase`. This is the strongest current proof that the relibc
> compatibility work is preserved in the right place for long-term maintenance.
>
> **Current patch-carrier note:** the bounded `ifaddrs` / `net_if` work and the bounded
> `arpa/nameser.h` / `resolv.h` compatibility work are now preserved in the tracked
> `local/patches/relibc/redox.patch` carrier instead of separate transient patch files. The durable
> relibc recipe patch chain therefore consists only of tracked local patch files plus
> `recipes/core/relibc/recipe.toml` wiring.
### Summary
relibc is one of Red Bears strongest foundational subsystems, but it is not complete.
The current repo shows a relibc that is already strong in:
- broad header/libc surface coverage
- real Redox-native platform integration
- source-visible implementations of the historical Wayland-facing P3 APIs, with patch carriers still retained as sync/upstream artifacts
- enough maturity to unlock major build-side progress in Wayland, Qt, and KDE
- a substantial generic upstream-style test tree
The current repo also shows relibc is still weak in:
- shared memory / SysV IPC completeness
- named semaphores
- process/runtime quality for some downstreams
- networking/resolver/interface completeness
- Redox-target and downstream-runtime validation depth
### Status Matrix
| Area | State | Notes |
|---|---|---|
| Core POSIX/header breadth | **strong / partial** | Large header surface exists, but many TODO headers and feature gaps remain |
| Wayland-facing P3 APIs | **implemented / runtime-tested / bounded** | `signalfd`, `timerfd`, `eventfd`, `open_memstream`, socket flags, and `F_DUPFD_CLOEXEC` now exist in the relibc source tree; strict targeted relibc runtime tests now execute on Redox, but broader consumer semantics still need careful documentation |
| Networking/libc socket surface | **usable / partial** | AF_INET/AF_UNIX paths exist, but interface/reporting/resolver behavior remains narrow |
| Qt/KDE downstream unblockers | **build-side improved / multiple gates crossed** | `QProcess`, `QSharedMemory`, and `QSystemSemaphore` now configure, build, and stage on in-tree qtbase; broader runtime validation is still needed |
| Shared memory / semaphore completeness | **partial** | `shm_open` exists through the Redox shm path, but SysV IPC/shared-memory and named semaphore completeness remain open |
| Process/runtime completeness | **partial** | Some process-facing functionality still uses stubs or downstream workarounds |
| Dedicated test surface | **present / Redox-specific coverage still thin** | relibc has a substantial `source/tests/` tree, but the Red Bear-visible Redox/P3/runtime validation story is still weaker than the generic libc test surface |
| Runtime validation against real consumers | **improved / still bounded** | relibc fd-event runtime tests now execute on Redox; broader desktop consumer semantics still need continued confirmation |
## Strong Points
### 1. relibc already exposes a broad libc/header surface
`recipes/core/relibc/source/src/header/mod.rs` shows a broad libc/header tree with networking,
threading, polling, stdio, locale, signal, socket, time, and many Unix-facing modules already
present.
That means Red Bear should treat relibc work as **quality and completeness hardening**, not as a
greenfield libc effort.
### 2. The historical P3 Wayland-facing API bridge is now source-visible
The local relibc patch carriers documented the APIs that historically blocked Wayland and downstream
consumers. In the current preserved tree, those fd-event and adjacent IPC surfaces are now present
in the active upstream relibc source itself, and the relibc-facing recipes no longer replay the old
standalone P3 carrier set for `eventfd`, `signalfd`, `timerfd`, `waitid`, SysV IPC, or their focused
test files. The active Red Bear relibc recipe replay has narrowed back to the shared
`local/patches/relibc/redox.patch` compatibility delta, while the historical P3 patch files remain
useful as prior bring-up evidence rather than current recipe inputs.
### 3. Focused fd-event proof record
The bounded fd-event runtime proof now has a small tracked record here so it does not depend only on
session history.
Preserved command shape:
- rebuild relibc from tracked carriers: `repo unfetch relibc && repo fetch relibc && repo cook relibc`
- rebuild targeted test package: `TESTBIN=sys_eventfd/eventfd CI=1 ./target/release/repo cook relibc-tests-bins`
- execute inside staged Redox target via `cookbook_redbear_redoxer write-exec`
Recorded bounded runtime markers from the current pass:
- `eventfd_runtime_finalfinal_ok`
- `signalfd_runtime_finalfinal_ok`
- `timerfd_runtime_finalfinal_ok`
- `eventfd_runtime_kernelreplay_ok`
These markers should be read as proof of the bounded relibc fd-event harness only. They do not by
themselves claim full Linux-equivalent semantics for every downstream desktop consumer.
The upstream-first policy still applies here, but the durable patch-carrier set should be trimmed
only when a fresh upstream refetch plus reapply plus downstream rebuild actually proves the upstream
coverage is sufficient. In the current Red Bear tree, `open_memstream`, `F_DUPFD_CLOEXEC`, and the
socket flag work still need to remain in the relibc overlay set because the clean reconstructed
consumer path still depends on them.
This is one of relibcs strongest current points: Red Bear already has the exact P3 compatibility
surface that older docs used to describe as absent.
The local patches still matter as provenance and sync-upstream carriers for the gaps upstream does
not yet solve, but they should be retired as soon as upstream makes them redundant.
### 3. Downstream build progress proves relibc is materially useful
The current docs consistently show that relibc has already enabled substantial downstream progress:
- `docs/02-GAP-ANALYSIS.md` now marks the P3 bridge as implemented in-tree with strict Redox-target runtime proof for the fd-event slice
- `local/docs/WAYLAND-IMPLEMENTATION-PLAN.md` says the build-side relibc/libwayland bridge is restored and that the remaining blocker is runtime validation, not basic POSIX availability
- `local/docs/QT6-PORT-STATUS.md` treats many earlier relibc blockers as moved from “missing” to “present but still needs downstream validation”
This is a major quality signal: relibc is already strong enough to unlock real build-side subsystem work.
### 4. relibc already has a substantial generic test surface
`recipes/core/relibc/source/tests/` is real and large. It already covers many libc-facing areas such
as:
- `fcntl/`
- `net/` and `netdb/`
- `pthread/`
- `stdio/`
- `sys_mman/`
- `sys_socket/`
- `sys_resource/`
- `time/`
- `unistd/`
That is a genuine strength and should be documented as one.
The remaining weakness is narrower: Red Bear still lacks a strong **Redox-target / P3 API /
downstream-runtime** validation story that is as visible and deliberate as this generic relibc test
tree.
### 5. The current relibc problem is no longer one single blocker
The downstream evidence shows that relibc now has **multiple completeness fronts**:
- Wayland-facing POSIX/event APIs
- Qt/KDE shared memory and semaphore support
- process-facing behavior such as `waitid()`
- networking/resolver completeness
- legacy but still-consumed items such as `sigjmp_buf` and locale/runtime edges
That means the right enhancement plan is no longer “finish one missing API and unblock everything.”
The work has to be triaged by downstream impact.
### 6. The Redox networking model is reflected in relibc
`recipes/core/relibc/source/src/platform/redox/socket.rs` shows a real Redox-native socket/path
model instead of a pure stub implementation. That is another strong point: relibc already knows
about Redox-native runtime behavior.
## Deficiencies and Gaps
### 1. Header coverage is still incomplete in visible source
`recipes/core/relibc/source/src/header/mod.rs` still contains a meaningful backlog of TODO or absent
header surfaces, including examples such as:
- `iconv.h` - `iconv.h`
- `mqueue.h` - `mqueue.h`
@@ -281,379 +53,269 @@ header surfaces, including examples such as:
- `threads.h` - `threads.h`
- `wordexp.h` - `wordexp.h`
Some of these are lower-value than others, but they still show that relibc has real completeness work left. The live source tree also still shows relibc areas that are **not** yet plain-source-complete:
### 2. Named semaphores are now source-visible, but still incomplete - `recipes/core/relibc/source/src/header/semaphore/mod.rs` still contains `todo!("named semaphores")`
- `recipes/core/relibc/source/src/header/ifaddrs/mod.rs` still returns `ENOSYS`
- `recipes/core/relibc/source/src/header/mod.rs` still keeps `sys/ipc.h`, `sys/sem.h`, and `sys/shm.h` behind TODO comments
`recipes/core/relibc/source/src/header/semaphore/mod.rs` is still a clear example of partial completeness. That means older wording such as "now source-visible in the current tree" was too strong for much of
the current relibc surface.
Basic unnamed semaphore paths exist (`sem_init`, `sem_post`, `sem_wait`, `sem_timedwait`, etc.), ### 2. The active relibc build relies on a broad patch chain
and the named semaphore path is now source-visible too:
- `sem_open` The active recipe in `recipes/core/relibc/recipe.toml` currently replays more than `redox.patch`.
- `sem_close` The tracked patch list still includes, among others:
- `sem_unlink`
These are now implemented on top of the existing shm path instead of left as raw `todo!()` stubs. - `redox.patch`
- `P0-strtold-cpp-linkage-and-compat.patch`
- `P3-eventfd.patch`
- `P3-signalfd.patch`
- `P3-signalfd-header.patch`
- `P3-timerfd.patch`
- `P3-waitid.patch`
- `P3-semaphore-fixes.patch`
- `P3-socket-cred.patch`
- `P3-elf64-types.patch`
- `P3-open-memstream.patch`
- `P3-ifaddrs-net_if.patch`
- `P3-fd-event-tests.patch`
The remaining weakness is semantic and validation depth, not pure absence: So the active Red Bear relibc story is still **recipe-applied compatibility plus partial upstream
source**, not a nearly converged plain-source state.
- broader POSIX semaphore semantics are still not strongly runtime-validated ### 3. What the active patch chain actually provides
- downstream configure/runtime behavior still needs continued confirmation
- the SysV semaphore surface remains thinner than a full Unix implementation
This directly affects downstream consumers such as `QSystemSemaphore`. Observed directly from the current patch set:
### 3. Shared memory is present, but not complete enough for downstream GUI/runtime work - `P3-eventfd.patch`: adds `sys/eventfd.h` support through `/scheme/event/eventfd/...`
- `P3-signalfd.patch`: adds `signalfd` / `signalfd4` support through `/scheme/event` plus signal-mask handling
- `P3-timerfd.patch`: adds `sys/timerfd.h` support through `/scheme/time/{clockid}`
- `P3-waitid.patch`: adds a bounded `waitid()` implementation plus a focused test
- `P3-semaphore-fixes.patch`: adds named semaphore support on top of `shm_open()` / `mmap()` and fixes unnamed semaphore error behavior
- `P3-open-memstream.patch`: adds `open_memstream()` plus a focused stdio test
- `P3-ifaddrs-net_if.patch`: adds a bounded `ifaddrs` / `net_if` surface that currently synthesizes only `loopback` and `eth0`
- `P3-fd-event-tests.patch`: adds focused `eventfd`, `signalfd`, and `timerfd` tests
The current relibc source already exposes one meaningful shared-memory path: This is meaningful progress, but it is still a patch-carried compatibility layer, not a finished libc
surface.
- `recipes/core/relibc/source/src/header/sys_mman/mod.rs` provides `shm_open()` and `shm_unlink()` ### 4. Fresh bounded-wave verification in this pass
- on Redox, that path resolves to `/scheme/shm/`
- `recipes/core/base/source/ipcd/src/shm.rs` implements the backing shared-memory scheme
That is a real strength and should not be described as “shared memory absent.” This documentation pass also executed a fresh bounded relibc verification cycle against the active
recipe surface:
The real gap is that shared-memory completeness is still insufficient for broader downstream use: - `./target/release/repo unfetch relibc`
- `./target/release/repo fetch relibc`
- `./target/release/repo cook relibc`
- targeted `relibc-tests-bins` executions for:
- `sys_eventfd/eventfd`
- `sys_signalfd/signalfd`
- `sys_timerfd/timerfd`
- `waitid`
- `semaphore/named`
- `semaphore/unnamed`
- `stdio/open_memstream`
- `ifaddrs/getifaddrs`
- the source tree now has visible `sys/shm.h` / `sys/ipc.h` / `sys/sem.h` modules, but they remain bounded rather than comprehensive These are bounded relibc-target proofs, not broad desktop-session runtime proof. They do, however,
- Qt/KDE-facing docs still treat `shm_open()` / `shmget()`-class behavior as unresolved enough to block full `QSharedMemory` confidence move the active concrete-wave surface from documented intent to directly revalidated recipe behavior.
- the current repo still lacks a strong end-to-end validation story for these paths in desktop consumers
### 4. Resolver and interface-networking completeness are still uneven ## Quality assessment
The downstream scan shows that networking-facing userland still hits relibc gaps beyond raw socket ### Strong points
basics.
Examples from downstream recipes and docs: 1. **The patch carriers are explicit and reviewable.**
The relibc recipe points at named patch files instead of hiding Red Bear behavior in an
untracked working tree.
- `recipes/wip/qt/qtbase/recipe.toml` still leaves QtNetwork disabled because of broader networking/runtime concerns such as `in6_pktinfo` and richer interface semantics, even though minimal `resolv.h` and `arpa/nameser.h` surfaces now exist 2. **Several high-value desktop-facing APIs exist in the active build.**
- `recipes/net/openssh/recipe.toml` and its patch history still call out `resolv.h` `eventfd`, `signalfd`, `timerfd`, `waitid`, and named semaphore support are all represented in the
- `recipes/wip/terminal/tmux/redox.patch` comments out `resolv.h` active patch chain instead of remaining vague TODO items.
- `recipes/libs/glib/redox.patch` still touches resolver-facing includes
### 5. The networking surface is narrower than generic Unix software expects 3. **Focused tests now exist for the active concrete-wave surface and were rerun in this pass.**
The current patch chain now covers `eventfd`, `signalfd`, `timerfd`, `waitid`, named and unnamed
semaphores, `open_memstream`, and the bounded `ifaddrs` view.
The current source still shows important limits that should be named directly: 4. **The build integration point is simple and durable.**
The active surface is controlled centrally from `recipes/core/relibc/recipe.toml` and durable
carriers under `local/patches/relibc/`.
- `recipes/core/relibc/source/src/platform/redox/socket.rs` has AF_INET / AF_UNIX socket handling ### Weak points
- `recipes/core/relibc/source/src/header/net_if/mod.rs` now exposes a bounded `eth0`-backed interface view instead of a permanent `stub`
- `recipes/core/relibc/source/src/header/ifaddrs/mod.rs` now provides a bounded `eth0`-backed `getifaddrs()` path instead of pure `ENOSYS`
- source-visible `resolv.h` / `arpa/nameser.h` plus bounded `res_query()` / `res_search()` compatibility are now present, and at least one real downstream (`openssh`) now builds against them, but broader resolver compatibility is still incomplete
That is enough to support the current Red Bear native network path in a bounded sense, but it is not 1. **The repo has drifted between plain-source truth and documentation truth.**
yet strong enough to claim broad interface-aware compatibility for higher-level consumers. Resolver/ Several canonical docs previously described patch-carried functionality as if it already existed
header gaps and interface-model assumptions still show up in ports such as QtNetwork, OpenSSH, in the plain upstream-owned source tree.
tmux, glib, curl, and libuv.
### 6. Process/runtime completeness is still uneven 2. **The active API surface is broader than its semantic maturity.**
The patch chain exposes interfaces, but several of them are bounded compatibility layers rather
than broad Unix-complete implementations.
The repo still has process/runtime unevenness, but one meaningful consumer-facing gap has now moved: 3. **Patch-chain size is still a maintainability risk.**
The active recipe still depends on a substantial set of P3 carriers. That is workable, but it is
not yet the convergence story older docs implied.
- relibc now provides a bounded `waitid()` implementation over the existing `waitpid` path ## Completeness assessment
- the old Qt-side injected `waitid()` stub has been retired from the Qt recipe layer
The source state needs to be classified carefully: ### Plain-source-visible gaps
- `sigjmp_buf` exists in `recipes/core/relibc/source/include/setjmp.h`, so older downstream comments treating it as absent are better read as compatibility/staleness signals rather than primary source truth Still absent or TODO in the live source tree:
- `getgroups()` has a Redox implementation path in `platform/redox/mod.rs`
- `getrlimit()` is no longer a pure placeholder for all consumers: Red Bear now has bounded `RLIMIT_NOFILE` and `RLIMIT_MEMLOCK` behavior, but broader resource-limit completeness is still weak
So process/runtime completeness should be treated as a real subsystem-quality track, but the plan - `mqueue.h`
must distinguish **missing**, **implemented but weak**, and **stale downstream complaint**. - `sys/msg.h`
- named semaphores in `semaphore.h`
- `ifaddrs` plain-source implementation
- plain-source `sys/ipc.h`, `sys/sem.h`, and `sys/shm.h`
### 7. Source quality still contains many TODO / unimplemented branches ### Recipe-applied but bounded surfaces
The current source has a large amount of unfinished or explicitly deferred behavior across: The active build surface includes several features that should be described as **bounded**, not
fully complete:
- `pthread` - `timerfd`: the patch exposes `TFD_TIMER_CANCEL_ON_SET`, but `timerfd_settime()` only accepts
- `time` `TFD_TIMER_ABSTIME`
- `unistd` - `ifaddrs` / `net_if`: current patch-provided interface enumeration is a fixed `loopback` + `eth0`
- `platform/redox` model, not live system discovery
- `epoll` - `open_memstream`: now active in the recipe-applied surface, but still validated here only through
- `ptrace` focused relibc tests rather than broad downstream usage proof
- locale and stdio internals - named semaphores: implemented through `shm_open()` / `mmap()` as a practical compatibility path,
but not yet a broad semantics-proofed story
This does not mean relibc is unusable. It means completeness and quality work now needs a stronger ### Still-missing areas
triage model instead of treating all missing items as equally important.
### 8. Redox-target and downstream validation remain thin relative to subsystem importance The clearest remaining gaps are still real gaps, not just "needs more runtime proof":
The current repo already contains a substantial generic relibc test tree, but the Red Bear-visible - POSIX message queues
validation story is still thin in the areas that matter most for current subsystem unblockers. - SysV message queues
- broader thread / spawn / iconv / wordexp completeness
Right now much of relibcs confidence in the Red Bear docs still comes from: The broader SysV shm/sem carriers still exist under `local/patches/relibc/`, but they are not part
of the active bounded concrete wave implemented in this pass.
- source inspection ## Robustness assessment
- patch carriers
- build-side downstream success
- limited runtime validation via downstream stacks
That is not enough for a component as central as libc, especially for the Redox-target and Robustness is the weakest part of the current relibc story.
downstream-consumer paths Red Bear depends on.
## Downstream-Blocking Gaps by Subsystem The repo now has a meaningful active patch-applied compatibility surface, but several pieces are
still narrow enough that the safest language is:
### Wayland - useful for bounded downstream compatibility,
- not yet broad semantics-proof,
- and not yet safely describable as a plain-source upstream relibc completion story.
The old “basic POSIX APIs are missing” story is no longer the main one. Concretely:
Current state: - fd-event APIs depend on scheme paths such as `/scheme/event` and `/scheme/time`
- `ifaddrs` currently reports a synthetic interface view rather than live network state
- named semaphores remain a bounded shm-backed path rather than a broader semantics-proofed story
- `signalfd`, `timerfd`, `eventfd`, `open_memstream`, bounded `waitid()`, key socket flags, and the adjacent SysV IPC surfaces are now source-visible in the active relibc tree without needing the old standalone P3 replay set ## Recommended support language
- the active Red Bear relibc replay has narrowed to the shared `redox.patch` compatibility delta while those older P3 files remain historical references
- `libwayland` now rebuilds with a much smaller Redox patch
Remaining blocker: Use this language in project docs unless stronger evidence is gathered:
- runtime validation of the full relibc -> libwayland -> compositor path - **Good:** "The active relibc recipe patch chain provides bounded `eventfd` / `signalfd` /
`timerfd` compatibility for current Red Bear consumers."
- **Good:** "Named semaphores and `ifaddrs` currently exist through recipe-applied Red Bear
compatibility patches, not as plain-source upstream relibc convergence."
- **Avoid:** "These surfaces are now source-visible in the current relibc tree."
- **Avoid:** "relibc is complete for desktop consumers."
So the current relibc task for Wayland is primarily **runtime proof and patch reduction**, not just ## Improvement plan
adding obvious libc symbols.
Current Red Bear evidence is stronger than before: `libwayland` now cooks successfully against the ### Phase R0 — Keep the evidence model honest
rebuilt relibc image produced from the current upstream-backed relibc tree plus the active shared
Red Bear compatibility delta, which means the `signalfd`, `timerfd`, `eventfd`, `stdio.h`, and
`sys/socket.h` surfaces are sufficient for at least one major downstream consumer in the current
rebuild model.
### Qt / KDE Goals:
The Qt/KDE-facing relibc backlog is still substantial. - keep plain-source, recipe-applied, and runtime-proof language distinct
- keep canonical relibc docs aligned with `recipes/core/relibc/recipe.toml`
- stop describing patch-carried functionality as already upstream-visible unless it really is
The biggest libc-facing gaps are: Exit criteria:
- shared memory (`shm_open` / `shmget`) for `QSharedMemory` - relibc docs match the active recipe patch list
- named/system semaphores (`sem_open` / `semget`) for `QSystemSemaphore` - repo-level summaries use bounded/evidence-qualified language
- stronger process/runtime behavior for `QProcess`
- runtime validation of QtNetwork against the current relibc networking surface
- resolver/header completeness (`resolv.h`) and network-interface semantics for QtNetwork
- broader process/runtime validation after the new bounded `waitid()` path
This makes Qt/KDE the clearest downstream consumer pushing relibc from “build-capable” toward ### Phase R1 — Make the active patch chain the explicit build contract
“desktop-capable”.
Current Red Bear evidence is stronger than before here too: qtbase now configures, builds, and Goals:
stages with
`FEATURE_process=ON`, `FEATURE_sharedmemory=ON`, and `FEATURE_systemsemaphore=ON` in the current
tree. The remaining work is therefore less about “make the feature visible at all” and more about
runtime semantics, broader compatibility, and downstream cleanup.
### Networking and interface-aware software - treat the current relibc recipe patch list as the build contract for Red Bear relibc behavior
- review that list regularly against upstream relibc changes
- retire carriers only when the recipe no longer needs them
The current relibc networking model is usable, but still narrow enough that higher-level consumers Exit criteria:
keep carrying workarounds or disabled features.
The newer bounded `eth0`-backed `net_if` / `ifaddrs` work improves the source-visible story, but it - every relibc carrier still replayed by the recipe is documented as active
is still only a first Red Bear-shaped interface view, not a full generic Unix interface model. - every historical-but-not-active carrier is clearly marked historical
This is why the plan should treat networking as **usable but still validation-heavy**, not “done”. ### Phase R2 — Strengthen proof for the patch-applied surface
### General userland / server software Goals:
The downstream scan also shows relibc gaps outside graphics: - keep focused tests for `waitid`, semaphores, and other patch-applied APIs
- expand consumer-facing checks for the APIs Red Bear actually depends on
- avoid treating build success alone as semantics proof
- PostgreSQL and some libraries still carry `sigjmp_buf`-related downstream notes that need revalidation against current headers Exit criteria:
- SQLite still notes `getrlimit()` / `getgroups()` gaps, even though the current source state now splits those two differently
- Apache and other ports still touch semaphore or IPC assumptions
That is important because it means relibc completeness is not only about desktop bring-up. It also - each active compatibility surface names its current proof level and missing proof
affects core application/server breadth.
### Desktop/session path ### Phase R3 — Harden bounded compatibility layers
Session and desktop work depends less on one dramatic relibc gap than on overall libc quality: Highest-value targets:
- process semantics - fd-event semantics that current desktop consumers rely on
- IPC completeness - named semaphore behavior beyond the current narrow shm-backed path
- synchronization primitives - `ifaddrs` / `net_if` behavior beyond the synthetic `loopback` + `eth0` model
- runtime interaction with D-Bus/Qt/Wayland consumers
This is why relibc should be treated as a cross-cutting runtime-quality subsystem, not just a POSIX checklist. Exit criteria:
## Quality Assessment - docs no longer need to caveat these areas as merely synthetic or narrowly bounded unless that
boundedness is intentional and accepted
### What relibc is good at now ### Phase R4 — Decide the real SysV IPC contract
- broad visible libc/header coverage The current bounded SysV shm/sem layer is better than raw absence, but it is not a broad final
- practical Redox-native integration rather than fake stubs everywhere design.
- concrete P3 compatibility work for real downstreams
- enough maturity to unlock major subsystem builds
- a substantial generic test tree
### What relibc is bad at now Decision needed:
- uneven implementation depth - either keep a clearly documented bounded compatibility contract,
- too many TODO/unimplemented branches for a component this central - or implement a broader system-backed contract and test it accordingly.
- patch-carried functionality that is still not strongly reflected in visible source snapshots
- too little Redox-target and downstream-runtime validation relative to the generic test tree
- too much downstream confidence still derived from “compiles” instead of “runtime-proven”
## Enhancement Plan Exit criteria:
### Phase R0 — Evidence and Ownership Cleanup - the repo stops implying broad SysV completeness where only a narrow compatibility slice exists
**Goal**: Make relibc status honest before widening scope. ### Phase R5 — Triage the still-missing surfaces
**What to do**: Priority candidates:
- explicitly track relibc claims as `source-visible`, `patch-carried`, `build-proven`, or `runtime-validated` - message queues,
- keep the P3 patch carriers discoverable and documented as canonical until upstreamed - thread/spawn completeness,
- stop describing relibc gaps with outdated “missing basics” language where the code already exists - other TODO headers that block real consumers rather than theoretical completeness checklists.
**Exit criteria**: Exit criteria:
- subsystem docs consistently distinguish between missing, patch-carried, and runtime-proven relibc behavior - each remaining TODO surface is either implemented, explicitly deferred, or removed from misleading
summary language
--- ### Phase R6 — Converge with upstream where possible
### Phase R1 — Stabilize the newly source-visible P3 APIs Goals:
**Goal**: Keep the newly source-visible P3 APIs aligned with their patch-carrier and downstream expectations. - shrink the relibc patch chain whenever upstream absorbs equivalent behavior
- avoid carrying Red Bear-local relibc deltas longer than necessary
**What to do**: Exit criteria:
- keep `signalfd`, `timerfd`, `eventfd`, `open_memstream`, socket flags, and `F_DUPFD_CLOEXEC` visible and maintained as canonical relibc behavior - the active recipe patch chain is smaller for evidence-based reasons, not for documentation optics
- reduce downstream assumptions that these APIs are still absent
- ensure generated/exported headers stay aligned with the source-visible implementation set
**Exit criteria**: ## Bottom line
- the repo consistently treats these P3 APIs as source-visible functionality that now needs validation and downstream cleanup rather than invention relibc in the current Red Bear repo is neither a greenfield libc nor a nearly converged upstream
story.
--- It is a **partially upstream, materially patch-applied compatibility surface** that already covers
important desktop-facing APIs, but still has real completeness gaps, bounded semantics, and a larger
### Phase R2 — Close the shared-memory and semaphore completeness gap patch-chain dependency than older docs admitted.
**Goal**: Unlock the next meaningful Qt/KDE-facing libc surface.
**What to do**:
- keep the existing `shm_open` / `/scheme/shm/` path explicit and documented
- implement the missing SysV IPC/shared-memory side or document a deliberate non-goal if Red Bear does not want full SysV compatibility
- harden and validate the now source-visible named semaphore support (`sem_open`, `sem_close`, `sem_unlink`)
- close the specific `QSharedMemory` and `QSystemSemaphore` blockers identified in the Qt docs
**Exit criteria**:
- the Qt/KDE docs no longer list shared memory and named semaphores as unresolved relibc blockers
---
### Phase R3 — Process/runtime correctness for desktop consumers
**Goal**: Reduce downstream process workarounds.
**What to do**:
- strengthen process-facing libc/runtime behavior enough to remove targeted workarounds such as the Qt `waitid()` shim path
- close or intentionally document the remaining `sigjmp_buf` / `getrlimit()` / `getgroups()` quality gaps that still force downstream patches
- validate process semantics against real downstream consumers, not only isolated libc expectations
**Current implementation note:** the bounded `waitid()` path is now source-visible, the old Qt-side
`waitid()` shim is gone, and qtbase now configures/builds/stages with process support enabled. The
remaining work is broader process/runtime validation and cleanup, not the old total absence of `waitid()`.
**Exit criteria**:
- downstream process workarounds are reduced or eliminated for the current desktop stack
---
### Phase R4 — Networking/runtime validation
**Goal**: Turn the current networking surface from “present” into “trusted”.
**What to do**:
- validate QtNetwork and similar consumers against the current relibc socket/ioctl/interface model
- close the highest-value resolver/header gaps such as `resolv.h` where they are still forcing downstream stubs or disabled modules
- evolve the new bounded `eth0`-backed interface-reporting path into a better general Redox interface model where needed
- document which current networking semantics are intentionally Redox-specific and which are intended to mimic broader Unix behavior
**Exit criteria**:
- at least one meaningful higher-level network consumer is validated against the current relibc networking surface
---
### Phase R5 — Dedicated relibc validation expansion
**Goal**: Improve libc confidence without waiting for whole desktop stacks.
**What to do**:
- build a stronger dedicated Redox-target and P3/downstream validation layer on top of the existing generic relibc test tree
- ensure new APIs and bugfixes come with focused libc-level tests where practical
- keep downstream consumer tests, but stop relying on them as the only quality signal
**Exit criteria**:
- relibc has explicit Redox-target and downstream-runtime validation beyond the generic upstream-style test tree
---
### Phase R6 — General completeness triage
**Goal**: Attack the remaining TODO/unimplemented backlog by priority rather than by random header count.
**What to do**:
- rank remaining TODO/unimplemented items by downstream subsystem impact
- prioritize IPC, synchronization, process, time, and networking correctness over obscure or deprecated headers
- keep deprecated/low-value gaps documented, but do not let them drive the roadmap ahead of higher-value runtime work
**Exit criteria**:
- relibc backlog is organized by real system impact instead of undifferentiated TODO volume
## Recommended Order of Work
The current best order is:
1. evidence cleanup and canonicalization of what already exists
2. shared memory and named semaphores
3. process/runtime correctness
4. networking/runtime validation
5. Redox-target and downstream validation expansion
6. broader backlog triage and cleanup
That order matches the current downstream blocker chain better than a generic “finish all missing headers” strategy.
## Support-Language Guidance
Until the runtime-validation phases are materially complete, Red Bear should avoid saying:
- “relibc POSIX gaps are solved”
- “Qt/Wayland blockers are fully gone”
- “network/process/shared-memory support is complete”
Prefer language such as:
- “consumer-visible P3 APIs are now present, with runtime validation still needed”
- “relibc is materially stronger, but desktop-facing completeness work remains”
- “the remaining relibc problem is now quality and downstream proof, not just symbol absence”
## Summary
relibc is one of Red Bears strongest foundational subsystems, but it is not complete.
Its strongest current qualities are:
- broad libc/header coverage
- real Redox-native platform integration
- concrete source-visible and patch-backed solutions to the historical P3 Wayland-facing blockers
- clear downstream build progress because of those fixes
- a substantial generic test surface
Its largest remaining weaknesses are:
- incomplete shared memory and named semaphore support
- process/runtime unevenness
- networking/resolver/interface completeness gaps
- too many TODO/unimplemented branches in central paths
- too little Redox-target and downstream-runtime validation relative to the generic test tree
The correct relibc roadmap is therefore **not** “hunt random missing symbols.” It is to turn the
current build-capable libc into a runtime-trusted subsystem by closing the high-value desktop/runtime
gaps, strengthening validation, and reducing patch-carried ambiguity.
@@ -2,412 +2,167 @@
## Purpose ## Purpose
This document assesses the current **IPC-related relibc surface** in Red Bear OS and turns that This document is the IPC-focused companion to
assessment into a concrete improvement plan. `local/docs/RELIBC-COMPLETENESS-AND-ENHANCEMENT-PLAN.md`.
The focus here is narrower than the general relibc plan: Its job is to describe the current IPC-facing relibc surface honestly, especially where the active
Red Bear build depends on recipe-applied compatibility layers rather than plain-source upstream
relibc.
- POSIX shared memory and semaphores ## Evidence model
- System V shared memory and semaphores
- missing System V / POSIX IPC areas such as message queues
- IPC-adjacent descriptor/event primitives that downstream software treats as part of the same
coordination substrate: `eventfd`, `signalfd`, and `timerfd`
- the downstream subsystem pressure created by Qt, KDE, Wayland, and related userland
This is not a generic libc-compliance document. It is grounded in the current repository state. This document uses the same terms as the canonical relibc plan:
## Evidence Model - **plain-source-visible**
- **recipe-applied**
- **test-present**
- **runtime-unrevalidated in this pass**
This assessment distinguishes four evidence levels: Do not collapse those into one generic "implemented" label.
- **source-visible** — behavior exists in relibc source now ## Current IPC inventory
- **test-visible** — behavior is exercised by focused relibc tests
- **build-visible downstream** — real consumers compile/link against it
- **runtime-validated** — behavior has been exercised in real Redox or consumer runtime paths
The key IPC problem in the current tree is not simple absence. It is the gap between | Surface | Plain source | Active build | Notes |
**source-visible**, **bounded**, **build-proven**, and **runtime-trusted**. |---|---|---|---|
| `shm_open()` / `shm_unlink()` | yes | yes | provided through `sys_mman` in the live source tree |
| named POSIX semaphores | no | yes | added by `P3-semaphore-fixes.patch` on top of `shm_open()` / `mmap()` |
| `eventfd` | no | yes | added by `P3-eventfd.patch` through `/scheme/event/eventfd/...` |
| `signalfd` | no | yes | added by `P3-signalfd.patch` through `/scheme/event` plus signal-mask handling |
| `timerfd` | no | yes | added by `P3-timerfd.patch` through `/scheme/time/{clockid}` |
| `waitid()` | no | yes | added by `P3-waitid.patch` |
| `ifaddrs` / `net_if` support used by IPC-adjacent consumers | no | yes | added by `P3-ifaddrs-net_if.patch`; currently synthetic |
| SysV shm (`sys/shm.h`) | no | no | bounded carriers exist locally, but they are not part of the active concrete-wave recipe surface |
| SysV sem (`sys/sem.h`) | no | no | bounded carriers exist locally, but they are not part of the active concrete-wave recipe surface |
| POSIX message queues (`mqueue.h`) | no | no | still TODO in the live source tree |
| SysV message queues (`sys/msg.h`) | no | no | still TODO in the live source tree |
## Upstream vs Red Bear separation ## Observed limitations
For this IPC work, keep the storage model explicit: ### Named POSIX semaphores
- the live implementation under `recipes/core/relibc/source/src/header/` is the working upstream The active patch chain implements named semaphores by storing a `Semaphore` inside shared memory
tree used for builds and tests opened through `shm_open()` and mapped with `mmap()`. That is a useful bounded compatibility path,
- the durable Red Bear ownership boundary is `local/patches/relibc/` plus `local/docs/` but it should still be described as a Red Bear recipe-applied layer, not a plain-source upstream
relibc completion.
So the IPC implementation is only truly safe when: ### fd-event APIs
1. the upstream-owned relibc source tree builds with the change now, and `eventfd`, `signalfd`, and `timerfd` are present in the active build, but they are all scheme-backed
2. the same delta is preserved in `local/patches/relibc/` so a fresh upstream refetch can recover it compatibility layers:
This repo should be able to pull renewed upstream sources every day and still rebuild after - `eventfd` depends on `/scheme/event/eventfd/...`
reapplying the local relibc patch carriers. That requirement is part of the IPC improvement plan, - `signalfd` depends on `/scheme/event` and blocks the supplied mask with `sigprocmask()`
not an afterthought. - `timerfd` depends on `/scheme/time/{clockid}` and currently rejects unsupported flag combinations
The same section also implies an upstream-preference policy: These are real compatibility layers, but they should still be described as bounded until broader
consumer/runtime proof is recorded.
- when upstream relibc already provides the same IPC fix, prefer upstream ### Deferred SysV shm/sem work
- keep Red Bear IPC patches only for gaps that upstream still does not solve adequately
- review patch carriers regularly and delete or shrink ones made obsolete by upstream evolution
## Current Implementation Note Bounded SysV shm/sem carriers still exist under `local/patches/relibc/`, but they were not wired
into the active concrete-wave recipe surface implemented in this pass. They should therefore be
treated as deferred follow-up work, not as active build behavior.
This repo pass did not just assess the IPC surface; it also restored the missing relibc IPC modules ### Interface enumeration used by networking-adjacent consumers
that the drafted Red Bear docs were already assuming existed in-tree.
The current tree now contains source-visible implementations for: The current `P3-ifaddrs-net_if.patch` replaces `ENOSYS`, but it does so with a synthetic two-entry
model:
- `sys/eventfd.h` / `eventfd()` / `eventfd_read()` / `eventfd_write()` - `loopback`
- `sys/timerfd.h` / `timerfd_create()` / `timerfd_settime()` / `timerfd_gettime()` - `eth0`
- `sys/signalfd.h` / `signalfd()` / `signalfd4()`
- `open_memstream()`
- bounded `sys/ipc.h`, `sys/shm.h`, and `sys/sem.h` compatibility layers
- a bounded `waitid()` path sufficient to satisfy current Qt process-side linking
This pass also added focused relibc tests for: That is enough for some bounded consumers, but it should not be described as live full interface
enumeration.
- `stdio/open_memstream` ## Downstream pressure
- `sys_sem/semget`
- `sys_timerfd/timerfd`
- `sys_signalfd/signalfd`
Current manual verification in this repo pass:
- `cargo check --target x86_64-unknown-linux-gnu` passes for relibc
- host-side focused IPC tests execute successfully for `open_memstream` and `semget`
- targeted Redox runtime execution now validates the `timerfd` and `signalfd` tests directly through the repaired `write-exec` path instead of relying on bounded host-side fallback behavior
- `CI=1 ./target/release/repo cook relibc` completes successfully after clearing a stale stage-dir collision
- `CI=1 ./target/release/repo cook qtbase` now succeeds after exporting `eventfd_t` and restoring a bounded `waitid()` path
- a fresh `repo unfetch relibc``repo fetch relibc` cycle plus reapplication of
`local/patches/relibc/` again supports successful downstream `libwayland` and `qtbase` builds,
which is the current proof that the relibc IPC overlay is recoverable from refreshed upstream
source, not only from the previously edited working tree
In other words, the current relibc IPC work is no longer just “working in the checked-out source
tree”. It is now proven as an overlay workflow:
1. refresh upstream relibc source
2. reapply the local relibc compatibility overlays
3. rebuild relibc
4. rebuild real downstream consumers (`libwayland`, `qtbase`)
For the current tree, that overlay story now includes the tracked
`local/patches/relibc/redox.patch` carrier owning the bounded interface-enumeration and
resolver-header compatibility deltas (`ifaddrs` / `net_if`, `arpa/nameser.h`, `resolv.h`) rather
than leaving those as standalone transient patch files.
## Scope Map
### In scope in relibc today
| Area | State | Primary evidence |
|---|---|---|
| `shm_open()` / `shm_unlink()` | implemented | `recipes/core/relibc/source/src/header/sys_mman/mod.rs` |
| POSIX unnamed semaphores | implemented | `recipes/core/relibc/source/src/header/semaphore/mod.rs` |
| POSIX named semaphores | implemented but bounded | `recipes/core/relibc/source/src/header/semaphore/mod.rs` |
| SysV shared memory | implemented but bounded | `recipes/core/relibc/source/src/header/sys_shm/mod.rs` |
| SysV semaphores | implemented but bounded | `recipes/core/relibc/source/src/header/sys_sem/mod.rs` |
| `eventfd` | implemented; stronger than the other descriptor-event APIs | `recipes/core/relibc/source/src/header/sys_eventfd/mod.rs` |
| `signalfd` | implemented, but runtime-thin and not broadly Redox-runtime-trusted yet | `recipes/core/relibc/source/src/header/signal/signalfd.rs` |
| `timerfd` | implemented, but semantically narrow and not broadly Redox-runtime-trusted yet | `recipes/core/relibc/source/src/header/sys_timerfd/mod.rs` |
### Explicitly incomplete or absent
| Area | Current state | Evidence |
|---|---|---|
| POSIX message queues | absent | `recipes/core/relibc/source/src/header/mod.rs` still has `TODO: mqueue.h` |
| SysV message queues | absent | `recipes/core/relibc/source/src/header/mod.rs` still has `TODO: sys/msg.h` |
| `threads.h` / other broader libc completeness | outside this IPC focus, still incomplete | `recipes/core/relibc/source/src/header/mod.rs` |
## Current Implementation Assessment
### 1. Strong spots
The strongest IPC-related point is that relibc is no longer missing its core coordination substrate.
The current tree has real, source-visible implementations for POSIX shm, POSIX semaphores, SysV
shared memory, SysV semaphores, `eventfd`, `signalfd`, and `timerfd`. This is already enough to
move several downstreams from patch-side workarounds to actual libc usage.
`shm_open()` and `shm_unlink()` are cleanly tied to the Redox-native `/scheme/shm/` path in
`sys_mman/mod.rs`. That is a good architectural fit: Red Bear is not pretending to have a Linux
kernel IPC model under the hood, but it still exposes familiar libc entry points on top of Redox
schemes.
The second strong point is that the IPC work is not just source-visible anymore. The focused relibc
tests already cover `sem_open`, `shmget`, `open_memstream`, `semget`, `eventfd`, and the targeted
Redox-runtime `timerfd` / `signalfd` cases. The broader relibc plan also records successful
downstream builds for `libwayland`, `qtbase`, and `openssh`, which means real consumers are already
benefiting from this work, but those consumers do **not** all prove IPC depth equally.
### 2. Weak spots
The biggest weakness is **boundedness masquerading as compatibility**. The SysV layers exist, but
they are deliberately thin wrappers over `/scheme/shm/` and relibc-local bookkeeping, not a broad
Unix-complete implementation.
In `sys_shm/mod.rs`, `shmat()` rejects non-null attach addresses with `ENOSYS`, `SHM_RND` is
defined but not meaningfully implemented, and `shmctl()` only meaningfully supports `IPC_RMID` and
`IPC_STAT`. This is good enough for simple `IPC_PRIVATE` workflows and current compile-time
consumers, but it is not strong enough to claim general SysV shared-memory completeness.
In `sys_sem/mod.rs`, `semget()` rejects any `nsems != 1`, so the implementation is effectively a
single-semaphore set model rather than a full semaphore-set model. `semop()` supports multiple
operations in one call, but only for semaphore number 0, and there is no `semtimedop()` support.
`SEM_UNDO` is defined but not actually implemented. Compared with the standard `semop(2)` model,
this means the current layer matches only the narrowest downstream cases.
Named POSIX semaphores are also present but still bounded. `sem_open()` is implemented on top of
`shm_open()`, which is a practical Redox-native strategy, but the current code comments already mark
it as a bounded Redox path rather than a full Linux/glibc-equivalent semantic model.
The descriptor-event primitives are in a materially better state than before. `eventfd` now has a
real counter-style runtime path instead of only a source-visible wrapper, and the targeted Redox
runtime test harness now executes strict `eventfd`, `signalfd`, and `timerfd` test binaries
successfully through the repaired `write-exec` runner path. The older "unavailable is success"
fallbacks were removed from those focused tests, so these are now actual runtime checks rather than
mere launch proofs.
The preserved overlay story for those paths is now simpler than it was during the original bounded
bring-up. The current relibc tree already contains the fd-event implementations and focused tests
upstream, so the active Red Bear recipe replay no longer needs the old standalone
`P3-eventfd.patch`, `P3-signalfd.patch`, `P3-signalfd-header.patch`, `P3-timerfd.patch`, and
`P3-fd-event-tests.patch` carriers. In the current repo, `redox.patch` remains the active shared
Red Bear relibc delta, while the historical P3 files are legacy references rather than recipe inputs.
The remaining caution is semantic breadth, not whether the paths execute at all. `timerfd` is now
runtime-validated for the bounded relibc test harness, but downstream consumers such as KWin still
pressure Linux-oriented details like `TFD_TIMER_CANCEL_ON_SET`, so broad desktop/runtime trust
should still be described as narrower than full Linux equivalence.
### 3. Missing areas
The obvious missing IPC area is message queues. Both `mqueue.h` and `sys/msg.h` remain TODOs in the
header tree, which means relibc currently has no story at all for POSIX message queues or SysV
message queues. That is not necessarily todays highest-value blocker, but it is still a real IPC
gap and should be named directly instead of being buried under generic TODO volume.
## Downstream Subsystem Assessment
### Qt / KDE ### Qt / KDE
Qt and KDE are the clearest subsystem forcing IPC depth rather than just IPC surface area. Qt and KDE remain the strongest pressure on relibc IPC semantics.
`local/docs/QT6-PORT-STATUS.md` already treats `QSharedMemory`, `QSystemSemaphore`, and `QProcess` They do not only need headers to exist. They need the active compatibility layers to behave well
as moved from “missing libc surface” to “present, but still needs runtime validation”. That is the enough for:
right framing. The libc surface is no longer the primary blocker; confidence and semantics are.
The strongest concrete consumers in-tree are: - shared-memory consumers,
- named semaphore consumers,
- direct `eventfd` / `timerfd` users,
- and process-control paths such as `waitid()`.
- `local/recipes/kde/kf6-kservice/source/src/sycoca/kmemfile.cpp` — heavy `QSharedMemory` usage ### Wayland-facing consumers
- `local/recipes/kde/kf6-solid/source/src/solid/devices/backends/udisks2/udisksopticaldisc.cpp`
`QSharedMemory` plus `QSystemSemaphore`
- `local/recipes/kde/kf6-kio/source/src/gui/previewjob.cpp` — direct SysV `shmget` / `shmat`
- `local/recipes/kde/kwin/source/src/utils/xcbutils.cpp` — direct `shmget`
- `local/recipes/kde/kwin/source/src/core/syncobjtimeline.cpp` and kio scoped-process code —
`eventfd`
- `local/recipes/kde/kwin/source/src/plugins/nightlight/clockskewnotifierengine_linux.cpp`
`timerfd` with `TFD_TIMER_CANCEL_ON_SET`
This matters because it shows two different downstream classes: Wayland-facing pressure is strongest on the fd-event side of the IPC story:
1. **Qt abstractions** (`QSharedMemory`, `QSystemSemaphore`) that can tolerate bounded underlying - `eventfd`
libc behavior if their common paths work. - `signalfd`
2. **Direct Unix/Linux-style callers** (KIO/KWin) that expose the places where the current relibc - `timerfd`
SysV and timerfd layers are still semantically narrower than software expects.
### Wayland stack That is a different pressure profile from the SysV and named-semaphore side.
Wayland is less about classic shared-memory IPC completeness now and more about the descriptor-event ## Fresh verification in this pass
side of the same subsystem family. The repos existing docs correctly show that `signalfd`,
`timerfd`, `eventfd`, and `open_memstream` were the historical blockers and are now source-visible.
`libwayland` cooking successfully is strong build-side proof, but the remaining work is runtime
behavior under a compositor/session stack.
### Secondary consumers: OpenSSH / GLib / tmux This pass revalidated the active concrete-wave IPC-facing surface through the relibc test recipe:
These are weaker IPC drivers and stronger networking/resolver drivers. They still matter because they - `sys_eventfd/eventfd`
show a pattern: once relibc exports the needed surface, downstream recipes can drop fake fallbacks, - `sys_signalfd/signalfd`
but runtime validation still trails source visibility. For an IPC-focused roadmap, they are useful - `sys_timerfd/timerfd`
secondary evidence, not primary IPC blockers. - `waitid`
- `semaphore/named`
- `semaphore/unnamed`
The downstream proof should therefore be read this way: These are bounded relibc-target proofs. They improve confidence in the active fd-event and named
semaphore surface, but they do not change the deferred status of broader SysV shm/sem or message
queues.
- `qtbase` is the strongest IPC-facing downstream because it directly pressures shared memory, ## Improvement plan
semaphores, and process behavior.
- KDE consumers on top of Qt are the strongest subsystem evidence for where IPC semantics still need
runtime trust.
- `libwayland` is strongest as descriptor-event proof (`signalfd`, `timerfd`, `eventfd`,
`open_memstream`) rather than SysV IPC proof.
- `openssh`, `glib`, and `tmux` are useful proof that relibc header/export cleanup is helping real
ports, but they should not be over-counted as core IPC validation.
## Main Blockers ### Phase I1 — Keep IPC claims aligned with the active build surface
### Blocker 1 — SysV layers are intentionally narrower than their API surface suggests - document patch-applied IPC layers as patch-applied
- stop describing them as plain-source-visible unless they move into the live source tree
- keep this doc aligned with `recipes/core/relibc/recipe.toml`
This is the highest-value blocker because it affects both direct consumers and Qt/KDE confidence. ### Phase I2 — Decide the support contract for bounded IPC layers
Current examples: For each major IPC area, choose one of these paths explicitly:
- `semget()` only supports one semaphore per set - bounded compatibility layer with honest documentation,
- `semop()` only supports semaphore number 0 - or broader semantics work with explicit proof targets.
- `SEM_UNDO` is not implemented
- `semtimedop()` is absent
- `shmat()` does not support non-null attach addresses
- `shmctl()` does not cover the broader control matrix
- SysV message queues are absent entirely
None of these invalidate the current build work. But together they mean “API present” is still not This is especially important for:
the same as “subsystem-complete”.
### Blocker 2 — Runtime validation is still shallower than subsystem importance - SysV shm,
- SysV sem,
- named semaphores,
- and `ifaddrs`-driven interface discovery.
The IPC surface is better-tested than before, but runtime validation still trails the subsystems ### Phase I3 — Add proof where current docs only imply confidence
importance.
Current test story: Highest-value areas:
- host-side focused execution exists for `sem_open`, `shmget`, `open_memstream`, `semget`, and - the fd-event slice used by Wayland-facing consumers,
`eventfd` - shared-memory and named-semaphore behavior used by Qt/KDE,
- targeted Redox runtime execution now exists for `signalfd`, `timerfd`, and `eventfd` via - and the currently synthetic interface-discovery path.
`relibc-tests-bins` and the repaired `cookbook_redoxer write-exec` path, with strict pass/fail
semantics rather than availability fallbacks
- downstream build evidence exists for `libwayland`, `qtbase`, and `openssh`
What is still missing is stronger Redox-target or consumer-runtime proof for Qt/KDE and Wayland ### Phase I4 — Triage message queues directly
paths that actually exercise shared memory, semaphores, and timer/signal descriptor behavior in a
live session.
The strongest safe claim today is therefore: Message queues are still genuine absences, not just bounded implementations.
- **source-visible** across the major IPC surfaces, This doc should keep them visible until Red Bear either:
- **test-visible** for focused host-side and Redox-target fd-event cases,
- **build-visible downstream** for meaningful consumers,
- with **bounded runtime trust on Redox for the relibc fd-event harness**,
- but **not yet broad proof of full Linux-equivalent semantics for every desktop consumer path**.
### Blocker 3 — Descriptor-event semantics are still narrower than Linux-oriented callers expect - implements them,
- proves they are unnecessary for the intended consumer set,
- or explicitly documents them as deferred/non-goals.
KWins timer code wants `TFD_TIMER_CANCEL_ON_SET`. The current bounded relibc timerfd layer does ### Phase I5 — Converge with upstream deliberately
not claim that full Linux cancel-on-clock-change semantic. The preserved test/runtime slice proves
one-shot behavior and successful `TFD_TIMER_ABSTIME` / bounded flag-surface handling, while broader
Linux-equivalent cancel-on-clock-change semantics remain an explicit downstream expectation gap.
Likewise, `signalfd` support is no longer merely visible/exported; it now passes the targeted When upstream relibc absorbs equivalent IPC functionality, prefer the upstream path and shrink the
Redox-runtime relibc test path. The remaining question is broader consumer semantics and long-tail Red Bear patch chain. Until then, keep the active IPC carrier set explicit and documented.
desktop/runtime confidence, not basic availability.
### Blocker 4 — Message queues remain a completely open IPC front ## Bottom line
`mqueue.h` and `sys/msg.h` are still absent. This is not the first blocker to fix for todays The current Red Bear relibc IPC story is **material patch-applied compatibility, not plain-source
desktop stack, but it is the clearest “IPC truly not implemented yet” gap left in relibc. completion**.
## Current Non-Goals / Not Yet Claimed That is still valuable progress, but the repo should describe it honestly: several important IPC
surfaces exist in the active build, several of them are still bounded, and message queues remain a
The current tree should **not** be described as claiming any of the following: real missing area.
- full SysV semaphore-set semantics
- full SysV shared-memory semantics
- full Linux-equivalent `timerfd` semantics
- broad Redox-runtime trust for `signalfd` or `timerfd`
- any POSIX message queue support
- any SysV message queue support
## Recommended Improvement Plan
### Phase I1 — Reclassify the IPC support language
**Goal:** Make subsystem docs accurately describe the current state.
**Do:**
- describe POSIX shm and semaphores as implemented
- describe SysV shm and semaphores as **bounded compatibility layers**, not comprehensive support
- describe `eventfd` as stronger than `signalfd` / `timerfd`
- describe message queues as still absent
**Exit criteria:** repo docs stop using broad phrases that imply complete IPC compatibility.
### Phase I2 — Harden the bounded SysV compatibility layers
**Goal:** Make the existing SysV support less misleading and more useful.
**Do:**
- decide whether Red Bear wants full semaphore-set support or an intentionally limited single-set model
- if limited, document that choice explicitly in relibc and subsystem docs
- otherwise extend `semget` / `semop` / `semctl` beyond the current semaphore-0-only model
- implement or explicitly reject `SEM_UNDO`
- add `semtimedop()` if downstreams need it
- expand `shmctl()` and `shmat()` support where real consumers need more than the current `IPC_PRIVATE`
attach workflow
**Exit criteria:** the SysV shm/sem layers either become materially broader or are clearly documented
as intentionally bounded Redox compatibility shims.
### Phase I3 — Close the Qt/KDE runtime-proof gap
**Goal:** Move the IPC story from build-visible to desktop-visible.
**Do:**
- validate `QSharedMemory` under real Qt/KDE usage paths
- validate `QSystemSemaphore` in KDE consumers such as Solid
- validate KIO / KWin direct SysV shm paths
- record exactly which Qt/KDE IPC paths are now runtime-trusted versus merely build-capable
**Exit criteria:** Qt/KDE docs stop listing shared memory and semaphore support as unresolved relibc
confidence gaps.
### Phase I4 — Improve descriptor-event completeness for compositor/session code
**Goal:** Turn the current `eventfd` / `signalfd` / `timerfd` set into a more trustworthy runtime layer.
**Do:**
- keep `eventfd` on the current stable path
- validate `signalfd` in real event-loop style consumers
- extend `timerfd` semantics where current downstream code expects more than `TFD_TIMER_ABSTIME`
(notably `TFD_TIMER_CANCEL_ON_SET`)
- build targeted Redox-target tests where host behavior is inherently not representative
**Exit criteria:** at least one meaningful compositor/session consumer is runtime-validated against
the current descriptor-event path.
### Phase I5 — Triage message queues explicitly
**Goal:** Stop leaving message queues as unprioritized TODOs.
**Do:**
- determine whether any current Red Bear subsystem actually needs POSIX or SysV message queues
- if not, mark them as lower-priority completeness debt
- if yes, create a dedicated implementation plan rather than burying them in generic header backlog
**Exit criteria:** `mqueue.h` and `sys/msg.h` are either on a concrete roadmap or explicitly treated
as non-blocking backlog.
## Recommended Order
The current best order is:
1. documentation cleanup and accurate IPC classification
2. SysV shm/sem hardening or explicit non-goal documentation
3. Qt/KDE runtime validation
4. descriptor-event runtime validation and timerfd semantic expansion
5. message queue triage
That order matches the current subsystem pressure better than a generic “finish all missing IPC
headers” strategy.
## Bottom Line
relibc IPC in Red Bear OS is no longer a story of missing primitives. It is now a story of **real
surface area with bounded compatibility depth**.
The strongest parts are POSIX shm, POSIX semaphores, `eventfd`, and the fact that major downstreams
already build. The weakest parts are the narrow SysV semantics, the lack of message queues, and the
runtime-proof gap for the desktop/session stack. The right next step is not random header work; it
is to harden and validate the IPC layers that current Qt/KDE and Wayland-adjacent consumers are
already trying to use.
+4
View File
@@ -26,6 +26,10 @@ The goal is to remove guesswork from the sync/fetch/apply/build workflow.
| `local/scripts/test-drm-display-runtime.sh` | Run the bounded DRM/KMS display checker in a target runtime | invokes the packaged `redbear-drm-display-check` helper for AMD or Intel, proving scheme/card reachability, connector/mode enumeration, and bounded direct modeset proof over the Red Bear DRM ioctl surface when requested | does not prove render command submission, fence semantics, or hardware rendering | | `local/scripts/test-drm-display-runtime.sh` | Run the bounded DRM/KMS display checker in a target runtime | invokes the packaged `redbear-drm-display-check` helper for AMD or Intel, proving scheme/card reachability, connector/mode enumeration, and bounded direct modeset proof over the Red Bear DRM ioctl surface when requested | does not prove render command submission, fence semantics, or hardware rendering |
| `local/scripts/test-amd-gpu.sh` | AMD wrapper for the bounded DRM/KMS display checker | runs `test-drm-display-runtime.sh --vendor amd` | still only display-path evidence | | `local/scripts/test-amd-gpu.sh` | AMD wrapper for the bounded DRM/KMS display checker | runs `test-drm-display-runtime.sh --vendor amd` | still only display-path evidence |
| `local/scripts/test-intel-gpu.sh` | Intel wrapper for the bounded DRM/KMS display checker | runs `test-drm-display-runtime.sh --vendor intel` | still only display-path evidence | | `local/scripts/test-intel-gpu.sh` | Intel wrapper for the bounded DRM/KMS display checker | runs `test-drm-display-runtime.sh --vendor intel` | still only display-path evidence |
| `local/scripts/test-msix-qemu.sh` | Bounded MSI-X proof in QEMU | validates that the current virtio-net guest path reaches MSI-X-capable interrupt delivery and emits normalized `IRQ_DRIVER`, `IRQ_MODE`, `IRQ_REASON`, and `IRQ_LOG` output for the bounded guest/runtime proof | does not prove broad hardware MSI-X reliability or per-device fallback behavior outside the bounded guest path |
| `local/scripts/test-iommu-qemu.sh` | Bounded IOMMU first-use proof in QEMU | validates guest-visible AMD-Vi initialization and bounded event/drain behavior through the current `iommu` runtime path | does not prove real-hardware interrupt remapping quality or full DMA-remapping correctness |
| `local/scripts/test-xhci-irq-qemu.sh` | Bounded xHCI interrupt-mode proof in QEMU | validates that the xHCI guest path reaches an interrupt-driven mode under the current bounded runtime checker and emits normalized `IRQ_DRIVER`, `IRQ_MODE`, `IRQ_REASON`, and `IRQ_LOG` output | does not prove full USB topology maturity or broad hardware interrupt robustness |
| `local/scripts/test-lowlevel-controllers-qemu.sh` | Aggregate bounded low-level controller proof wrapper | runs MSI-X, xHCI IRQ, IOMMU first-use, PS/2/serio, and monotonic timer proofs in one sequence, defaulting to `redbear-mini` while automatically upgrading only the IOMMU leg to `redbear-full` because that runtime currently ships `/usr/bin/iommu`; if the required `redbear-full` image is absent, that single IOMMU leg is explicitly skipped rather than aborting the rest of the bounded wrapper | does not replace the individual proof helpers and does not prove real-hardware controller quality |
| `local/scripts/prepare-wifi-vfio.sh` | Prepare or restore an Intel Wi-Fi PCI function for passthrough | binds a chosen PCI function to `vfio-pci` or restores it to a specified host driver | does not verify guest Wi-Fi functionality and must be used carefully on a host with a safe detachable target device | | `local/scripts/prepare-wifi-vfio.sh` | Prepare or restore an Intel Wi-Fi PCI function for passthrough | binds a chosen PCI function to `vfio-pci` or restores it to a specified host driver | does not verify guest Wi-Fi functionality and must be used carefully on a host with a safe detachable target device |
| `local/scripts/validate-wifi-vfio-host.sh` | Check whether a host looks ready for Wi-Fi VFIO testing | validates PCI presence, current driver, UEFI firmware, Red Bear image presence, QEMU/expect availability, VFIO module state, and IOMMU group visibility; exits non-zero when blockers are found | does not bind devices or prove the guest Wi-Fi stack works | | `local/scripts/validate-wifi-vfio-host.sh` | Check whether a host looks ready for Wi-Fi VFIO testing | validates PCI presence, current driver, UEFI firmware, Red Bear image presence, QEMU/expect availability, VFIO module state, and IOMMU group visibility; exits non-zero when blockers are found | does not bind devices or prove the guest Wi-Fi stack works |
| `local/scripts/run-wifi-passthrough-validation.sh` | End-to-end host-side passthrough validation wrapper | prepares VFIO, runs the packaged in-guest Wi-Fi validation path, captures the guest JSON artifact to the host, writes a host-side metadata sidecar, and restores the host driver afterwards | still depends on real VFIO/hardware support and does not itself guarantee end-to-end Wi-Fi connectivity | | `local/scripts/run-wifi-passthrough-validation.sh` | End-to-end host-side passthrough validation wrapper | prepares VFIO, runs the packaged in-guest Wi-Fi validation path, captures the guest JSON artifact to the host, writes a host-side metadata sidecar, and restores the host driver afterwards | still depends on real VFIO/hardware support and does not itself guarantee end-to-end Wi-Fi connectivity |