From 2485afa798703010fc2626bab8d92f4ad1440a2e Mon Sep 17 00:00:00 2001 From: Vasilito Date: Sat, 18 Apr 2026 21:38:31 +0100 Subject: [PATCH] Update ACPI and low-level controller docs --- local/docs/ACPI-FIXES.md | 11 +- local/docs/ACPI-IMPROVEMENT-PLAN.md | 823 +++++++++--------- ...D-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md | 27 +- 3 files changed, 431 insertions(+), 430 deletions(-) diff --git a/local/docs/ACPI-FIXES.md b/local/docs/ACPI-FIXES.md index d5cbb939..bc0755c5 100644 --- a/local/docs/ACPI-FIXES.md +++ b/local/docs/ACPI-FIXES.md @@ -9,8 +9,9 @@ Status of ACPI fixes for AMD bare metal boot. Cross-referenced with This file is the **historical P0 bring-up ledger**. The forward-looking ownership, robustness, and validation plan now lives in `local/docs/ACPI-IMPROVEMENT-PLAN.md`. -P0 ACPI boot-baseline work is **materially complete**. Kernel patch is 574 lines, base/acpid patch -is 558 lines. +P0 ACPI boot-baseline work is **materially complete for the historical boot goal**. It should not +be read as release-grade ACPI completeness; ownership cleanup, sleep-state support, and bounded +bare-metal validation still remain open. Kernel patch is 574 lines, base/acpid patch is 558 lines. ## Crash Reports @@ -46,7 +47,7 @@ access. | RSDT/XSDT | `acpi/rsdt.rs`, `acpi/xsdt.rs` | N/A | Root table pointer iteration + SDT checksum validation | | MADT (APIC) | `acpi/madt/` | N/A | xAPIC + x2APIC (type 0x9) + NMI (0x4, 0xA) + address override (0x5) | | HPET | `acpi/hpet.rs` | N/A | Assumes single HPET | -| DMAR (Intel VT-d) | N/A | `acpi/dmar/` (present, not wired) | DMAR table parsing present in `dmar/mod.rs` but not initialized at acpid startup; effectively owned by `iommu` daemon. Iterator bug fixed, re-enabled, safe on AMD (early return) | +| DMAR (Intel VT-d) | N/A | `acpi/dmar/` (present, not wired) | DMAR parsing code remains in `dmar/mod.rs` but is not initialized at `acpid` startup. Ownership is still transitional/orphaned from `acpid`, not cleanly transferred to a real Intel runtime owner. Iterator bug fixed, re-enabled, safe on AMD (early return) | | FADT | N/A | `acpi.rs` | Full: PM1a/b CNT, reset register, `\_S5` sleep types, GenericAddress I/O | | Power Methods | N/A | `acpi.rs` | `\_PS0`/`\_PS3`/`\_PPC` AML evaluation for device power control | | SPCR | `acpi/spcr.rs` | N/A | ARM64 serial console | @@ -104,7 +105,7 @@ platforms, not just AMD. | `acpi.rs` | 643 | Handle SLP_TYPb for sleep states | Upstream | Mainline power management | Open (known gap) | | `aml_physmem.rs` | 418,423,428 | Mutex create/acquire/release | Upstream | Mainline AML interpreter | **Partially addressed** — real tracked state implemented, not placeholder | | `ec.rs` | 193+ (8 occurrences) | Proper error types | Upstream | Mainline EC handler | **Partially addressed** — widened accesses implemented via byte transactions | -| `dmar/mod.rs` | 7 | Move DMAR to separate driver | Upstream | Mainline driver refactor | **Partially addressed** — DMAR module present but not wired into startup; effectively deferred to `iommu` daemon | +| `dmar/mod.rs` | 7 | Move DMAR to separate driver | Upstream | Mainline driver refactor | **Partially addressed** — DMAR module present but not wired into startup; ownership remains transitional/orphaned rather than cleanly moved | | `main.rs` | — | Startup panic/expect handling | Local | Boot-path hardening | **Addressed** — typed `StartupError` enum with explicit error messages and clean exit paths | ## P0 Fixes Applied @@ -137,7 +138,7 @@ platforms, not just AMD. |---|-----|-------------| | 1 | DMAR iterator fix | `type_bytes` renamed to `len_bytes` bug fix + `len < 4` guard | | 2 | DMAR init re-enabled | Safe on AMD (no DMAR table = early return, no crash) | -| 3 | DMAR not wired into acpid startup | DMAR module present in `dmar/mod.rs` but not imported or called from `main.rs`; effectively deferred to `iommu` daemon ownership | +| 3 | DMAR not wired into acpid startup | DMAR module present in `dmar/mod.rs` but not imported or called from `main.rs`; this removes active startup ownership from `acpid`, but does not yet establish a clean Intel runtime owner | | 4 | FADT shutdown | `acpi_shutdown()` using PM1a/PM1b CNT_BLK writes with `\_S5` sleep types | | 5 | FADT reboot | `acpi_reboot()` using ACPI reset register via GenericAddress | | 6 | Keyboard controller fallback | `Pio::::new(0x64).write(0xFE)` when reset_reg unavailable | diff --git a/local/docs/ACPI-IMPROVEMENT-PLAN.md b/local/docs/ACPI-IMPROVEMENT-PLAN.md index e82b7635..bdf37e46 100644 --- a/local/docs/ACPI-IMPROVEMENT-PLAN.md +++ b/local/docs/ACPI-IMPROVEMENT-PLAN.md @@ -1,383 +1,330 @@ # Red Bear OS ACPI Improvement Plan +## Truth Statement + +Red Bear ACPI is **boot-baseline complete for the historical P0 bring-up goal**, but it is **not +release-grade complete**. + +What is real today: + +- kernel early discovery and MADT/xAPIC/x2APIC bring-up are in place, +- `acpid` owns FADT shutdown/reboot, AML execution, DMI exposure, and ACPI power exposure, +- IVRS/AMD-Vi ownership moved out of the broken `acpid` path and into `iommu`, +- `kstop` shutdown eventing exists and is integrated with `redbear-sessiond`. + +What is still open: + +- sleep-state support beyond `\_S5`, +- AML portability and runtime robustness on real firmware, +- clean ownership boundaries across kernel / `acpid` / IOMMU, +- bounded real-hardware validation on AMD, Intel, and at least one EC-backed platform. + +This document is therefore a **ULW execution plan** for turning the current ACPI stack from +historical bring-up success into a subsystem that is honest, maintainable, and release-grade. + ## Purpose -This document turns the current ACPI assessment into a concrete execution plan. +This plan does **not** replace `local/docs/ACPI-FIXES.md`. -It does **not** replace `local/docs/ACPI-FIXES.md`. That file remains the historical record for the -P0 bring-up work and the current table-by-table status snapshot. This document is the forward-looking -plan for improving **completeness**, **robustness**, **ownership clarity**, **consumer integration**, -and **validation quality**. +- `local/docs/ACPI-FIXES.md` remains the historical ledger for P0 ACPI bring-up and the current + table-by-table implementation snapshot. +- This file is the forward execution plan for closing the remaining ACPI gaps in correctness, + ownership clarity, consumer integration, and validation trust. -The goal is not to treat ACPI as a generic checklist of table parsers. The goal is to make the Red -Bear ACPI stack: +The goal is not to maximize the number of parsed ACPI tables. The goal is to make the Red Bear ACPI +stack: -- correct enough to survive bad firmware, -- clear enough that ownership boundaries stay maintainable, -- observable enough that failures are diagnosable, -- and validated enough that "complete" means more than "boots on one machine". +- correct under bad firmware, +- explicit about who owns what, +- observable when it fails, +- and validated enough that status claims are evidence-backed rather than inferred. ## Scope -This plan covers the Red Bear ACPI stack and its direct dependency chain: +This plan covers the Red Bear ACPI stack and its direct consumers: -- kernel ACPI discovery and early table handling, -- `acpid` as the main ACPI/AML/FADT/DMI/power daemon, -- `iommu` as the IVRS/AMD-Vi runtime owner, -- `pcid` / `/config` as the MCFG replacement path, +- kernel ACPI discovery and early platform setup, +- `acpid` as the main ACPI / AML / FADT / DMI / power daemon, +- `iommu` as the IVRS / AMD-Vi runtime owner, +- `pcid` and `/config` as the PCI config-space path replacing broken MCFG-in-`acpid` stubs, - DMI-backed quirks flowing through `acpid` and `redox-driver-sys`, -- ACPI-consuming services such as `redbear-sessiond` and `redbear-info`. +- ACPI consumers such as `redbear-sessiond`, `redbear-info`, and downstream services. -Primary focus is the current x86_64 path, because that is the active Red Bear hardware target and the -area where the current implementation and validation debt is concentrated. ARM64 ACPI support remains -in scope only where kernel ownership decisions or generic parser quality would affect it. - -## Evidence Model - -This plan uses five evidence buckets and does **not** treat them as equivalent: - -- **source-visible** — behavior is visible in the current checked-in source tree -- **patch-carried** — behavior exists through `local/patches/*` rather than plain upstream source -- **build-visible** — code compiles and stages successfully in the current build -- **runtime-validated** — behavior has been exercised successfully in real boot/runtime paths -- **negative-result-documented** — failure modes and platform gaps are recorded explicitly - -This matters because the current ACPI stack has already crossed the bring-up threshold, but still has -meaningful gaps between **implemented**, **robust**, and **trusted**. - -## Ownership Model - -The long-term ownership split should be: - -- **Kernel ACPI** — minimum early discovery and unavoidable early platform setup -- **`acpid`** — ACPI table serving, AML execution, FADT power/reboot logic, DMI exposure, - power-state exposure, ACPI table quirk filtering -- **`iommu` daemon** — IVRS runtime parsing and AMD-Vi controller ownership -- **future Intel IOMMU owner** — DMAR runtime handling, not `acpid` -- **`pcid`** — PCI config space access replacing broken MCFG-in-acpid stubs -- **consumers** — query ACPI-exposed services; do not parse ACPI firmware directly unless they are - the designated owner - -This ownership split is **not fully enforced today**. The plan below is designed to move the current -tree from transitional ownership to explicit ownership without destabilizing the working bring-up -path. - -## Current State Summary - -### What is strong today - -- Kernel RSDP/RSDT/XSDT/MADT handling exists and is sufficient for current boot bring-up. -- `acpid` owns FADT parsing, AML integration, DMI exposure, and ACPI-backed power state exposure. -- IVRS was correctly removed from the broken `acpid` stub path and moved to the `iommu` daemon. -- MCFG ownership was correctly removed from `acpid` and replaced with the `pcid /config` path. -- DMI-backed quirks are integrated through `/scheme/acpi/dmi` and `redox-driver-sys`. -- `acpid` startup uses typed `StartupError` with explicit error messages and clean exit paths (Wave 1 - boot-path hardening partially complete). -- AML mutex state has real tracked implementation with handle-based acquire/release semantics in - `aml_physmem.rs` (Wave 2 AML mutex work partially complete). -- EC access width is handled via `read_bytes`/`write_bytes` byte-transaction sequences for u16/u32/u64 - accesses (Wave 2 EC width work partially complete). -- DMAR table parsing module exists in `acpid` but is not wired into the startup path; DMAR ownership - is effectively deferred to the `iommu` daemon (Wave 3 DMAR separation partially complete). -- Shutdown eventing uses `/scheme/kernel.acpi/kstop` as the kernel-to-userspace shutdown signal; - `redbear-sessiond` listens on this path for `PrepareForShutdown` D-Bus signals. -- Kernel registers the `kstop` scheme at boot and ACPI subsystem shutdown uses PM1a/PM1b CNT writes - with `\_S5` sleep types. - -### What is still weak today - -- Sleep state transitions (`\_Sx` methods beyond `\_S5`) and sleep eventing remain unsupported; there is - no `/scheme/acpi/sleep` or event-driven sleep contract. -- AML opregion error propagation still has some silent failure paths; not all correctness-critical - reads return error to caller. -- `AmlSymbols` initialization order is still tied to PCI FD registration timing; AML initialization - is not fully deterministic. -- `SLP_TYPb` handling remains unimplemented for sleep states beyond `\_S5`. -- DMAR table parsing module is present but unused; the module itself has not been removed, creating - latent confusion about ownership. -- Docs still risk equating "implemented" with "validated" without explicit evidence qualification. - -### Honest status statement - -Red Bear ACPI is **materially complete for the historical P0 boot goal**, but it is **not yet complete -for robustness, ownership cleanliness, sleep state support, or broad platform confidence**. Sleep -state eventing is a known gap. The shutdown eventing contract via `kstop` is implemented but only -validated in QEMU; bare-metal validation is still outstanding. +Primary focus is the current `x86_64` path. ARM64 remains in scope only where parser quality or +kernel-ownership decisions are shared. ## Canonical Related Documents Read these alongside this plan: -- `local/docs/ACPI-FIXES.md` — current status ledger and historical P0 fixes -- `local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md` — controller-level validation and - quality context -- `local/docs/IOMMU-SPEC-REFERENCE.md` — IVRS/DMAR technical reference -- `local/docs/QUIRKS-SYSTEM.md` — DMI-backed quirks and ACPI table blacklist behavior -- `local/docs/AMD-FIRST-INTEGRATION.md` — historical AMD-first framing and hardware context -- `local/docs/BAREMETAL-LOG.md` — real-machine failure notes and negative results +- `local/docs/ACPI-FIXES.md` +- `local/docs/BAREMETAL-LOG.md` +- `local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md` +- `local/docs/IOMMU-SPEC-REFERENCE.md` +- `local/docs/QUIRKS-SYSTEM.md` +- `docs/02-GAP-ANALYSIS.md` -## Work Classification +## Evidence Model -Every task in this plan is tagged by its main purpose: +This plan uses five evidence buckets and does **not** treat them as equivalent: -- **Completeness** — functionality exists but is still missing or partial -- **Robustness** — behavior exists but is too fragile under bad firmware or runtime stress -- **Quality** — ownership, observability, maintainability, or docs are below target +- **source-visible** — behavior is visible in the checked-in source tree +- **patch-carried** — behavior exists through `local/patches/*` +- **build-visible** — code compiles and stages in the current build +- **runtime-validated** — behavior has been exercised successfully in boot or runtime +- **negative-result-documented** — failures and platform gaps are explicitly recorded + +This distinction matters because the current ACPI stack has already crossed the bring-up threshold, +but still has meaningful distance between **implemented**, **robust**, and **trusted**. + +## Status Vocabulary + +All ACPI status claims in Red Bear docs should use one of these meanings: + +- **implemented** — present in code today +- **validated in QEMU** — exercised in QEMU / OVMF only +- **validated on bounded real hardware** — proven on named tested hardware only +- **transitional** — exists, but ownership or architecture is still not clean +- **known gap** — absent, incomplete, or intentionally deferred and documented + +Do **not** use a bare “complete” claim without also saying whether it means boot-baseline, +bounded-hardware, or release-grade completeness. + +## Current State Summary + +### Strong today + +- Kernel RSDP / RSDT / XSDT / MADT handling is sufficient for current boot bring-up. +- `acpid` owns FADT parsing, AML integration, DMI exposure, and ACPI-backed power-state exposure. +- `acpid` startup uses typed `StartupError` and clean exits for several boot-critical failure paths. +- AML mutex state is real-tracked in `aml_physmem.rs`, not placeholder-only. +- EC width access is implemented via byte-transaction sequences for widened reads and writes. +- IVRS ownership was removed from the broken `acpid` stub path and moved into the `iommu` daemon. +- MCFG handling was removed from `acpid` and replaced with the `pcid /config` path. +- Shutdown eventing via `/scheme/kernel.acpi/kstop` is implemented and consumed by + `redbear-sessiond`. + +### Weak today + +- Sleep-state transitions beyond `\_S5` are unsupported. +- Sleep eventing is unsupported. +- `SLP_TYPb` remains incomplete for broader sleep-state handling. +- AML init order is still tied to PCI FD registration timing. +- Some physmem / opregion failure paths are still not explicit enough. +- DMAR remains orphaned in `acpid` source: present, not wired, not fully transferred. +- Repo status language can still blur “implemented” vs “validated”. +- Bare-metal validation is too thin to justify release-grade claims. + +## Ownership Model + +The long-term ownership split should be: + +| Component | Intended owner | Current status | +|---|---|---| +| RSDP / RSDT / XSDT early discovery | Kernel | implemented | +| MADT / HPET / early unavoidable platform setup | Kernel | implemented, broader scope still transitional | +| FADT parsing, `\_S5`, PM register writes, reboot | `acpid` | implemented | +| AML execution and opregion handling | `acpid` | implemented, robustness still partial | +| DMI exposure and ACPI power surfaces | `acpid` | implemented | +| IVRS / AMD-Vi runtime handling | `iommu` | implemented | +| DMAR / Intel VT-d runtime handling | future Intel IOMMU owner | transitional / not fully assigned | +| PCI config-space access | `pcid` | implemented | +| ACPI consumers | downstream services | should consume ACPI-owned surfaces, not firmware directly | + +Important ownership truth: + +- **DMAR is not cleanly transferred today.** +- The `acpi/dmar/mod.rs` module still exists inside `acpid` source, but is not wired into startup. +- `iommu` is the real IVRS runtime owner today. +- Do **not** describe Intel DMAR ownership as fully complete until the orphaned `acpid` carrier is + removed or a real Intel runtime owner is implemented and validated. + +## Degraded-Mode Contract + +The ACPI stack must distinguish between **fatal**, **degradable**, and **out-of-scope** failures. + +| Condition | Expected behavior today | Classification | +|---|---|---| +| ACPI absent / empty root table | `acpid` exits cleanly without ACPI services | degradable | +| Bad SDT checksum | warn, continue best-effort where supported | degradable | +| Bad table length / malformed table | behavior varies too much today; must be normalized | open contract | +| AML init failure | `acpid` exits, ACPI scheme unavailable | currently fatal | +| EC timeout | AML error path should surface failure, not fabricate success | degradable | +| Missing `\_S5` | shutdown path cannot use PM registers | degradable if fallback exists | +| Sleep-state transition request | unsupported today | known gap | +| Missing `kstop` path | no kernel-orchestrated shutdown event contract | fatal for that integration path | +| Missing DMAR on Intel | no Intel VT-d runtime | degradable for non-IOMMU boot | +| Missing IVRS on AMD | no AMD-Vi runtime | degradable for non-IOMMU boot | + +Wave 1 must convert the still-fuzzy cases into explicit, table-specific policy. + +## ULW Execution Rules + +These rules govern all work from this plan: + +1. **No hidden status inflation.** Status words must match evidence. +2. **No ownership moves without a handoff contract.** “Not wired” is not the same as “cleanly moved.” +3. **No validation laundering.** QEMU success is not bare-metal success. +4. **No Wave 5 shortcuts.** Validation cannot substitute for unfinished architecture. +5. **No cross-wave dependency drift.** Later waves must not silently depend on work that was never + formalized in earlier waves. ## Wave 0 — Contracts, truthfulness, and degraded-mode policy ### Goal -Stop treating ACPI as a loose cluster of working code and instead define: +Establish one canonical answer to: 1. who owns what, -2. which failures are fatal versus degradable, -3. and what status words in the docs actually mean. +2. what counts as degraded but acceptable, +3. and what ACPI status words mean. ### Why this wave is first -Without an explicit contract, later hardening work turns into hidden rewrites and docs drift. +Without a contract, all later hardening work turns into undocumented rewrites and docs drift. -### Scope +### Primary files - `local/docs/ACPI-FIXES.md` -- `local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md` - this file -- related references in `README.md`, `docs/02-GAP-ANALYSIS.md`, and `AGENTS.md` if needed +- `docs/02-GAP-ANALYSIS.md` +- `README.md` and related status surfaces if needed -### Status: Wave 0 execution partially complete +### Dependencies -Tasks 0.1, 0.2, and 0.3 are partially executed in this documentation pass. The degraded-mode matrix -and normalized vocabulary are new in this pass. Ownership boundaries are partially documented below; -the canonical statement still lives in `local/docs/ACPI-FIXES.md`. +- none -### Vocabulary normalization (Task 0.2 — partially executed) +### Deliverables -Replace ambiguous wording such as "complete" with one of: +- one normalized ACPI vocabulary, +- one degraded-mode contract, +- one canonical ownership statement, +- removal of doc language that implies subsystem completeness without evidence. -- **implemented** — behavior exists in the current source tree -- **validated in QEMU** — behavior has been exercised in QEMU/OVMF but not on real hardware -- **validated on bounded real hardware** — behavior verified on specific hardware that was tested -- **still transitional** — behavior exists but ownership or robustness is not yet clean -- **known gap** — functionality is absent or broken; the gap is documented +### Verification -### ACPI degraded-mode matrix (Task 0.1 — new) +- documentation review only, +- no contradictory ownership claims across ACPI docs, +- no bare “complete” wording without scope. -This matrix documents the expected system behavior for ACPI failure cases. All entries reflect -implemented behavior visible in the current source tree. +### Exit criteria -| Condition | Kernel behavior | Userspace (`acpid`) behavior | Session impact | -|-----------|----------------|----------------------------|----------------| -| Bad RSDP checksum | Warns, continues with best-effort RSDP parse | No ACPI init if RSDT/XSDT unreadable; exits cleanly | No ACPI services | -| Bad SDT checksum | Logs warning per table, continues | Table skipped; other tables still served | Reduced ACPI surface | -| Truncated FADT | FADT fields fall back to zero defaults | Uses zero defaults for PM registers; `acpi_shutdown` may not fire | Shutdown may fall back to keyboard controller | -| Truncated DMAR | N/A (DMAR not used by kernel) | Logs error, continues without DMAR; `iommu` daemon not started | No Intel IOMMU via DMAR | -| Truncated IVRS | N/A (IVRS not used by kernel) | No effect on `acpid` (IVRS owned by `iommu` daemon) | No AMD-Vi via IVRS | -| AML interpreter init failure | N/A | `acpid` exits with typed error; no ACPI scheme | No AML, no power methods | -| EC timeout | N/A | Returns `AmlError::MutexAcquireTimeout` to AML interpreter | AML opregion access fails gracefully | -| EC unsupported width access | N/A | Wider accesses split into byte transactions via `read_bytes`/`write_bytes` | Works on byte-access ECs only | -| Missing DMAR on Intel | N/A | `acpid` logs DMAR absent; `iommu` daemon not started | No Intel VT-d | -| Missing IVRS on AMD | N/A | No effect (IVRS owned by `iommu` daemon) | No AMD-Vi | -| Missing `\_S5` sleep types | N/A | `acpi_shutdown` logs error and returns without writing PM registers | Shutdown may fall back to keyboard controller | -| Missing `/scheme/kernel.acpi/kstop` | Kernel does not register kstop scheme | `acpid` exits with error on startup | No kernel-orchestrated shutdown | -| Sleep state transition (`\_Sx`) | N/A | Not implemented; no event-driven sleep contract | Sleep states not available | -| `redbear-sessiond` shutdown watcher | Kernel signals `kstop` on shutdown | `acpi_watcher.rs` reads kstop and emits D-Bus `PrepareForShutdown` | Login1 session manager informed | +- one canonical ownership statement exists, +- one degraded-mode matrix exists, +- all top-level ACPI docs use the same vocabulary. -### Ownership boundaries (Task 0.3 — partially documented) +### Current status -This section documents the current ownership split as visible in the source tree. Items marked -**transitional** indicate the ownership boundary is not yet fully enforced by code. - -| Component | Owner | Status | -|-----------|-------|--------| -| Early table discovery (RSDP, RSDT, XSDT) | Kernel | implemented | -| MADT, HPET, SPCR, GTDT parsing | Kernel | implemented | -| FADT parsing, `\_S5` sleep types, PM registers | `acpid` | implemented | -| AML interpreter initialization and execution | `acpid` | implemented | -| EC access (byte-wide and widened via byte transactions) | `acpid` | implemented | -| AML mutex state tracking | `acpid` (`aml_physmem.rs`) | implemented (real tracked state, not placeholder) | -| FADT shutdown/reboot via PM1a/PM1b CNT | `acpid` | implemented | -| Keyboard controller fallback reboot | `acpid` | implemented | -| DMAR table parsing | `acpid` (module present) | **transitional** — module not wired; effectively owned by `iommu` daemon | -| IVRS ownership | `iommu` daemon | implemented | -| MCFG/PCI config space | `pcid` `/config` endpoint | implemented | -| DMI exposure and quirks | `acpid` via `/scheme/acpi/dmi` | implemented | -| Power methods (`\_PS0`/`\_PS3`/`\_PPC`) | `acpid` | implemented | -| Sleep state transitions (`\_Sx` beyond `\_S5`) | none | **known gap** | -| Sleep eventing | none | **known gap** | -| Shutdown event via `kstop` | Kernel + `acpid` + `redbear-sessiond` | implemented (QEMU-validated; bare-metal validation outstanding) | -| DMAR runtime ownership (Intel VT-d) | `iommu` daemon | **transitional** — not yet fully separated from `acpid` DMAR module | - -### Acceptance criteria - -- one canonical ownership statement exists — **partially met** (this table, plus `ACPI-FIXES.md`) -- one degraded-mode matrix exists — **met** (this pass) -- all high-level ACPI status claims use the same vocabulary — **partially met** (normalized in this pass) - -### Validation - -- doc review only, -- no code changes required for vocabulary and matrix, -- Wave 0 should be treated as ongoing; new evidence may require matrix updates +- partially complete ## Wave 1 — Boot-path hardening and parser strictness ### Goal -Remove catastrophic or silent failure behavior from the boot-critical ACPI path. +Remove catastrophic or silent failure behavior from boot-critical ACPI initialization. -### Main files +### Primary files - `recipes/core/base/source/drivers/acpid/src/main.rs` - `recipes/core/base/source/drivers/acpid/src/acpi.rs` - `recipes/core/base/source/drivers/acpid/src/scheme.rs` - `recipes/core/kernel/source/src/acpi/mod.rs` -- kernel ACPI submodules for RSDP/RSDT/XSDT/MADT/HPET/SPCR/GTDT +- kernel ACPI submodules as needed -### Status: Task 1.1 partially executed +### Dependencies -`acpid` main.rs now uses a typed `StartupError` enum covering: +- Wave 0 ownership and degraded-mode vocabulary in place -- `ReadRootTable` — failed to read `/scheme/kernel.acpi/rxsdt` -- `ParseRootTable` — failed to parse `[R|X]SDT` -- `UnexpectedRootTableSignature` — wrong root table signature -- `MalformedRootTableEntries` — malformed entry area -- `InitializeAcpi` — failed ACPI context init -- port I/O rights acquisition failure -- shutdown pipe open failure (`/scheme/kernel.acpi/kstop`) -- event queue creation failure -- scheme socket creation failure -- event queue subscription failure -- scheme registration failure +### Deliverables -Each failure path logs a human-readable error and calls `std::process::exit(1)`. No `panic!` -remains on these paths. Empty RSDT (no ACPI) causes a clean `exit(0)` after `daemon.ready()`. +- startup paths are typed and explicit, +- table rejection policy is documented per table class, +- parser observability is strong enough to reconstruct failures, +- degraded boot succeeds for all conditions classified as degradable. -Tasks 1.2 and 1.3 remain open. +### Specific tasks -### Tasks +1. Finish replacing panic-grade startup behavior in active firmware-origin paths. +2. Define table-specific reject / warn / degrade / fail rules. +3. Log accepted and rejected tables with enough evidence to debug failures. -#### Task 1.1 — Replace panic-grade startup failures in `acpid` — **partially done** +### Verification -Typed `StartupError` enum is implemented in `main.rs`. The following failure classes are now handled: - -- hard fail with typed error message and exit code 1, -- soft fail with degraded behavior (ACPI absent → clean exit 0), -- or early clean exit when `/scheme/kernel.acpi/rxsdt` is empty. - -#### Task 1.2 — Make table rejection policy explicit — **open** - -For kernel and `acpid` table use, define when a bad length/checksum/revision: - -- is logged and ignored, -- is logged and downgraded, -- or is fatal. - -This policy must be table-specific, not one global "warn and continue" convention. - -#### Task 1.3 — Improve parser observability — **open** - -Every accepted or rejected table should leave enough evidence to reconstruct why: - -- table signature, -- physical address if known, -- length/revision/checksum status, -- consumer that requested it, -- fallback path chosen. - -### Acceptance criteria - -- no `panic!/expect()` remains on firmware-origin or optional-service startup paths in `acpid` — **partially met** (Tasks 1.2 and 1.3 still open), -- malformed-table decisions are deterministic and documented — **open**, -- degraded boot still succeeds in all cases classified as degradable by Wave 0 — **open**. - -### Validation - -- negative tests for malformed checksums and table lengths, +- malformed checksum / truncated-length tests, - QEMU validation with intentionally damaged tables if feasible, -- one AMD and one Intel bounded hardware boot recheck, -- evidence captured in `local/docs/BAREMETAL-LOG.md` or a successor log. +- one bounded AMD hardware boot recheck, +- one bounded Intel hardware boot recheck, +- evidence captured in `local/docs/BAREMETAL-LOG.md` or its successor. + +### Exit criteria + +- no unjustified `panic!/expect()` remains on firmware-origin startup paths, +- malformed-table decisions are deterministic and documented, +- degraded boot behavior matches Wave 0 classification. + +### Current status + +- partially complete ## Wave 2 — AML, opregions, EC, and power-state correctness ### Goal -Close the biggest runtime-correctness gaps in the ACPI stack. +Close the biggest runtime-correctness gaps in the `acpid` layer. -### Main files +### Primary files - `recipes/core/base/source/drivers/acpid/src/acpi.rs` - `recipes/core/base/source/drivers/acpid/src/aml_physmem.rs` - `recipes/core/base/source/drivers/acpid/src/ec.rs` -### Status: Tasks 2.1, 2.2, and 2.5 partially executed +### Dependencies -#### Task 2.1 — Remove placeholder AML mutex behavior — **partially done** +- Wave 1 startup paths hardened enough that runtime work is not sitting on a fragile base -`AmlMutexState` in `aml_physmem.rs` implements real tracked state: +### Deliverables -- `AmlMutexState::create_handle()` generates unique handles via incrementing `next_handle` -- `AmlMutexState::states` is a `FxHashMap` tracking locked/unlocked state -- `lock_aml_mutexes()` wraps the state map with proper `Mutex` guard and poisoned-state recovery -- The `acquire()` method looks up the handle in the map, sets it to `true` on success, and returns - `AmlError::MutexAcquireTimeout` on timeout or unknown handle +- real AML synchronization semantics, +- explicit physmem / opregion failure behavior, +- deterministic AML init order, +- explicit sleep-state scope, +- honest EC behavior bounds. -This is no longer a placeholder implementation. Remaining work: timeout semantics documentation -and concurrent acquire/release stress testing. +### Specific tasks -#### Task 2.2 — Eliminate silent zero-on-failure physical reads — **partially done** +1. Document and stress AML mutex timeout semantics. +2. Remove silent correctness-critical physmem failure paths. +3. Finish `AmlSymbols` initialization contract; stop tying AML readiness to fragile PCI timing. +4. Decide whether sleep support is in-scope now or explicitly deferred. +5. If in-scope now, implement and validate the missing sleep-state pieces, including `SLP_TYPb`. -EC reads via `read_bytes` now propagate `AmlError::MutexAcquireTimeout` on EC timeout rather than -returning zero. Kernel-physmem reads still have some silent failure paths; this task is not fully -closed. - -#### Task 2.5 — Decide and validate EC width behavior — **partially done** - -`ec.rs` now implements `read_u16`, `read_u32`, `read_u64`, `write_u16`, `write_u32`, and `write_u64` -on `Ec` via byte-transaction sequences in `read_bytes`/`write_bytes`: - -- `ensure_access()` validates the access fits in a u8 addressable range -- `read_bytes` loops over individual byte reads with timeout per byte -- `write_bytes()` loops over individual byte writes with timeout per byte - -Wider accesses are emulated through byte transactions rather than being rejected. This is implemented -behavior, not a placeholder. Validation on real EC hardware remains outstanding. - -### Tasks still open - -#### Task 2.3 — Finish `AmlSymbols` initialization contract — **open** - -`AmlSymbols` initialization order is still tied to PCI FD registration timing. AML initialization -is not fully deterministic. The upstream TODO to "use these parsed tables for the rest of acpid" -remains. - -#### Task 2.4 — Fix power-state completeness gaps — **open** - -`SLP_TYPb` handling remains unimplemented. Sleep state transitions beyond `\_S5` are not supported. -Sleep eventing is not implemented. These are documented as known gaps. - -### Acceptance criteria - -- AML synchronization is no longer placeholder-driven — **partially met** (2.1 done; 2.3 open), -- physmem failures do not silently fabricate correctness-critical values — **partially met** (EC done; kernel-physmem still open), -- AML initialization order is reproducible and documented — **open** (Task 2.3), -- sleep-state handling is explicit for both implemented and out-of-scope states — **open** (Task 2.4), -- EC behavior is either implemented or honestly bounded — **met** (byte transactions for wider widths) - -### Validation +### Verification - targeted AML method execution tests, -- shutdown/reboot proof on QEMU and bounded real hardware, -- EC timeout/error-path tests where possible, -- concurrent ACPI scheme reads while AML methods run. +- shutdown / reboot proof in QEMU and bounded hardware, +- EC timeout and error-path tests, +- concurrent ACPI scheme reads while AML methods run, +- at least one EC-backed platform check if available. -## Wave 3 — Ownership cleanup: reduce kernel ACPI scope and remove DMAR from `acpid` +### Exit criteria + +- AML synchronization is no longer placeholder-grade, +- physmem failures do not silently fabricate correctness-critical values, +- AML initialization order is reproducible and documented, +- sleep-state handling is either implemented or explicitly bounded as a known gap, +- EC behavior is implemented or honestly constrained. + +### Current status + +- partially complete + +## Wave 3 — Ownership cleanup and kernel-surface reduction ### Goal -Move from transitional ownership to architecture that is easier to maintain. +Move from transitional ownership to an architecture that can survive long-term maintenance. -### Main files +### Primary files - `recipes/core/kernel/source/src/acpi/mod.rs` - kernel ACPI submodules as needed @@ -385,203 +332,239 @@ Move from transitional ownership to architecture that is easier to maintain. - `recipes/core/base/source/drivers/acpid/src/scheme.rs` - `local/recipes/system/iommu/source/src/*` -### Status: Tasks 3.1 and 3.2 partially executed +### Dependencies -#### Task 3.1 — Define the minimum kernel ACPI surface — **open** +- Wave 1 and Wave 2 are at least partially complete -The kernel still carries TODOs for kernel ACPI scope reduction. No staged migration contract has -been written yet. +### Deliverables -#### Task 3.2 — Move DMAR to the correct owner — **partially done** +- a minimum kernel ACPI contract, +- explicit handoff paths for table discovery and topology, +- DMAR no longer orphaned in `acpid`, +- ownership wording that matches the code. -The `acpi/dmar/mod.rs` module remains present in `acpid` source but is not imported or called from -`main.rs` startup. The DMAR parsing code itself is not executed at daemon startup. However, the -module has not been removed from the source tree, creating latent confusion about ownership. +### Specific tasks -The `iommu` daemon is responsible for IVRS/DMAR runtime handling. DMAR is not initialized by -`acpid`. The exit path from `acpid` for DMAR is therefore effectively achieved, but the cleanup -(task: remove the unused module or move it to the `iommu` crate) is not complete. +1. Define the minimum kernel ACPI surface that must remain in early boot. +2. Document the userspace handoff contract for topology and table consumers. +3. Remove or relocate the orphaned DMAR carrier in `acpid`. +4. Do not claim Intel DMAR runtime ownership complete unless a real owner exists and is validated. -#### Task 3.3 — Ensure handoff paths are explicit — **open** +### Verification -Handoff paths for table discovery and CPU/topology are not yet documented as a staged migration -contract. +- before / after boot regressions, +- Intel-specific validation for any DMAR ownership move, +- AMD regression checks showing IVRS ownership remains isolated in `iommu`. -### Acceptance criteria +### Exit criteria -- the minimal kernel ACPI contract is written down — **open**, -- DMAR has a concrete exit path from `acpid` — **partially met** (not wired; module still present), -- ownership reductions are staged and do not break current bring-up — **open** +- the minimum kernel ACPI contract is written down, +- DMAR has a concrete, non-ambiguous owner or is explicitly deferred, +- ownership reductions do not regress current bring-up. -### Validation +### Current status -- before/after boot regressions, -- Intel-specific validation for DMAR path changes, -- AMD regression checks proving IVRS ownership remains isolated in `iommu`. +- partially complete ## Wave 4 — Consumer integration and eventing quality ### Goal -Make ACPI consumers correct and low-friction, not just functional. +Make ACPI consumers correct, observable, and low-friction. -### Main files +### Primary files - `local/recipes/system/redbear-sessiond/source/src/acpi_watcher.rs` - `recipes/core/base/source/drivers/acpid/src/scheme.rs` -- DMI/quirk consumers under `redox-driver-sys` and reporting surfaces +- DMI / quirk consumers in `redox-driver-sys` +- reporting surfaces such as `redbear-info` -### Status: Task 4.1 partially executed; sleep eventing still a gap +### Dependencies -#### Task 4.1 — Replace polling-based ACPI state consumption — **partially done** +- Wave 2 runtime behavior is stable enough for downstream consumers to depend on it -Shutdown eventing is now event-driven via `/scheme/kernel.acpi/kstop`: +### Deliverables -- `acpid` opens `kstop` at startup and subscribes to it via `RawEventQueue` -- When the kernel triggers shutdown, `acpid` receives an event on the `kstop` file descriptor -- `redbear-sessiond`'s `acpi_watcher.rs` opens `kstop` and reads one byte in a blocking - `spawn_blocking` call, then emits D-Bus `PrepareForShutdown(true)` signal +- event-driven core power-session behavior where feasible, +- bounded DMI quirk authority, +- operator-facing observability strong enough to diagnose behavior, +- explicit treatment of unsupported sleep eventing if it remains deferred. -Sleep eventing (`\_Sx` transitions) remains unsupported. There is no `/scheme/acpi/sleep` surface -and no event-driven sleep contract. This is a known gap. +### Specific tasks -#### Task 4.2 — Bound DMI quirk authority — **open** +1. Keep shutdown eventing on `kstop` as the canonical shutdown signal. +2. Improve consumer-facing observability for ACPI state and failures. +3. Define DMI quirk precedence and limits. +4. If sleep eventing remains out-of-scope, document that explicitly and consistently. -#### Task 4.3 — Improve operator-facing observability — **open** +### Verification -### Acceptance criteria - -- no periodic polling remains for core ACPI power/session transitions if eventing is feasible — **partially met** (shutdown done; sleep still polling/absent), -- quirk precedence is documented — **open**, -- consumer-visible behavior is diagnosable from logs and status outputs — **open** - -### Validation - -- repeated shutdown/sleep edge tests, +- repeated shutdown edge tests, +- sleep-edge tests if sleep work is in scope, - DMI quirk application checks on known systems, - race checks with multiple simultaneous consumers of `/scheme/acpi/*`. -## Wave 5 — Validation closure and release gate +### Exit criteria + +- no unnecessary polling remains for core ACPI transitions where eventing is feasible, +- quirk precedence is documented, +- consumer-visible behavior is diagnosable from logs and status outputs. + +### Current status + +- partially complete + +## Wave 5 — Validation closure and release gates ### Goal -Convert the current implementation from bring-up evidence into release-grade trust. +Turn the current ACPI stack from bring-up evidence into release-grade trust. -### Validation matrix +### Dependencies -At minimum, require: +- Waves 1 through 4 have produced stable behavior worth validating -- QEMU/OVMF boot with ACPI active, -- modern AMD hardware, -- modern Intel hardware, +### Required validation matrix + +At minimum: + +- QEMU / OVMF boot with ACPI active, +- one modern AMD machine, +- one modern Intel machine, - one platform that exercises EC-backed AML behavior, - malformed-table or degraded-mode evidence where feasible. -### Tasks +### Deliverables -#### Task 5.1 — Publish a platform matrix +- a bounded platform matrix, +- negative-result capture, +- explicit release gates for both boot-baseline and full ACPI claims, +- docs that distinguish implemented from validated. -For each validated platform, record: +### Specific tasks -- firmware mode, -- key ACPI tables detected, -- APIC mode, -- whether shutdown/reboot worked, -- whether DMI and power exposure worked, -- whether any AML/EC failure was observed. +1. Publish the platform matrix in `local/docs/BAREMETAL-LOG.md` or its successor. +2. Record for each platform: firmware mode, key ACPI tables, APIC mode, shutdown / reboot, + DMI / power exposure, AML / EC failures, and notable degraded behavior. +3. Preserve negative results such as unsupported AML opcodes or platform-specific regressions. +4. Require evidence before any stronger ACPI completeness claim is made. -#### Task 5.2 — Capture negative results +### Verification -Do not hide unsupported AML opcodes, partial EC behavior, or platform-specific regressions behind a -generic "works on tested hardware" label. +- repeated QEMU proof, +- bounded repeated bare-metal proof on AMD and Intel, +- one EC-heavy platform check, +- cross-check docs so claims match recorded evidence. -#### Task 5.3 — Define the ACPI release gate +### Exit criteria -Before calling ACPI complete for current Red Bear goals, require: - -- clean boot on the bounded matrix, -- explicit degraded-mode behavior for known bad firmware cases, -- documented ownership state, -- and current docs that distinguish implemented vs validated. - -### Acceptance criteria - -- one bounded but honest validation matrix exists, +- one bounded but honest platform matrix exists, - negative results are documented, -- ACPI status claims are tied to explicit evidence rather than inference. +- ACPI status claims are tied to explicit evidence, +- release gates are defined and followed. + +### Current status + +- open + +## Release Gates + +### Gate A — Boot-Baseline ACPI Ready + +This is the strongest claim the repo can make before sleep and broader ownership cleanup are done. + +Require: + +- clean boot on bounded QEMU + AMD + Intel validation targets, +- working MADT / APIC initialization on those targets, +- shutdown / reboot proof where supported, +- explicit degraded behavior for known firmware-bad cases, +- current docs that distinguish implemented from validated. + +### Gate B — Full ACPI / Power-Management Ready + +Do **not** claim this until all of the following are true: + +- AML runtime behavior is stable across the bounded matrix, +- sleep-state scope is implemented and validated or explicitly excluded from the release claim, +- ownership boundaries are clean rather than merely transitional, +- consumer integration is observable and race-bounded, +- the platform matrix supports the stronger claim. ## Upstream vs Red Bear Work Split -### Upstream-first work +### Prefer upstream for -These are generic ACPI correctness or architecture issues and should be solved upstream whenever -possible, with temporary Red Bear patch carriers only if necessary: +- generic `acpid` startup hardening, +- AML mutex semantics, +- `SLP_TYPb` completion, +- EC error typing and generic width behavior, +- reuse of parsed tables inside `acpid`, +- DMAR leaving `acpid`, +- kernel ACPI scope reduction TODOs, +- generic parser quality in kernel ACPI modules. -- `acpid` startup hardening -- AML mutex semantics in `aml_physmem.rs` -- `SLP_TYPb` completion -- EC error typing and possibly EC access-width handling -- using parsed tables for the rest of `acpid` -- DMAR leaving `acpid` -- kernel ACPI scope reduction TODOs -- generic parser quality for kernel ACPI modules +### Red Bear owns -### Red Bear-owned work - -These remain Red Bear responsibilities even if upstream code improves: - -- honest status/phase documentation -- bounded validation matrix and operator runbooks -- `redbear-sessiond` event consumption quality -- DMI quirk governance and integration policy -- temporary durable patch carriers in `local/patches/*` -- coordination between `acpid`, `iommu`, `pcid`, and downstream consumers +- honest status language, +- bounded validation matrices and runbooks, +- `redbear-sessiond` shutdown-consumer quality, +- DMI quirk governance and integration policy, +- patch carriers in `local/patches/*`, +- coordination across `acpid`, `iommu`, `pcid`, and downstream consumers. ## Sequencing Constraints -1. **Wave 0 must come first** so later changes do not drift architecturally. -2. **Wave 1 must come before Wave 2** so runtime correctness sits on a hardened startup path. -3. **Wave 2 should come before Wave 4** because consumer contracts should depend on correct AML and - power behavior. -4. **Wave 3 should not start until Waves 1 and 2 are at least partially complete**; ownership moves - are dangerous if the runtime behavior is still fragile. -5. **Wave 5 closes the work**; it must not be used as a substitute for architecture. +1. **Wave 0 first** — architecture and wording must stop drifting. +2. **Wave 1 before Wave 2** — runtime correctness should not sit on fragile startup behavior. +3. **Wave 2 before Wave 4** — consumer contracts must rely on correct AML / power behavior. +4. **Wave 3 after Waves 1 and 2 are partially stable** — ownership moves are risky on unstable + behavior. +5. **Wave 5 last** — validation closes work; it does not replace architecture. ## Main Risks -- stricter parser/error handling may expose machines that currently boot only by luck, -- AML/EC changes may uncover hidden ordering assumptions with PCI registration, -- reducing kernel ownership too early may regress early platform bring-up, -- moving DMAR out of `acpid` may create Intel-only regressions if the replacement contract is vague, -- DMI quirks can become a crutch if they are allowed to override runtime facts indiscriminately. +- stricter parser behavior may expose machines currently booting only by luck, +- AML / EC changes may uncover hidden PCI-registration ordering assumptions, +- reducing kernel scope too early may regress early bring-up, +- careless DMAR cleanup may create Intel-only regressions, +- DMI quirks can become a crutch if allowed to override runtime facts indiscriminately. + +## Non-Goals + +- claiming sleep support that does not exist, +- calling DMAR ownership “complete” while the orphaned `acpid` module still exists, +- treating one-machine success as subsystem-level proof, +- using Wave 5 validation language to hide unfinished ownership work. ## Deliverable Order -If work from this plan is executed, the recommended order is: +Recommended order: -1. documentation and degraded-mode contract, +1. truth contract and doc normalization, 2. startup hardening, -3. AML/EC correctness, +3. AML / EC / power-state correctness, 4. ownership cleanup, -5. consumer/eventing quality, -6. validation closure. +5. consumer / eventing quality, +6. validation closure and release gate. -## Definition of Done for the Current ACPI Plan +## Definition of Done -This plan can be considered substantially complete only when: +This plan is substantially complete only when: -- ownership boundaries are explicit — **partially met** (this doc; module-level cleanup still needed) -- boot-critical panic/silent-fallback paths are removed or justified — **partially met** (Task 1.1 done; Tasks 1.2 and 1.3 open) -- AML and EC behavior are no longer TODO-grade — **partially met** (mutex state and EC width done; AML init order and SLP_TYPb open) -- DMAR and IVRS ownership are cleanly separated — **partially met** (DMAR not wired; module still present in acpid) -- ACPI consumers are event-driven or explicitly bounded — **partially met** (shutdown done via kstop; sleep not implemented) -- sleep state transitions and eventing are implemented or explicitly documented as known gaps — **open** -- the repo contains platform evidence that supports its status claims — **open** (QEMU validated; bare-metal evidence still needed) +- ownership boundaries are explicit and not contradicted by the code, +- boot-critical panic / silent-fallback paths are removed or justified, +- AML and EC behavior are no longer TODO-grade, +- DMAR and IVRS ownership are no longer described ambiguously, +- consumers are event-driven or explicitly bounded where eventing is not feasible, +- sleep-state handling is implemented or explicitly excluded from the release claim, +- the repo contains bounded platform evidence that supports every status claim. -Current truthful status for Red Bear ACPI: +## Current Truthful Status -> materially complete for historical bring-up, but still under active robustness, ownership, -> sleep-state, and validation improvement. Shutdown eventing is implemented via kstop. Sleep state -> transitions are a known gap. EC width support is implemented via byte transactions. AML mutex -> state is real-tracked, not placeholder. DMAR is not initialized by acpid. Bare-metal validation -> for the full ACPI surface is still outstanding. +> Red Bear ACPI is materially complete for historical boot bring-up, but still under active +> robustness, ownership, sleep-state, and validation improvement. Shutdown eventing is implemented +> via `kstop`. Sleep-state transitions remain a known gap. EC widened access is implemented via byte +> transactions. AML mutex state is real-tracked, not placeholder. DMAR is not initialized by +> `acpid`, and Intel DMAR runtime ownership is not yet cleanly closed. Bare-metal validation for the +> full ACPI surface is still outstanding. diff --git a/local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md b/local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md index 19122baf..88c85892 100644 --- a/local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md +++ b/local/docs/IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md @@ -74,10 +74,20 @@ Red Bear OS already has a meaningful low-level controller and interrupt foundati MSI-X table mapping, and IRQ affinity control. - `linux-kpi` exposes Linux-style IRQ, PCI, memory, and synchronization APIs on top of `redox-driver-sys`. +- `redox-driver-sys` now has direct host-runnable unit coverage for pure PCI/IRQ substrate rules, + including PCI scheme-entry parsing bounds, I/O BAR port conversion safety, and MSI-X BAR window + helper validation. This should be treated as **source + host-test evidence**, not as runtime + controller proof. - `redox-drm` already contains a shared interrupt abstraction with MSI-X-first and legacy-IRQ fallback paths for GPU drivers. - The AMD-Vi / Intel VT-d reference material and the in-tree `iommu` daemon establish a serious implementation direction for IOMMU and interrupt-remapping work. +- the repo now has a bounded timer proof path via `redbear-phase-timer-check` and + `local/scripts/test-timer-qemu.sh --check`, which verifies the monotonic time scheme is present + and advances across two reads inside a guest runtime +- the bounded low-level controller proof hooks can now be run together through + `local/scripts/test-lowlevel-controllers-qemu.sh`, which sequences xHCI, IOMMU, PS/2, and timer + runtime checks on the desktop validation image ### What is still weak @@ -161,7 +171,7 @@ especially under real runtime scenarios. ### ACPI / APIC / x2APIC -**State**: materially complete for current platform bring-up goals. +**State**: materially complete for the historical boot-baseline bring-up goals, but not release-grade complete. **Important source note**: the checked-in MADT parser in `recipes/core/kernel/source/src/acpi/madt/mod.rs` visibly handles `LocalApic`, `IoApic`, @@ -179,6 +189,7 @@ Open enhancement items: - Better controller/runtime characterization on diverse hardware. - Clearer documentation for what is kernel-complete versus only tested on limited platforms. +- Keep sleep-state support beyond `\_S5`, DMAR ownership cleanup, and bounded validation visible as open ACPI work rather than implying subsystem closure. ### IOAPIC / interrupt source override routing @@ -268,6 +279,9 @@ Current implementation improvement: - a guest-driven self-test path now exists (`/usr/bin/iommu --self-test-init` via `redbear-phase-iommu-check` / `test-iommu-qemu.sh`) and now proves first-use unit initialization and event-drain completion in QEMU +- the self-test output now includes structured discovery diagnostics (`discovery_source`, + `kernel_acpi_status`, `ivrs_path`) so zero-unit failures can be distinguished from kernel-ACPI + fallback and missing-IVRS cases without changing the IOMMU MMIO path itself ### Legacy IRQ ownership and dispatch map @@ -311,6 +325,9 @@ Open enhancement items: - keep validation language explicit about the PS/2 path versus the later generic input stack - add platform notes for systems that still rely on PS/2 keyboard/mouse delivery +- the repo now has a bounded PS/2 runtime-proof path via `redbear-phase-ps2-check` and + `local/scripts/test-ps2-qemu.sh --check`, which proves serio node presence and a successful + handoff into the existing Phase 3 input-path checker inside a guest ### USB xHCI controller interrupt path @@ -323,8 +340,8 @@ Concrete checked-in owner: Current behavior: - xHCI has MSI/MSI-X and legacy INTx detection logic in source -- the hardwired polling override in `xhcid` has been removed, and the driver now uses the existing - MSI-X / MSI / INTx selection logic again +- the checked-in `xhcid` source now calls the existing `get_int_method` path again instead of + hardwiring polling, and it logs whether it selected MSI/MSI-X, legacy INTx, or polling - `local/scripts/test-xhci-irq-qemu.sh --check` now provides a repo-visible runtime proof path by booting a Red Bear image in QEMU and checking the xHCI interrupt-mode log output - `redox-driver-sys` now logs allocated MSI-X vectors so interrupt selection is more observable in @@ -359,7 +376,7 @@ Open enhancement items: above. - The repository already has serious implementation artifacts, not just speculative plans. - The low-level controller work is documented more deeply than many higher-level desktop areas. -- ACPI and early-platform work is significantly more mature than the rest of the low-level stack. +- ACPI and early-platform work are significantly more mature than the rest of the low-level stack, but that maturity is still best read as boot-baseline progress with bounded validation rather than subsystem-complete closure. ### Weak points @@ -537,7 +554,7 @@ runtime-evidence surface: - `local/scripts/test-xhci-irq-qemu.sh --check` — xHCI interrupt-mode proof from QEMU boot logs - `local/scripts/test-msix-qemu.sh` — live MSI-X proof via `virtio-net` - `local/scripts/test-iommu-qemu.sh --check` — AMD IOMMU device visibility plus guest boot reachability -- `local/scripts/test-usb-storage-qemu.sh` — USB mass-storage autospawn probe +- `local/scripts/test-usb-storage-qemu.sh` — USB mass-storage autospawn plus bounded sector-0 readback proof ## Bottom Line