37 KiB
Red Bear OS — Low-Level Infrastructure Reassessment & Updated Plan
Version: 1.0 (2026-05-21)
Supersedes: Fragmentary assessments in COMPREHENSIVE-SYSTEM-ASSESSMENT-AND-IMPROVEMENT-PLAN.md §2–§4 for ACPI/IRQ/PCI/driver topics
Canonical adjacent plans (remain authoritative for subsystem detail):
ACPI-IMPROVEMENT-PLAN.md— ACPI waves W0–W7IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md— PCI/IRQ/MSI-X waves W1–W6BOOT-PROCESS-HARDWARE-DETECTION-PLAN.md— Boot detection waves W0–W6SMP-SCHEDULER-IMPROVEMENT-PLAN.md— SMP bottlenecks B1–B7
1. Executive Summary
This document is a code-grounded reassessment of four interdependent low-level subsystems: ACPI/acpid, IRQ/PCI, enumeration/driver binding, and driver infrastructure. It is based on direct source inspection (file paths and line numbers provided throughout), cross-referenced against existing plans.
Bottom-line verdict
| Subsystem | Verdict | Blocking Bare Metal? |
|---|---|---|
| ACPI boot | Boot-baseline complete, not release-grade | Partial — shutdown timing fragile |
| ACPI shutdown | S5 derivation works, timing-dependent on PCI | Yes — pre-PCI shutdown degrades weakly |
| ACPI thermal/fan | Discovery exists, no runtime backend | No — thermal safety gap |
| ACPI C-states | Discovery exists, no kernel cpuidle | Yes — root cause of heat |
| IRQ delivery | Architecturally strong, QEMU-proven only | Partial — no HW validation |
| MSI/MSI-X | Code complete, IOMMU validation stubbed | Yes — iommu_validate_msi_irq() returns true |
| PCI enumeration | Userspace-only (correct), pcid complete | No |
| Driver binding | Manual class-code matching, no ACPI _HID/_CID | Partial — limited device coverage |
| redox-driver-sys | Production quality, zero stubs | No |
| linux-kpi | Structurally complete for GPU+Wi-Fi | No |
| GPU drivers | Compile-only, synthetic EDID everywhere | Yes — no real display detection |
| Wi-Fi | Compile+host-test only | Yes — no HW validation |
| USB | xhcid only, no EHCI/UHCI/OHCI | Yes — legacy USB keyboards unreachable |
What changed since last assessment (2026-05-20)
- Critical stub discovered:
iommu_validate_msi_irq()atkernel/src/scheme/irq.rs:231unconditionally returnstrue— this was not flagged as a blocking item in the IRQ enhancement plan (all 6 waves marked "complete"). - Critical stub discovered:
aml_physmem.rs:195and:274fabricate zero values on physical memory access failure — affects all AML runtime evaluation. - Dual AML interpreter architecture identified as a maintenance risk — kernel
acpi_extcrate and userspaceacpicrate parse DSDT/SSDT independently. - APIC timer disabled (
local_apic.rs:81) — not flagged in any existing plan as a blocker. - Synthetic EDID used in all GPU drivers — blocks real display detection on bare metal.
- 40 total TODOs in ACPI code (16 kernel + 24 userspace) — higher than previously documented.
2. ACPI / acpid Reassessment
2.1 Architecture
The ACPI subsystem has three operational levels:
Bootloader → KernelArgs.hwdesc_base (RSDP pointer)
│
▼
Kernel ACPI (src/acpi/ + src/scheme/acpi.rs + src/arch/x86_shared/sleep.rs)
├── RSDP→RSDT/XSDT→SDT enumeration (MADT, SRAT, SLIT, HPET)
├── Export via /scheme/kernel.acpi/{rxsdt, kstop, sleep}
└── Kernel-side AML interpreter (acpi_ext crate) for S3/S5 sleep
│
▼
Userspace acpid (drivers/acpid/src/)
├── Reads rxsdt, loads SDTs from physical memory
├── Userspace AML interpreter (acpi crate) — SEPARATE from kernel's
├── Exports /scheme/acpi/{dmi, tables, symbols, thermal, fan, cstates}
└── Shutdown via kstop pipe + PM1a/PM1b write
2.2 What Is Working
| Component | File | Evidence |
|---|---|---|
| RSDP discovery + dual checksum | acpi/rsdp.rs |
ACPI 1.0 + 2.0+ validation, 62 lines |
| MADT parsing (10 entry types) | acpi/madt/mod.rs |
Types 0x0–0xA + aarch64 GICC/GICD, 340 lines |
| x2APIC support | acpi/madt/mod.rs |
Types 0x9/0xA, P20–P22 patches |
| IOAPIC init from MADT | device/ioapic.rs |
GSI resolution, source overrides, affinity, 502 lines |
| LAPIC/x2APIC | device/local_apic.rs |
MSR + MMIO dual path, 312 lines |
| SRAT/SLIT NUMA | acpi/srat.rs, acpi/slit.rs |
Affinity + distance matrix |
| HPET timer | acpi/hpet.rs |
Init from ACPI tables |
| Kernel scheme export | scheme/acpi.rs |
rxsdt, kstop, sleep — 398 lines |
| acpid SDT loading | acpid/src/acpi.rs:162–217 |
Page-span handling, PhysmapGuard |
| acpid FADT parsing | acpid/src/acpi.rs:965–1122 |
ACPI 2.0 extended fields |
| acpid EC handler | acpid/src/ec.rs |
Full protocol (RD_EC/WR_EC/BE_EC/BD_EC/QR_EC), 317 lines |
| acpid S5 derivation | acpid/src/acpi.rs:754–813 |
FADT + AML __S5, cached |
| acpid DMI | acpid/src/dmi.rs |
SMBIOS 32/64-bit entry points, 350 lines |
| acpid thermal/fan/cstate discovery | thermal.rs, fan.rs, cstate.rs |
AML-backed __TZ, __PR namespace |
| hwd ACPI backend | hwd/backend/acpi.rs |
__CID/__HID device discovery, 119 lines |
2.3 Critical Stubs
| Location | Line | Issue | Severity |
|---|---|---|---|
acpid/src/aml_physmem.rs |
195 | read_phys_or_fault() returns T::zero() on failure — fabricates data |
🔴 CRITICAL |
acpid/src/aml_physmem.rs |
274 | map_physical_region() falls back to zero page on failure — writes lost |
🔴 CRITICAL |
kernel/src/arch/x86_shared/sleep.rs |
257–276 | read_pci_u8/u16/u32 always return 0; write_pci_* are no-ops |
🔴 CRITICAL |
kernel/src/arch/x86_shared/sleep.rs |
275 | nanos_since_boot() returns 0 — broken AML timing |
🟠 HIGH |
kernel/src/arch/x86_shared/sleep.rs |
294–298 | acquire()/release() for AML mutexes are no-ops |
🟠 HIGH |
acpid/src/acpi.rs |
545 | Dmar::init(&this) commented out — "TODO (hangs on real hardware)" |
🟠 HIGH |
hwd/backend/legacy.rs |
13 | LegacyBackend::probe() is a TODO no-op |
🟠 HIGH |
acpid/src/acpi.rs |
820–822 | set_global_s_state(state) returns Ok for any state != 5 |
🟡 MEDIUM |
2.4 Architectural Risks
- Dual AML interpreters: Kernel
sleep.rsusesacpi_extcrate; userspaceacpidusesacpicrate. They parse the same DSDT/SSDT independently with different handler implementations. Bug fixes in one do not affect the other. - RSDP_ADDR contract: acpid AML init requires
RSDP_ADDRenvironment variable (fromhwdviaKernelArgs.hwdesc_base). x86 has BIOS fallback; non-x86 paths are unresolved. - S5 derivation timing: Depends on AML readiness which depends on PCI registration. Pre-PCI shutdown falls back gracefully but the degraded contract is weak.
- DMAR orphaned: 533 lines of Intel VT-d parsing code exist but are not wired into startup.
2.5 TODO Inventory
- Kernel ACPI: 16 TODOs (
madtarch variants,hpetx86 assumption,spcrtype support,scheme/acpicontext switch,gtdt) - Userspace acpid: 24 TODOs (
acpi.rs: 10,dmar/: 9,main.rs: 3,scheme.rs: 1,aml_physmem.rs: 1) - Total: 40 TODOs
2.6 Alignment with ACPI-IMPROVEMENT-PLAN.md
| Wave | Plan Status | Code Reality | Delta |
|---|---|---|---|
| W0 Contracts | ~80% | Truth statement accurate | — |
| W1 Startup hardening | ~60% | P19 patch removed panic-grade expects; remaining expect() in firmware-origin paths |
Underdocumented |
| W2 AML ordering/shutdown | ~50% | S5 derivation improved (P24); explicit error types exist; timing still coupled to PCI | Underdocumented |
| W3 Honest power surface | Open | Battery/AC probing exists but not trustworthy; thermal/fan discovery real but no backend action | — |
| W4 Physmem/EC/fault handling | ~40% | Two critical stubs at lines 195, 274 not flagged in plan | New finding |
| W5 Ownership cleanup | Open | DMAR still orphaned; dual interpreters unresolved | — |
| W6 Consumer integration | ~60% | kstop→sessiond path works | — |
| W7 Validation closure | Open | No bare-metal validation matrix executed | — |
3. IRQ / PCI Reassessment
3.1 Architecture
PCI Device → MSI/MSI-X message (address 0xFEE0_0xxx + data)
│
▼
APIC (local or I/O) → Vector delivery to target CPU
│
▼
Kernel IDT → generic_irq handler (vec 32–255)
│
▼
scheme/irq.rs → irq_trigger(irq, token)
├── iommu_validate_msi_irq(irq) ← STUB: returns true unconditionally
├── increment COUNTS[irq]
├── walk HANDLES for matching fd
└── trigger EVENT_READ
│
▼
Userspace driver → IrqHandle::wait() returns with count
3.2 What Is Working
| Component | File | Evidence |
|---|---|---|
| IDT (256 entries) | arch/x86_shared/idt.rs |
224 generic vectors, legacy IRQ bindings, IPI handlers, 374 lines |
| 8259 PIC | arch/x86_shared/device/pic.rs |
Master/slave init, mask, ack, ISR query, 98 lines |
| I/O APIC | arch/x86_shared/device/ioapic.rs |
MADT-parsed, GSI resolution, affinity reprogramming, 502 lines |
| LAPIC/x2APIC | arch/x86_shared/device/local_apic.rs |
MMIO + MSR dual path, IPI, EOI, ESR, 312 lines |
| IRQ dispatch | arch/x86_shared/interrupt/irq.rs |
PIC/APIC switching, spurious accounting, 352 lines |
| IRQ scheme | scheme/irq.rs |
Registration, delivery, affinity, per-CPU listing, 650 lines |
| MSI kernel code | arch/x86_shared/device/msi.rs |
Message composition, validation, capability parsing, 183 lines |
| Vector allocator | arch/x86_shared/device/vector.rs |
CAS bitmap for 224 vectors, 53 lines |
| redox-driver-sys IRQ | redox-driver-sys/src/irq.rs |
MSI-X table mapping, vector allocation, affinity, 491 lines, zero TODOs |
| redox-driver-sys PCI | redox-driver-sys/src/pci.rs |
Config space, BAR probing, MSI-X enable, 1446 lines, zero TODOs |
| pcid daemon | drivers/pcid/src/ |
Enumeration, scheme:pci, driver spawn, ~1400 lines total |
| driver-manager | driver-manager/src/main.rs |
PciBus + AcpiBus binding, boot timeline, 553 lines |
3.3 Critical Stubs
| Location | Line | Issue | Severity |
|---|---|---|---|
kernel/src/scheme/irq.rs |
231 | iommu_validate_msi_irq(_irq) -> bool { true } — zero IOMMU validation |
🔴 CRITICAL |
kernel/src/arch/x86_shared/device/local_apic.rs |
81 | //self.setup_timer(); — APIC timer disabled |
🟠 HIGH |
kernel/src/arch/x86_shared/interrupt/irq.rs |
307 | println!("Local apic timer interrupt"); — debug artifact |
🟡 MEDIUM |
kernel/src/arch/x86_shared/device/ioapic.rs |
329–331 | .unwrap() on cpuid — panic risk |
🟡 MEDIUM |
drivers/pcid/src/driver_interface/irq_helpers.rs |
— | "FIXME for cpu_id >255 need IOMMU IRQ remapping" | 🟠 HIGH |
drivers/pcid/src/driver_interface/irq_helpers.rs |
— | "FIXME allow allocating multiple interrupt vectors" | 🟠 HIGH |
3.4 Patch-Backed Code
The following kernel code does not exist in upstream — it is entirely Red Bear patches:
msi.rs(+183 lines) — added byP8-msi.patch(281 lines, 12 hunks)vector.rs(+53 lines) — added byP8-msi.patch- IOAPIC affinity —
P9-ioapic-irq-affinity.patch - IRQ affinity wiring —
P10-irq-affinity-wiring.patch - x2APIC ICR fix —
P20-x2apic-icr-mode-fix.patch - x2APIC SMP fix —
P21-x2apic-smp-fix.patch - x2APIC MADT fallback —
P22-x2apic-madt-fallback.patch
Risk: If upstream kernel rebases, these patches must be rebased. The MSI/MSI-X subsystem is entirely patch-dependent.
3.5 Alignment with IRQ Enhancement Plan
The plan reports all 6 Waves as ✅ Complete. Code inspection confirms the Waves addressed panic hardening and code quality. However, 6 priority areas remain entirely open and the plan does not flag:
iommu_validate_msi_irq()stub (CRITICAL — not mentioned)- APIC timer disabled (not mentioned)
- Single-vector-per-device limit (mentioned as FIXME but not prioritized)
4. Enumeration / Driver Binding Reassessment
4.1 Current Flow
pcid enumerates PCI bus → /scheme/pci/{segment}--{bus}--{device}.{function}/
│
▼
driver-manager (or pcid-spawner legacy) reads /scheme/pci/
│
▼
For each device: query config space (vendor, device, class, subclass)
│
▼
Match against driver config (PCI class/vendor/device ID lookup)
│
▼
Spawn driver daemon with PCID_CLIENT_CHANNEL env var
│
▼
Driver opens /scheme/pci/{addr}/config and /scheme/irq/{irq}
4.2 Limitations
- No ACPI _HID/_CID matching: Non-PCI devices (ACPI-enumerated GPIO, I2C, etc.) are not bound through the driver-manager.
- No modalias generation: Drivers are matched by simple class-code or vendor/device ID — no automatic alias generation from PCI class/subclass/prog-if.
- LegacyBackend is a stub:
hwd/backend/legacy.rs:13— "TODO: handle driver spawning from legacy backend" — any non-ACPI, non-DTB platform gets no hardware discovery. - Initfs transitional:
hwdandacpidlive on initfs boot path, not under stable rootfs service contract.
4.3 Alignment with Boot-Process-Hardware-Detection-Plan.md
| Wave | Plan Status | Code Reality |
|---|---|---|
| W0 Boot stage definitions | ✅ Done | Config-only |
| W1 ACPI bus in driver-manager | ✅ Done | AcpiBus exists |
| W2 Resource parser (_CRS, _PRT) | ✅ Done | Parsed |
| W2b ACPI device binding | ✅ Done | Wired |
| W2c GPIO/I2C configs | Partial | Runtime _CRS evaluation not started |
| W3 Service rewiring | ✅ Done | Stage targets wired |
| W4 Dead /etc/pcid.d/ removal | ✅ Done | Removed |
| W5 Deferred probing | ✅ Already had | Scheme-aware |
| W6 USB topology enumeration | Not started | Depends on xHCI IRQ stability |
5. Driver Infrastructure Reassessment
5.1 redox-driver-sys
Status: ✅ Production quality, zero stubs, zero TODOs
- Schemes: memory (physical mapping, cache type control), irq (registration, wait, affinity), pci (enumeration, config space, BARs, MSI-X)
- Quirks: 3-layer (compiled-in 11 entries + TOML runtime + DMI/SMBIOS 8 rules), 22 PCI flags, 21 USB flags
- MSI-X: Full
MsixTablewith validated x86 message programming, vector allocation, CPU round-robin - DMA:
DmaBuffer(phys-contiguous),IommuDmaAllocator(MAP/UNMAP protocol) - Tests: 30+ unit tests in
pci.rs
5.2 linux-kpi
Status: ✅ Structurally complete for GPU + Wi-Fi, 119 tests passing, zero stubs
- 17 Rust modules, 32 C headers
- Full implementations: pci (777 lines), net (809), wireless (1002), mac80211 (959), irq (228), firmware (277), drm_shim (374)
- No
todo!()/unimplemented!()in any audited module - C header coverage: pci.h, skbuff.h, interrupt.h, firmware.h, netdevice.h, ieee80211.h, nl80211.h, cfg80211.h, mac80211.h, drm*.h, atomic.h, spinlock.h, mutex.h, workqueue.h, timer.h, wait.h, list.h, slab.h, mm.h, io.h, types.h, errno.h, compiler.h, export.h, printk.h, module.h, refcount.h, jiffies.h, kernel.h, idr.h, bug.h
5.3 firmware-loader
Status: ✅ Production quality
scheme:firmwaredaemon withSchemeSyncimpl- MANIFEST generation (BLAKE3),
--probe,--request-nowait - Path traversal prevention, 64MB blob cap, cache with source signature validation
- AMD GPU: 17 firmware keys expected; Intel: per-generation DMC firmware
5.4 GPU Drivers
| Driver | Status | Key Gap |
|---|---|---|
| redox-drm (AMD) | 🟡 Compiles, 616 lines | synthetic_edid() fallback — no real DDC/I²C |
| redox-drm (Intel) | 🟡 Compiles, 693 lines | synthetic_edid() fallback — no real DDC/I²C |
| redox-drm (VirtIO) | 🟡 Compiles | synthetic_edid() fallback |
| amdgpu (C port) | 🟡 Compiles, ~1487 lines | Hardcoded 4 connector descriptors, no real HPD |
All three GPU drivers use synthetic_edid() at redox-drm/src/kms/connector.rs:35 — a hardcoded 128-byte EDID 1.4 block for 1920×1080@60Hz. This blocks real display detection on bare metal.
5.5 Wi-Fi
Status: 🟡 Compiles + host-tested, zero hardware validation
redbear-iwlwifi: C transport layer (~2450 lines) + Rust daemon (~1550 lines)- 8 host tests pass
- Commands time out without real firmware — by design
- No Intel Wi-Fi device ever exercised
5.6 USB
Status: 🟡 xhcid builds + QEMU proofs pass, bare-metal incomplete
- xhcid: Red Bear patched, QEMU IRQ delivery proven
- usbscsid: USB mass storage with inline quirks (214 storage quirks)
- usbhubd: Hub port management
- Gap: No EHCI, UHCI, or OHCI drivers — legacy USB keyboards on companion controllers are unreachable on bare metal
6. Cross-Cutting Critical Gaps (Updated Priority)
Gap 1 — IOMMU MSI Validation (CRITICAL)
File: kernel/src/scheme/irq.rs:231
fn iommu_validate_msi_irq(_irq: u8) -> bool {
true
}
Every MSI/MSI-X interrupt bypasses IOMMU remapping validation. This is a security and correctness gap. The hook exists but has zero logic.
Root cause: IOMMU daemon (iommu) provides AMD-Vi runtime but no Intel VT-d. The validation function needs remapping table data from the IOMMU daemon, or validation must move to userspace via a scheme call.
Action: Implement real validation against IOMMU remapping tables, or explicitly document that MSI/MSI-X without IOMMU is only safe on trusted buses.
Gap 2 — AML Physical Memory Stubs (CRITICAL)
Files: acpid/src/aml_physmem.rs:195, :274
read_phys_or_fault()returnsT::zero()on failure — fabricates datamap_physical_region()falls back to zero page — silent data loss
Impact: Any AML method accessing a physical memory region that fails to map will see fabricated zeroes. This can cause:
- Incorrect battery/thermal readings
- Silent EC communication failures
- Wrong power state transitions
Action: Propagate Result<T> errors to AML evaluation callers instead of fabricating values.
Gap 3 — Kernel Sleep Path PCI Stubs (CRITICAL)
File: kernel/src/arch/x86_shared/sleep.rs:257–276
read_pci_u8/u16/u32always return 0write_pci_*are no-ops
Impact: Any AML code using PCI config space access in the kernel S3/S5 sleep path gets fabricated values. This is only safe if the sleep path guarantees no PCI-dependent AML methods are evaluated.
Action: Either wire real PCI config space access in the kernel sleep path, or explicitly scope the kernel AML interpreter to exclude PCI-dependent methods.
Gap 4 — APIC Timer Disabled (HIGH)
File: kernel/src/arch/x86_shared/device/local_apic.rs:81
setup_timer()commented out- System uses PIT fallback for all timer interrupts
Impact: No per-CPU timer interrupts (all CPUs share PIT on BSP), no TSC deadline mode for modern CPUs, potential timer skew on SMP.
Action: Re-enable APIC timer with calibration against PIT or TSC. Required for per-CPU timer distribution.
Gap 5 — Synthetic EDID in All GPU Drivers (HIGH)
File: redox-drm/src/kms/connector.rs:35
- All three drivers (AMD, Intel, VirtIO) use hardcoded EDID
- No real DDC/I²C display detection
Impact: Display will not work on bare metal with non-1080p panels, multi-monitor setups, or displays with non-standard timings.
Action: Implement I²C-over-DDC EDID retrieval in redox-drm, or at minimum implement a real connector detection path that queries HPD + DDC before falling back to synthetic.
Gap 6 — Dual AML Interpreters (HIGH)
Files: kernel/src/arch/x86_shared/sleep.rs (acpi_ext crate) + acpid/src/acpi.rs (acpi crate)
- Two independent parsers for the same DSDT/SSDT
- Different handler implementations (kernel has PCI stubs, userspace has physmem stubs)
- Bug fixes in one do not affect the other
Impact: Maintenance risk, correctness divergence, two surfaces for AML security issues.
Action: Converge on a single canonical interpreter. Recommendation: userspace (acpid) since all drivers are userspace per project model. Kernel sleep path should delegate to userspace or use a shared, read-only AML namespace.
Gap 7 — No EHCI/UHCI/OHCI Drivers (HIGH)
Impact: Legacy USB keyboards on companion controller paths unreachable on bare metal. Only xHCI-native USB devices work.
Action: Implement EHCI driver (highest priority — covers most USB 2.0 controllers with xHCI companion). UHCI/OHCI are lower priority (very old hardware).
Gap 8 — No C-State Kernel Backend (HIGH)
Impact: CPUs run at full frequency constantly on bare metal. Thermal throttling only.
Action: Implement cpuidle/cpufreq kernel backend using MWAIT or HLT. Discovery exists in acpid (cstate.rs) but kernel has no idle driver.
Gap 9 — DMAR Orphaned (MEDIUM)
File: acpid/src/acpi.rs:545
- 533 lines of Intel VT-d parsing code
Dmar::init()commented out — "hangs on real hardware"
Action: Either fix the hang and assign a runtime owner (iommu daemon), or remove the orphaned code until ready.
Gap 10 — >256 CPU MSI Remapping (MEDIUM)
File: drivers/pcid/src/driver_interface/irq_helpers.rs
- 8-bit APIC destination field limits MSI target selection
- IOMMU interrupt remapping required for >256 CPUs
Action: Gated on IOMMU maturity (Gap 1).
7. Updated Execution Plan
Phase 1: Critical Stub Removal (2–3 weeks)
Goal: Remove all CRITICAL-severity stubs before any hardware validation.
| # | Task | File | Effort | Owner |
|---|---|---|---|---|
| 1.1 | Fix read_phys_or_fault() zero-return |
acpid/src/aml_physmem.rs:195 |
2 days | — |
| 1.2 | Fix map_physical_region() zero-page fallback |
acpid/src/aml_physmem.rs:274 |
2 days | — |
| 1.3 | Fix kernel sleep path PCI read stubs | kernel/src/arch/x86_shared/sleep.rs:257–276 |
3 days | — |
| 1.4 | Document kernel PCI stub scope | sleep.rs |
1 day | — |
| 1.5 | Remove println! debug artifact |
kernel/src/arch/x86_shared/interrupt/irq.rs:307 |
1 hour | — |
Gate: All CRITICAL stubs removed + cargo check clean on affected modules.
Phase 2: IOMMU + MSI Validation (3–4 weeks)
Goal: Make MSI/MSI-X delivery trustworthy.
| # | Task | File | Effort | Owner |
|---|---|---|---|---|
| 2.1 | Implement iommu_validate_msi_irq() real logic |
kernel/src/scheme/irq.rs:231 |
1 week | — |
| 2.2 | Wire IOMMU remapping table read into kernel | iommu daemon ↔ scheme/irq |
1 week | — |
| 2.3 | QEMU validation: MSI-X with IOMMU enabled | test-msix-qemu.sh |
2 days | — |
| 2.4 | Fix or remove orphaned DMAR code | acpid/src/acpi.rs:545 |
2 days | — |
Gate: test-msix-qemu.sh passes with IOMMU enabled + no iommu_validate_msi_irq() stub.
Phase 3: Timer + CPU Power (2–3 weeks)
Goal: Enable per-CPU timers and basic CPU idle.
| # | Task | File | Effort | Owner |
|---|---|---|---|---|
| 3.1 | Re-enable APIC timer with calibration | kernel/src/arch/x86_shared/device/local_apic.rs:81 |
3 days | — |
| 3.2 | Implement kernel cpuidle backend (MWAIT/HLT) | New file: kernel/src/arch/x86_shared/cpuidle.rs |
1 week | — |
| 3.3 | Wire acpid C-state discovery to kernel idle | acpid/src/cstate.rs → kernel |
3 days | — |
| 3.4 | QEMU validation: timer + idle | test-timer-qemu.sh |
2 days | — |
Gate: test-timer-qemu.sh passes with APIC timer + CPU idle active.
Phase 4: Display Detection (4–6 weeks)
Goal: Replace synthetic EDID with real display detection.
| # | Task | File | Effort | Owner |
|---|---|---|---|---|
| 4.1 | Implement I²C-over-DDC EDID retrieval | redox-drm/src/kms/ddc.rs (new) |
2 weeks | — |
| 4.2 | Wire HPD interrupt to connector detection | redox-drm/src/drivers/amd/mod.rs, intel/mod.rs |
1 week | — |
| 4.3 | Replace synthetic_edid() with real → fallback |
redox-drm/src/kms/connector.rs:35 |
3 days | — |
| 4.4 | QEMU validation: EDID readback | test-drm-display-runtime.sh |
2 days | — |
| 4.5 | Bare-metal validation: AMD GPU display | test-amd-gpu.sh |
1 week | — |
| 4.6 | Bare-metal validation: Intel GPU display | test-intel-gpu.sh |
1 week | — |
Gate: Real EDID retrieved from at least one display on bare metal (AMD or Intel).
Phase 5: USB Legacy Controllers (3–4 weeks)
Goal: Enable USB keyboard on non-xHCI paths.
| # | Task | File | Effort | Owner |
|---|---|---|---|---|
| 5.1 | Implement EHCI host controller driver | local/recipes/drivers/ehcid/ (new) |
2 weeks | — |
| 5.2 | Wire EHCI into driver-manager PCI binding | driver-manager/src/main.rs |
3 days | — |
| 5.3 | QEMU validation: EHCI keyboard | test-usb-qemu.sh |
2 days | — |
| 5.4 | UHCI/OHCI assessment | — | 1 week | — |
Gate: USB keyboard works via EHCI in QEMU.
Phase 6: AML Convergence (3–4 weeks)
Goal: Resolve dual AML interpreter risk.
| # | Task | File | Effort | Owner |
|---|---|---|---|---|
| 6.1 | Evaluate kernel sleep.rs → userspace delegation | kernel/src/arch/x86_shared/sleep.rs |
1 week | — |
| 6.2 | Implement kernel→userspace S3/S5 sleep RPC | scheme/kernel.acpi/sleep → acpid |
1 week | — |
| 6.3 | Remove kernel acpi_ext crate if delegated |
kernel/src/arch/x86_shared/sleep.rs |
3 days | — |
| 6.4 | QEMU validation: sleep/wake cycle | test-sleep-qemu.sh |
2 days | — |
Gate: S5 shutdown works with single AML interpreter (userspace only).
Phase 7: Hardware Validation Matrix (4–6 weeks, parallel with 4–6)
Goal: Evidence-based support claims.
| # | Task | Hardware | Effort |
|---|---|---|---|
| 7.1 | Class A1 validation (AMD desktop + discrete GPU) | Ryzen 5000/7000 + AMD GPU | 1 week |
| 7.2 | Class A2 validation (Intel desktop + iGPU) | Core 12th–14th Gen | 1 week |
| 7.3 | Class A3 validation (AMD laptop) | Ryzen Mobile | 1 week |
| 7.4 | Class A4 validation (Intel laptop) | Core Mobile | 1 week |
| 7.5 | Regression test suite on all 4 classes | All | 2 weeks |
Gate: All 4 hardware classes pass boot, shutdown, USB keyboard, and display detection.
8. Timeline Synthesis
Week 1–3: Phase 1 — Critical stub removal
Week 4–7: Phase 2 — IOMMU + MSI validation
Week 7–9: Phase 3 — Timer + CPU power (parallel with Phase 2 week 7)
Week 10–15: Phase 4 — Display detection (parallel with Phase 5)
Week 10–13: Phase 5 — USB legacy controllers (parallel with Phase 4)
Week 14–17: Phase 6 — AML convergence
Week 14–19: Phase 7 — Hardware validation matrix (parallel with Phase 6)
Total: 19 weeks (≈4.5 months) with 2 developers
What the existing plans said vs this plan
| Plan | Claimed Timeline | Reality |
|---|---|---|
| COMPREHENSIVE P1 (bare-metal hardening) | 6–8 weeks | Understated — no critical stub removal phase |
| COMPREHENSIVE P2 (USB) | 4–6 weeks | Realistic for EHCI only |
| COMPREHENSIVE P3 (IRQ/IOMMU) | 4–6 weeks | Realistic if focused on Gap 1 only |
| IRQ plan Waves 1–6 | "Complete" | Code quality complete, validation not started |
| ACPI plan Waves 0–7 | W0–W4 partial, W5–W7 open | Accurate, but two critical stubs not flagged |
| SMP plan bottlenecks | 11–18 days | Realistic for B1–B2 only |
Dependencies
Phase 1 (stub removal)
│
├── required by ──► Phase 2 (IOMMU validation)
│
├── required by ──► Phase 3 (timer + idle)
│
└── required by ──► Phase 4 (display detection)
Phase 2 (IOMMU)
└── required by ──► Phase 7 (hardware validation — safe MSI)
Phase 3 (timer + idle)
└── required by ──► Phase 7 (hardware validation — no overheating)
Phase 4 (display)
└── required by ──► Phase 7 (hardware validation — working console)
Phase 5 (USB EHCI)
└── required by ──► Phase 7 (hardware validation — keyboard input)
Phase 6 (AML convergence)
└── not blocking ──► Phase 7 (can validate with dual interpreters)
9. Risk Register
| # | Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|---|
| R1 | aml_physmem stub fix reveals deeper AML memory access issues |
Medium | High | Fix with comprehensive error propagation; add fallback to kernel scheme for problematic regions |
| R2 | IOMMU validation implementation requires kernel ABI change | Medium | High | Prototype in userspace first via scheme:iommu call; only promote to kernel if performance requires it |
| R3 | APIC timer calibration fails on specific CPU models | Medium | Medium | Keep PIT fallback path; detect calibration failure and degrade gracefully |
| R4 | DDC/I²C implementation requires GPIO/I2C subsystem not yet built | High | High | Scope Phase 4 to "query EDID via ACPI _DDC method first, then direct I²C"; fallback to synthetic still acceptable for initial bring-up |
| R5 | EHCI driver requires IRQ/MSI-X fixes first | Medium | Medium | Phase 5 starts after Phase 2 gate; use legacy IRQ for EHCI if MSI-X not ready |
| R6 | AML convergence breaks S3 sleep path | Medium | High | Keep kernel sleep.rs as fallback during transition; remove only after S3 validated via userspace path |
| R7 | No bare-metal hardware available for validation | Medium | Critical | Prioritize QEMU proofs for all phases; document "QEMU-validated" vs "bare-metal-validated" per subsystem |
10. Verification Gates
Gate A: Boot-Baseline Ready (end of Phase 1)
aml_physmem.rs:195returnsResult<T>instead ofT::zero()aml_physmem.rs:274propagates mapping errors instead of zero-page fallbacksleep.rs:257–276either wired to real PCI or explicitly scoped outcargo checkclean onacpid,kernel,redox-drmrepo validate-patches kernelpassesrepo validate-patches basepasses
Gate B: IRQ/IOMMU Trustworthy (end of Phase 2)
iommu_validate_msi_irq()performs real validationtest-msix-qemu.shpasses with IOMMU enabledtest-iommu-qemu.shpasses- No unconditional
truereturns in IRQ validation path
Gate C: Timer + Power (end of Phase 3)
- APIC timer fires and calibrates correctly in QEMU
- CPU idle backend enters C1/C2 via MWAIT or HLT
test-timer-qemu.shpasses- No PIT-only fallback in boot log
Gate D: Display Detection (end of Phase 4)
synthetic_edid()is fallback, not primary- Real EDID retrieved from at least one display in QEMU
test-drm-display-runtime.shpasses
Gate E: USB Legacy (end of Phase 5)
- EHCI driver enumerates devices in QEMU
- USB keyboard functional via EHCI in QEMU
test-usb-qemu.shpasses
Gate F: Single AML Interpreter (end of Phase 6)
- S5 shutdown works with userspace AML only
- Kernel
acpi_extcrate removed or explicitly deprecated test-sleep-qemu.shpasses (S3 + S5)
Gate G: Hardware Validation (end of Phase 7)
- Class A1 (AMD desktop) boots, shuts down, displays, accepts USB keyboard
- Class A2 (Intel desktop) boots, shuts down, displays, accepts USB keyboard
- Class A3 (AMD laptop) boots, shuts down, displays, accepts USB keyboard
- Class A4 (Intel laptop) boots, shuts down, displays, accepts USB keyboard
- Validation artifacts committed to
local/docs/HARDWARE-VALIDATION-MATRIX.md
11. Appendix: Key File Reference
ACPI
recipes/core/kernel/source/src/acpi/mod.rs— Kernel ACPI orchestratorrecipes/core/kernel/source/src/acpi/rsdp.rs— RSDP discoveryrecipes/core/kernel/source/src/acpi/madt/mod.rs— MADT parserrecipes/core/kernel/source/src/scheme/acpi.rs— Kernel ACPI schemerecipes/core/kernel/source/src/arch/x86_shared/sleep.rs— Kernel AML interpreter for sleeprecipes/core/kernel/source/src/arch/x86_shared/stop.rs— Shutdown orchestratorrecipes/core/base/source/drivers/acpid/src/main.rs— acpid daemon entryrecipes/core/base/source/drivers/acpid/src/acpi.rs— Core ACPI contextrecipes/core/base/source/drivers/acpid/src/aml_physmem.rs— AML physmem handler (stubs at :195, :274)recipes/core/base/source/drivers/acpid/src/ec.rs— Embedded Controller handlerrecipes/core/base/source/drivers/acpid/src/thermal.rs— Thermal zone discoveryrecipes/core/base/source/drivers/acpid/src/fan.rs— Fan device discoveryrecipes/core/base/source/drivers/acpid/src/cstate.rs— C-state discoveryrecipes/core/base/source/drivers/acpid/src/dmi.rs— SMBIOS DMI parserrecipes/core/base/source/drivers/hwd/src/backend/acpi.rs— hwd ACPI backendrecipes/core/base/source/drivers/hwd/src/backend/legacy.rs— LegacyBackend stub (:13)
IRQ / PCI
recipes/core/kernel/source/src/scheme/irq.rs— IRQ scheme (stub at :231)recipes/core/kernel/source/src/arch/x86_shared/interrupt/irq.rs— IRQ dispatchrecipes/core/kernel/source/src/arch/x86_shared/device/ioapic.rs— I/O APICrecipes/core/kernel/source/src/arch/x86_shared/device/local_apic.rs— LAPIC (timer disabled at :81)recipes/core/kernel/source/src/arch/x86_shared/device/msi.rs— MSI code (patch-based)recipes/core/kernel/source/src/arch/x86_shared/device/vector.rs— Vector allocator (patch-based)recipes/core/kernel/source/src/arch/x86_shared/device/pic.rs— 8259 PICrecipes/core/kernel/source/src/arch/x86_shared/idt.rs— IDT setuplocal/recipes/drivers/redox-driver-sys/source/src/irq.rs— Userspace IRQ handlinglocal/recipes/drivers/redox-driver-sys/source/src/pci.rs— Userspace PCI abstractionrecipes/core/base/source/drivers/pcid/src/main.rs— pcid daemonrecipes/core/base/source/drivers/pcid/src/scheme.rs— PciSchemerecipes/core/base/source/drivers/pcid/src/driver_interface/irq_helpers.rs— IRQ helper FIXMEslocal/recipes/system/driver-manager/source/src/main.rs— Driver manager
Driver Infrastructure
local/recipes/drivers/redox-driver-sys/source/src/lib.rs— Core librarylocal/recipes/drivers/redox-driver-sys/source/src/quirks/mod.rs— Quirks APIlocal/recipes/drivers/linux-kpi/source/src/lib.rs— linux-kpi cratelocal/recipes/drivers/linux-kpi/source/src/rust_impl/pci.rs— PCI KPI (777 lines)local/recipes/drivers/linux-kpi/source/src/rust_impl/drm_shim.rs— DRM GEM shimlocal/recipes/drivers/linux-kpi/source/src/rust_impl/mac80211.rs— mac80211 KPI (959 lines)local/recipes/drivers/linux-kpi/source/src/rust_impl/wireless.rs— cfg80211 KPI (1002 lines)local/recipes/system/firmware-loader/source/src/main.rs— firmware-loader daemonlocal/recipes/gpu/redox-drm/source/src/main.rs— DRM daemonlocal/recipes/gpu/redox-drm/source/src/drivers/amd/mod.rs— AMD GPU driverlocal/recipes/gpu/redox-drm/source/src/drivers/intel/mod.rs— Intel GPU driverlocal/recipes/gpu/redox-drm/source/src/kms/connector.rs— Connector + synthetic EDID (:35)local/recipes/gpu/amdgpu/source/amdgpu_redox_main.c— Bounded AMD display C portlocal/recipes/gpu/amdgpu/source/redox_glue.h— Linux→Redox C gluelocal/recipes/gpu/amdgpu/source/redox_stubs.c— Kernel emulation stubs
Patches
local/patches/kernel/redbear-consolidated.patch— Consolidated mega-patchlocal/patches/kernel/P8-msi.patch— MSI + vector allocatorlocal/patches/kernel/P9-ioapic-irq-affinity.patch— IRQ affinitylocal/patches/kernel/P10-irq-affinity-wiring.patch— Affinity wiringlocal/patches/kernel/P20-x2apic-icr-mode-fix.patch— x2APIC ICRlocal/patches/kernel/P21-x2apic-smp-fix.patch— x2APIC SMPlocal/patches/kernel/P22-x2apic-madt-fallback.patch— x2APIC MADT fallbacklocal/patches/kernel/P24-cstate-mwait-idle.patch— C-state MWAITlocal/patches/kernel/P25-cpuidle-deep-cstates.patch— Deep C-stateslocal/patches/base/P19-acpid-startup-hardening.patch— acpid startuplocal/patches/base/P24-acpi-s5-derivation-shutdown-semantics.patch— S5 derivationlocal/patches/base/P44-acpid-thermal-zones.patch— Thermal zoneslocal/patches/base/P48-acpid-fan-support.patch— Fan supportlocal/patches/base/P52-acpid-cstates.patch— C-state discovery
12. Document Authority
This document is a cross-cutting reassessment that references but does not replace the canonical subsystem plans:
- For ACPI wave-level execution detail, see
ACPI-IMPROVEMENT-PLAN.md - For IRQ/PCI wave-level execution detail, see
IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md - For boot detection wave detail, see
BOOT-PROCESS-HARDWARE-DETECTION-PLAN.md - For SMP bottleneck detail, see
SMP-SCHEDULER-IMPROVEMENT-PLAN.md - For desktop path blockers, see
CONSOLE-TO-KDE-DESKTOP-PLAN.md
When this document conflicts with a canonical subsystem plan, the canonical plan wins on subsystem-specific details, and this document wins on cross-cutting prioritization and inter-subsystem dependencies.
This document should be updated after each phase gate is reached, or when new critical stubs are discovered.