LOWLEVEL plan v1.1: comprehensive Linux 7.1 cross-reference audit
Cross-referenced every stub/gap claim in v1.0 against actual code and Linux 7.1 reference (local/reference/linux-7.1/). Four parallel audits. Key corrections to v1.0: - kernel/src/arch/x86_shared/sleep.rs:257-276 does NOT exist; real PCI stubs are in acpid/aml_physmem.rs:375-398 (root cause: pcid never sends fd to acpid) - EHCI is ALREADY implemented (1538+ lines); the stubs are OHCI and UHCI - aml_physmem.rs:195, :274 line numbers were wrong; actual stubs at :213-232 (map_physical_region panic) and :241-280 (read returns 0) - MSI stub at irq.rs:231 was fixed 2026-06-08 (this audit's first task) New gaps added (v1.1): - Gap 11: IOMMU daemon->kernel IRQ integration missing (kernel has set_iommu_remapping_active() but daemon never calls it) - Gap 12: MSI multi-vector not exposed (blocks xhcid, nvmed, ixgbed, redox-drm) Other corrections: - DMAR init should move to iommu daemon, not acpid - >255 CPU ID is a panic (u8::try_from().expect()), not deferred - hwd legacy backend stub is acceptable (graceful no-op fallback) Added new sections: - Section 13: Concrete Fix List (v1.1, ready to execute) with exact file paths, line numbers, current code, target code, Linux reference - Section 14: v1.1 Audit Methodology documenting the cross-reference approach All execution plan phases updated with corrected tasks, owners, and verification gates.
This commit is contained in:
@@ -1,12 +1,13 @@
|
||||
# Red Bear OS — Low-Level Infrastructure Reassessment & Updated Plan
|
||||
|
||||
**Version**: 1.0 (2026-05-21)
|
||||
**Version**: 1.1 (2026-06-08) — comprehensive code audit against Linux 7.1 reference
|
||||
**Supersedes**: Fragmentary assessments in `COMPREHENSIVE-SYSTEM-ASSESSMENT-AND-IMPROVEMENT-PLAN.md` §2–§4 for ACPI/IRQ/PCI/driver topics
|
||||
**Canonical adjacent plans** (remain authoritative for subsystem detail):
|
||||
- `ACPI-IMPROVEMENT-PLAN.md` — ACPI waves W0–W7
|
||||
- `IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md` — PCI/IRQ/MSI-X waves W1–W6
|
||||
- `BOOT-PROCESS-HARDWARE-DETECTION-PLAN.md` — Boot detection waves W0–W6
|
||||
- `SMP-SCHEDULER-IMPROVEMENT-PLAN.md` — SMP bottlenecks B1–B7
|
||||
- `local/reference/linux-7.1/` — Linux 7.1 reference source for cross-validation
|
||||
|
||||
---
|
||||
|
||||
@@ -42,6 +43,24 @@ This document is a **code-grounded reassessment** of four interdependent low-lev
|
||||
6. **40 total TODOs** in ACPI code (16 kernel + 24 userspace) — higher than previously documented.
|
||||
7. **linux-kpi wireless layer verified real** (2026-06-08): Comprehensive code audit confirmed all Wi-Fi headers (`cfg80211.h`, `mac80211.h`, `netdevice.h`, `skbuff.h`) are real implementations backed by 2770 lines of Rust code (`wireless.rs` 1002 lines, `mac80211.rs` 959 lines, `net.rs` 809 lines). No TODO/FIXME/STUB markers found in wireless code. The `amdgpu_stubs.h` stub file is GPU-specific and does not affect Wi-Fi.
|
||||
|
||||
### What changed in v1.1 (2026-06-08) — Linux 7.1 cross-reference audit
|
||||
|
||||
8. **`kernel/src/arch/x86_shared/sleep.rs:257–276` does not exist**: The kernel has no `sleep.rs` file. The sleep path is entirely in userspace (`acpid`). The actual PCI config access stubs are in `acpid/src/aml_physmem.rs:375–398` (read_pci_u8/u16/u32, write_pci_u8/u16/u32) where `pci_fd` is always `None` because `pcid` never sends its fd to `acpid`. The fix is to wire pcid's fd to acpid via the `RegisterPci` scheme handle, not to modify a non-existent kernel file.
|
||||
|
||||
9. **EHCI is already implemented** (2026-06-08): `local/recipes/drivers/ehcid/source/src/main.rs` is 1538+ lines with full EHCI spec implementation (device enumeration, control/bulk/interrupt transfers, DMA, port reset). The plan's "no EHCI driver" gap was inaccurate. The actual stubs are **OHCI** (`ohcid/source/src/main.rs:16–34`, ~19 lines) and **UHCI** (`uhcid/source/src/main.rs:16–34`, ~19 lines) — both just log BAR reads and enter a sleep loop with no enumeration, no transfers, no port management.
|
||||
|
||||
10. **MSI/MSI-X stub FIXED** (2026-06-08): The `iommu_validate_msi_irq()` blind `true` return at `kernel/src/scheme/irq.rs:231` was replaced with proper IOMMU remapping state tracking. Now uses `IOMMU_REMAPPING_ACTIVE: AtomicBool` + public `set_iommu_remapping_active()` API. The kernel trusts the IOMMU hardware (when active) to validate interrupt remapping. The daemon→kernel coordination is not yet wired (Gap 11).
|
||||
|
||||
11. **IOMMU daemon→kernel IRQ integration missing** (2026-06-08): The kernel now has `set_iommu_remapping_active()` but the `iommu` daemon never calls it. The MSI validation gate works correctly once the daemon writes to a new `/scheme/irq/remapping` file. Spec: kernel adds `Handle::RemappingControl` + write handler; daemon writes `"1"` after `INIT_UNITS` succeeds and IRTE tables are set up.
|
||||
|
||||
12. **MSI multi-vector allocation is a real blocker** (2026-06-08): `pci_allocate_interrupt_vector` in `pcid/src/driver_interface/irq_helpers.rs:307` only allocates single vectors. xhcid (USB 3.0), nvmed (NVMe), ixgbed (10GbE), and redox-drm (GPU) all need multiple vectors. `allocate_aligned_interrupt_vectors` already supports `count` parameter; the fix is to expose it. `multi_message_enable` field in MSI capability is always set to `Some(0)` (single vector).
|
||||
|
||||
13. **>255 CPU truncation is a panic** (2026-06-08): `irq_helpers.rs:89` `u8::try_from(cpu_id).expect("usize cpu ids not implemented yet")` panics for CPU IDs > 255. Must be converted to `io::Error` return. The x2APIC path supports 32-bit APIC IDs. Not a blocker for current hardware (AMD Threadripper 128-thread = 128 CPUs) but must be fixed before >256 CPU systems are tested.
|
||||
|
||||
14. **DMAR init should move to iommu daemon** (2026-06-08): The 533 lines of Intel VT-d parsing in `acpid/src/acpi/dmar/mod.rs` (`Dmar::init()` at line 55) only log register values without initializing hardware. The `acpi.rs:545` call is commented out. Per microkernel design, DMAR init belongs in the `iommu` daemon, not acpid. Linux 7.1 reference: `drivers/iommu/intel/dmar.c` (intel-iommu init pattern).
|
||||
|
||||
15. **APIC timer disabled confirmed** (2026-06-08): `local_apic.rs:81` `//self.setup_timer();` — `setup_timer()` method does not exist despite the call being commented out. All timer infrastructure (LVT timer, divider config, count registers) is present but unconnected. Re-enabling requires: implement `setup_timer()` (TSC deadline mode for modern CPUs, periodic with divide-by-16 fallback), add PM-timer/TSC-based calibration, wire into init sequence. Safe on QEMU; needs calibration on bare metal.
|
||||
|
||||
---
|
||||
|
||||
## 2. ACPI / acpid Reassessment
|
||||
@@ -338,141 +357,218 @@ Every MSI/MSI-X interrupt bypasses IOMMU remapping validation. This is a securit
|
||||
|
||||
**Action**: Propagate `Result<T>` errors to AML evaluation callers instead of fabricating values.
|
||||
|
||||
### Gap 3 — Kernel Sleep Path PCI Stubs (CRITICAL)
|
||||
**File**: `kernel/src/arch/x86_shared/sleep.rs:257–276`
|
||||
- `read_pci_u8/u16/u32` always return 0
|
||||
- `write_pci_*` are no-ops
|
||||
### Gap 3 — AML PCI Access Stubs in acpid (CRITICAL, corrected v1.1)
|
||||
**Files**: `acpid/src/aml_physmem.rs:375–398` (NOT `kernel/src/arch/x86_shared/sleep.rs:257–276` which does not exist)
|
||||
- `read_pci_u8/u16/u32` and `write_pci_u8/u16/u32` in `AmlPhysMemHandler`
|
||||
- When `pci_fd` is `None` (always, currently): `read_pci()` logs error, returns untouched `value` array (all zeros from `let mut value = [0u8]`); `write_pci()` silently does nothing
|
||||
- Root cause: `pcid` never sends its PCI scheme fd to `acpid` via the `RegisterPci` scheme handle (scheme.rs:447–480)
|
||||
|
||||
**Impact**: Any AML code using PCI config space access in the kernel S3/S5 sleep path gets fabricated values. This is only safe if the sleep path guarantees no PCI-dependent AML methods are evaluated.
|
||||
**Impact**: Any AML method that accesses PCI config space (OpRegion with `ACPI_ADR_SPACE_PCI_CONFIG`) gets fabricated zero data. S5 shutdown works by accident because `set_global_s_state(5)` writes to PM1a port directly, but `\_PTS`, `\_WAK`, and any PCI-dependent `\_S5` methods get wrong data.
|
||||
|
||||
**Action**: Either wire real PCI config space access in the kernel sleep path, or explicitly scope the kernel AML interpreter to exclude PCI-dependent methods.
|
||||
**Action**: Wire pcid's PCI scheme fd to acpid:
|
||||
1. `pcid` opens `/scheme/acpi/register_pci` and sends its pci scheme fd via `on_sendfd` on startup
|
||||
2. `acpid` scheme stores the fd in `AmlPhysMemHandler::pci_fd`
|
||||
3. `aml_eval()` in `acpi.rs:394` passes `self.pci_fd.as_ref()` to `aml_context_mut` instead of `None`
|
||||
|
||||
### Gap 4 — APIC Timer Disabled (HIGH)
|
||||
**File**: `kernel/src/arch/x86_shared/device/local_apic.rs:81`
|
||||
- `setup_timer()` commented out
|
||||
- `//self.setup_timer();` — the method `setup_timer()` does not exist; all timer infrastructure (LVT timer, divider config, count registers) is present but unconnected
|
||||
- System uses PIT fallback for all timer interrupts
|
||||
|
||||
**Impact**: No per-CPU timer interrupts (all CPUs share PIT on BSP), no TSC deadline mode for modern CPUs, potential timer skew on SMP.
|
||||
**Impact**: No per-CPU timer interrupts (all CPUs share PIT on BSP), no TSC deadline mode for modern CPUs, potential timer skew on SMP, root cause of heat on bare metal.
|
||||
|
||||
**Action**: Re-enable APIC timer with calibration against PIT or TSC. Required for per-CPU timer distribution.
|
||||
**Action** (per Linux 7.1 `arch/x86/kernel/apic/apic.c:277–321`):
|
||||
1. Implement `setup_timer()` method: TSC deadline mode for modern CPUs (Intel Haswell+, AMD Zen+), periodic with divide-by-16 fallback
|
||||
2. Add PM-timer or TSC-based calibration (Linux: `lapic_cal_handler`)
|
||||
3. Wire into `init_ap()` after `setup_error_int()`
|
||||
4. Calibrate against PIT initially, switch to TSC-deadline or APIC periodic after calibration
|
||||
|
||||
### Gap 5 — Synthetic EDID in All GPU Drivers (HIGH)
|
||||
**File**: `redox-drm/src/kms/connector.rs:35`
|
||||
- All three drivers (AMD, Intel, VirtIO) use hardcoded EDID
|
||||
**File**: `redox-drm/src/kms/connector.rs:35–84`
|
||||
- All three drivers (AMD, Intel, VirtIO) use hardcoded EDID via `synthetic_edid()`
|
||||
- No real DDC/I²C display detection
|
||||
|
||||
**Impact**: Display will not work on bare metal with non-1080p panels, multi-monitor setups, or displays with non-standard timings.
|
||||
|
||||
**Action**: Implement I²C-over-DDC EDID retrieval in `redox-drm`, or at minimum implement a real connector detection path that queries HPD + DDC before falling back to synthetic.
|
||||
**Action** (per Linux 7.1 `drivers/gpu/drm/drm_edid.c` and `drm_dp_helper.c`):
|
||||
1. Implement I²C-over-AUX infrastructure in redox-drm for DisplayPort connectors (DDC address 0x50)
|
||||
2. Replace `synthetic_edid()` with real EDID fetch via AUX CH
|
||||
3. Keep fallback to standard CEA/CTA modes if AUX CH fails (not a single hardcoded mode)
|
||||
4. For HDMI/VGA: implement separate DDC I²C bus access paths
|
||||
|
||||
### Gap 6 — Dual AML Interpreters (HIGH)
|
||||
**Files**: `kernel/src/arch/x86_shared/sleep.rs` (acpi_ext crate) + `acpid/src/acpi.rs` (acpi crate)
|
||||
**Files**: `kernel` uses `acpi_ext` crate (kernel-side); `acpid/src/acpi.rs` uses `acpi` crate (userspace)
|
||||
- Two independent parsers for the same DSDT/SSDT
|
||||
- Different handler implementations (kernel has PCI stubs, userspace has physmem stubs)
|
||||
- Different handler implementations
|
||||
- Bug fixes in one do not affect the other
|
||||
|
||||
**Impact**: Maintenance risk, correctness divergence, two surfaces for AML security issues.
|
||||
|
||||
**Action**: Converge on a single canonical interpreter. Recommendation: userspace (acpid) since all drivers are userspace per project model. Kernel sleep path should delegate to userspace or use a shared, read-only AML namespace.
|
||||
**Action**: Converge on a single canonical interpreter. Recommendation: userspace (acpid) since all drivers are userspace per project model. The kernel `sleep.rs` path was expected but doesn't exist in this codebase — the actual AML eval path is entirely in acpid. Future kernel S3 support should delegate to userspace.
|
||||
|
||||
### Gap 7 — No EHCI/UHCI/OHCI Drivers (HIGH)
|
||||
**Impact**: Legacy USB keyboards on companion controller paths unreachable on bare metal. Only xHCI-native USB devices work.
|
||||
### Gap 7 — No OHCI/UHCI Drivers (HIGH, corrected v1.1)
|
||||
**Files**:
|
||||
- `local/recipes/drivers/ohcid/source/src/main.rs:16–34` — STUB: reads PCI BAR, enters sleep loop
|
||||
- `local/recipes/drivers/uhcid/source/src/main.rs:16–34` — STUB: reads I/O port BAR, enters sleep loop
|
||||
- `local/recipes/drivers/ehcid/source/src/main.rs` — **ALREADY IMPLEMENTED** (1538+ lines, full EHCI spec) — NOT a gap
|
||||
|
||||
**Action**: Implement EHCI driver (highest priority — covers most USB 2.0 controllers with xHCI companion). UHCI/OHCI are lower priority (very old hardware).
|
||||
**Impact**: Legacy USB keyboards on companion controller paths unreachable on bare metal. Only xHCI-native USB devices work, plus EHCI-native ones.
|
||||
|
||||
**Action** (per Linux 7.1 `drivers/usb/host/ohci-hcd.c` and `uhci-hcd.c`):
|
||||
1. **OHCI first** (MMIO-based, simpler than UHCI): 3–4 weeks
|
||||
- HCCA (Host Controller Communications Area) for interrupt transfers
|
||||
- Control/bulk/isochronous transfer descriptors
|
||||
- Frame list management (1024 entries)
|
||||
- Port power and reset control
|
||||
2. **UHCI second** (I/O port-based, more complex): 3–4 weeks
|
||||
- Transfer descriptors (QTD) and queue heads (QH)
|
||||
- Frame list pointer register in MMIO space
|
||||
- Port reset and suspend control
|
||||
|
||||
### Gap 8 — No C-State Kernel Backend (HIGH)
|
||||
**Impact**: CPUs run at full frequency constantly on bare metal. Thermal throttling only.
|
||||
**Impact**: CPUs run at full frequency constantly on bare metal. Thermal throttling only. Root cause of heat on AMD64.
|
||||
|
||||
**Action**: Implement `cpuidle`/`cpufreq` kernel backend using MWAIT or HLT. Discovery exists in acpid (`cstate.rs`) but kernel has no idle driver.
|
||||
**Action** (per Linux 7.1 `drivers/idle/intel_idle.c` and `arch/x86/include/asm/mwait.h`):
|
||||
1. Kernel: add `mwait()`/`mwaitx()` helper functions + C-state hint MSR read/write
|
||||
2. ACPI: parse `_CST` in acpid, expose C-state info via `scheme:cpuidle`
|
||||
3. Implement idle loop using MWAIT with sub-state hints (Linux pattern: `intel_idle.c:67–107` idle_cpu struct)
|
||||
4. Optional: `cpuidled` daemon to coordinate C-state selection
|
||||
|
||||
### Gap 9 — DMAR Orphaned (MEDIUM)
|
||||
**File**: `acpid/src/acpi.rs:545`
|
||||
- 533 lines of Intel VT-d parsing code
|
||||
- `Dmar::init()` commented out — "hangs on real hardware"
|
||||
### Gap 9 — DMAR Init in Wrong Owner (MEDIUM, corrected v1.1)
|
||||
**Files**:
|
||||
- `acpid/src/acpi/dmar/mod.rs:7` — TODO comment: "Move this code to a separate driver as well?"
|
||||
- `acpid/src/acpi/dmar/mod.rs:55–90` — `Dmar::init()` only logs register values, never initializes hardware
|
||||
- `acpid/src/acpi.rs:545` — `Dmar::init(&this)` call commented out
|
||||
- The iommu daemon is the correct owner: `local/recipes/system/iommu/`
|
||||
|
||||
**Action**: Either fix the hang and assign a runtime owner (iommu daemon), or remove the orphaned code until ready.
|
||||
**Impact**: 533 lines of orphaned DMAR parsing in acpid. No Intel VT-d initialization anywhere.
|
||||
|
||||
### Gap 10 — >256 CPU MSI Remapping (MEDIUM)
|
||||
**File**: `drivers/pcid/src/driver_interface/irq_helpers.rs`
|
||||
- 8-bit APIC destination field limits MSI target selection
|
||||
- IOMMU interrupt remapping required for >256 CPUs
|
||||
**Action** (per Linux 7.1 `drivers/iommu/intel/dmar.c:408–456`):
|
||||
1. Remove `Dmar::init()` from acpid — acpid should only expose raw ACPI table data
|
||||
2. Move DMAR parsing to `iommu` daemon: parse via `/scheme/acpi`, initialize IOMMU hardware (program RT, set up context entries, enable GCMD, configure fault handling)
|
||||
3. Or: remove orphaned code until ready (Lower-effort path)
|
||||
|
||||
**Action**: Gated on IOMMU maturity (Gap 1).
|
||||
### Gap 10 — >256 CPU MSI Truncation Panic (MEDIUM)
|
||||
**File**: `drivers/pcid/src/driver_interface/irq_helpers.rs:89`
|
||||
- `let cpu_id = u8::try_from(cpu_id).expect("usize cpu ids not implemented yet");` — PANICS for CPU IDs > 255
|
||||
- x2APIC supports 32-bit APIC IDs (up to 4 billion CPUs)
|
||||
|
||||
**Impact**: Any pcid-spawned driver on a system with >256 CPUs will panic. Not a blocker for current hardware (Threadripper 128-thread = 128 CPUs) but must be fixed before >256 CPU systems are tested.
|
||||
|
||||
**Action**:
|
||||
1. Change `u8::try_from(cpu_id)` to `u32::try_from(cpu_id).map_err(|_| io::Error::new(io::ErrorKind::InvalidInput, "cpu_id > u32::MAX"))?`
|
||||
2. Update kernel `/scheme/irq/cpu-{:02x}` to `/scheme/irq/cpu-{:08x}` for x2APIC
|
||||
3. Add unit test for u32::MAX cpu_id path
|
||||
|
||||
### Gap 11 — IOMMU Daemon→Kernel IRQ Integration Missing (MEDIUM, new in v1.1)
|
||||
**Files**:
|
||||
- Kernel has `set_iommu_remapping_active()` (added 2026-06-08)
|
||||
- `iommu` daemon never calls it
|
||||
|
||||
**Impact**: The MSI validation gate works correctly in code, but `IOMMU_REMAPPING_ACTIVE` always stays `false`, so the one-time warning always fires and the kernel never gets informed of hardware remapping state.
|
||||
|
||||
**Action**:
|
||||
1. Kernel: add `Handle::RemappingControl` variant in `scheme/irq.rs`, detect path `"remapping"` in `kopenat()`, parse `"0"`/`"1"` in `kwrite()` and call `set_iommu_remapping_active()`
|
||||
2. iommu daemon: after `INIT_UNITS` succeeds and IRTE tables are set up, write `"1"` to `/scheme/irq/remapping`
|
||||
3. On shutdown: iommu daemon writes `"0"` before exit
|
||||
|
||||
### Gap 12 — MSI Multi-Vector Not Exposed (MEDIUM, new in v1.1)
|
||||
**File**: `pcid/src/driver_interface/irq_helpers.rs:307`
|
||||
- `pci_allocate_interrupt_vector` only allocates single vector
|
||||
- `allocate_aligned_interrupt_vectors` already supports `count` parameter but is not exposed
|
||||
- `multi_message_enable` field always set to `Some(0)` (single vector)
|
||||
|
||||
**Impact**: xhcid, nvmed, ixgbed, redox-drm cannot use multiple MSI vectors. Falls back to shared IRQ with degraded performance.
|
||||
|
||||
**Action**:
|
||||
1. Add `pci_allocate_interrupt_vectors(pcid_handle, driver, count)` to pcid
|
||||
2. For MSI: set `multi_message_enable` to `log2(count)`, allocate contiguous aligned vectors
|
||||
3. For MSI-X: loop calling `allocate_single_interrupt_vector_for_msi()` per vector
|
||||
|
||||
---
|
||||
|
||||
## 7. Updated Execution Plan
|
||||
## 7. Updated Execution Plan (v1.1)
|
||||
|
||||
### Phase 1: Critical Stub Removal (2–3 weeks)
|
||||
**Goal**: Remove all CRITICAL-severity stubs before any hardware validation.
|
||||
|
||||
| # | Task | File | Effort | Owner |
|
||||
|---|------|------|--------|-------|
|
||||
| 1.1 | Fix `read_phys_or_fault()` zero-return | `acpid/src/aml_physmem.rs:195` | 2 days | — |
|
||||
| 1.2 | Fix `map_physical_region()` zero-page fallback | `acpid/src/aml_physmem.rs:274` | 2 days | — |
|
||||
| 1.3 | Fix kernel sleep path PCI read stubs | `kernel/src/arch/x86_shared/sleep.rs:257–276` | 3 days | — |
|
||||
| 1.4 | Document kernel PCI stub scope | `sleep.rs` | 1 day | — |
|
||||
| 1.5 | Remove `println!` debug artifact | `kernel/src/arch/x86_shared/interrupt/irq.rs:307` | 1 hour | — |
|
||||
| 1.1 | Fix `read_u8/u16/u32/u64` zero-return on failure (fabricate data) | `acpid/src/aml_physmem.rs:241–280` | 2 days | — |
|
||||
| 1.2 | Fix `map_physical_region()` `.expect()` panic | `acpid/src/aml_physmem.rs:213–232` | 2 days | — |
|
||||
| 1.3 | Wire pcid fd → acpid `RegisterPci` handle (root cause of Gap 3) | `pcid/main.rs` + `acpid/scheme.rs` + `acpid/acpi.rs:400` | 3 days | — |
|
||||
| 1.4 | Remove `println!` debug artifact | `kernel/src/arch/x86_shared/interrupt/irq.rs:307` | 1 hour | — |
|
||||
| 1.5 | Replace `cpu_id` u8 truncation panic with error return (Gap 10) | `pcid/src/driver_interface/irq_helpers.rs:89` | 1 day | — |
|
||||
|
||||
**Gate**: All CRITICAL stubs removed + `cargo check` clean on affected modules.
|
||||
**Gate**: All CRITICAL stubs removed + `cargo check` clean on affected modules + pcid→acpid fd wiring tested.
|
||||
|
||||
### Phase 2: IOMMU + MSI Validation (3–4 weeks)
|
||||
**Goal**: Make MSI/MSI-X delivery trustworthy.
|
||||
|
||||
| # | Task | File | Effort | Owner |
|
||||
|---|------|------|--------|-------|
|
||||
| 2.1 | Implement `iommu_validate_msi_irq()` real logic | `kernel/src/scheme/irq.rs:231` | 1 week | — |
|
||||
| 2.2 | Wire IOMMU remapping table read into kernel | `iommu` daemon ↔ `scheme/irq` | 1 week | — |
|
||||
| 2.3 | QEMU validation: MSI-X with IOMMU enabled | `test-msix-qemu.sh` | 2 days | — |
|
||||
| 2.4 | Fix or remove orphaned DMAR code | `acpid/src/acpi.rs:545` | 2 days | — |
|
||||
| 2.1 | **DONE** (2026-06-08): `iommu_validate_msi_irq()` real implementation | `kernel/src/scheme/irq.rs:231` | ✅ | committed |
|
||||
| 2.2 | Add `/scheme/irq/remapping` control file (Gap 11) | `kernel/src/scheme/irq.rs` | 1 day | — |
|
||||
| 2.3 | iommu daemon: write `"1"` to remapping after IRTE init | `iommu/source/src/main.rs` | 2 days | — |
|
||||
| 2.4 | iommu daemon: write `"0"` to remapping on shutdown | `iommu/source/src/main.rs` | 1 day | — |
|
||||
| 2.5 | QEMU validation: MSI-X with IOMMU enabled | `test-msix-qemu.sh` | 2 days | — |
|
||||
| 2.6 | Move DMAR init from acpid to iommu daemon (Gap 9) | `acpid/dmar/` → `iommu/` | 1 week | — |
|
||||
| 2.7 | QEMU validation: DMAR discovery + iommu | `test-iommu-qemu.sh` | 2 days | — |
|
||||
|
||||
**Gate**: `test-msix-qemu.sh` passes with IOMMU enabled + no `iommu_validate_msi_irq()` stub.
|
||||
**Gate**: `test-msix-qemu.sh` passes with IOMMU enabled + remapping gate works + no DMAR init in acpid.
|
||||
|
||||
### Phase 3: Timer + CPU Power (2–3 weeks)
|
||||
**Goal**: Enable per-CPU timers and basic CPU idle.
|
||||
|
||||
| # | Task | File | Effort | Owner |
|
||||
|---|------|------|--------|-------|
|
||||
| 3.1 | Re-enable APIC timer with calibration | `kernel/src/arch/x86_shared/device/local_apic.rs:81` | 3 days | — |
|
||||
| 3.2 | Implement kernel cpuidle backend (MWAIT/HLT) | New file: `kernel/src/arch/x86_shared/cpuidle.rs` | 1 week | — |
|
||||
| 3.3 | Wire acpid C-state discovery to kernel idle | `acpid/src/cstate.rs` → kernel | 3 days | — |
|
||||
| 3.4 | QEMU validation: timer + idle | `test-timer-qemu.sh` | 2 days | — |
|
||||
| 3.1 | Implement `setup_timer()` method (TSC deadline + periodic fallback) | `kernel/src/arch/x86_shared/device/local_apic.rs` | 1 week | — |
|
||||
| 3.2 | Add PM-timer/TSC-based calibration | `kernel/src/arch/x86_shared/device/local_apic.rs` | 1 week | — |
|
||||
| 3.3 | Wire `setup_timer()` into `init_ap()` after `setup_error_int()` | `local_apic.rs:81` | 1 day | — |
|
||||
| 3.4 | Implement kernel cpuidle backend (MWAIT/HLT) | New file: `kernel/src/arch/x86_shared/cpuidle.rs` | 1 week | — |
|
||||
| 3.5 | ACPI `_CST` parsing in acpid | `acpid/src/cstate.rs` (new) | 1 week | — |
|
||||
| 3.6 | QEMU validation: timer + idle | `test-timer-qemu.sh` | 2 days | — |
|
||||
|
||||
**Gate**: `test-timer-qemu.sh` passes with APIC timer + CPU idle active.
|
||||
**Gate**: `test-timer-qemu.sh` passes with APIC timer + CPU idle active + C1/C2 entry observed.
|
||||
|
||||
### Phase 4: Display Detection (4–6 weeks)
|
||||
**Goal**: Replace synthetic EDID with real display detection.
|
||||
|
||||
| # | Task | File | Effort | Owner |
|
||||
|---|------|------|--------|-------|
|
||||
| 4.1 | Implement I²C-over-DDC EDID retrieval | `redox-drm/src/kms/ddc.rs` (new) | 2 weeks | — |
|
||||
| 4.2 | Wire HPD interrupt to connector detection | `redox-drm/src/drivers/amd/mod.rs`, `intel/mod.rs` | 1 week | — |
|
||||
| 4.3 | Replace `synthetic_edid()` with real → fallback | `redox-drm/src/kms/connector.rs:35` | 3 days | — |
|
||||
| 4.4 | QEMU validation: EDID readback | `test-drm-display-runtime.sh` | 2 days | — |
|
||||
| 4.5 | Bare-metal validation: AMD GPU display | `test-amd-gpu.sh` | 1 week | — |
|
||||
| 4.6 | Bare-metal validation: Intel GPU display | `test-intel-gpu.sh` | 1 week | — |
|
||||
| 4.1 | Implement I²C-over-AUX infrastructure (DP connectors) | `redox-drm/src/kms/aux.rs` (new) | 2 weeks | — |
|
||||
| 4.2 | Implement DDC I²C bus for HDMI/VGA | `redox-drm/src/kms/ddc.rs` (new) | 1 week | — |
|
||||
| 4.3 | Wire HPD interrupt to connector detection | `redox-drm/src/drivers/amd/mod.rs`, `intel/mod.rs` | 1 week | — |
|
||||
| 4.4 | Replace `synthetic_edid()` with real → CEA fallback | `redox-drm/src/kms/connector.rs:38–84` | 3 days | — |
|
||||
| 4.5 | QEMU validation: EDID readback | `test-drm-display-runtime.sh` | 2 days | — |
|
||||
| 4.6 | Bare-metal validation: AMD GPU display | `test-amd-gpu.sh` | 1 week | — |
|
||||
| 4.7 | Bare-metal validation: Intel GPU display | `test-intel-gpu.sh` | 1 week | — |
|
||||
|
||||
**Gate**: Real EDID retrieved from at least one display on bare metal (AMD or Intel).
|
||||
|
||||
### Phase 5: USB Legacy Controllers (3–4 weeks)
|
||||
**Goal**: Enable USB keyboard on non-xHCI paths.
|
||||
### Phase 5: USB Legacy Controllers — OHCI/UHCI (6–8 weeks)
|
||||
**Goal**: Enable USB keyboard on non-xHCI paths (EHCI already done).
|
||||
|
||||
| # | Task | File | Effort | Owner |
|
||||
|---|------|------|--------|-------|
|
||||
| 5.1 | Implement EHCI host controller driver | `local/recipes/drivers/ehcid/` (new) | 2 weeks | — |
|
||||
| 5.2 | Wire EHCI into driver-manager PCI binding | `driver-manager/src/main.rs` | 3 days | — |
|
||||
| 5.3 | QEMU validation: EHCI keyboard | `test-usb-qemu.sh` | 2 days | — |
|
||||
| 5.4 | UHCI/OHCI assessment | — | 1 week | — |
|
||||
| 5.1 | Implement OHCI host controller driver | `local/recipes/drivers/ohcid/source/src/main.rs` | 3–4 weeks | — |
|
||||
| 5.2 | Wire OHCI into driver-manager PCI binding | `driver-manager/src/main.rs` | 3 days | — |
|
||||
| 5.3 | QEMU validation: OHCI keyboard | `test-usb-qemu.sh` | 2 days | — |
|
||||
| 5.4 | Implement UHCI host controller driver | `local/recipes/drivers/uhcid/source/src/main.rs` | 3–4 weeks | — |
|
||||
| 5.5 | Wire UHCI into driver-manager PCI binding | `driver-manager/src/main.rs` | 3 days | — |
|
||||
| 5.6 | QEMU validation: UHCI keyboard | `test-usb-qemu.sh` | 2 days | — |
|
||||
| 5.7 | MSI multi-vector support (Gap 12) | `pcid/src/driver_interface/irq_helpers.rs:307` | 1 week | — |
|
||||
|
||||
**Gate**: USB keyboard works via EHCI in QEMU.
|
||||
**Gate**: USB keyboard works via OHCI/UHCI in QEMU + multi-vector MSI for xhcid/nvmed/ixgbed.
|
||||
|
||||
### Phase 6: AML Convergence (3–4 weeks)
|
||||
**Goal**: Resolve dual AML interpreter risk.
|
||||
|
||||
| # | Task | File | Effort | Owner |
|
||||
|---|------|------|--------|-------|
|
||||
| 6.1 | Evaluate kernel sleep.rs → userspace delegation | `kernel/src/arch/x86_shared/sleep.rs` | 1 week | — |
|
||||
| 6.2 | Implement kernel→userspace S3/S5 sleep RPC | `scheme/kernel.acpi/sleep` → `acpid` | 1 week | — |
|
||||
| 6.1 | Audit kernel `acpi_ext` crate usage (does kernel still use it?) | `kernel/src/arch/x86_shared/sleep.rs` (verify exists) | 2 days | — |
|
||||
| 6.2 | Evaluate kernel→userspace S3/S5 sleep delegation | `scheme/kernel.acpi/sleep` → `acpid` | 1 week | — |
|
||||
| 6.3 | Implement kernel→userspace sleep RPC if S3 is needed | `scheme/kernel.acpi/sleep` | 1 week | — |
|
||||
| 6.3 | Remove kernel `acpi_ext` crate if delegated | `kernel/src/arch/x86_shared/sleep.rs` | 3 days | — |
|
||||
| 6.4 | QEMU validation: sleep/wake cycle | `test-sleep-qemu.sh` | 2 days | — |
|
||||
|
||||
@@ -547,56 +643,76 @@ Phase 6 (AML convergence)
|
||||
|
||||
---
|
||||
|
||||
## 9. Risk Register
|
||||
## 9. Risk Register (v1.1)
|
||||
|
||||
| # | Risk | Likelihood | Impact | Mitigation |
|
||||
|---|------|-----------|--------|------------|
|
||||
| R1 | `aml_physmem` stub fix reveals deeper AML memory access issues | Medium | High | Fix with comprehensive error propagation; add fallback to kernel scheme for problematic regions |
|
||||
| R2 | IOMMU validation implementation requires kernel ABI change | Medium | High | Prototype in userspace first via `scheme:iommu` call; only promote to kernel if performance requires it |
|
||||
| R3 | APIC timer calibration fails on specific CPU models | Medium | Medium | Keep PIT fallback path; detect calibration failure and degrade gracefully |
|
||||
| R4 | DDC/I²C implementation requires GPIO/I2C subsystem not yet built | High | High | Scope Phase 4 to "query EDID via ACPI _DDC method first, then direct I²C"; fallback to synthetic still acceptable for initial bring-up |
|
||||
| R5 | EHCI driver requires IRQ/MSI-X fixes first | Medium | Medium | Phase 5 starts after Phase 2 gate; use legacy IRQ for EHCI if MSI-X not ready |
|
||||
| R6 | AML convergence breaks S3 sleep path | Medium | High | Keep kernel sleep.rs as fallback during transition; remove only after S3 validated via userspace path |
|
||||
| R7 | No bare-metal hardware available for validation | Medium | Critical | Prioritize QEMU proofs for all phases; document "QEMU-validated" vs "bare-metal-validated" per subsystem |
|
||||
| R1 | `aml_physmem` stub fix requires `acpi` crate trait modification | High | High | Fork acpi crate to local/recipes/, or use sentinel-value + error-flag workaround that doesn't require trait change |
|
||||
| R2 | IOMMU daemon→kernel integration needs new scheme file | Low | Medium | Kernel side is ~20 lines (`Handle::RemappingControl` + write handler); daemon side is ~5 lines. Both well-understood. |
|
||||
| R3 | APIC timer calibration fails on specific CPU models | Medium | Medium | Keep PIT fallback path; detect calibration failure and degrade gracefully. TSC deadline mode is simpler and doesn't need calibration. |
|
||||
| R4 | DDC/I²C implementation requires AUX CH for DisplayPort | High | High | Phase 4 split: implement AUX CH for DP first (covers AMD/Intel), DDC I²C for HDMI/VGA later. Synthetic EDID as fallback always. |
|
||||
| R5 | OHCI/UHCI implementation is high-effort (6–8 weeks total) | Medium | Medium | Phase 5 spans two cycles: OHCI first (MMIO-based, simpler), UHCI second (I/O port-based, more complex) |
|
||||
| R6 | AML convergence depends on whether kernel still uses `acpi_ext` | Unknown | Medium | Phase 6.1 audit: verify if `kernel/src/arch/x86_shared/sleep.rs` exists. If it does NOT exist, the dual-AML concern is moot (kernel has no AML interpreter). |
|
||||
| R7 | MSI multi-vector breaks drivers that use shared IRQ assumptions | Low | Medium | Gate behind Phase 5; ship single-vector path as default; multi-vector is opt-in per driver |
|
||||
| R8 | DMAR move from acpid to iommu daemon changes module ownership | Low | Medium | Refactor only; no new hardware interaction. iommu daemon already has the register-programming infrastructure. |
|
||||
| R9 | pcid→acpid fd passing uses a Redox-specific mechanism | Medium | Medium | Verify fd-passing via `on_sendfd` works between pcid and acpid schemes. Add test in pcid. |
|
||||
| R10 | No bare-metal hardware available for validation | Medium | Critical | Prioritize QEMU proofs for all phases; document "QEMU-validated" vs "bare-metal-validated" per subsystem |
|
||||
|
||||
---
|
||||
|
||||
## 10. Verification Gates
|
||||
## 10. Verification Gates (v1.1)
|
||||
|
||||
### Gate A: Boot-Baseline Ready (end of Phase 1)
|
||||
- [ ] `aml_physmem.rs:195` returns `Result<T>` instead of `T::zero()`
|
||||
- [ ] `aml_physmem.rs:274` propagates mapping errors instead of zero-page fallback
|
||||
- [ ] `sleep.rs:257–276` either wired to real PCI or explicitly scoped out
|
||||
- [ ] `cargo check` clean on `acpid`, `kernel`, `redox-drm`
|
||||
- [ ] `aml_physmem.rs:241–280` read_u* methods no longer fabricate zeros on failure
|
||||
- [ ] `aml_physmem.rs:213–232` `map_physical_region()` no longer panics on physmap failure
|
||||
- [ ] pcid sends fd to acpid via `/scheme/acpi/register_pci`; acpid `pci_fd` is `Some` after init
|
||||
- [ ] `acpi.rs:400` `aml_eval()` passes `self.pci_fd.as_ref()` instead of `None`
|
||||
- [ ] `irq_helpers.rs:89` returns `io::Error` instead of panic for >255 CPU IDs
|
||||
- [ ] `cargo check` clean on `acpid`, `kernel`, `redox-drm`, `pcid`
|
||||
- [ ] `repo validate-patches kernel` passes
|
||||
- [ ] `repo validate-patches base` passes
|
||||
|
||||
### Gate B: IRQ/IOMMU Trustworthy (end of Phase 2)
|
||||
- [ ] `iommu_validate_msi_irq()` performs real validation
|
||||
- [x] `iommu_validate_msi_irq()` performs real validation (done 2026-06-08)
|
||||
- [ ] `/scheme/irq/remapping` exists and is writable
|
||||
- [ ] iommu daemon writes `"1"` to remapping after IRTE init
|
||||
- [ ] iommu daemon writes `"0"` to remapping on shutdown
|
||||
- [ ] DMAR init removed from acpid
|
||||
- [ ] DMAR init lives in iommu daemon
|
||||
- [ ] `test-msix-qemu.sh` passes with IOMMU enabled
|
||||
- [ ] `test-iommu-qemu.sh` passes
|
||||
- [ ] No unconditional `true` returns in IRQ validation path
|
||||
- [ ] Boot log does not show the "MSI before IOMMU" warning when IOMMU is configured
|
||||
|
||||
### Gate C: Timer + Power (end of Phase 3)
|
||||
- [ ] `setup_timer()` method exists in `local_apic.rs`
|
||||
- [ ] APIC timer fires and calibrates correctly in QEMU
|
||||
- [ ] CPU idle backend enters C1/C2 via MWAIT or HLT
|
||||
- [ ] `test-timer-qemu.sh` passes
|
||||
- [ ] No PIT-only fallback in boot log
|
||||
|
||||
### Gate D: Display Detection (end of Phase 4)
|
||||
- [ ] AUX CH infrastructure exists in `redox-drm/src/kms/aux.rs`
|
||||
- [ ] DDC I²C infrastructure exists in `redox-drm/src/kms/ddc.rs`
|
||||
- [ ] `synthetic_edid()` is fallback, not primary
|
||||
- [ ] Real EDID retrieved from at least one display in QEMU
|
||||
- [ ] `test-drm-display-runtime.sh` passes
|
||||
|
||||
### Gate E: USB Legacy (end of Phase 5)
|
||||
- [ ] EHCI driver enumerates devices in QEMU
|
||||
- [ ] USB keyboard functional via EHCI in QEMU
|
||||
- [ ] OHCI driver enumerates devices in QEMU
|
||||
- [ ] UHCI driver enumerates devices in QEMU
|
||||
- [ ] USB keyboard functional via OHCI in QEMU
|
||||
- [ ] USB keyboard functional via UHCI in QEMU
|
||||
- [ ] MSI multi-vector exposed via `pci_allocate_interrupt_vectors(pcid_handle, driver, count)`
|
||||
- [ ] xhcid, nvmed, ixgbed updated to use multi-vector MSI where appropriate
|
||||
- [ ] `test-usb-qemu.sh` passes
|
||||
|
||||
### Gate F: Single AML Interpreter (end of Phase 6)
|
||||
- [ ] S5 shutdown works with userspace AML only
|
||||
- [ ] Kernel `acpi_ext` crate removed or explicitly deprecated
|
||||
- [ ] `test-sleep-qemu.sh` passes (S3 + S5)
|
||||
- [ ] Audit: confirm whether `kernel/src/arch/x86_shared/sleep.rs` exists
|
||||
- [ ] If it exists: evaluate kernel→userspace sleep delegation
|
||||
- [ ] If it does NOT exist: dual-AML concern is moot, document this
|
||||
- [ ] S5 shutdown works via userspace AML only
|
||||
- [ ] `test-shutdown-qemu.sh` passes (S5 only — S3 is not a current target)
|
||||
|
||||
### Gate G: Hardware Validation (end of Phase 7)
|
||||
- [ ] Class A1 (AMD desktop) boots, shuts down, displays, accepts USB keyboard
|
||||
@@ -688,6 +804,214 @@ This document is a **cross-cutting reassessment** that references but does not r
|
||||
- For SMP bottleneck detail, see `SMP-SCHEDULER-IMPROVEMENT-PLAN.md`
|
||||
- For desktop path blockers, see `CONSOLE-TO-KDE-DESKTOP-PLAN.md`
|
||||
|
||||
## 13. Concrete Fix List (v1.1, Ready to Execute)
|
||||
|
||||
The following items are **ready to implement immediately** — they have been fully audited against Linux 7.1 reference, the root cause is understood, and the fix is specified. Each item has been promoted to a tracked task.
|
||||
|
||||
### 1.1.a — acpid `read_u8/u16/u32/u64` data fabrication (Gap 2)
|
||||
|
||||
**File**: `local/sources/base/drivers/acpid/src/aml_physmem.rs:241–280`
|
||||
**Severity**: 🔴 CRITICAL
|
||||
**Linux reference**: `local/reference/linux-7.1/drivers/acpi/acpica/evregion.c:302–316` — `acpi_ev_address_space_dispatch()` checks handler return status and logs exception; never fabricates data
|
||||
|
||||
**Current code** (representative — same pattern in all 4 read methods):
|
||||
```rust
|
||||
fn read_u8(&self, address: usize) -> u8 {
|
||||
if let Ok(mut page_cache) = self.page_cache.lock() {
|
||||
if let Ok(value) = page_cache.read_from_phys::<u8>(address) {
|
||||
return value;
|
||||
}
|
||||
}
|
||||
log::error!("failed to read u8 {:#x}", address);
|
||||
0 // FABRICATES DATA
|
||||
}
|
||||
```
|
||||
|
||||
**Target code** (sentinel-value approach, since the `acpi` crate's `Handler` trait returns raw `u8`):
|
||||
```rust
|
||||
static READ_FABRICATION_FLAG: AtomicUsize = AtomicUsize::new(0);
|
||||
fn read_u8(&self, address: usize) -> u8 {
|
||||
if let Ok(mut page_cache) = self.page_cache.lock() {
|
||||
match page_cache.read_from_phys::<u8>(address) {
|
||||
Ok(value) => return value,
|
||||
Err(e) => log::error!("read u8 {:#x} failed: {:?}", address, e),
|
||||
}
|
||||
}
|
||||
READ_FABRICATION_FLAG.fetch_add(1, Ordering::SeqCst);
|
||||
0
|
||||
}
|
||||
pub fn read_fabrication_count() -> usize {
|
||||
READ_FABRICATION_FLAG.load(Ordering::SeqCst)
|
||||
}
|
||||
```
|
||||
|
||||
**Note**: Full `Result<T, AmlError>` propagation requires forking the `acpi` crate and modifying the `Handler` trait. The sentinel+flag approach is the minimum-viable fix that doesn't require a crate fork.
|
||||
|
||||
### 1.1.b — acpid `map_physical_region` panic (Gap 2)
|
||||
|
||||
**File**: `local/sources/base/drivers/acpid/src/aml_physmem.rs:213–232`
|
||||
**Severity**: 🔴 CRITICAL
|
||||
**Linux reference**: `local/reference/linux-7.1/drivers/acpi/acpica/exregion.c:145–153` — returns `AE_NO_MEMORY` status on map failure
|
||||
|
||||
**Current code**:
|
||||
```rust
|
||||
let virt_page = common::physmap(...).expect("failed to map physical region") as usize;
|
||||
```
|
||||
|
||||
**Target code**:
|
||||
```rust
|
||||
let virt_page = match common::physmap(...) {
|
||||
Ok(v) => v as usize,
|
||||
Err(e) => {
|
||||
log::error!("physmap failed at {:#x}+{:#x}: {:?}", phys_page, map_size, e);
|
||||
return PhysicalMapping {
|
||||
physical_start: phys,
|
||||
virtual_start: NonNull::dangling(),
|
||||
region_length: size,
|
||||
mapped_length: 0, // 0 length signals invalid
|
||||
handler: self.clone(),
|
||||
};
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
### 1.3 — Wire pcid→acpid fd (Gap 3)
|
||||
|
||||
**Files**:
|
||||
- `local/sources/base/drivers/pcid/src/main.rs` (add fd send)
|
||||
- `local/sources/base/drivers/acpid/src/scheme.rs` (handle `RegisterPci`)
|
||||
- `local/sources/base/drivers/acpid/src/acpi.rs:400` (pass pci_fd to aml_context_mut)
|
||||
|
||||
**Implementation sketch**:
|
||||
```rust
|
||||
// In pcid/src/main.rs, after PCI bus init, before event loop:
|
||||
let acpi_register = File::open("/scheme/acpi/register_pci")?;
|
||||
let pci_scheme_fd = /* get from pcid's internal pci scheme handle */;
|
||||
send_fd(acpi_register, pci_scheme_fd)?;
|
||||
|
||||
// In acpi.rs line 400, change:
|
||||
let interpreter = symbols.aml_context_mut(self.pci_fd.as_ref())?;
|
||||
// from:
|
||||
let interpreter = symbols.aml_context_mut(None)?;
|
||||
```
|
||||
|
||||
### 1.5 — Replace u8 CPU ID panic (Gap 10)
|
||||
|
||||
**File**: `local/sources/base/drivers/pcid/src/driver_interface/irq_helpers.rs:89`
|
||||
**Severity**: 🟠 HIGH (panic on >255 CPU systems)
|
||||
**Current code**:
|
||||
```rust
|
||||
let cpu_id = u8::try_from(cpu_id).expect("usize cpu ids not implemented yet");
|
||||
```
|
||||
**Target code**:
|
||||
```rust
|
||||
let cpu_id = u32::try_from(cpu_id)
|
||||
.map_err(|_| io::Error::new(io::ErrorKind::InvalidInput, "cpu_id > u32::MAX"))?;
|
||||
```
|
||||
|
||||
### 2.2 — Add `/scheme/irq/remapping` control file (Gap 11)
|
||||
|
||||
**File**: `local/sources/kernel/src/scheme/irq.rs`
|
||||
**Severity**: 🟡 MEDIUM
|
||||
**Linux reference**: `local/reference/linux-7.1/include/linux/pci.h` — `pci_write_config_byte` is the equivalent scheme pattern in Redox
|
||||
|
||||
**Implementation**:
|
||||
1. Add `Handle::RemappingControl` variant
|
||||
2. In `kopenat()`, detect path `"remapping"` and return `OpenResult::Other` with this handle
|
||||
3. In `kwrite()`, parse `"0"` or `"1"` and call `set_iommu_remapping_active()`
|
||||
4. Document in `irqs.md` (or scheme doc)
|
||||
|
||||
### 2.3-2.4 — iommu daemon writes to `/scheme/irq/remapping` (Gap 11)
|
||||
|
||||
**File**: `local/recipes/system/iommu/source/src/main.rs`
|
||||
**Severity**: 🟡 MEDIUM
|
||||
**Implementation**:
|
||||
```rust
|
||||
// After successful INIT_UNITS and IRTE setup:
|
||||
let remapping = std::fs::File::create("/scheme/irq/remapping")?;
|
||||
remapping.write_all(b"1")?;
|
||||
|
||||
// On shutdown signal:
|
||||
let remapping = std::fs::File::create("/scheme/irq/remapping")?;
|
||||
remapping.write_all(b"0")?;
|
||||
```
|
||||
|
||||
### 2.6 — Move DMAR init from acpid to iommu daemon (Gap 9)
|
||||
|
||||
**Files**:
|
||||
- Remove: `local/sources/base/drivers/acpid/src/acpi/dmar/mod.rs` (533 lines of orphaned code)
|
||||
- Add: DMAR parsing to `local/recipes/system/iommu/source/src/intel.rs` (new file)
|
||||
- Add: DMAR init wired into `local/recipes/system/iommu/source/src/main.rs` `INIT_UNITS` path
|
||||
|
||||
**Linux reference**: `local/reference/linux-7.1/drivers/iommu/intel/dmar.c:408–456` (`dmar_parse_one_drhd`)
|
||||
|
||||
### 3.1-3.3 — Re-enable APIC timer (Gap 4)
|
||||
|
||||
**File**: `local/sources/kernel/src/arch/x86_shared/device/local_apic.rs`
|
||||
**Severity**: 🟠 HIGH
|
||||
**Linux reference**: `local/reference/linux-7.1/arch/x86/kernel/apic/apic.c:277–321` (`__setup_APIC_LVTT`)
|
||||
|
||||
**Implementation**:
|
||||
1. Implement `setup_timer()` method (TSC deadline mode first, periodic fallback)
|
||||
2. Add PM-timer or TSC calibration (`lapic_cal_handler` pattern, `apic.c:662–688`)
|
||||
3. Uncomment line 81: `self.setup_timer();`
|
||||
4. Verify with `test-timer-qemu.sh`
|
||||
|
||||
### 5.1-5.6 — OHCI and UHCI drivers (Gap 7)
|
||||
|
||||
**Files**:
|
||||
- `local/recipes/drivers/ohcid/source/src/main.rs` (currently 19-line stub)
|
||||
- `local/recipes/drivers/uhcid/source/src/main.rs` (currently 19-line stub)
|
||||
|
||||
**Linux reference**:
|
||||
- `local/reference/linux-7.1/drivers/usb/host/ohci-hcd.c` (full reference impl)
|
||||
- `local/reference/linux-7.1/drivers/usb/host/uhci-hcd.c` (full reference impl)
|
||||
|
||||
**Implementation order**:
|
||||
1. **OHCI first** (3–4 weeks): MMIO register access, HCCA, transfer descriptors, frame list, port management
|
||||
2. **UHCI second** (3–4 weeks): I/O port register access, QH/QTD management, FLBASEADD, port control
|
||||
|
||||
### 5.7 — MSI multi-vector allocation (Gap 12)
|
||||
|
||||
**File**: `local/sources/base/drivers/pcid/src/driver_interface/irq_helpers.rs:307`
|
||||
**Severity**: 🟡 MEDIUM
|
||||
**Linux reference**: `local/reference/linux-7.1/drivers/pci/msi/api.c` — `pci_alloc_irq_vectors()`
|
||||
|
||||
**Implementation**:
|
||||
1. Add `pci_allocate_interrupt_vectors(pcid_handle, driver, count)` to pcid
|
||||
2. For MSI: set `multi_message_enable = log2(count)`, allocate contiguous aligned vectors
|
||||
3. For MSI-X: loop calling `allocate_single_interrupt_vector_for_msi()` per vector
|
||||
4. Update xhcid, nvmed, ixgbed, redox-drm to use multi-vector where appropriate
|
||||
|
||||
## 14. v1.1 Audit Methodology
|
||||
|
||||
The v1.1 corrections were made by:
|
||||
|
||||
1. **Reading** the source files at the locations the v1.0 plan claimed contained stubs
|
||||
2. **Discovering** that several locations don't exist (`kernel/src/arch/x86_shared/sleep.rs:257–276`)
|
||||
3. **Finding** the actual stubs at different locations
|
||||
4. **Cross-referencing** against Linux 7.1 reference at `local/reference/linux-7.1/` for each fix
|
||||
5. **Verifying** through grep + read that the line numbers in the v1.0 plan were sometimes off
|
||||
6. **Checking** git history of `local/sources/base/` and `local/sources/kernel/` to ensure fixes target the correct durable location
|
||||
|
||||
### Findings of the audit
|
||||
|
||||
| v1.0 claim | v1.1 reality |
|
||||
|---|---|
|
||||
| Gap 3: kernel `sleep.rs:257–276` PCI stubs | **Does not exist** — sleep path is in `acpid/aml_physmem.rs:375–398` |
|
||||
| Gap 7: no EHCI driver | **EHCI is implemented** (1538+ lines) — stubs are OHCI + UHCI |
|
||||
| Gap 1: MSI stub at `kernel/scheme/irq.rs:231` | **Fixed 2026-06-08** (this audit's first deliverable) |
|
||||
| Gap 2: AML stubs at `aml_physmem.rs:195, :274` | **Wrong line numbers** — actual stubs are at `:241–280` (reads) and `:213–232` (map) |
|
||||
| Gap 4: APIC timer disabled | **Confirmed** — `setup_timer()` method doesn't even exist |
|
||||
| Gap 6: Dual AML interpreters | **Confirmed, but reduced scope** — kernel may not have AML interpreter at all |
|
||||
| Gap 8: No C-state backend | **Confirmed** — no `cpuidle` exists, no `cstate.rs` in acpid |
|
||||
| Gap 9: DMAR orphaned | **Confirmed, but ownership wrong** — should be in iommu daemon, not acpid |
|
||||
| Gap 10: >256 CPU MSI | **Confirmed, but is a panic, not a deferred case** — `u8::try_from(...).expect(...)` |
|
||||
| New Gap 11: IOMMU→kernel integration | **New finding** — kernel has `set_iommu_remapping_active()` but daemon never calls it |
|
||||
| New Gap 12: MSI multi-vector | **New finding** — required by xhcid, nvmed, ixgbed, redox-drm |
|
||||
|
||||
---
|
||||
|
||||
**When this document conflicts with a canonical subsystem plan**, the **canonical plan** wins on subsystem-specific details, and this document wins on cross-cutting prioritization and inter-subsystem dependencies.
|
||||
|
||||
**This document should be updated** after each phase gate is reached, or when new critical stubs are discovered.
|
||||
|
||||
Reference in New Issue
Block a user