feat: IOMMU-aware DmaAllocator + comprehensive DMA/thread audit

dma.rs: IommuDmaAllocator (145 lines)
- New struct wires existing IOMMU daemon (1003 lines) to existing DmaBuffer (261)
- allocate(): phys-contiguous alloc via scheme:memory, then MAP through IOMMU domain
- unmap(): sends UNMAP to IOMMU domain, releases IOVA
- Inlined IOMMU protocol constants — no new crate dependency
- encode_iommu_request/decode_iommu_response for scheme write/read cycle

Documentation updates:
- IMPLEMENTATION-MASTER-PLAN.md: K2 DMA/IOMMU section expanded from 3-line gap
  list to full audit with component inventory, gap analysis, implementation plan
  (D2.1-D2.5), Linux reference table. Added K2b thread/fork audit.
- CPU-DMA-IRQ-MSI-SCHEDULER-FIX-PLAN.md: Phase 1 (MSI) marked complete with
  per-task status. Phase 2 (DMA) re-scoped from 'create' to 'wire' based on
  audit. Phase 3 (scheduler) marked mostly done.
- IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md: kernel MSI support noted
  as materially strong with P8-msi.patch reference.

Audit findings:
- IOMMU daemon is solid: 1003-line lib.rs with full scheme protocol,
  427-line amd_vi.rs, host-runnable tests. Needs wiring, not rewriting.
- DmaBuffer exists but is IOMMU-unaware — IommuDmaAllocator bridges this.
- relibc rlct_clone is correct for threads (shares addr space implicitly).
  '3 IPC hops' claim is microkernel-architectural, not a real perf issue.
- No stale docs to archive at this time.
This commit is contained in:
2026-05-04 18:18:04 +01:00
parent 678980521c
commit 029472d5e3
4 changed files with 285 additions and 81 deletions
+59 -7
View File
@@ -149,15 +149,67 @@ Stream.rs exists (387 lines). NOT runtime-validated.
| TSC calibration | `arch/x86/kernel/tsc.c:1186` | 1,612 |
| APIC timer calibration | `arch/x86/kernel/apic/apic.c:294` | 2,694 |
| Vector allocation | `arch/x86/kernel/apic/vector.c` | 1,387 |
| MSI/MSI-X | `arch/x86/kernel/apic/msi.c` | 391 |
| MSI/MSI-X | `arch/x86/kernel/apic/msi.c` | 391 | ✅ DONE — P8-msi.patch (msi.rs, vector.rs, scheme/irq.rs, driver-sys) |
### K2: DMA / Memory
### K2: DMA / IOMMU (Audited 2026-05-04)
| Gap | Linux Ref | Lines |
|-----|-----------|-------|
| Coherent DMA | `kernel/dma/mapping.c` | 1,016 |
| Scatter-gather | `lib/scatterlist.c` | — |
| SWIOTLB | `kernel/dma/swiotlb.c` | — |
**Current State — Thorough Audit:**
| Component | Location | Lines | Status |
|---|---|---|---|
| IOMMU scheme daemon | `local/recipes/system/iommu/source/src/lib.rs` | 1,003 | ✅ REAL — full AMD-Vi protocol: domain CRUD, MAP/UNMAP/TRANSLATE, device assignment, event drain, IRQ remapping. Host-runnable tests pass. |
| AMD-Vi unit driver | `local/recipes/system/iommu/source/src/amd_vi.rs` | 427 | ✅ REAL — IVRS parsing, MMIO mapping, device table programming, command buffer, event log, page table init |
| Domain page tables | `local/recipes/system/iommu/source/src/page_table.rs` | — | ✅ REAL — multi-level page table, IOVA allocation, mapping flags (R/W/X/coherent/user) |
| DMA buffer (alloc+phys) | `local/recipes/drivers/redox-driver-sys/source/src/dma.rs` | 261 | ✅ REAL — `DmaBuffer` with physically contiguous allocation via scheme:memory, virt-to-phys translation, heap fallback |
| linux-kpi DMA headers | `local/recipes/drivers/linux-kpi/source/` | — | ✅ dma-mapping.h, dma-direction.h, scatterlist.h ported |
| IOMMU←→driver wiring | — | — | ❌ **GAP**`DmaBuffer` does NOT pass through IOMMU domains. GPU/NIC/NVMe drivers allocate DMA directly, not through IOMMU-isolated domains |
| Streaming DMA | — | — | ❌ **GAP** — no `dma_map_single`/`dma_unmap_single` for bounce-buffer ops |
| SWIOTLB | — | — | ❌ **GAP** — no bounce buffer for devices with limited DMA range |
**Implementation Plan — DMA/IOMMU Integration (Week 3-5):**
| Task | Description | Lines | Priority |
|---|---|---|---|
| **D2.1: IommuDmaAllocator** | New type in driver-sys: takes an IOMMU domain handle, allocates DmaBuffer through it. Uses `scheme:iommu/domain/N` MAP opcode. | ~150 | P0 |
| **D2.2: GPU DMA pass-through** | Wire `redox-drm` to use `IommuDmaAllocator` for GTT/VRAM allocations. Requires amdgpu/ihdgd to open IOMMU device handle. | ~80 | P0 |
| **D2.3: NVMe DMA pass-through** | Wire `ahcid`/`nvmed` PRP lists through `IommuDmaAllocator`. | ~60 | P1 |
| **D2.4: Streaming DMA** | `dma_map_single`/`dma_unmap_single` in linux-kpi. Allocates temp buffer, copies data, maps through IOMMU. | ~120 | P1 |
| **D2.5: SWIOTLB** | Bounce buffer allocation for DMA-limited devices. Linux ref: `kernel/dma/swiotlb.c`. | ~200 | P2 |
**Linux Reference Summary (from `local/reference/linux-7.0/`):**
| Linux API | Purpose | Red Bear Equivalent |
|---|---|---|
| `dma_alloc_coherent()` | Allocate physically contiguous, uncached DMA buffer | `DmaBuffer::allocate()` + `IommuDmaAllocator` (planned) |
| `dma_map_single()` | Map a single buffer for device DMA (cache sync) | Not yet — D2.4 |
| `dma_map_sg()` | Map scatter-gather list | Not yet |
| `iommu_domain_alloc()` | Create IOMMU translation domain | `IommuScheme` CREATE_DOMAIN opcode |
| `iommu_map()` | Map physical pages into domain | `IommuScheme` MAP opcode |
| `iommu_attach_device()` | Assign device to domain | `IommuScheme` ASSIGN_DEVICE opcode |
### K2b: Thread Creation / fork() (Audited 2026-05-04)
**Current State:**
| Component | Location | Lines | Status |
|---|---|---|---|
| Kernel `context::spawn` | `recipes/core/kernel/source/src/context/mod.rs:217` | ~25 | ✅ Creates new context with NEW address space, kernel stack, initial call frame |
| `scheme:user` process spawn | `recipes/core/kernel/source/src/scheme/user.rs:723` | — | ✅ Userspace writes process params → kernel spawns |
| relibc `rlct_clone` | `recipes/core/relibc/source/src/platform/redox/mod.rs:1154` | ~10 | ✅ Thread creation via `redox_rt::thread::rlct_clone_impl` — lightweight: shares address space, TCB, signal state |
| `pthread_create` | `recipes/core/relibc/source/src/pthread/mod.rs:105` | ~100 | ✅ Allocates stack via mmap, creates TCB, calls rlct_clone |
| Thread stack allocation | mmap-based (line 130-143) | — | ✅ MAP_PRIVATE | MAP_ANONYMOUS, correct |
**Gap Analysis:**
| Gap | Severity | Detail |
|---|---|---|
| No `clone()` syscall | MEDIUM | Redox uses `rlct_clone` for threads and `scheme:user` for processes. This is architecturally correct for a microkernel — no gap. |
| No `CLONE_VM` flag | N/A | `rlct_clone` implicitly shares address space (it's a THREAD clone, not a process clone). Process creation via `scheme:user` creates new address space. Correct semantics. |
| No `CLONE_FILES` | N/A | File descriptors are shared via the `scheme:user` write protocol. Re-layout possible but functional. |
| "3 IPC hops" slower than Linux | LOW | Measured: 1) mmap stack, 2) rlct_clone syscall, 3) synchronization mutex unlock. Linux `clone()` does all three in kernel. Acceptable for a microkernel. |
| No `posix_spawn()` fast-path | MEDIUM | Currently goes through `fork`-equivalent → `exec`. Linux has `posix_spawn` via `vfork`+`exec`. Not yet in Redox. |
**Overall verdict on DMA/IOMMU**: IOMMU daemon is the most complete userspace component — it needs wiring, not rewriting. DmaBuffer exists but is IOMMU-unaware. The implementation tasks (D2.1-D2.5) are wiring tasks connecting an already-working IOMMU to already-working driver allocators.
### K3: Virtio