# Driver Discovery and Dynamic Hardware Mapping Plan **Status**: Draft — implementation pending **Date**: 2026-05-27 **Supersedes**: Ad-hoc pcid-spawner + hardcoded lived disk paths **Author**: Red Bear OS team --- ## 1. Problem Statement Red Bear OS has two critical gaps in hardware discovery: 1. **lived's disk fallback is broken**: The live ISO boot daemon (`lived`) tries hardcoded paths `/scheme/disk/0` and `/scheme/usbscsi/0` to find the physical boot disk. But no disk driver registers those exact scheme names — they register `disk.pci-00-1F-2_ahci`, `disk.usb-xhci+1-scsi`, etc. The fallback **never works**. 2. **No dynamic hardware mapping**: The system does not distinguish between "hardware present" and "driver needed." On bare metal with no virtio devices, the system should not try to load `virtio-blkd`. On QEMU with no real AHCI controller, the system should not try to load `ahcid`. Today, the driver-manager loads whatever matches its static config files regardless of whether the hardware exists. Linux solves both problems with a two-stage model: - **Stage 1 (initramfs)**: Enumerate PCI bus, load ONLY the storage driver matching the boot controller, mount rootfs. - **Stage 2 (rootfs)**: Full enumeration, udev + modprobe dynamically load all remaining drivers based on actual hardware. --- ## 2. Current Architecture ### 2.1 Boot Sequence (Initfs Phase) ``` Bootstrap (PID 1) → init → services start in dependency order: 00_runtime.target randd, nulld, zerod, rtcd, logd 10_inputd.service VT input multiplexer 10_lived.service Live disk daemon (RAM preload + disk fallback) 20_graphics.target vesad (FB handoff), fbcond, fbbootlogd 41_acpid.service ACPI interpreter → scheme:acpi 40_hwd.service Hardware manager → spawns pcid internally pcid → enumerates PCI bus → registers scheme:pci 00_driver-manager-initfs.service (if P26 applied) Loads /scheme/initfs/lib/drivers.d/00-storage.toml Only: ahcid, ided, nvmed, virtio-blkd 40_drivers.target All initfs drivers 50_rootfs.service Mount rootfs (hard dep on drivers.target) 90_initfs.target Trigger switchroot ``` ### 2.2 Driver Registration Contract All disk drivers using `driver_block::DiskScheme` register schemes starting with `"disk"`: | Driver | Scheme Name Pattern | Match Criteria | |--------|---------------------|----------------| | ided | `disk.pci-XX-XX-X_ide` | PCI class 0x01, subclass 0x01 | | ahcid | `disk.pci-XX-XX-X_ahci` | PCI class 0x01, subclass 0x06 | | nvmed | `disk.pci-XX-XX-X-nvme` | PCI class 0x01, subclass 0x08 | | virtio-blkd | `disk.pci-XX-XX-X_virtio_blk` | PCI vendor 0x1AF4, device 0x1001 | | usbscsid | `disk.usb-xhci+PORT-scsi` | USB SCSI transport | | lived | `disk.live` | RAM-backed (our daemon) | The `DiskScheme::new()` assertion (`assert!(scheme_name.starts_with("disk"))`) is the **contract** that enables dynamic discovery: any consumer can find all disk schemes by listing `/scheme/` and filtering for the `"disk"` prefix. ### 2.3 The Two Driver-Loading Paths | Path | Mechanism | Config Source | Drivers | |------|-----------|---------------|---------| | **Initfs** | `driver-manager --initfs` | `/scheme/initfs/lib/drivers.d/00-storage.toml` | Storage only (4 drivers) | | **Rootfs** | `driver-manager --hotplug` | `/lib/drivers.d/*.toml` | All categories (40+ drivers) | ### 2.4 How Linux Does It (Reference) Linux uses a two-tier ordering: **Tier 1 — Initcall levels** (include/linux/init.h): ``` Level 0: pure_initcall (architecture setup) Level 2: postcore_initcall (PCI subsystem registers here) Level 4: subsys_initcall (SCSI, networking subsystems) Level 6: device_initcall (module_init → all built-in drivers) Level 7: late_initcall (late-stage platform drivers) ``` **Tier 2 — Link order** within device_initcall (drivers/Makefile): ``` Line 49: obj-y += virtio/ # VirtIO before block Line 76: obj-y += block/ # Block devices (storage) Line 84: obj-y += nvme/ # NVMe Line 85: obj-y += ata/ # ATA/AHCI Line 92: obj-y += net/ # Network Line 68: obj-y += gpu/ # GPU comes AFTER storage ``` **The critical principle**: Storage must load before GPU not because of PCI ordering, but because GPU drivers need firmware blobs from `/lib/firmware/` — which requires a mounted filesystem. Storage drivers are needed to mount that filesystem. **Dynamic loading** (after rootfs mount): `MODULE_DEVICE_TABLE` entries in every driver generate `modules.alias` patterns. udev receives kernel uevents with `MODALIAS=pci:v00001AF4d00001001...`, calls `modprobe`, which looks up the alias and loads the matching `.ko` module. --- ## 3. Design: Two-Stage Dynamic Hardware Discovery ### 3.1 Stage 1 — Initfs Boot (Storage-Only) **Goal**: Load exactly the storage driver(s) needed to mount the root filesystem. No more, no less. **Mechanism**: driver-manager `--initfs` already exists and does PCI class/vendor matching. The missing piece is that the P26 patch (which creates `00_driver-manager-initfs.service` and `initfs-storage.toml`) is wired in `recipe.toml` but needs to be applied. **Initfs driver config** (`initfs-storage.toml`): ```toml # Only storage drivers — needed to mount rootfs # GPU/display deliberately excluded (handled by rootfs DRM/KMS stack) [[driver]] name = "nvmed" description = "NVMe storage driver" priority = 100 command = ["/scheme/initfs/lib/drivers/nvmed"] [[driver.match]] bus = "pci" class = 1 subclass = 8 [[driver]] name = "ahcid" description = "AHCI SATA driver" priority = 100 command = ["/scheme/initfs/lib/drivers/ahcid"] [[driver.match]] bus = "pci" class = 1 subclass = 6 [[driver]] name = "ided" description = "PATA IDE driver" priority = 100 command = ["/scheme/initfs/lib/drivers/ided"] [[driver.match]] bus = "pci" class = 1 subclass = 1 [[driver]] name = "virtio-blkd" description = "VirtIO block device driver" priority = 100 command = ["/scheme/initfs/lib/drivers/virtio-blkd"] [[driver.match]] bus = "pci" vendor = 0x1AF4 device = 0x1001 ``` **How this is already dynamic**: The driver-manager only spawns a driver when the PCI bus actually reports a matching device. If QEMU has no AHCI controller, `ahcid` is never spawned. If bare metal has no VirtIO devices, `virtio-blkd` is never spawned. The TOML match table is a **candidate list**, not a **must-load list**. **What's needed**: Ensure P26 is applied, ensure `virtio-blkd` is in the BINS list, and ensure the initfs binary staging includes all 4 storage drivers. ### 3.2 Stage 2 — Rootfs (Full Hardware Discovery) **Goal**: After rootfs is mounted, dynamically discover and load ALL remaining drivers based on actual hardware. **Mechanism**: `driver-manager --hotplug` already reads `/lib/drivers.d/*.toml` (8 config files, 40+ drivers), enumerates PCI + ACPI buses, and spawns matching drivers. It also runs a hotplug loop for device add/remove. **The existing driver configs are already data-driven and dynamic**: | Config File | Category | Priority | Matching | |-------------|----------|----------|----------| | `00-storage.toml` | Storage | 100 | PCI class-based | | `10-network.toml` | Network | 50 | PCI vendor + class | | `20-usb.toml` | USB | 80 | PCI class + prog_if | | `30-graphics.toml` | GPU/Display | 60 | PCI class 0x03 | | `40-input.toml` | Input | 40 | Sentinel (vendor=0xFFFF) | | `50-audio.toml` | Audio | 40 | PCI vendor + class | | `60-gpio-i2c.toml` | GPIO/I2C | 30 | ACPI bus matching | | `70-usb-class.toml` | USB class | 20 | Sentinel (vendor=0xFFFF) | **Key property**: Priority ordering ensures storage (100) > USB (80) > GPU (60) > network (50) > audio (40). This mirrors Linux's link-order principle. ### 3.3 lived Disk Fallback Fix **Current bug**: `lived` tries `/scheme/disk/0` — but real schemes are named `disk.pci-00-1F-2_ahci`, never just `disk`. **Fix**: Replace hardcoded paths with RedoxFS-style dynamic scheme discovery (same pattern as `filesystem_by_uuid` in `redoxfs/src/bin/mount.rs`): ```rust fn try_open_disk(&self) -> Result { for attempt in 0..DISK_OPEN_MAX_RETRIES { // List /scheme/ to find all registered disk schemes if let Ok(entries) = std::fs::read_dir("/scheme") { for entry in entries.flatten() { let name = entry.file_name(); let name_str = name.to_string_lossy(); // All disk schemes start with "disk." (driver-block contract) // Skip our own "disk.live" scheme if name_str.starts_with("disk.") && name_str != "disk.live" { // Try opening disk 0 on this scheme let path = format!("/scheme/{}/0", name_str); if let Ok(file) = File::open(&path) { eprintln!("lived: opened physical disk at {} (attempt {})", path, attempt + 1); return Ok(file); } } } } if attempt < DISK_OPEN_MAX_RETRIES - 1 { std::thread::sleep(std::time::Duration::from_millis( DISK_OPEN_RETRY_INTERVAL_MS )); } } Err(format!("no disk scheme found after {} retries", DISK_OPEN_MAX_RETRIES)) } ``` **This is the exact pattern RedoxFS uses** in `filesystem_by_uuid()`. It: 1. Lists `/scheme/` (all registered schemes) 2. Filters to names starting with `"disk."` (the `driver-block` contract) 3. Skips `disk.live` (our own RAM-backed scheme) 4. Tries opening disk 0 on each discovered scheme **Boot timing**: lived starts at service 10, before disk drivers. The retry loop (60 × 500ms = 30s) gives driver-manager and storage drivers time to load and register their schemes. As soon as ANY storage driver registers `disk.*`, lived finds it. --- ## 4. What Needs to Change ### 4.1 Patches Required | Component | Patch | What It Does | |-----------|-------|--------------| | **base** | P60 (new) | Add `virtio-blkd` to BINS + staged files; update lived's `try_open_disk()` with dynamic scheme discovery | | **kernel** | P26 (existing) | DebugDisplay scrolling fix (already done) | | **base** | P26-driver-manager-initfs-conversion.patch (existing, wired but needs application verification) | Replaces pcid-spawner with driver-manager in initfs | ### 4.2 Changes to `recipes/core/base/recipe.toml` 1. **Add `virtio-blkd` to BINS** (already done in working tree) 2. **Add `virtio-blkd` to staged files list** (already done in working tree) 3. **No changes to driver configs** — `initfs-storage.toml` already lists all 4 storage drivers ### 4.3 Changes to `recipes/core/base/source/drivers/storage/lived/src/main.rs` Replace the hardcoded `candidates` array in `try_open_disk()` with `/scheme/` directory enumeration that discovers disk schemes dynamically. ### 4.4 No Changes Needed - **driver-manager** — already does dynamic PCI matching - **initfs-storage.toml** — already has the right 4 storage drivers - **Driver configs** (`/lib/drivers.d/*.toml`) — already data-driven with vendor/class matching - **pcid** — already enumerates PCI bus correctly - **Boot service order** — already correct (lived at 10, driver-manager-initfs at 00, rootfs at 50) --- ## 5. Verification Plan ### 5.1 QEMU with IDE (default) ```bash timeout 60 qemu-system-x86_64 \ -drive file=build/x86_64/redbear-full.iso,format=raw \ -m 4G -smp 4 -serial stdio -no-reboot ``` Expected: lived finds `disk.pci-00-01-1_ide` scheme from `ided`, mounts rootfs. ### 5.2 QEMU with virtio-blk ```bash timeout 60 qemu-system-x86_64 \ -device virtio-blk-pci,drive=drive0 \ -drive id=drive0,file=build/x86_64/redbear-full.iso,format=raw,if=none \ -m 4G -smp 4 -serial stdio -no-reboot ``` Expected: lived finds `disk.pci-00-XX-X_virtio_blk` scheme from `virtio-blkd`, mounts rootfs. ### 5.3 Bare Metal USB Boot Expected: lived finds `disk.usb-xhci+PORT-scsi` scheme from `usbscsid`, mounts rootfs. ### 5.4 No Unnecessary Drivers On QEMU with only virtio-blk (no AHCI), `ahcid` should NOT be spawned. Verify via boot log: ``` driver-manager: no driver found for pci 0000:00:01.1 # IDE controller — no match driver-manager: bound: 0000:00:04.0 -> virtio-blkd # VirtIO block — matched ``` --- ## 6. PCI Class Code Reference From Linux `include/linux/pci_ids.h` and our driver configs: | Class | Subclass | Prog IF | Device Type | Red Bear Driver | |-------|----------|---------|-------------|-----------------| | 0x01 | 0x01 | — | IDE/PATA | `ided` | | 0x01 | 0x06 | 0x01 | AHCI SATA | `ahcid` | | 0x01 | 0x08 | 0x02 | NVMe | `nvmed` | | 0x01 | 0x00 | — | VirtIO Block (vendor 0x1AF4, device 0x1001) | `virtio-blkd` | | 0x02 | — | — | Ethernet | `e1000d`, `rtl8168d`, etc. | | 0x03 | — | — | Display/GPU | `redox-drm` | | 0x04 | 0x03 | — | Audio (HDA) | `ihdad` | | 0x0C | 0x03 | 0x30 | xHCI USB | `xhcid` | | 0x0C | 0x03 | 0x00 | UHCI USB | `uhcid` | | 0x0C | 0x03 | 0x10 | OHCI USB | `ohcid` | | 0x0C | 0x03 | 0x20 | EHCI USB | `ehcid` | --- ## 7. Boot Timeline (Target State) ``` T+0ms Bootstrap starts, creates initfs/procmgr/namespace schemes T+50ms init starts, launches 00_randd → 00_logd → 00_runtime.target T+200ms lived starts (service 10), loads 128 MiB preload T+300ms vesad starts (FB handoff for text console) T+400ms acpid starts → ACPI interpreter → scheme:acpi T+500ms hwd starts → spawns pcid → PCI bus scan → scheme:pci driver-manager --initfs starts: Loads 00-storage.toml (4 storage drivers) Enumerates PCI bus via /scheme/pci/ QEMU: finds 8086:7010 (IDE) → spawns ided finds 1234:1111 (virtio-gpu) → no storage match, skipped finds 1AF4:1050 (virtio-net) → no storage match, skipped T+1500ms ided registers disk.pci-00-01-1_ide lived discovers disk.pci-00-01-1_ide via /scheme/ enumeration lived disk fallback succeeds T+2000ms redoxfs mounts rootfs from lived T+2500ms switchroot → rootfs init starts T+3000ms driver-manager --hotplug starts (rootfs): Loads all /lib/drivers.d/*.toml configs Detects ided already bound → skips Finds 1234:1111 (display class 0x03) → spawns redox-drm Finds 8086:100E (network class 0x02) → spawns e1000d Finds 1AF4:1050 (virtio-net) → spawns virtio-netd T+5000ms All drivers bound, system fully operational ``` --- ## 8. Principles 1. **Data-driven, not hardcoded**: Driver matching via TOML configs with vendor/device/class fields. No binary name hardcoding, no path guessing. 2. **Enumerate first, match second**: PCI bus scan produces ALL devices. Driver matching filters to supported ones. Unknown hardware is logged but doesn't block boot. 3. **Priority ordering**: Storage (100) before USB (80) before GPU (60) before network (50) before audio (40). Mirrors Linux's link-order principle. 4. **Stage 1 = minimum viable set**: Initfs loads ONLY storage drivers. Everything else waits for rootfs. 5. **Dynamic scheme discovery**: lived discovers disk schemes by reading `/scheme/` and filtering for the `"disk."` prefix — the same contract that `driver-block` enforces. 6. **No unnecessary drivers**: If hardware doesn't exist, the driver is never spawned. `driver-manager` only calls `probe()` for devices that actually exist on the PCI/ACPI bus. 7. **Deferred retry for timing**: Drivers that start before their dependencies are ready get retried (3 times in initfs, 5 times in hotplug). After max retries, the device is permanently skipped with a logged reason.