Files
RedBear-OS/local/docs/BOOT-PROCESS-HARDWARE-DETECTION-PLAN.md
T
vasilito cee25393d8 fix: boot process improvements — dependency cycle, INIT_NOTIFY, probing loop, and log spam fixes
- Fix P15-8-init-cycle-detection.patch: replace visiting+error with seen+silent-skip
  to eliminate 11 false-positive 'dependency cycle detected' errors on shared deps
- Fix P0-daemon-fix-init-notify-unwrap.patch: remove eprintln! for missing
  INIT_NOTIFY (expected for oneshot_async services, ~7 daemons affected)
- Fix driver-manager hotplug loop: add PERMANENTLY_SKIPPED static set shared
  between hotplug handler and DriverConfig::probe() to stop infinite re-probing
  of Fatal/NotSupported/deferred-exhausted device+driver pairs (e.g. ided)
- Fix driver-manager log_timeline: suppress repeated EPIPE/ENOENT errors with
  AtomicI32 dedup and AtomicBool one-shot guards for boot timeline JSON
- Add driver-manager SIGTERM handler, ACPI bus registration, --status mode,
  driver reap loop, graceful shutdown, and reduced deferred retries (30→3)
2026-05-17 12:34:02 +03:00

35 KiB

Red Bear OS: Boot Process & Hardware Detection Improvement Plan

Version: 1.5 (2026-05-15) Reference: Linux 7.1-rc3 (local/reference/linux-7.1/) Status: Canonical plan for boot efficiency, hardware detection completeness, and init ordering

Implementation Status (2026-05-15)

Approach changed: Instead of creating a separate redbear-hwdetect daemon, we are enhancing the existing driver-manager with ACPI bus support and boot stage targets. This builds on the existing redox-driver-core device model (DeviceId, DeviceInfo, Bus, Driver, DeviceManager) rather than duplicating it.

Completed

Wave Status What was done
Wave 0 Done Created config/redbear-boot-stages.toml with 4 stage targets (02_early_hw, 04_drivers, 06_services, 08_userland) + serial boot markers
Wave 1 Done Created local/recipes/drivers/redox-driver-acpi/ with AcpiBus that enumerates ACPI devices from /scheme/acpi/symbols/. Registered in driver-manager alongside PciBus. Added _HID-based device classification (maps ~40 ACPI hardware IDs to PCI-equivalent class/subclass/vendor). 15 unit tests pass.
Wave 2 Done Created resource.rs — ACPI resource descriptor parser (raw byte buffers → typed structs for IRQ, MMIO, I/O port, DMA, address spaces). Covers all 25 ACPI resource types (types 0-25). Created prt.rs — _PRT PCI IRQ routing table resolver (parses RON-serialized Package-of-Packages, resolves static GSI and dynamic link device routing). Fixed bus.rs to use child symbol lookup for _HID/_CID (the RON Device variant is unit — properties are separate namespace children). Added query_device_resources() API to AcpiBus. 20+ new unit tests across all modules.
Wave 2b Done Extended driver-manager/config.rs probe() to handle ACPI device binding alongside PCI. ACPI devices get ACPI_DEVICE_PATH, ACPI_DEVICE_NAME, ACPI_MMIO_N, ACPI_IRQ_N, ACPI_IO_N env vars passed to spawned drivers. PCI devices continue using PCID_CLIENT_CHANNEL/PCID_DEVICE_PATH. Updated scheme.rs to accept ACPI device names in the scheme namespace (relaxed PCI-only validation). main.rs now notifies bound devices for both buses.
Wave 2c Done Created ACPI driver config with match criteria in 60-gpio-i2c.toml — Intel I2C (class=0x0C/sub=0x05/vendor=0x8086 → dw-acpi-i2cd), AMD I2C (class=0x0C/sub=0x05/vendor=0x1022 → amd-mp2-i2cd), Intel GPIO (class=0x0C/sub=0x80/vendor=0x8086 → intel-gpiod). Wired into redbear-device-services.toml as /lib/drivers.d/60-gpio-i2c.toml. Infrastructure daemons (i2cd, gpiod) remain as init services (scheme providers); controller drivers are dual-pathed (init fallback + driver-manager matching).
Wave 3 Done Rewired all services in redbear-device-services.toml, redbear-mini.toml, redbear-full.toml to use stage targets instead of flat 00_base.target
Wave 4 Done Removed dead /etc/pcid.d/ entries from redbear-mini.toml and redbear-full.toml. Confirmed no runtime binary reads /etc/pcid.d/. All driver matching now uses /lib/drivers.d/.
Wave 5 Already had driver-manager/config.rs already has scheme-aware deferred probing via check_scheme_available() + depends_on field

Not Yet Started

Wave Status What remains
Wave 2c Not started Runtime _CRS evaluation via ACPI scheme call() interface for Method-type _CRS (currently only Buffer-type _CRS is parsed). Link device _CRS resolution for dynamic _PRT entries. Full image build verification.

Config consistency verified (2026-05-15)

All requires_weak references in config files resolve to valid targets or services:

  • 00_base.target — staged by base package at /usr/lib/init.d/00_base.target
  • Stage targets (02_early_hw through 08_userland) — defined in redbear-boot-stages.toml
  • 12_boot-late.target — compat alias defined in redbear-device-services.toml
  • 05_boot-essential.target — defined in redbear-full.toml and redbear-greeter-services.toml
  • All service dependencies have corresponding [[files]] entries or package-staged definitions

Purpose

This document is the execution plan for making the Red Bear OS boot process stellar: efficient, complete, and — above all — featuring perfect hardware detection and initialization.

It is grounded in a comprehensive study of Linux 7.1-rc3's boot flow (init/main.c, drivers/base/, drivers/pci/, drivers/acpi/) and maps Linux's proven patterns to Red Bear OS's microkernel architecture.

Honest Current State

What works today

  • UEFI boot on x86_64 (bootloader → kernel → initfs → init → login)
  • ACPI boot-baseline: RSDP/SDT/MADT/FADT/HPET parsing in kernel
  • PCI enumeration via pcid + driver matching via driver-manager
  • Wired networking (e1000d, rtl8168d, virtio-netd) in QEMU
  • PS/2 keyboard/mouse via kernel serio scheme
  • Framebuffer text console via vesad
  • Multi-core x2APIC/SMP works
  • Greeter/login QEMU proof passes on redbear-full

What is broken or missing (THESE ARE THE GAPS)

Gap Linux equivalent RedBear status
No unified hardware detection start_kernel()driver_init() → initcalls Fragmented across pcid, acpid, hwd, driver-manager
No device model struct device, struct driver, struct bus_type No common device/driver/bus abstraction
No ACPI device enumeration acpi_bus_scan() walks namespace, creates platform devices acpid parses tables but doesn't enumerate devices
No deferred probe with real semantics -EPROBE_DEFER + retry queue in driver_deferred_probe_trigger() driver-manager has a 30-retry loop but no dependency graph
No device resource tracking request_region(), request_irq(), ioremap() with resource tree BARs mapped ad-hoc per driver, no global resource registry
No boot-stage ordering initcall levels (core → postcore → arch → subsys → device → late) Flat requires_weak everywhere; no semantic stages
PCI enumeration too late PCI scanned at subsys_initcall level (level 4) driver-manager is a userspace service with no hard dependency
No platform/I2C/SPI device discovery ACPI _HID/_CID creates platform/i2c/spi devices I2C/SPI daemons exist but no device enumeration from ACPI
No USB device enumeration usb_new_device() → device descriptor → class matching xHCI controller starts but no USB topology enumeration
No sysfs/udev equivalent /sys/devices/ tree + udev rules udev-shim exists but is minimal
Silent service failures Kernel oops if critical subsystem fails requires_weak + oneshot_async → failures are invisible

Architecture: What Linux Does That We Must Reimplement

Linux Boot Flow (from init/main.c)

start_kernel()
  ├── setup_arch()          → arch-specific: page tables, early param parsing
  ├── trap_init()           → IDT/exception vectors
  ├── mm_init()             → memory management, slab allocator
  ├── sched_init()          → scheduler
  ├── early_irq_init()      → early IRQ descriptors
  ├── init_IRQ()            → architecture IRQ controllers (IOAPIC, LAPIC)
  ├── time_init()           → HPET/PIT/timers
  ├── console_init()        → early console
  ├── driver_init()         → device model core (kobject, sysfs, bus, class)
  └── rest_init()
      └── kernel_init()
          └── do_basic_setup()
              └── do_initcalls()
                  ├── level 0 (core):     kobject, debugfs, kernel core
                  ├── level 1 (postcore): driver core, workqueue
                  ├── level 2 (arch):     arch-specific devices
                  ├── level 3 (subsys):   PCI, ACPI, network stack
                  ├── level 4 (fs):       filesystems
                  ├── level 5 (device):   device drivers
                  └── level 6 (late):     late drivers, networking

Linux Device Model (from drivers/base/)

Three core abstractions:

  1. struct bus_type — PCI, ACPI, platform, USB, I2C, SPI
  2. struct device — represents hardware, has parent, bus, driver, resources
  3. struct device_driver — probe/remove/shutdown callbacks, ID table

Binding flow:

bus->probe(dev) → driver->probe(dev, id) → device bound to driver

Deferred probing (drivers/base/dd.c):

driver_probe_device() returns -EPROBE_DEFER
  → device added to deferred_probe_pending_list
  → driver_deferred_probe_trigger() retries on schedule
  → wake_up_all() after each successful bind

Linux ACPI Device Discovery (from drivers/acpi/scan.c)

acpi_init()
  └── acpi_bus_scan()
      └── acpi_walk_namespace()
          ├── Read _HID (hardware ID)
          ├── Read _CID (compatible IDs)
          ├── Read _STA (status: present, enabled, functional)
          ├── Read _CRS (current resource settings: IRQ, MMIO, I/O ports)
          └── Create device:
              ├── PCI root bridge → pci_scan_child_bus()
              ├── I2C controller → i2c_register_adapter()
              ├── SPI controller → spi_register_controller()
              ├── GPIO controller → gpiochip_add()
              ├── Platform device → platform_device_register()
              └── Thermal zone → thermal_zone_device_register()

Linux PCI Enumeration (from drivers/pci/probe.c)

pci_scan_child_bus(bus)
  for devfn in 0..0xFF:
    pci_scan_slot(bus, devfn)
      ├── Read PCI_VENDOR_ID → skip if 0xFFFFFFFF
      ├── Read PCI_HEADER_TYPE → multifunction?
      ├── Read PCI_CLASS, PCI_REVISION
      ├── Read BARs (6 base address registers)
      ├── Parse capability chain (MSI, MSI-X, PCIe, power management)
      ├── Assign IRQ (from ACPI _PRT or BIOS)
      ├── If PCI bridge: recursively scan subordinate bus
      └── Register device → driver core → bus_probe_device()

Design: RedBear OS Hardware Detection Architecture

Core Principle

RedBear OS is a microkernel. Unlike Linux where everything runs in kernel space, RedBear OS runs all drivers as userspace daemons accessing hardware through schemes.

This means our "device model" lives in userspace, not in the kernel. The kernel provides:

  • scheme:irq — interrupt delivery
  • scheme:memory — physical memory mapping
  • scheme:pci — PCI config space access
  • scheme:acpi — ACPI table access
  • scheme:serio — PS/2 controller

Everything else — device discovery, driver matching, resource allocation — is userspace.

Proposed Architecture

┌─────────────────────────────────────────────────────────┐
│                    Kernel (microkernel)                   │
│  schemes: irq, memory, pci, acpi, serio, event, time    │
│  ACPI early: RSDP, MADT (LAPIC/IOAPIC), HPET            │
│  x2APIC/SMP: AP startup, interrupt routing               │
└───────────────────────┬─────────────────────────────────┘
                        │ scheme IPC
┌───────────────────────▼─────────────────────────────────┐
│              redbear-hwdetect (NEW DAEMON)                │
│  unified hardware detection & device registry             │
│                                                           │
│  1. PCI bus walk (via scheme:pci)                         │
│     → enumerate all devices, parse BARs/caps/IRQ         │
│     → build device tree with parent-child relationships  │
│                                                           │
│  2. ACPI device scan (via scheme:acpi + acpid)            │
│     → walk ACPI namespace for _HID/_CID/_STA/_CRS        │
│     → create platform/I2C/SPI devices from ACPI          │
│     → resolve PCI IRQ routing via _PRT                   │
│                                                           │
│  3. USB topology (via xhcid scheme)                      │
│     → enumerate USB devices on each controller            │
│     → match by class/vendor/product                      │
│                                                           │
│  4. Driver matching                                       │
│     → match devices to /lib/drivers.d/*.toml             │
│     → spawn driver daemons with correct resources        │
│     → deferred retry with real dependency tracking       │
│                                                           │
│  5. Device registry (scheme:hwdetect)                     │
│     → /scheme/hwdetect/devices → list all detected HW    │
│     → /scheme/hwdetect/pci/{bdf} → per-device info      │
│     → /scheme/hwdetect/acpi/{path} → per-ACPI device    │
│     → /scheme/hwdetect/drivers → driver status           │
│     → JSON output for diagnostics                        │
│                                                           │
│  Registers scheme: hwdetect                               │
└─────────────────────────────────────────────────────────┘

Why Enhance driver-manager Instead of Creating a New Daemon

Decision (2026-05-15): We chose to enhance the existing driver-manager instead of creating redbear-hwdetect. The redox-driver-core crate already provides a solid device model (DeviceId, DeviceInfo, Bus trait, Driver trait, DeviceManager with deferred probing), and driver-manager already uses it for PCI enumeration. Adding ACPI bus support as a second Bus implementation follows the established pattern and avoids duplicating the device model, driver matching, and deferred probe logic.

The current driver-manager does PCI matching but:

  • No ACPI device enumeration
  • No USB topology
  • No device tree
  • No resource tracking
  • No parent-child relationships
  • Deferred retry is naive (fixed interval, no dependency graph)

Rather than bolting more onto driver-manager, the original plan was to create redbear-hwdetect as the single source of truth for hardware state, and driver-manager becomes a thin consumer of its device registry. However, since redox-driver-core already provides the device model abstractions, we enhance driver-manager by registering additional Bus implementations (ACPI, and eventually USB).

Implementation Plan

Wave 0: Boot Stage Definitions (config-only, zero code)

Goal: Replace the flat requires_weak service model with explicit boot stages.

Current problem: Every service uses requires_weak = ["00_base.target"] which means no real ordering guarantee. Services can start in any order and silently fail.

Linux equivalent: initcall levels (core → postcore → arch → subsys → device → late)

Proposed boot stages:

Stage 0: PLATFORM   — kernel schemes ready (irq, memory, pci, acpi, serio)
Stage 1: CORE       — tmpdir, logging, random, null/zero
Stage 2: EARLY_HW   — acpid (ACPI tables), pcid (PCI bus access)
Stage 3: BUS_ENUM   — redbear-hwdetect (PCI walk, ACPI scan, USB topology)
Stage 4: DRIVERS    — driver spawning (storage, network, GPU, audio, USB class)
Stage 5: LATE_HW    — IOMMU, firmware loading, NUMA topology
Stage 6: SERVICES   — D-Bus, session broker, seat management
Stage 7: USERLAND   — console, greeter, desktop

Implementation: Add target files:

# /etc/init.d/00_platform.target
[unit]
description = "Platform stage: kernel schemes ready"

# /etc/init.d/01_core.target
[unit]
description = "Core stage: basic services"
requires = ["00_platform.target"]

# /etc/init.d/02_early_hw.target
[unit]
description = "Early hardware: ACPI + PCI bus access"
requires = ["01_core.target"]

# /etc/init.d/03_bus_enum.target
[unit]
description = "Bus enumeration: PCI walk + ACPI scan"
requires = ["02_early_hw.target"]

# /etc/init.d/04_drivers.target
[unit]
description = "Driver spawning stage"
requires = ["03_bus_enum.target"]

# /etc/init.d/05_late_hw.target
[unit]
description = "Late hardware: firmware, IOMMU, NUMA"
requires = ["04_drivers.target"]

# /etc/init.d/06_services.target
[unit]
description = "System services: D-Bus, session broker"
requires = ["05_late_hw.target"]

# /etc/init.d/07_userland.target
[unit]
description = "User-facing: console, greeter, desktop"
requires = ["06_services.target"]

Key change: Use requires (hard dependency, blocks if not met) instead of requires_weak for stages. Services within a stage use requires_weak against their stage target.

Wave 1: redbear-hwdetect — The Unified Hardware Detection Daemon

Goal: Create a single daemon that discovers ALL hardware, builds a device tree, and manages driver lifecycle.

Source location: local/recipes/system/redbear-hwdetect/source/

Cargo.toml:

[package]
name = "redbear-hwdetect"
version = "0.1.0"
edition = "2024"

[dependencies]
redox-daemon = "0.1"
redox-scheme = "0.11"
libredox = "0.1"
redox_syscall = "0.7"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
toml = "0.8"
log = "0.4"

[features]
default = []

Module structure:

redbear-hwdetect/source/src/
├── main.rs              — daemon entry, scheme registration, event loop
├── device.rs            — Device trait, DeviceInfo, DeviceType, DeviceStatus
├── registry.rs          — DeviceRegistry: HashMap<DeviceId, DeviceInfo>
├── pci/
│   ├── mod.rs           — PciEnumerator: bus walk via scheme:pci
│   ├── config.rs        — PCI config space reader
│   ├── capability.rs    — PCI capability chain parser (MSI, MSI-X, PCIe, PM)
│   └── resource.rs      — BAR parsing, IRQ assignment, resource allocation
├── acpi/
│   ├── mod.rs           — AcpiScanner: device enumeration from ACPI tables
│   ├── namespace.rs     — ACPI namespace walker (via acpid)
│   ├── resource.rs      — _CRS parser (IRQ, MMIO, I/O port resources)
│   └── pci_routing.rs   — _PRT (PCI IRQ routing table) resolver
├── usb/
│   ├── mod.rs           — UsbScanner: USB topology via xHCI schemes
│   └── descriptor.rs    — USB device/class descriptor parsing
├── driver/
│   ├── mod.rs           — DriverMatcher: load /lib/drivers.d/*.toml
│   ├── match.rs         — Device-driver matching (class, vendor, subclass)
│   └── spawn.rs         — Driver process spawning with resource handoff
├── deferred.rs          — Deferred probe queue with dependency graph
└── scheme.rs            — scheme:hwdetect handler

Key data structures:

/// Unique device identifier
#[derive(Debug, Clone, Hash, Eq, PartialEq, Serialize, Deserialize)]
pub enum DeviceId {
    Pci { domain: u16, bus: u8, device: u8, function: u8 },
    Acpi { path: String },       // ACPI namespace path (e.g., "\_SB.PCI0.I2C0")
    Usb { controller: u8, port: u8, address: u8 },
    Platform { name: String, id: u32 },
}

/// Device information
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DeviceInfo {
    pub id: DeviceId,
    pub device_type: DeviceType,
    pub status: DeviceStatus,
    pub vendor_id: Option<u16>,
    pub device_id: Option<u16>,
    pub class_code: Option<u8>,
    pub subclass_code: Option<u8>,
    pub prog_if: Option<u8>,
    pub revision: Option<u8>,
    pub parent: Option<DeviceId>,
    pub resources: Vec<Resource>,
    pub driver: Option<DriverInfo>,
    pub quirks: Vec<String>,
    pub description: String,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum DeviceType {
    PciDevice,
    PciBridge,
    AcpiDevice,
    UsbController,
    UsbDevice,
    PlatformDevice,
    I2cController,
    I2cDevice,
    SpiController,
    SpiDevice,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum DeviceStatus {
    Detected,        // Found during scan, not yet probed
    Probing,         // Driver probe in progress
    Bound,           // Driver successfully bound
    Deferred,        // Probe deferred (dependency not ready)
    Failed(String),  // Probe failed permanently
    NoDriver,        // No matching driver found
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Resource {
    pub resource_type: ResourceType,
    pub base: u64,
    pub size: u64,
    pub flags: ResourceFlags,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum ResourceType {
    Mmio,       // Memory-mapped I/O
    IoPort,     // I/O port range
    Irq,        // Interrupt (GSI number)
    Dma,        // DMA channel/range
    Firmware,   // Required firmware blob
}

bitflags! {
    #[derive(Serialize, Deserialize)]
    pub struct ResourceFlags: u32 {
        const PREFETCHABLE = 0x01;
        const CACHEABLE    = 0x02;
        const SHARED       = 0x04;
        const MSI          = 0x08;
        const MSI_X        = 0x10;
    }
}

/// Driver match rule (from /lib/drivers.d/*.toml)
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DriverMatch {
    pub vendor: Option<u16>,
    pub device: Option<u16>,
    pub class: Option<u8>,
    pub subclass: Option<u8>,
    pub prog_if: Option<u8>,
}

/// Driver configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DriverConfig {
    pub name: String,
    pub description: String,
    pub priority: u32,
    pub command: Vec<String>,
    pub depends_on: Vec<String>,   // scheme names that must exist before spawn
    pub matches: Vec<DriverMatch>,
}

PCI enumeration flow (from Linux drivers/pci/probe.c):

impl PciEnumerator {
    /// Walk all PCI buses (mirrors Linux pci_scan_child_bus)
    pub fn scan_all_buses(&mut self) -> Result<Vec<DeviceInfo>> {
        let mut devices = Vec::new();

        // Read from scheme:pci — get all PCI devices
        for entry in self.read_pci_scheme()? {
            let domain = entry.domain;
            let bus = entry.bus;
            let dev = entry.device;
            let func = entry.function;

            // Read config space (mirrors Linux pci_scan_slot)
            let config = self.read_config(domain, bus, dev, func)?;

            // Skip invalid devices (vendor 0xFFFF)
            if config.vendor_id == 0xFFFF {
                continue;
            }

            // Parse device info (mirrors Linux pci_setup_device)
            let mut device = DeviceInfo {
                id: DeviceId::Pci { domain, bus, device: dev, function: func },
                device_type: if config.is_bridge() {
                    DeviceType::PciBridge
                } else {
                    DeviceType::PciDevice
                },
                status: DeviceStatus::Detected,
                vendor_id: Some(config.vendor_id),
                device_id: Some(config.device_id),
                class_code: Some(config.class_code),
                subclass_code: Some(config.subclass_code),
                prog_if: Some(config.prog_if),
                revision: Some(config.revision),
                parent: self.find_parent_bridge(domain, bus),
                resources: Vec::new(),
                driver: None,
                quirks: Vec::new(),
                description: format!("PCI {:04x}:{:02x}:{:02x}.{} [{:04x}:{:04x}]",
                    domain, bus, dev, func, config.vendor_id, config.device_id),
            };

            // Parse BARs (mirrors Linux pci_read_bases)
            for bar_idx in 0..6 {
                if let Some(resource) = self.parse_bar(domain, bus, dev, func, bar_idx)? {
                    device.resources.push(resource);
                }
            }

            // Parse capability chain (mirrors Linux pci_init_capabilities)
            self.parse_capabilities(&mut device, domain, bus, dev, func)?;

            // Assign IRQ (from ACPI _PRT or IOAPIC routing)
            if let Some(irq) = self.assign_irq(&device)? {
                device.resources.push(Resource {
                    resource_type: ResourceType::Irq,
                    base: irq as u64,
                    size: 1,
                    flags: ResourceFlags::empty(),
                });
            }

            // Apply quirks
            self.apply_quirks(&mut device)?;

            devices.push(device);
        }

        Ok(devices)
    }
}

Deferred probe with real dependency graph (from Linux drivers/base/dd.c):

pub struct DeferredQueue {
    /// Devices waiting for dependencies
    pending: HashMap<DeviceId, Vec<String>>,  // device → missing dependencies
    /// Maximum retries per device
    max_retries: u32,
    /// Retry interval in ms
    retry_interval: u64,
}

impl DeferredQueue {
    /// Add a deferred device (mirrors Linux driver_deferred_probe_add)
    pub fn add(&mut self, device_id: DeviceId, missing_deps: Vec<String>) {
        self.pending.insert(device_id, missing_deps);
    }

    /// Retry all deferred devices (mirrors Linux driver_deferred_probe_trigger)
    pub fn retry_cycle(&mut self, registry: &mut DeviceRegistry) -> Vec<DeviceInfo> {
        let mut resolved = Vec::new();

        // Check each deferred device
        let pending_ids: Vec<DeviceId> = self.pending.keys().cloned().collect();
        for id in &pending_ids {
            if let Some(missing) = self.pending.get(id) {
                // Check if all dependencies are now available
                let all_ready = missing.iter().all(|dep| {
                    // Check if the scheme/file exists
                    std::path::Path::new(&format!("/scheme/{}", dep)).exists()
                        || std::path::Path::new(&format!("/bin/{}", dep)).exists()
                });

                if all_ready {
                    let deps = self.pending.remove(id).unwrap();
                    log::info!("Deferred device {:?} resolved (deps: {:?})", id, deps);
                    if let Some(device) = registry.get_mut(id) {
                        device.status = DeviceStatus::Detected; // Reset to retry
                        resolved.push(device.clone());
                    }
                }
            }
        }

        resolved
    }
}

Wave 2: ACPI Device Enumeration

Goal: Walk the ACPI namespace to discover non-PCI devices (I2C, SPI, GPIO, thermal, battery, AC adapter, platform devices).

Linux reference: drivers/acpi/scan.c::acpi_bus_scan()

Implementation in redbear-hwdetect:

impl AcpiScanner {
    /// Enumerate ACPI devices (mirrors Linux acpi_bus_scan)
    pub fn scan(&mut self) -> Result<Vec<DeviceInfo>> {
        let mut devices = Vec::new();

        // Connect to acpid via scheme:acpi
        let acpi = File::open("/scheme/acpi")?;

        // Walk ACPI namespace (read device entries)
        // Linux does: acpi_walk_namespace(ACPI_TYPE_DEVICE, ...)
        // RedBear: read entries from acpid's device enumeration
        for entry in self.enumerate_acpi_devices(&acpi)? {
            let hid = self.read_hid(&entry)?;
            let cid = self.read_cid(&entry)?;
            let sta = self.read_sta(&entry)?;

            // Skip if not present (mirrors Linux acpi_bus_check_add)
            if !sta.present {
                continue;
            }

            // Parse _CRS resources (mirrors Linux acpi_walk_resources)
            let resources = self.parse_crs(&entry)?;

            // Determine device type from _HID/_CID
            let device_type = match hid.as_str() {
                "PNP0A03" | "PNP0A08" => DeviceType::PciBridge,   // PCI root bridge
                "INT33C3" | "INT3433" | "AMDI0010" => DeviceType::I2cController,
                "INT33C0" | "INT3430" | "AMDI0061" => DeviceType::SpiController,
                _ => DeviceType::PlatformDevice,
            };

            let device = DeviceInfo {
                id: DeviceId::Acpi { path: entry.path.clone() },
                device_type,
                status: DeviceStatus::Detected,
                vendor_id: None,
                device_id: None,
                class_code: None,
                subclass_code: None,
                prog_if: None,
                revision: None,
                parent: Some(DeviceId::Acpi { path: entry.parent.clone() }),
                resources,
                driver: None,
                quirks: Vec::new(),
                description: format!("ACPI device {} ({})", entry.path, hid),
            };

            devices.push(device);
        }

        Ok(devients)
    }
}

Wave 3: Service Ordering Fix

Goal: Replace the current flat requires_weak model with stage-based ordering.

Changes to config files:

  1. Add stage targets to config/redbear-device-services.toml (shared fragment)
  2. Rewire services to depend on their stage target instead of 00_base.target

New service wiring example:

# acpid: early hardware stage
[[files]]
path = "/etc/init.d/02_acpid.service"
data = """
[unit]
description = "ACPI daemon"
requires = ["02_early_hw.target"]

[service]
cmd = "acpid"
type = { scheme = "acpi" }
"""

# redbear-hwdetect: bus enumeration stage
[[files]]
path = "/etc/init.d/03_redbear-hwdetect.service"
data = """
[unit]
description = "Hardware detection and device registry"
requires = ["03_bus_enum.target", "02_acpid.service"]

[service]
cmd = "redbear-hwdetect"
type = { scheme = "hwdetect" }
"""

# driver-manager: driver spawning stage (now consumes hwdetect registry)
[[files]]
path = "/etc/init.d/04_driver-manager.service"
data = """
[unit]
description = "Driver manager (consumes hwdetect registry)"
requires = ["04_drivers.target", "03_redbear-hwdetect.service"]

[service]
cmd = "driver-manager"
type = "oneshot_async"
"""

Wave 4: Driver Config Unification

Goal: Consolidate /etc/pcid.d/ and /lib/drivers.d/ into a single config format.

Current problem: Two config systems exist:

  • /etc/pcid.d/*.toml — legacy pcid format
  • /lib/drivers.d/*.toml — driver-manager format

Solution: Use only /lib/drivers.d/*.toml (driver-manager format). Remove all /etc/pcid.d/ config file generation from TOML configs.

Updated driver config format (enhanced from current):

[[driver]]
name = "e1000d"
description = "Intel Gigabit Ethernet"
priority = 50
command = ["/usr/lib/drivers/e1000d"]
depends_on = ["pci"]           # scheme dependencies (NEW)
capabilities = ["net"]         # declares what it provides (NEW)

[[driver.match]]
vendor = 0x8086
class = 0x02
subclass = 0x00

# Optional: specific device IDs for better matching
[[driver.match]]
vendor = 0x8086
device = 0x100e               # 82540EM
class = 0x02

Wave 5: Boot Diagnostics

Goal: Make boot failures visible and diagnosable.

Implementation:

  1. redbear-hwdetect --status — print detected hardware and driver status
  2. Boot marker on serialecho "STAGE_03_BUS_ENUM_COMPLETE" at each stage
  3. Device failure logging — every deferred/failed probe logged with reason
  4. JSON diagnostic outputredbear-hwdetect --json for automated testing

Wave 6: USB Topology Enumeration

Goal: Discover USB devices beyond just the xHCI controller.

Linux reference: drivers/usb/core/hub.c::hub_events()

This is a later wave because it depends on xHCI IRQ stability (per the blocker chain).

Implementation approach:

  • Query each xHCI controller for its device list
  • Parse USB device descriptors
  • Match USB class drivers (HID, mass storage, audio, CDC ACM)
  • Register in device registry

Execution Order

Wave Duration Deliverable Depends on
Wave 0 1 day Boot stage targets in config Nothing
Wave 1 2-3 weeks redbear-hwdetect daemon with PCI enumeration Wave 0
Wave 2 1-2 weeks ACPI device enumeration in hwdetect Wave 1
Wave 3 1 week Service rewiring to stage targets Wave 0
Wave 4 3-5 days Driver config unification Wave 1
Wave 5 3-5 days Boot diagnostics Wave 1
Wave 6 2-3 weeks USB topology enumeration Wave 1, xHCI IRQ stability

Total estimate: 6-10 weeks for waves 0-5 (core boot and hardware detection). Wave 6 (USB) follows the blocker chain after low-level controller quality.

Acceptance Criteria

Boot process is "stellar" when:

  1. Boot completes from power-on to login in < 10 seconds on QEMU
  2. Every PCI device is enumerated and logged with full info (vendor, device, class, BARs, IRQ)
  3. Every ACPI device with a present status is discovered
  4. Every device that has a matching driver is bound within 3 seconds of enumeration
  5. Deferred probes resolve within 5 seconds of dependency availability
  6. Boot failures are visible on serial console with stage markers
  7. redbear-hwdetect --status shows complete hardware state
  8. No requires_weak remains for critical boot-path services
  9. Service ordering is deterministic: same order on every boot
  10. Missing hardware does not cause panics or hangs

Hardware detection is "perfect" when:

  1. PCI: all devices on all buses enumerated, including behind bridges
  2. PCI: BARs parsed correctly (type, size, prefetchable)
  3. PCI: capabilities parsed (MSI, MSI-X, PCIe, power management, vendor-specific)
  4. PCI: IRQ assigned from ACPI _PRT or IOAPIC routing
  5. ACPI: all devices with _STA present enumerated
  6. ACPI: _CRS resources parsed (IRQ, MMIO, I/O ports, DMA)
  7. USB: all devices on all controllers discovered (Wave 6)
  8. Platform: I2C/SPI/GPIO controllers discovered from ACPI (Wave 2)
  9. Quirks: hardware-specific quirks applied automatically
  10. Hotplug: new devices detected and drivers spawned in < 2 seconds

Relationship to Other Plans

Plan Relationship
ACPI-IMPROVEMENT-PLAN.md ACPI robustness is prerequisite for Wave 2
IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md IRQ quality is prerequisite for hardware detection reliability
USB-IMPLEMENTATION-PLAN.md USB topology (Wave 6) depends on USB maturity
CONSOLE-TO-KDE-DESKTOP-PLAN.md Desktop path benefits from better boot/hardware detection
QUIRKS-SYSTEM.md Quirks integrated into hwdetect's device discovery

Linux 7.1 Reference Files

Key files to consult when implementing:

RedBear component Linux 7.1 reference
PCI enumeration drivers/pci/probe.c, drivers/pci/setup-bus.c
PCI driver matching drivers/pci/pci-driver.c
ACPI device scan drivers/acpi/scan.c, drivers/acpi/bus.c
ACPI resource parsing drivers/acpi/resource.c
PCI IRQ routing drivers/acpi/pci_irq.c, drivers/acpi/pci_link.c
Device model core drivers/base/core.c, drivers/base/bus.c, drivers/base/dd.c
Deferred probing drivers/base/dd.c
Boot initcalls init/main.c, include/linux/init.h
IRQ management kernel/irq/manage.c, kernel/irq/chip.c
Resource management kernel/resource.c
DMA mapping kernel/dma/mapping.c