# Red Bear OS — Driver & Hardware Improvement Plan **Date**: 2026-05-04 **Status**: In Progress — Phase 0 ✅, Phase 1 ✅, Phase 2 ✅, Phase 3 ✅, Phase 4 partial, Phase 5 ✅, Addendum A + B added (kernel + daemon audit with precise Linux 7.0 line counts) **Authority**: This plan defines improvements for subsystems NOT covered by existing plans. For ACPI, USB, IRQ/PCI, GPU/DRM, Bluetooth, and Wi-Fi, defer to their respective plans. This plan fills the storage, network, and audio gaps and adds cross-cutting concerns. **Source of truth**: Linux kernel 7.0 (`local/reference/linux-7.0/`). When in doubt, Linux behavior is authoritative. Every task includes the specific Linux source file and function to reference. --- ## Relationship to Existing Plans This plan is **subordinate** to the following plans for their respective subsystems. Tasks here do not duplicate, override, or conflict with them: | Plan Document | Subsystem | Status | |---------------|-----------|--------| | `ACPI-IMPROVEMENT-PLAN.md` | ACPI sleep, thermal, EC, power states | Active | | `IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md` | PCI IRQ, MSI-X, IOMMU, controllers | Active | | `USB-IMPLEMENTATION-PLAN.md` | xHCI, EHCI, device lifecycle | Active | | `DRM-MODERNIZATION-EXECUTION-PLAN.md` | GPU/DRM display, KMS, Mesa | Active | | `BLUETOOTH-IMPLEMENTATION-PLAN.md` | BT host/controller | Active | | `WIFI-IMPLEMENTATION-PLAN.md` | Wi-Fi control plane | Active | | `CONSOLE-TO-KDE-DESKTOP-PLAN.md` | Desktop/KDE path | Active | **New coverage by this plan**: Storage drivers (AHCI, NVMe), Network drivers (e1000, r8168), Audio drivers (HDA, AC97), Input completeness (PS/2, HID), and cross-cutting driver quality (error handling, logging, lifecycle). --- ## Validation States All tasks use these validation levels, consistent with existing plans: - **builds** — compiles without error against the target toolchain - **enumerates** — discovers hardware and reports it through scheme interfaces - **usable** — works in a bounded real scenario (QEMU or bare metal) - **validated** — passes explicit acceptance tests with captured evidence - **hardware-validated** — proven on real bare metal, not just QEMU --- ## Phase 0: Cross-Cutting Driver Quality (Weeks 1-2) These improvements apply to ALL drivers and must be done first to establish the quality baseline for subsequent phases. ### T0.1: Driver Error Handling Audit **Problem**: Many drivers use `unwrap()`/`expect()` on hardware operations (I/O port reads, MMIO, PCI config space). Hardware failures produce panics instead of graceful degradation. **Task**: Audit all drivers in `recipes/core/base/source/drivers/` and `local/recipes/drivers/` for: 1. `unwrap()`/`expect()` on hardware I/O — replace with proper `Result` propagation 2. Missing error logging for hardware failures — add `log::error!()` before error returns 3. Infinite retry loops without backoff — add bounded retry with exponential backoff **Linux reference**: `drivers/ata/libata-eh.c` — `ata_eh_link_autopsy()` for error classification pattern. Linux distinguishes transient errors (retry), permanent errors (fail), and protocol errors (reset). **File paths**: - `recipes/core/base/source/drivers/storage/ahcid/src/main.rs` - `recipes/core/base/source/drivers/net/e1000d/src/device.rs` - `recipes/core/base/source/drivers/net/rtl8168d/src/device.rs` - `recipes/core/base/source/drivers/audio/ihdad/src/main.rs` - `recipes/core/base/source/drivers/audio/ac97d/src/device.rs` - `local/recipes/drivers/ehcid/source/src/`, `ohcid/`, `uhcid/` **Acceptance**: `grep -r 'unwrap()' recipes/core/base/source/drivers/` returns zero matches for hardware I/O paths. Each `unwrap()` removal includes a `log::error!()` before the error return. ### T0.2: Driver Logging Standardization **Problem**: Drivers use inconsistent logging — some use `println!`, some `eprintln!`, some `log::info!`, some no logging at all. Makes debugging hardware issues on bare metal nearly impossible. **Task**: Standardize all drivers to use the `log` crate with logd integration: 1. Replace `println!`/`eprintln!` with `log::info!`/`log::warn!`/`log::error!` 2. Log every hardware initialization step (PCI probe, BAR mapping, IRQ registration) 3. Log every error with the hardware register values that caused it 4. Add `log::debug!` for register read/write traces (behind a feature flag or compile-time config) **Linux reference**: `drivers/net/ethernet/intel/e1000e/netdev.c` — `e_err()` macro with per-driver message prefix. Linux uses `netdev_err()`, `netdev_warn()`, `netdev_info()` with device context. **Acceptance**: Every driver produces at minimum: one `info!` on start, one `info!` on successful init, one `error!` per failure path with register dump. Verified by booting in QEMU and checking serial output. ### T0.3: Driver Lifecycle Documentation **Problem**: No documentation exists for driver initialization sequences, required resources, or expected behavior. New contributors cannot understand or debug drivers. **Task**: For each driver category (storage, network, audio), create a brief `DRIVERS.md` in the driver directory documenting: 1. Hardware initialization sequence (PCI probe → BAR mapping → device reset → capability enumeration → ready) 2. Required kernel schemes (scheme:memory, scheme:irq, scheme:pci) 3. Known hardware quirks 4. Linux source file(s) to cross-reference **Acceptance**: `DRIVERS.md` exists in `recipes/core/base/source/drivers/storage/`, `drivers/net/`, `drivers/audio/` with the above sections. --- ## Phase 1: Storage Drivers (Weeks 2-6) ### T1.1: AHCI NCQ Support **Problem**: ahcid is 109 lines, only basic PIO/DMA read/write. No NCQ. SSD throughput is 3-5x slower than possible. **Linux reference**: `drivers/ata/libata-sata.c:35` — `sata_fsl_host_intr()` with NCQ error handling. `drivers/ata/ahci.c:1423` — `ahci_qc_prep()` for FIS/command table setup. **Implementation**: 1. Add command queue structure to `ahcid/src/ahci/` — track up to 32 pending commands per port 2. Implement `ahci_qc_issue()` modeled on Linux `ata_qc_issue()`: - Allocate command slot from device command table - Fill command FIS (Frame Information Structure) with READ/WRITE FPDMA command - Set PRDT (Physical Region Descriptor Table) for DMA scatter-gather - Issue command via PxCI (Port Command Issue) register write 3. Implement `ahci_port_intr()` modeled on Linux `ahci_port_intr()`: - Read PxIS (Port Interrupt Status) - Handle D2H Register FIS (command completion) - Handle SDB FIS (NCQ completion with per-tag status) - Handle PIO Setup FIS (for ATAPI) - Handle Device-to-Host FIS errors 4. Add per-tag completion tracking using `PxSACT` (SActive) register **Files to modify/create**: - `recipes/core/base/source/drivers/storage/ahcid/src/main.rs` — NCQ enable in `ahci_init()` - `recipes/core/base/source/drivers/storage/ahcid/src/ahci/` — new `ncq.rs`, `fis.rs` **Acceptance**: - `fio` random read test on SSD shows ≥3x improvement over current PIO-only - NCQ depth 32 verified via `PxSACT` register dump in debug output - QEMU with `-device ahci,id=ahci` and `-drive file=...,if=none,id=drive0` produces NCQ completions ### T1.2: AHCI Power Management **Problem**: No power management. Laptops drain battery with disk constantly powered. **Linux reference**: `drivers/ata/libata-eh.c:3682` — `ata_eh_handle_port_suspend()`. `drivers/ata/ahci.c` — `ahci_set_lpm()` for Partial/Slumber link power management. **Implementation**: 1. Add link power management to `ahci_init()`: - Set PxCMD.ICC (Interface Communication Control) to Slumber after idle - Set PxSCTL.DET to disable PHY when port is idle - Restore on new command arrival 2. Add ALPM (Aggressive Link Power Management): - Set AHCI_HOST_CAP2.SDS (Supports Device Sleep) if available - Enable HIPM (Host Initiated Power Management) and DIPM (Device Initiated) 3. Add device sleep (DevSlp) for SATA 3.2+ devices **Acceptance**: After 5 seconds of idle, PxSSTS.DET reports 0x4 (PHY offline). New command wakes the link within 100ms. Verified on bare metal with SATA SSD. ### T1.3: AHCI TRIM/Discard **Problem**: SSDs degrade over time without TRIM. Write amplification increases. **Linux reference**: `drivers/ata/libata-scsi.c` — `ata_scsi_unmap_xlat()` maps SCSI UNMAP to ATA DATA SET MANAGEMENT with TRIM bit. **Implementation**: 1. Add TRIM command support using ATA DATA SET MANAGEMENT (opcode 0x06) with TRIM bit 2. Implement range list construction (LBA + sector count per entry, up to 64 entries) 3. Wire into filesystem TRIM/discard path via scheme discard operation **Acceptance**: `fstrim /` (or redoxfs equivalent) issues DATA SET MANAGEMENT commands visible in AHCI debug output. SSD wear leveling counters show improvement after TRIM. ### T1.4: NVMe Multiple Queue Support **Problem**: NVMe driver uses single I/O queue. NVMe supports up to 64K queues for parallelism. **Linux reference**: `drivers/nvme/host/pci.c` — `nvme_reset_work()` for controller initialization with queue count negotiation. **Implementation**: 1. Implement `nvme_create_io_queues()` modeled on Linux: - Read controller capabilities for maximum queue count - Create one admin submission + completion queue pair - Create N I/O submission + completion queue pairs - Configure interrupt vectors for MSI-X per-queue 2. Implement round-robin queue selection for I/O submission **Acceptance**: NVMe device in QEMU reports ≥4 I/O queues. `fio` shows throughput scaling with queue count. --- ## Phase 2: Network Drivers (Weeks 4-8) ### T2.1: e1000 Interrupt Moderation + Checksum Offload **Problem**: e1000d is 458 lines with no hardware offloads. Every packet triggers an interrupt. Throughput is limited by interrupt rate (~10K pps max). **Linux reference**: `drivers/net/ethernet/intel/e1000e/netdev.c:4200` — `e1000_configure_itr()`. `e1000e/netdev.c` — `e1000_tx_csum()`, `e1000_rx_checksum()`. **Implementation**: 1. **Interrupt moderation** (ITR): - Program E1000_ITR register with dynamic moderation - Implement `e1000_update_itr()` modeled on Linux: increase ITR under high load, decrease under low load - Target: reduce interrupts from 10K/s to 1K/s under full load 2. **TX checksum offload**: - Set E1000_TXD_CMD_IPCSS/TUCMD_IPCSS for IP header checksum - Set E1000_TXD_CMD_TCP/UDP for TCP/UDP pseudo-header checksum - Set context descriptor for checksum parameters 3. **RX checksum offload**: - Parse E1000_RXD_STAT_IPCS/TCPCS status bits - Pass checksum status to netstack **Files to modify**: - `recipes/core/base/source/drivers/net/e1000d/src/device.rs` — add ITR, checksum methods - `recipes/core/base/source/drivers/net/e1000d/src/main.rs` — wire into TX/RX paths **Acceptance**: `iperf3` TCP throughput ≥5x improvement. Interrupt rate drops from ~10K/s to ≤2K/s under load. Wireshark capture shows valid checksums on TX packets. ### T2.2: e1000 TSO/GSO **Problem**: TCP segmentation is done in software. Large sends require per-packet overhead. **Linux reference**: `drivers/net/ethernet/intel/e1000e/netdev.c:5305` — `e1000_tso()`. **Implementation**: 1. Implement `e1000_tso()` modeled on Linux: - Parse GSO descriptor from netstack - Set E1000_TXD_CMD_TSE (TCP Segmentation Enable) - Set MSS (Maximum Segment Size) in context descriptor - Set header length in context descriptor - Hardware will segment one large buffer into MSS-sized packets 2. Implement `e1000_tx_csum()` for combined TSO + checksum offload **Acceptance**: TCP send of 64KB buffer produces hardware-segmented packets (verified via virtio-net capture on host side). Throughput for large sends ≥2x improvement. ### T2.3: r8169 PHY Configuration **Problem**: rtl8168d has no per-chip PHY initialization. Works on QEMU's default r8169 but fails on many real chips. **Linux reference**: `drivers/net/ethernet/realtek/r8169_phy_config.c` (1,354 lines of per-chip init sequences). **Implementation**: 1. Identify chip version from MAC0-MAC4 registers (Linux: `rtl8169_get_mac_version()`) 2. Add PHY init sequences for common chip versions: - RTL_GIGA_MAC_VER_34 (RTL8168EP/8111EP) - RTL_GIGA_MAC_VER_44 (RTL8168FP/8111FP) - RTL_GIGA_MAC_VER_51 (RTL8168H/8111H) 3. Implement MDIO register read/write for PHY access 4. Add PHY status polling for link detection **Files to modify**: - `recipes/core/base/source/drivers/net/rtl8168d/src/device.rs` — chip detection, PHY init - `recipes/core/base/source/drivers/net/rtl8168d/src/main.rs` — init sequence **Acceptance**: RTL8168 NIC in real hardware enumerates, links up, and passes `ping`. Multiple chip versions tested. ### T2.4: Jumbo Frame Support (e1000 + r8169) **Problem**: MTU limited to 1500. Jumbo frames (9000 bytes) reduce per-packet overhead for bulk transfers. **Linux reference**: `e1000e/netdev.c` — `e1000_change_mtu()`. `r8169_main.c:4352` — `rtl_jumbo_config()`. **Implementation**: 1. Configure RX buffer size for jumbo frames (up to 9KB) 2. Set MAX_FRAME_SIZE register 3. Update TX descriptor buffer size 4. Expose MTU configuration through scheme interface **Acceptance**: `ifconfig eth0 mtu 9000` succeeds. `iperf3` with 9KB MTU shows reduced CPU usage per Gbps. --- ## Phase 3: Audio Drivers (Weeks 6-10) ### T3.1: HDA Codec Auto-Detection **Problem**: ihdad (143 lines) has no codec detection. Audio works on zero real machines. **Linux reference**: `sound/hda/hda_codec.c` — `snd_hda_codec_new()` for codec discovery. `sound/hda/hda_generic.c` for generic codec parser. **Implementation**: 1. Implement HDA controller initialization: - Read GCAP (Global Capabilities) register for stream/IRQ info - Reset controller via GCTL.CRST - Set CORB/RIRB (Command/Response Ring Buffers) for codec communication 2. Implement codec discovery: - Read STATETS register for codec presence bitmap - For each present codec, send GET_PARAMETER verb to read: - Vendor/Device ID (F00) - Subsystem ID (F20) - Revision ID (F02) - Node count (F04) - Function group type (F05) 3. Implement codec parsing: - Walk widget tree starting from AFG (Audio Function Group) node - Parse each widget's parameters (amp capabilities, connection list, pin config) - Build internal topology representation 4. Add codec table for common codecs: - Realtek ALC887/ALC888/ALC892 (most common desktop) - Realtek ALC269/ALC282/ALC283 (most common laptop) - Conexant CX20561/CX20585 - IDT 92HD73C1/92HD81B1C5 **Files to modify/create**: - `recipes/core/base/source/drivers/audio/ihdad/src/main.rs` — controller init - `recipes/core/base/source/drivers/audio/ihdad/src/hda/` — new `codec.rs`, `widget.rs`, `codecs/` - `recipes/core/base/source/drivers/audio/ihdad/src/hda/registers.rs` — register definitions **Acceptance**: Real hardware with Intel HDA controller enumerates codecs. `lspci` shows HD Audio device with driver attached. Codec dump shows vendor/device IDs matching known codecs. ### T3.2: HDA Mixer Controls + Jack Detection **Problem**: No volume control, no muting, no jack detection. Audio output is fixed-volume or silent. **Linux reference**: `sound/hda/hda_generic.c` — `create_mute_volume_ctl()`. `sound/hda/hda_jack.c` — `snd_hda_jack_detect()`. **Implementation**: 1. Add mixer controls for each output path: - Volume control (AMP-OUT mute + gain on pin widget) - Capture control (AMP-IN mute + gain on ADC widget) - Master volume (combined output volume) 2. Implement jack detection: - Enable unsolicited response for jack-sense pin widgets - Handle unsolicited response in CORB/RIRB interrupt - Report jack state (plugged/unplugged) via scheme 3. Wire mixer controls to audiod for system-wide volume management **Files to modify**: - `recipes/core/base/source/drivers/audio/ihdad/src/hda/codec.rs` — mixer controls - `recipes/core/base/source/drivers/audio/ihdad/src/hda/jack.rs` — jack detection (new) - `recipes/core/base/source/drivers/audio/audiod/src/scheme.rs` — volume interface **Acceptance**: Volume control changes audible output level. Plugging/unplugging headphones triggers jack event (visible in debug output). Headphone and speaker paths are independent. ### T3.3: HDA Stream Setup and PCM Playback **Problem**: No actual PCM audio output. HDA hardware configured but no audio data flows. **Linux reference**: `sound/hda/hda_controller.c` — `azx_pcm_open()` / `azx_pcm_prepare()` / `azx_pcm_trigger()`. **Implementation**: 1. Implement stream (PCM) management: - Allocate stream descriptor from controller (SD0-SDn) - Configure stream format (sample rate, bits, channels) - Set BDL (Buffer Descriptor List) for DMA - Set stream position in buffer (LPIB register) 2. Implement PCM playback path: - `pcm_open(format)` — allocate stream, configure format - `pcm_write(data)` — write audio samples to DMA buffer - `pcm_start()` — set RUN bit in stream control - `pcm_stop()` — clear RUN bit 3. Implement CORB/RIRB interrupt handling for unsolicited responses 4. Implement stream interrupt handling for buffer completion (BCIS) **Files to modify**: - `recipes/core/base/source/drivers/audio/ihdad/src/hda/stream.rs` — stream management (new) - `recipes/core/base/source/drivers/audio/ihdad/src/hda/dma.rs` — BDL setup (new) - `recipes/core/base/source/drivers/audio/audiod/src/` — PCM routing **Acceptance**: `aplay` (or redox equivalent) plays a WAV file and produces audible output. `parec` captures from microphone. Loopback (output → input) works without distortion. ### T3.4: AC97 Multiple Codec + Mixer Support **Problem**: ac97d supports only single codec at fixed configuration. No volume/mute. **Linux reference**: `sound/pci/ac97/ac97_codec.c` (3,134 lines) — multi-codec architecture. **Implementation**: 1. Add codec slot detection (AC97 supports up to 4 codecs on one controller) 2. Add mixer register read/write for volume/mute 3. Add record source selection **Acceptance**: Desktop with AC97 audio codec produces audible output with adjustable volume. --- ## Phase 4: Input Completeness (Weeks 3-5) ### T4.1: PS/2 i8042 Controller Reset **Problem**: ps2d assumes controller is ready. Real hardware may need reset sequence. **Linux reference**: `drivers/input/serio/i8042.c:522` — `i8042_controller_check()`. **Implementation**: 1. Add controller self-test: Write 0xAA to command register, expect 0x55 response 2. Add controller initialization: disable devices, flush buffer, enable 3. Add AUX (mouse) port detection 4. Add timeout handling for missing ACK from controller **Files to modify**: - `recipes/core/base/source/drivers/input/ps2d/src/controller.rs` **Acceptance**: PS/2 keyboard and mouse work on real hardware after cold boot. No "LED command ACK timeout" warnings. ### T4.2: Touchpad Protocol Detection **Problem**: USB HID touchpads work as basic mice. No multi-touch, no gestures. **Linux reference**: `drivers/input/mouse/synaptics.c` for Synaptics protocol. `drivers/input/mouse/alps.c` for ALPS. **Implementation**: 1. Add PS/2 touchpad protocol detection for Synaptics/ALPS/Elantech 2. Parse multi-touch data from HID digitizer reports 3. Expose gesture events through evdevd scheme **Acceptance**: Laptop touchpad supports two-finger scroll. Multi-touch coordinates reported correctly. --- ## Phase 5: Validation & Documentation (Weeks 1-12, parallel) ### T5.1: Per-Driver Test Harnesses **Task**: Create QEMU-based test scripts for each driver category: - `local/scripts/test-storage-qemu.sh` — boots with virtio-blk + AHCI, runs fio - `local/scripts/test-network-qemu.sh` — boots with e1000 + r8169, runs iperf3 - `local/scripts/test-audio-qemu.sh` — boots with HDA + AC97, plays test tone **Acceptance**: Each script exits 0 on success, produces captured serial output with test results. ### T5.2: Hardware Validation Matrix **Task**: Create `local/docs/HARDWARE-VALIDATION-MATRIX.md` documenting tested hardware configurations: - CPU/chipset combinations tested - Storage controllers (AHCI, NVMe) tested - Network chips (e1000, r8169 variants) tested - Audio codecs (HDA, AC97) tested - Known-broken configurations **Acceptance**: Matrix has at least one verified entry per driver category on real hardware. --- ## Execution Order & Dependencies ``` Phase 0 (Cross-cutting) ─────────────────────────────────────────────┐ T0.1 Error handling T0.2 Logging T0.3 Documentation │ │ │ ├── Phase 1 (Storage) ─────────────────────────────────────────┐ │ │ T1.1 AHCI NCQ ──► T1.3 TRIM ──► T1.2 PM ──► T1.4 NVMe │ │ │ │ │ ├── Phase 2 (Network) ──────────────────────────────────────┐ │ │ │ T2.1 ITR+Checksum ──► T2.2 TSO ──► T2.3 PHY ──► T2.4 │ │ │ │ │ │ │ ├── Phase 3 (Audio) ────────────────────────────────────┐ │ │ │ │ T3.1 CodecDetect ──► T3.3 Stream ──► T3.2 Mixer │ │ │ │ │ T3.4 AC97 (parallel) │ │ │ │ │ │ │ │ │ └── Phase 4 (Input) ───────────────────────────────┐ │ │ │ │ T4.1 PS/2 reset ──► T4.2 Touchpad │ │ │ │ │ │ │ │ │ │ Phase 5 (Validation) ◄───────────────────────────────┴─────┴────┴───┴──┘ T5.1 Test harnesses T5.2 Hardware matrix ``` **Phase 0 is prerequisite for all other phases.** **Phases 1-4 are independent of each other and can run in parallel.** **Phase 5 runs concurrently with all phases, finalizing as each completes.** ## Timeline | Phase | Tasks | Duration | Cumulative | |-------|-------|----------|------------| | Phase 0 | T0.1, T0.2, T0.3 | Weeks 1-2 | Week 2 | | Phase 1 | T1.1, T1.2, T1.3, T1.4 | Weeks 2-6 | Week 6 | | Phase 2 | T2.1, T2.2, T2.3, T2.4 | Weeks 4-8 | Week 8 | | Phase 3 | T3.1, T3.2, T3.3, T3.4 | Weeks 6-10 | Week 10 | | Phase 4 | T4.1, T4.2 | Weeks 3-5 | Week 5 | | Phase 5 | T5.1, T5.2 | Weeks 1-12 (parallel) | Week 12 | **Total**: 12 weeks with 2 developers working in parallel (Phase 1 and Phase 3 on separate tracks). --- ## Linux Reference Map Every task references specific Linux source. Here is the complete map: | Task | Primary Reference | File Size | Function Focus | |------|-------------------|-----------|----------------| | T1.1 (NCQ) | `drivers/ata/libata-sata.c` | 1,365 lines | `ata_qc_issue()`, FIS construction | | T1.2 (AHCI PM) | `drivers/ata/libata-eh.c` | 3,915 lines | `ata_eh_handle_port_suspend()` | | T1.3 (TRIM) | `drivers/ata/libata-scsi.c` | 4,504 lines | `ata_scsi_unmap_xlat()` | | T1.4 (NVMe) | `drivers/nvme/host/pci.c` | 3,146 lines | `nvme_reset_work()`, queue creation | | T2.1 (ITR) | `e1000e/netdev.c` | 7,240 lines | `e1000_configure_itr()`, checksum | | T2.2 (TSO) | `e1000e/netdev.c` | 7,240 lines | `e1000_tso()` | | T2.3 (PHY) | `r8169_phy_config.c` | 1,354 lines | per-chip PHY init sequences | | T3.1 (Codec) | `sound/hda/hda_codec.c` | 5,598 lines | `snd_hda_codec_new()`, widget parsing | | T3.2 (Mixer) | `sound/hda/hda_generic.c` | 5,982 lines | `create_mute_volume_ctl()` | | T3.3 (Stream) | `sound/hda/hda_controller.c` | 1,900 lines | `azx_pcm_open/prepare/trigger()` | | T3.4 (AC97) | `sound/pci/ac97/ac97_codec.c` | 3,134 lines | multi-codec, mixer regs | | T4.1 (PS/2) | `drivers/input/serio/i8042.c` | 1,254 lines | `i8042_controller_check()` | | T4.2 (Touchpad) | `drivers/input/mouse/synaptics.c` | 1,707 lines | protocol detection | --- ## Scope Boundaries **In scope**: - Storage driver enhancements (AHCI NCQ, PM, TRIM; NVMe queues) - Network driver enhancements (e1000 offload, r8169 PHY, jumbo frames) - Audio driver enhancements (HDA codec, mixer, streams; AC97 multi-codec) - Input driver enhancements (PS/2 reset, touchpad protocols) - Cross-cutting driver quality (error handling, logging, documentation) **Out of scope** (covered by existing plans): - ACPI S3/S4 sleep, thermal, EC — see `ACPI-IMPROVEMENT-PLAN.md` - PCI IRQ, MSI-X depth, IOMMU — see `IRQ-AND-LOWLEVEL-CONTROLLERS-ENHANCEMENT-PLAN.md` - USB controller completeness, device lifecycle — see `USB-IMPLEMENTATION-PLAN.md` - GPU/DRM display, KMS, Mesa — see `DRM-MODERNIZATION-EXECUTION-PLAN.md` - Bluetooth — see `BLUETOOTH-IMPLEMENTATION-PLAN.md` - Wi-Fi — see `WIFI-IMPLEMENTATION-PLAN.md` - Desktop/KDE — see `CONSOLE-TO-KDE-DESKTOP-PLAN.md` --- ## Addendum A: Kernel Substrate Audit (2026-05-04 deep re-assessment) ### A.1 CPU / SMP / Timer Initialization **Red Bear**: Kernel arch/x86_64 (502 lines) + arch/x86_shared + time.rs **Linux**: `arch/x86/kernel/smpboot.c` (1,511) + `arch/x86/kernel/apic/apic.c` (2,694) + `arch/x86/kernel/tsc.c` (1,612) + `kernel/time/tick-common.c` (595) = 6,412 lines (subset) **What Red Bear has**: - Basic x86_64 boot (GDT, IDT, page tables) - x2APIC/SMP detected from MADT - HPET timer **What Linux has that Red Bear is missing**: - ❌ BSP/AP handoff protocol — Linux: `smpboot.c:895` `do_boot_cpu()` - ❌ CPU hotplug (online/offline) — Linux: `smpboot.c:1312` `cpu_up()` / `cpu_down()` - ❌ TSC calibration and synchronization — Linux: `tsc.c:1186` `check_tsc_sync_source()` - ❌ APIC timer calibration and per-CPU timers — Linux: `apic.c:294` `calibrate_APIC_clock()` - ❌ Interrupt affinity and vector allocation — Linux: `kernel/irq/manage.c` (2,803 lines) - ❌ IPI (Inter-Processor Interrupt) routing — Linux: `apic/ipi.c` - ❌ CPU idle states (C-states) — Linux: `arch/x86/kernel/acpi/cstate.c` - ❌ Clock source rating and switching — Linux: `kernel/time/clocksource.c` **Priority**: SMP bring-up stability and TSC sync are critical for multi-core correctness. Without APIC timer calibration, scheduler tick is unreliable. ### A.2 DMA / Memory / IOMMU Substrate **Red Bear**: kernel memory/mod.rs (1,266 lines) + iommu daemon (4,411 lines) **Linux**: `kernel/dma/mapping.c` (1,016) + `drivers/iommu/` (~30K) + `mm/` subsystem **What Red Bear has**: - Physical memory mapping via scheme:memory - Basic IOMMU daemon (4,411 lines — substantial, AMD-Vi + Intel VT-d) - Page table management in iommu daemon **What Linux has that Red Bear is missing**: - ❌ Coherent DMA API — Linux: `kernel/dma/mapping.c` `dma_alloc_coherent()` - ❌ Streaming DMA API — Linux: `kernel/dma/mapping.c` `dma_map_single()` - ❌ Scatter-gather DMA — Linux: `lib/scatterlist.c` - ❌ DMA pool/zone management - ❌ SWIOTLB bounce buffering — Linux: `kernel/dma/swiotlb.c` - ❌ IOMMU DMA remapping per-device — the iommu daemon exists but Linux handles this in-kernel with `iommu_dma_ops` - ❌ DMA debug and error injection — Linux: `kernel/dma/debug.c` **Priority**: DMA API is prerequisite for any driver doing scatter-gather. Without coherent DMA, drivers must manually manage cache coherency. ### A.3 Virtio Completeness **Red Bear**: virtio-core (1,545 lines) + virtio-blkd + virtio-netd + virtio-gpud **Linux**: `drivers/virtio/virtio.c` (730) + `virtio_ring.c` (3,940) + `virtio_pci_modern.c` (1,301) + blk/net/gpu drivers (14,957 total) **What Red Bear has**: - Basic virtio PCI transport (legacy) - Split virtqueue with basic ring management - virtio-blk, virtio-net, virtio-gpu drivers **What Linux has that Red Bear is missing**: - ❌ **Virtio 1.0 modern PCI transport** — Linux: `virtio_pci_modern.c` (1,301 lines). Red Bear only uses legacy. - ❌ **Packed virtqueue** (Virtio 1.1) — Linux: `virtio_ring.c` supports both split and packed - ❌ **Multiqueue support** — Linux: virtio-net supports up to 16 TX/RX queue pairs via MSI-X - ❌ **Virtio feature negotiation** — Red Bear hardcodes features; Linux does dynamic negotiation - ❌ **Device reset protocol** — Linux: `virtio.c:237` `virtio_reset_device()` - ❌ **Virtio-MMIO transport** (for ARM/RISC-V VMs) - ❌ **Virtio-balloon** (memory ballooning) **Priority**: Modern PCI transport is required for QEMU machine types `q35` and newer. Packed virtqueues improve throughput. Multiqueue is critical for network performance. ### A.4 CPU Frequency / Thermal / Power **Red Bear**: cpufreqd (176 lines — real implementation with governors), thermald (837 lines), hwrngd (534 lines), redbear-upower, redbear-acmd, redbear-ecmd **Linux**: `drivers/cpufreq/cpufreq.c` (3,081) + `drivers/thermal/thermal_core.c` (1,956) + `drivers/char/hw_random/core.c` (739) **cpufreqd status**: 176 lines with ondemand/performance/powersave governors, MSR-based P-state control via IA32_PERF_CTL, and CPU load measurement via `/scheme/sys`. Still missing vs Linux: - ❌ Governor framework (performance, powersave, ondemand, schedutil) - ❌ ACPI P-state (_PSS) integration - ❌ Intel P-state / HWP driver - ❌ AMD CPPC driver **thermald status**: 837 lines — basic thermal monitoring exists but missing: - ❌ Thermal zone trip points (passive/active/critical) - ❌ Cooling device registration - ❌ Fan speed control via ACPI **hwrngd status**: 534 lines — reasonable random number daemon. Missing: - ❌ Entropy estimation per FIPS 140-2 - ❌ Multiple entropy source mixing (CPU jitter, TPM, RDRAND) - ❌ `/dev/hwrng` interface **Priority**: cpufreqd has basic governor support but still needs ACPI P-state integration, Intel HWP, and AMD CPPC for full functionality. ### A.5 Block Layer / Filesystem Integration **Red Bear**: No dedicated block layer — each storage driver handles I/O directly via DiskScheme **Linux**: `block/blk-mq.c` (5,309) + `block/blk-flush.c` (540) + `block/genhd.c` + `block/elevator.c` **What Linux has that Red Bear is missing**: - ❌ Multi-queue block I/O — Linux: `blk-mq.c` — per-CPU queues + tag sets - ❌ I/O scheduling (mq-deadline, kyber, bfq) — Linux: `block/mq-deadline.c` - ❌ Flush/FUA semantics — Linux: `block/blk-flush.c` - ❌ I/O merging and sorting - ❌ Request timeout and retry — Linux: `block/blk-mq.c` `blk_mq_check_expired()` - ❌ Block device partitioning (MBR/GPT handled by partitionlib library) - ❌ Queue depth management and back-pressure **Red Bear storage drivers** (nvmed 1,318 lines; usbscsid 1,622 lines; ided 773 lines) all implement their own I/O dispatch. The lack of a shared block layer means each driver reinvents queuing, timeout, and retry logic. **Priority**: Block layer is prerequisite for NCQ, NVMe multi-queue, TRIM propagation, and crash consistency. --- ## Revised Execution Priority (incorporating kernel substrate) | Tier | Subsystem | Effort | |------|-----------|--------| | **T0** (kernel) | SMP bring-up stability, TSC calibration, interrupt affinity | 4-6 weeks | | **T0** (kernel) | DMA API + scatter-gather | 2-3 weeks | | **T1** | AHCI NCQ + block layer | 3-4 weeks | | **T1** | Virtio modern PCI + multiqueue | 2-3 weeks | | **T1** | cpufreqd (governor + P-state) | 2-3 weeks | | **T2** | Network offloads (Phase 2) | 3-4 weeks | | **T2** | HDA codec detection (Phase 3) | 3-4 weeks | | **T3** | thermald trip points + fan control | 1-2 weeks | | **T3** | NVMe multi-queue | 2-3 weeks | | **T4** | Audio streams + mixer (Phase 3 remainder) | 3-4 weeks | **Total**: 24-36 weeks (T0-T2 minimum viable), 40-52 weeks (full). --- ## Addendum B: Daemon & Subsystem Audit (2026-05-04, updated with precise Linux 7.0 line counts) ### B.1 ACPI Subsystem — Deep Linux Cross-Reference **Red Bear**: acpid (2,187 lines) + kernel ACPI (727 lines) = 2,914 total **Linux 7.0** (key files): `sleep.c` (1,152) + `thermal.c` (1,067) + `battery.c` (1,331) + `ec.c` (2,380) + `arch/x86/kernel/acpi/sleep.c` (202) + `processor_perflib.c` + `acpi_video.c` + `pci_irq.c` + `apei/` = **~60,000+ total** | Linux File | Lines | Feature | Red Bear Status | |------------|-------|---------|-----------------| | `drivers/acpi/sleep.c` | 1,152 | S3/S4 suspend, NVS save/restore, wakeup vector | ❌ S3/S4 missing | | `drivers/acpi/thermal.c` | 1,067 | Thermal zones, trip points, cooling | ❌ Missing | | `drivers/acpi/battery.c` | 1,331 | Battery status, charge, ACPI _BIF/_BST | ❌ Missing | | `drivers/acpi/ec.c` | 2,380 | Embedded Controller runtime, commands, GPE | ❌ Missing (redbear-ecmd is stub) | | `drivers/acpi/fan.c` | ~400 | Fan speed control | ❌ Missing | | `arch/x86/kernel/acpi/sleep.c` | 202 | x86-specific sleep, wakeup vector, trampoline | ❌ Missing | | `drivers/acpi/processor_perflib.c` | ~800 | _PSS/_PPC performance states | ❌ Missing | | `drivers/acpi/pci_irq.c` | ~500 | PCI IRQ routing overrides (_PRT) | ❌ Missing | | `drivers/acpi/apei/` | ~3,000 | ACPI Platform Error Interface | ❌ Missing | **Priority**: S3/S4 sleep and thermal zones are critical for laptop/desktop use. EC support needed for modern laptops. ### B.2 IRQ / MSI / Timer Subsystem — Precise Line Counts **Red Bear**: kernel irq.rs (570) + local_apic.rs (272) + ioapic.rs (427) + ipi.rs (53) + time.rs (36) = 1,358 total **Linux 7.0** (key files): `kernel/irq/manage.c` (2,803) + `apic/vector.c` (1,387) + `apic/msi.c` (391) + `tsc.c` (1,612) + `tick-common.c` (595) = **6,788 lines (subset)** | Linux File | Lines | Feature | Red Bear Status | |------------|-------|---------|-----------------| | `kernel/irq/manage.c` | 2,803 | IRQ management, affinity, threading, spurious | ❌ Basic only | | `arch/x86/kernel/apic/vector.c` | 1,387 | Vector allocation matrix, CPU assignment | ❌ Missing | | `arch/x86/kernel/apic/msi.c` | 391 | MSI address/data composition, mask bits | ❌ Missing | | `arch/x86/kernel/tsc.c` | 1,612 | TSC calibration, sync, clocksource rating | ❌ Missing | | `kernel/time/tick-common.c` | 595 | Tick management, NO_HZ, broadcast | ❌ Missing | **Priority**: MSI/MSI-X blocks modern GPU/NVMe/network. TSC calibration needed for accurate time. ### B.3 cpufreqd — Confirmed 26-line Stub cpufreqd is **26 lines** — logs messages, sleeps forever. No MSR access, no governor, no P-state control. A 176-line implementation was written and saved as `local/patches/base/P6-cpufreqd-real-impl.patch` (177 lines) but the source was reverted. Needs re-application. ### B.4 Stale Documentation Cleanup 27 docs archived total. BOOT-PROCESS-FIX-SUMMARY and GRAPHICAL-BOOT-ASSESSMENT moved to archive (superseded by this plan).