# Hardware Quirks Improvement Plan ## Purpose This plan replaces vague “quirks support” follow-up work with a concrete path to: 1. keep quirks data and reporting honest, 2. integrate quirks into real runtime driver behavior, 3. reduce duplicated quirk logic, 4. leave DMI and USB device quirks in a maintainable state. ## Current status snapshot Completed from this plan: - runtime DMI TOML loading in `redox-driver-sys`, - subsystem-gated PCI TOML matching in both the canonical path and `pcid-spawner`, - shipped DMI TOML overrides in the brokered `pcid-spawner` env-var path, - direct canonical `redox-driver-sys` quirk lookup from `pcid-spawner` instead of a separate in-tree PCI quirk engine, - real USB device quirk consumption in `xhcid`, - first real linux-kpi quirk consumption in the Red Bear amdgpu path, - canonical GPU quirk policy moved to the Rust driver boundary in `redox-drm`, so Intel and AMD now consume one shared quirk source for init-time policy, - PCI quirk extraction upgraded from handler-name guessing to explicit handler-body evidence in `local/scripts/extract-linux-quirks.py`. Still open after this implementation wave: - document the provenance of existing AMD `need_firmware` entries in `quirks.d/10-gpu.toml`, - keep AMD device-specific GPU quirk growth review-gated on Linux-backed evidence, - keep Intel GPU quirk expansion deferred until Red Bear has a real Intel-side firmware/runtime policy surface that can honestly consume additional flags. Current naming/source split: - PCI vendor/device **names** now come from the shipped `pciids` database (`/usr/share/misc/pci.ids`). - PCI/USB/storage **quirk flags** still come from Red Bear’s canonical quirk path: compiled tables, shipped TOML files, and conservative Linux-source extraction where applicable. - The `extract-linux-quirks.py` script remains a quirk-mining tool, not the source of human-readable PCI device names. The runtime-behavior milestone from this plan is now implemented. The remaining work is maintenance, validation depth, and future refinement rather than missing quirks behavior for the shipped paths. It is based on the current in-tree state of: - `redox-driver-sys` as the canonical quirks library, - `pcid-spawner` as an upstream-owned PCI launch broker that now brokers canonical quirks, - `redox-drm`, `xhcid`, and the amdgpu Redox glue/runtime path as real runtime PCI quirk consumers, - `lspci`, `lsusb`, and `redbear-info` as reporting surfaces. ## Reassessment Summary ### What is real today - `redox-driver-sys` owns the canonical PCI/USB quirk flag definitions and lookup helpers. - `redox-drm` consumes PCI quirks for interrupt fallback and `DISABLE_ACCEL`. - `xhcid` consumes PCI controller quirks via `PCI_QUIRK_FLAGS` for IRQ mode selection and reset delay. - `linux-kpi` exposes `pci_get_quirk_flags()` / `pci_has_quirk()` for C drivers, and amdgpu now consumes them in its Redox init path. - `lspci` and `lsusb` surface active PCI/USB quirk flags for discovered devices. - `redbear-info --quirks` reports configured TOML entries and DMI rule counts. ### What is still weak - USB quirks now have a first real runtime consumer in `xhcid`, but broader USB-driver adoption is still missing. - The `linux-kpi` bridge now has a first real in-tree C consumer: amdgpu uses it for quirk-aware IRQ expectation logging. Broader C-driver adoption is still missing. - `pcid-spawner` still synthesizes a partial `PciDeviceInfo` instead of reusing a richer canonical PCI object, because it operates as an upstream-owned broker with a narrow interface. ### What should not be “fixed” in the wrong layer - `firmware-loader` should stay a generic scheme service. `NEED_FIRMWARE` belongs in device driver policy, not in the firmware scheme daemon. - `redbear-info` should describe configured and observable state; it should not pretend to prove runtime quirk application. ## Target Architecture ### Upstream-preference policy When upstream Redox already provides the same functionality, the upstream path wins by default unless the Red Bear-local implementation is materially better. For quirks and driver support, this means the canonical path should converge on `redox-driver-sys` instead of preserving lower-quality duplicate quirk engines as a steady state. ### Canonical rule `redox-driver-sys` remains the authoritative quirks model: - flag definitions, - compiled-in tables, - TOML parsing semantics, - DMI matching behavior. All other code should either: 1. call the canonical lookup directly, or 2. receive lookup results from a single broker that is guaranteed to use the same semantics. ### Driver integration rule - **Rust PCI drivers using `redox-driver-sys`** should call `info.quirks()` directly. - **C drivers using `linux-kpi`** should call `pci_has_quirk()` / `pci_get_quirk_flags()` directly in probe/init paths. - **Upstream base drivers that cannot depend on `redox-driver-sys`** may continue using brokered quirk bits from `pcid-spawner`, but only if that broker is made semantically identical to the canonical library. - **USB device quirks** should be consumed inside `xhcid` device enumeration/configuration logic, not only in tooling. ## Concrete Work Plan ### Wave 1 — Cleanup and truthfulness #### Task 1.1: Keep docs and reporting surfaces honest Scope: - `local/docs/QUIRKS-SYSTEM.md` - `local/recipes/system/redbear-info/source/src/main.rs` - related AGENTS references if needed Goals: - separate reporting surfaces from real runtime consumers, - remove claims that imply driver integration where only tooling exists, - keep “not yet implemented” items explicit. QA: - `cargo test` in `local/recipes/system/redbear-info/source` - review `redbear-info --help` text and `--quirks` output strings #### Task 1.2: Remove stale equivalence claims from extraction/documentation Scope: - `local/scripts/extract-linux-quirks.py` - `local/docs/QUIRKS-SYSTEM.md` Goals: - avoid mapping Linux flags to incorrect Red Bear flags, - clearly mark the supported explicit PCI extraction patterns and the limits of unsupported handlers. QA: - run the script on a small synthetic USB/PXI input sample, - confirm output omits unsupported PCI flag mappings instead of inventing equivalents. Current state: - `local/scripts/extract-linux-quirks.py` no longer guesses PCI quirks from handler names. - PCI extraction now maps only explicit handler-body evidence for supported `PCI_DEV_FLAGS_*` assignments plus `pci_d3cold_disable(...)`. - Running the upgraded extractor on Linux 7.0 `drivers/pci/quirks.c` currently yields only a very small high-confidence PCI subset and no directly usable modern Intel/AMD DRM GPU entries. - This is intentional: false negatives are preferred over wrong GPU quirk claims. - The existing AMD `need_firmware` entries in `quirks.d/10-gpu.toml` are manually reviewed policy entries, not extractor-produced Linux facts. Future extraction runs will not refresh those flags automatically. - Intel firmware classes should be treated explicitly: DMC for display power management, GuC for scheduling/power, HuC for media offload, and GSC for newer authentication flows. - Red Bear now has a bounded Intel DMC startup manifest/preload path for the first supported Intel device families, but Intel `need_firmware` must still stay out of the canonical GPU quirk set until the broader Intel runtime policy surface is real and validated. - AMD device-specific GPU quirk growth remains review-gated on explicit Linux-backed evidence. - Intel GPU quirk expansion is deferred until Red Bear has a real Intel-side firmware/runtime policy surface that can honestly consume additional flags. ### Wave 2 — Unify PCI quirk semantics #### Task 2.1: Eliminate semantic drift between `pcid-spawner` and `redox-driver-sys` Constraint: - `pcid-spawner` is upstream-owned base code, so any convergence work must be implemented as upstream-base changes carried by Red Bear patching until upstream absorbs them. Best approach: - make `pcid-spawner` consume generated/shared quirk data instead of hand-maintained duplicated tables and flag maps. Preferred implementation options, in order: 1. **Shared generated data module** used by both `redox-driver-sys` and `pcid-spawner`. 2. **Protocol extension** where a single canonical broker calculates quirk bits and hands them to drivers. 3. Keep duplication only as a short-term fallback if generation is not yet practical. Do **not** continue manually editing two separate PCI quirk engines long-term. Success criteria: - one authoritative source for compiled PCI quirk entries and flag name mapping, - subsystem matching behavior aligned, - explicit decision on whether DMI is brokered by `pcid-spawner` or left to driver-local lookup. QA: - compare quirk outputs for the same synthetic PCI info through both paths, - verify `PCI_QUIRK_FLAGS` emitted by `pcid-spawner` matches canonical lookup for representative devices. #### Task 2.2: Decide DMI ownership clearly Decision needed: - either `pcid-spawner` becomes DMI-aware and brokers the final PCI quirk bitmask, - or `pcid-spawner` remains PCI/TOML-only and DMI stays driver-local in `redox-driver-sys` consumers. Recommendation: - near term: document the split clearly, - medium term: move toward one brokered result for upstream base drivers. QA: - one design note added to the docs explaining the chosen ownership model. ### Wave 3 — Real driver integration #### Task 3.1: Integrate USB device quirks in `xhcid` Best integration points: - after device descriptor read, - before SetConfiguration, - before enabling LPM/U1/U2 or USB3-specific behavior, - after reset paths where extra delay or reset-after-probe is needed. Minimum runtime behaviors to wire first: - `NO_SET_CONFIG` - `NEED_RESET` - `NO_LPM` - `NO_U1U2` - `BAD_DESCRIPTOR` Success criteria: - `xhcid` calls `lookup_usb_quirks()` for enumerated devices, - these flags alter runtime behavior in concrete branches, - tooling and runtime logic agree on the same device-level quirks. QA: - unit/integration tests for selector logic where possible, - manual logging proof that a known vendor/product entry triggers the expected path. #### Task 3.2: Consume linux-kpi quirks in `amdgpu` Best integration points: - probe path, - IRQ mode selection, - firmware gating, - memory/power-management setup. First flags to consume: - `NO_MSI` - `NO_MSIX` - `NEED_FIRMWARE` - `NO_ASPM` - `NEED_IOMMU` Success criteria: - at least one real C driver uses `pci_has_quirk()` in production code, - runtime logs show quirk-informed decision making. Current state: - `local/recipes/gpu/amdgpu/source/amdgpu_redox_main.c` now queries linux-kpi PCI quirks in the real Redox runtime path, - logs now show the active quirk bitmask plus the implied IRQ fallback policy, - firmware policy has been pulled back to the Rust-side driver boundary so the C backend does not re-enforce `NEED_FIRMWARE` independently. QA: - `grep` shows real in-tree call sites in amdgpu, - build passes for linux-kpi + amdgpu recipe path. #### Task 3.3: Keep firmware policy in drivers, not firmware-loader Action: - when a driver has `NEED_FIRMWARE`, the driver should gate initialization until the firmware load succeeds. - `firmware-loader` remains a transport/provider only. Success criteria: - docs stop implying that firmware-loader interprets quirk flags, - driver init paths own the policy decision. QA: - driver code path shows firmware gating tied to quirks or explicit device rules. Current state: - `local/recipes/gpu/redox-drm/source/src/drivers/intel/mod.rs` now reads the canonical `info.quirks()` policy during init, rejects `DISABLE_ACCEL`, and explicitly warns if `NEED_FIRMWARE` appears on Intel instead of silently ignoring quirk policy. - `local/recipes/gpu/redox-drm/source/src/main.rs` now makes firmware preload expectations explicit at the Rust-side driver boundary, reports whether preload is quirk-required, and summarizes missing candidate blobs when preload cannot satisfy the current policy. - `local/recipes/gpu/amdgpu/source/amdgpu_redox_main.c` still consumes linux-kpi quirks for runtime expectations, but it no longer owns the final firmware gating decision. ### Wave 4 — DMI completion #### Task 4.1: DMI TOML runtime loading Scope: - `toml_loader.rs` parses `[[dmi_system_quirk]]`, - matching uses live DMI info served by `acpid` at `/scheme/acpi/dmi`, - resulting PCI quirk overrides flow through the canonical `redox-driver-sys` DMI path. Success criteria: - `50-system.toml` entries are no longer config-only, - runtime DMI TOML behavior is testable and documented through the live `acpid` DMI scheme. QA: - tests for TOML parsing, - one mock DMI input path proving a TOML DMI rule applies flags. #### Task 4.2: ACPI blacklist/override layer Current state: - `acpid` now supports narrow `[[acpi_table_quirk]]` skip rules, optionally gated by the same DMI-style `match.*` fields used elsewhere. - The implementation is intentionally limited to table suppression during ACPI table load; it is not a broad AML patching or firmware replacement framework. ## Suggested Immediate Deliverables If work resumes right away, the next concrete implementation sequence should be: 1. clean remaining stale quirks docs/reporting text, 2. write a design note for canonical PCI quirk ownership, 3. integrate `lookup_usb_quirks()` into `xhcid` enumeration/configuration, 4. add first real `pci_has_quirk()` use in `amdgpu`, 5. validate and extend shipped DMI TOML coverage as needed. ## Exit Criteria For The Next Quirks Milestone The next milestone is complete when all are true: - `pcid-spawner` and `redox-driver-sys` no longer drift semantically, - `xhcid` consumes USB device quirks at runtime, - at least one real C driver consumes linux-kpi quirks, - docs distinguish clearly between reporting, infrastructure, and true runtime behavior, - DMI TOML entries are either runtime-applied or removed from shipped config.