Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
14 KiB
Hardware Quirks Improvement Plan
Purpose
This plan replaces vague “quirks support” follow-up work with a concrete path to:
- keep quirks data and reporting honest,
- integrate quirks into real runtime driver behavior,
- reduce duplicated quirk logic,
- leave DMI and USB device quirks in a maintainable state.
Current status snapshot
Completed from this plan:
- runtime DMI TOML loading in
redox-driver-sys, - subsystem-gated PCI TOML matching in both the canonical path and
pcid-spawner, - shipped DMI TOML overrides in the brokered
pcid-spawnerenv-var path, - direct canonical
redox-driver-sysquirk lookup frompcid-spawnerinstead of a separate in-tree PCI quirk engine, - real USB device quirk consumption in
xhcid, - first real linux-kpi quirk consumption in the Red Bear amdgpu path,
- canonical GPU quirk policy moved to the Rust driver boundary in
redox-drm, so Intel and AMD now consume one shared quirk source for init-time policy, - PCI quirk extraction upgraded from handler-name guessing to explicit handler-body evidence in
local/scripts/extract-linux-quirks.py.
Still open after this implementation wave:
- document the provenance of existing AMD
need_firmwareentries inquirks.d/10-gpu.toml, - keep AMD device-specific GPU quirk growth review-gated on Linux-backed evidence,
- keep Intel GPU quirk expansion deferred until Red Bear has a real Intel-side firmware/runtime policy surface that can honestly consume additional flags.
Current naming/source split:
- PCI vendor/device names now come from the shipped
pciidsdatabase (/usr/share/misc/pci.ids). - PCI/USB/storage quirk flags still come from Red Bear’s canonical quirk path: compiled tables, shipped TOML files, and conservative Linux-source extraction where applicable.
- The
extract-linux-quirks.pyscript remains a quirk-mining tool, not the source of human-readable PCI device names.
The runtime-behavior milestone from this plan is now implemented. The remaining work is maintenance, validation depth, and future refinement rather than missing quirks behavior for the shipped paths.
It is based on the current in-tree state of:
redox-driver-sysas the canonical quirks library,pcid-spawneras an upstream-owned PCI launch broker that now brokers canonical quirks,redox-drm,xhcid, and the amdgpu Redox glue/runtime path as real runtime PCI quirk consumers,lspci,lsusb, andredbear-infoas reporting surfaces.
Reassessment Summary
What is real today
redox-driver-sysowns the canonical PCI/USB quirk flag definitions and lookup helpers.redox-drmconsumes PCI quirks for interrupt fallback andDISABLE_ACCEL.xhcidconsumes PCI controller quirks viaPCI_QUIRK_FLAGSfor IRQ mode selection and reset delay.linux-kpiexposespci_get_quirk_flags()/pci_has_quirk()for C drivers, and amdgpu now consumes them in its Redox init path.lspciandlsusbsurface active PCI/USB quirk flags for discovered devices.redbear-info --quirksreports configured TOML entries and DMI rule counts.
What is still weak
- USB quirks now have a first real runtime consumer in
xhcid, but broader USB-driver adoption is still missing. - The
linux-kpibridge now has a first real in-tree C consumer: amdgpu uses it for quirk-aware IRQ expectation logging. Broader C-driver adoption is still missing. pcid-spawnerstill synthesizes a partialPciDeviceInfoinstead of reusing a richer canonical PCI object, because it operates as an upstream-owned broker with a narrow interface.
What should not be “fixed” in the wrong layer
firmware-loadershould stay a generic scheme service.NEED_FIRMWAREbelongs in device driver policy, not in the firmware scheme daemon.redbear-infoshould describe configured and observable state; it should not pretend to prove runtime quirk application.
Target Architecture
Upstream-preference policy
When upstream Redox already provides the same functionality, the upstream path wins by default
unless the Red Bear-local implementation is materially better. For quirks and driver support,
this means the canonical path should converge on redox-driver-sys instead of preserving
lower-quality duplicate quirk engines as a steady state.
Canonical rule
redox-driver-sys remains the authoritative quirks model:
- flag definitions,
- compiled-in tables,
- TOML parsing semantics,
- DMI matching behavior.
All other code should either:
- call the canonical lookup directly, or
- receive lookup results from a single broker that is guaranteed to use the same semantics.
Driver integration rule
- Rust PCI drivers using
redox-driver-sysshould callinfo.quirks()directly. - C drivers using
linux-kpishould callpci_has_quirk()/pci_get_quirk_flags()directly in probe/init paths. - Upstream base drivers that cannot depend on
redox-driver-sysmay continue using brokered quirk bits frompcid-spawner, but only if that broker is made semantically identical to the canonical library. - USB device quirks should be consumed inside
xhciddevice enumeration/configuration logic, not only in tooling.
Concrete Work Plan
Wave 1 — Cleanup and truthfulness
Task 1.1: Keep docs and reporting surfaces honest
Scope:
local/docs/QUIRKS-SYSTEM.mdlocal/recipes/system/redbear-info/source/src/main.rs- related AGENTS references if needed
Goals:
- separate reporting surfaces from real runtime consumers,
- remove claims that imply driver integration where only tooling exists,
- keep “not yet implemented” items explicit.
QA:
cargo testinlocal/recipes/system/redbear-info/source- review
redbear-info --helptext and--quirksoutput strings
Task 1.2: Remove stale equivalence claims from extraction/documentation
Scope:
local/scripts/extract-linux-quirks.pylocal/docs/QUIRKS-SYSTEM.md
Goals:
- avoid mapping Linux flags to incorrect Red Bear flags,
- clearly mark the supported explicit PCI extraction patterns and the limits of unsupported handlers.
QA:
- run the script on a small synthetic USB/PXI input sample,
- confirm output omits unsupported PCI flag mappings instead of inventing equivalents.
Current state:
local/scripts/extract-linux-quirks.pyno longer guesses PCI quirks from handler names.- PCI extraction now maps only explicit handler-body evidence for supported
PCI_DEV_FLAGS_*assignments pluspci_d3cold_disable(...). - Running the upgraded extractor on Linux 7.0
drivers/pci/quirks.ccurrently yields only a very small high-confidence PCI subset and no directly usable modern Intel/AMD DRM GPU entries. - This is intentional: false negatives are preferred over wrong GPU quirk claims.
- The existing AMD
need_firmwareentries inquirks.d/10-gpu.tomlare manually reviewed policy entries, not extractor-produced Linux facts. Future extraction runs will not refresh those flags automatically. - Intel firmware classes should be treated explicitly: DMC for display power management, GuC for scheduling/power, HuC for media offload, and GSC for newer authentication flows.
- Red Bear now has a bounded Intel DMC startup manifest/preload path for the first supported Intel
device families, but Intel
need_firmwaremust still stay out of the canonical GPU quirk set until the broader Intel runtime policy surface is real and validated. - AMD device-specific GPU quirk growth remains review-gated on explicit Linux-backed evidence.
- Intel GPU quirk expansion is deferred until Red Bear has a real Intel-side firmware/runtime policy surface that can honestly consume additional flags.
Wave 2 — Unify PCI quirk semantics
Task 2.1: Eliminate semantic drift between pcid-spawner and redox-driver-sys
Constraint:
pcid-spawneris upstream-owned base code, so any convergence work must be implemented as upstream-base changes carried by Red Bear patching until upstream absorbs them.
Best approach:
- make
pcid-spawnerconsume generated/shared quirk data instead of hand-maintained duplicated tables and flag maps.
Preferred implementation options, in order:
- Shared generated data module used by both
redox-driver-sysandpcid-spawner. - Protocol extension where a single canonical broker calculates quirk bits and hands them to drivers.
- Keep duplication only as a short-term fallback if generation is not yet practical.
Do not continue manually editing two separate PCI quirk engines long-term.
Success criteria:
- one authoritative source for compiled PCI quirk entries and flag name mapping,
- subsystem matching behavior aligned,
- explicit decision on whether DMI is brokered by
pcid-spawneror left to driver-local lookup.
QA:
- compare quirk outputs for the same synthetic PCI info through both paths,
- verify
PCI_QUIRK_FLAGSemitted bypcid-spawnermatches canonical lookup for representative devices.
Task 2.2: Decide DMI ownership clearly
Decision needed:
- either
pcid-spawnerbecomes DMI-aware and brokers the final PCI quirk bitmask, - or
pcid-spawnerremains PCI/TOML-only and DMI stays driver-local inredox-driver-sysconsumers.
Recommendation:
- near term: document the split clearly,
- medium term: move toward one brokered result for upstream base drivers.
QA:
- one design note added to the docs explaining the chosen ownership model.
Wave 3 — Real driver integration
Task 3.1: Integrate USB device quirks in xhcid
Best integration points:
- after device descriptor read,
- before SetConfiguration,
- before enabling LPM/U1/U2 or USB3-specific behavior,
- after reset paths where extra delay or reset-after-probe is needed.
Minimum runtime behaviors to wire first:
NO_SET_CONFIGNEED_RESETNO_LPMNO_U1U2BAD_DESCRIPTOR
Success criteria:
xhcidcallslookup_usb_quirks()for enumerated devices,- these flags alter runtime behavior in concrete branches,
- tooling and runtime logic agree on the same device-level quirks.
QA:
- unit/integration tests for selector logic where possible,
- manual logging proof that a known vendor/product entry triggers the expected path.
Task 3.2: Consume linux-kpi quirks in amdgpu
Best integration points:
- probe path,
- IRQ mode selection,
- firmware gating,
- memory/power-management setup.
First flags to consume:
NO_MSINO_MSIXNEED_FIRMWARENO_ASPMNEED_IOMMU
Success criteria:
- at least one real C driver uses
pci_has_quirk()in production code, - runtime logs show quirk-informed decision making.
Current state:
local/recipes/gpu/amdgpu/source/amdgpu_redox_main.cnow queries linux-kpi PCI quirks in the real Redox runtime path,- logs now show the active quirk bitmask plus the implied IRQ fallback policy,
- firmware policy has been pulled back to the Rust-side driver boundary so the C backend does not
re-enforce
NEED_FIRMWAREindependently.
QA:
grepshows real in-tree call sites in amdgpu,- build passes for linux-kpi + amdgpu recipe path.
Task 3.3: Keep firmware policy in drivers, not firmware-loader
Action:
- when a driver has
NEED_FIRMWARE, the driver should gate initialization until the firmware load succeeds. firmware-loaderremains a transport/provider only.
Success criteria:
- docs stop implying that firmware-loader interprets quirk flags,
- driver init paths own the policy decision.
QA:
- driver code path shows firmware gating tied to quirks or explicit device rules.
Current state:
local/recipes/gpu/redox-drm/source/src/drivers/intel/mod.rsnow reads the canonicalinfo.quirks()policy during init, rejectsDISABLE_ACCEL, and explicitly warns ifNEED_FIRMWAREappears on Intel instead of silently ignoring quirk policy.local/recipes/gpu/redox-drm/source/src/main.rsnow makes firmware preload expectations explicit at the Rust-side driver boundary, reports whether preload is quirk-required, and summarizes missing candidate blobs when preload cannot satisfy the current policy.local/recipes/gpu/amdgpu/source/amdgpu_redox_main.cstill consumes linux-kpi quirks for runtime expectations, but it no longer owns the final firmware gating decision.
Wave 4 — DMI completion
Task 4.1: DMI TOML runtime loading
Scope:
toml_loader.rsparses[[dmi_system_quirk]],- matching uses live DMI info served by
acpidat/scheme/acpi/dmi, - resulting PCI quirk overrides flow through the canonical
redox-driver-sysDMI path.
Success criteria:
50-system.tomlentries are no longer config-only,- runtime DMI TOML behavior is testable and documented through the live
acpidDMI scheme.
QA:
- tests for TOML parsing,
- one mock DMI input path proving a TOML DMI rule applies flags.
Task 4.2: ACPI blacklist/override layer
Current state:
acpidnow supports narrow[[acpi_table_quirk]]skip rules, optionally gated by the same DMI-stylematch.*fields used elsewhere.- The implementation is intentionally limited to table suppression during ACPI table load; it is not a broad AML patching or firmware replacement framework.
Suggested Immediate Deliverables
If work resumes right away, the next concrete implementation sequence should be:
- clean remaining stale quirks docs/reporting text,
- write a design note for canonical PCI quirk ownership,
- integrate
lookup_usb_quirks()intoxhcidenumeration/configuration, - add first real
pci_has_quirk()use inamdgpu, - validate and extend shipped DMI TOML coverage as needed.
Exit Criteria For The Next Quirks Milestone
The next milestone is complete when all are true:
pcid-spawnerandredox-driver-sysno longer drift semantically,xhcidconsumes USB device quirks at runtime,- at least one real C driver consumes linux-kpi quirks,
- docs distinguish clearly between reporting, infrastructure, and true runtime behavior,
- DMI TOML entries are either runtime-applied or removed from shipped config.