Refresh project documentation

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
This commit is contained in:
2026-04-17 00:05:20 +01:00
parent 2c7659fe3a
commit 6689f751d9
23 changed files with 2428 additions and 1720 deletions
+3
View File
@@ -1,5 +1,8 @@
# ACPI Fixes — P0 Phase Tracker
> **Numbering note:** "P0" refers to the historical hardware-enablement phase (ACPI boot),
> not the v2.0 desktop plan phases in `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md`.
Status of ACPI fixes for AMD bare metal boot. Cross-referenced with
`HARDWARE.md` crash reports and kernel/acpid source TODOs.
+38 -39
View File
@@ -1,9 +1,16 @@
# AMD-FIRST REDOX OS — MASTER INTEGRATION PLAN
# AMD-FIRST REDOX OS — AMD-SPECIFIC INTEGRATION PLAN
> **Status note (2026-04-14):** This document remains the detailed AMD-focused hardware roadmap,
> but it is no longer the repository-wide platform-priority policy. Red Bear OS should now treat
> AMD and Intel machines as equal-priority targets. Read this file as the deeper AMD-specific plan,
> not as a statement that Intel is secondary going forward.
> **Status note (2026-04-16):** This document remains the detailed AMD-focused hardware roadmap.
> It is no longer the canonical desktop path plan — see
> `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` for that role. This file is now scoped to AMD-specific
> hardware integration detail only.
>
> The P0P6 section headings below refer to the historical hardware-enablement sequence, not the
> v2.0 desktop plan phases (Phase 15). Where numbering conflicts with the v2.0 plan, the v2.0 plan
> takes precedence.
>
> Red Bear OS now treats AMD and Intel machines as equal-priority targets. Read this file as the
> deeper AMD-specific technical plan, not as a platform-priority statement.
**Target**: AMD64 bare metal machine with AMD GPU (RDNA2/RDNA3), within an overall Red Bear OS
hardware policy that treats AMD and Intel machines as equal-priority targets.
@@ -186,7 +193,6 @@ local/recipes/gpu/redox-drm/
│ │ ├── encoder.rs # Encoder management
│ │ └── plane.rs # Primary/cursor planes
│ ├── gem.rs # GEM buffer objects
│ ├── dmabuf.rs # DMA-BUF export/import
│ └── drivers/
│ ├── mod.rs # trait GpuDriver
│ └── amd/
@@ -257,7 +263,7 @@ ONLY the display/modesetting portion first, using linux-kpi headers.
| GTT manager | ✅ | `local/recipes/gpu/redox-drm/source/src/drivers/amd/gtt.rs` |
| Ring buffer | ✅ | `local/recipes/gpu/redox-drm/source/src/drivers/amd/ring.rs` |
| GEM buffer mgmt | ✅ | `local/recipes/gpu/redox-drm/source/src/gem.rs` |
| DMA-BUF | ✅ | `local/recipes/gpu/redox-drm/source/src/dmabuf.rs` |
| DMA-BUF | ✅ | `local/recipes/gpu/redox-drm/source/src/scheme.rs` (PRIME export/import via opaque tokens) |
| Intel driver | ✅ | `local/recipes/gpu/redox-drm/source/src/drivers/intel/mod.rs` + `display.rs` |
### Build Verification
@@ -327,7 +333,9 @@ smithay/src/backend/
### P4-2: libdrm AMD Backend
Currently libdrm has `-Damdgpu=disabled`. Enable it once redox-drm exists.
libdrm now builds with `-Damdgpu=enabled` and `-Dintel=enabled`. The amdgpu and Intel
backends are present in the built sysroot. Runtime hardware validation through real GPU
hardware is still pending.
**Patches**: `local/patches/libdrm/`
@@ -335,9 +343,10 @@ Currently libdrm has `-Damdgpu=disabled`. Enable it once redox-drm exists.
## PHASE 5: AMD GPU ACCELERATION (16-24 weeks, parallel with P4)
> Note: this AMD-first Phase 5 is a hardware-driver track. It is **not** the same thing as the
> canonical public `docs/07` Phase 5, which is about wired networking and desktop/session
> integration.
> Note: this AMD-first Phase 5 is a hardware-driver track. In the v2.0 desktop plan
> (`local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md`), hardware GPU enablement is also Phase 5, so the
> numbering happens to align. The P0P6 labels in this document refer to the historical
> hardware-enablement sequence, not the current desktop-plan phases.
### P5-1: Full amdgpu Port via LinuxKPI
@@ -397,39 +406,29 @@ P0 (ACPI boot)
---
## WHAT NEEDS TO BE DOCUMENTED
## DOCUMENT STATUS
### New Documents to Create
> **Note (2026-04-16):** Most documents and scripts listed below have been created since this plan
> was originally written. This section is retained as a checklist rather than a to-do list.
| Document | Location | Purpose |
|----------|----------|---------|
| This file | `local/docs/AMD-FIRST-INTEGRATION.md` | Master plan |
| ACPI fix guide | `local/docs/ACPI-FIXES.md` | What ACPI functions are missing |
| Firmware loading spec | `local/docs/FIRMWARE-LOADING.md` | How AMD firmware loading works |
| AMD GPU register notes | `local/docs/AMD-GPU-NOTES.md` | Hardware programming notes |
| Bare metal testing log | `local/docs/BAREMETAL-LOG.md` | Hardware test results |
| Build guide (AMD) | `local/docs/BUILD-GUIDE-AMD.md` | How to build for AMD hardware |
| Overlay usage guide | `local/AGENTS.md` | How to use local/ overlay |
### Documents — Creation Status
### Existing Documents to Update
| Document | Location | Status |
|----------|----------|--------|
| This file | `local/docs/AMD-FIRST-INTEGRATION.md` | ✅ Created |
| ACPI fix guide | `local/docs/ACPI-FIXES.md` | ✅ Created |
| Bare metal testing log | `local/docs/BAREMETAL-LOG.md` | ✅ Created |
| Overlay usage guide | `local/AGENTS.md` | ✅ Created |
| Desktop path plan | `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` | ✅ Created |
| Document | Change |
|----------|--------|
| `AGENTS.md` (root) | Keep equal-priority AMD/Intel hardware policy visible; keep local/ overlay refs |
| `recipes/core/AGENTS.md` | Add AMD boot requirements, IOMMU note |
| `recipes/wip/AGENTS.md` | Add AMD GPU driver WIP section |
| `docs/AGENTS.md` | Add reference to local/docs/ |
| `docs/04-LINUX-DRIVER-COMPAT.md` | Add AMD-specific porting notes |
| `docs/02-GAP-ANALYSIS.md` | Add P0 bare metal boot layer |
### Config Files and Scripts — Creation Status
### Config Files to Create
| File | Purpose |
|------|---------|
| `local/config/my-amd-desktop.toml` | AMD desktop build config |
| `local/scripts/fetch-firmware.sh` | Download AMD firmware blobs |
| `local/scripts/build-amd.sh` | Build wrapper for AMD target |
| `local/scripts/test-baremetal.sh` | Burn + test on real hardware |
| File | Status |
|------|--------|
| `local/scripts/fetch-firmware.sh` | ✅ Created |
| `local/scripts/build-redbear.sh` | ✅ Created (replaces build-amd.sh) |
| `local/scripts/test-baremetal.sh` | ✅ Created |
| `config/redbear-desktop.toml` | ✅ Created (replaces my-amd-desktop.toml) |
---
+1 -1
View File
@@ -74,7 +74,7 @@ one more driver.” The feasible first target is a deliberately small subsystem
| Area | State | Notes |
|---|---|---|
| Bluetooth controller support | **experimental bounded slice** | `redbear-btusb` provides explicit-startup USB transport probing/status plus a daemon path that is now exercised repeatedly in QEMU for the bounded Battery Level slice |
| Bluetooth controller support | **experimental, USB discovery real** | `redbear-btusb` now probes `/scheme/usb/` for real Bluetooth class devices (USB class 0xE0/0x01/0x01 with vendor-ID fallback), parses descriptor files, assigns `hciN` names deterministically, and writes real adapter metadata into the status file. Daemon re-probes periodically. 8 tests pass including mock-filesystem USB discovery tests. |
| Bluetooth host stack | **experimental bounded slice** | `redbear-btctl` provides a BLE-first CLI/scheme surface with stub-backed scan plus bounded connect/disconnect control for stored bond IDs; the packaged checker now reruns that slice repeatedly and covers daemon-restart honesty in QEMU |
| Pairing / bond database | **experimental bounded slice** | `redbear-btctl` now persists conservative stub bond records under `/var/lib/bluetooth/<adapter>/bonds/`; connect/disconnect control targets those records, and the checker now verifies cleanup honesty, but this is still storage/control plumbing only, not real pairing or generic reconnect validation |
| Desktop Bluetooth API | **missing** | D-Bus exists generally, but no Bluetooth API/service exists |
File diff suppressed because it is too large Load Diff
+73 -97
View File
@@ -1,146 +1,122 @@
# Red Bear OS Desktop Stack — Current Status
**Last updated:** 2026-04-16
**Canonical plan:** `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` (v2.0)
## Purpose
This document is the **current build/runtime truth summary** for the Red Bear desktop stack.
It is intentionally narrower than the historical Wayland and KDE roadmap docs. Its job is to answer:
- what the current desktop stack actually builds,
- what the tracked desktop profiles currently expose,
Its job is to answer:
- what the desktop stack actually builds,
- what the tracked profiles currently expose,
- what is only build-visible,
- what is runtime-proven,
- and what still blocks a trustworthy Wayland/KDE session claim.
Use this document as the current-state summary. Use `docs/03-WAYLAND-ON-REDOX.md` and
`docs/05-KDE-PLASMA-ON-REDOX.md` mainly as design history, rationale, and deeper porting notes.
For the execution plan (phases, timelines, acceptance criteria), see the canonical plan above.
For historical design rationale, see `docs/03-WAYLAND-ON-REDOX.md` and `docs/05-KDE-PLASMA-ON-REDOX.md`.
## Current State Summary
## Where We Are in the Plan
The Red Bear desktop stack is no longer blocked on basic Qt/Wayland package availability.
The canonical desktop plan uses a three-track model:
The repo currently proves:
- **Track A (Phase 12):** Runtime Substrate → Software Compositor — **Phase 1 is the current target**
- **Track B (Phase 34):** KWin Session → KDE Plasma — **blocked on Track A**
- **Track C (Phase 5):** Hardware GPU — **can start after Phase 1**
- `libwayland` builds successfully against the current relibc/Red Bear compatibility surface
- Qt6 core modules build (`qtbase`, `qtdeclarative`, `qtsvg`, `qtwayland`)
- the current relibc overlay and its fresh-source reapply workflow are strong enough to support the
rebuilt Qt/Wayland stack
- D-Bus builds and is wired into desktop-facing profiles
- `seatd` builds and is wired into the KDE-facing runtime profile
- the `redbear-wayland`, `redbear-full`, and `redbear-kde` profiles exist as real tracked product
surfaces
The repo does **not** yet prove a generally trustworthy desktop runtime.
The main gap is no longer “can we build the packages?” The main gap is “which parts of the desktop
stack are runtime-trusted rather than just build-visible?”
**Current position:** Build-side gates are crossed. Phase 1 (Runtime Substrate Validation) is the
next work target. The repo has not yet started systematic runtime validation.
## Status Matrix
| Area | Current state | What that means |
| Area | Evidence class | Detail |
|---|---|---|
| `libwayland` | **builds** | relibc/Wayland-facing compatibility is materially better than before |
| Qt6 core stack | **builds** | `qtbase`, `qtdeclarative`, `qtsvg`, `qtwayland` are in-tree build surfaces |
| KF6 frameworks | **mixed but strong build progress** | many frameworks build; some higher-level pieces still rely on bounded or reduced recipes |
| KWin / Plasma session | **experimental / incomplete runtime** | recipe/config wiring exists, but runtime trust still trails build success |
| Mesa / hardware graphics path | **partial** | software path exists; current QEMU validation still shows llvmpipe, and hardware-validated Wayland graphics path still lags |
| Input stack | **build-visible and partly wired** | `evdevd`, `libevdev`, `libinput`, `seatd` are present, but runtime trust is still narrower than full desktop support |
| D-Bus session/system plumbing | **builds / wired into profiles** | present in desktop-facing profiles, but not equal to full desktop integration completeness |
| Qt6 core stack | **builds** | `qtbase` (7 libs + 12 plugins), `qtdeclarative`, `qtsvg`, `qtwayland` |
| KF6 frameworks | **builds** | All 32/32; some higher-level pieces use bounded/reduced recipes (kf6-kio heavy shim, kirigami stub-only) |
| KWin | **experimental** | Recipe exists; 5 features re-enabled; 4 stub deps block honest build; 9 feature switches still disabled |
| plasma-workspace | **experimental** | Recipe exists; stub deps (kf6-knewstuff, kf6-kwallet) unresolved |
| plasma-desktop | **experimental** | Recipe exists; depends on plasma-workspace |
| Mesa EGL+GBM+GLES2 | **builds** | Software path via LLVMpipe proven in QEMU; hardware path not proven |
| libdrm amdgpu | **builds** | Package-level success only |
| Input stack | **builds, enumerates** | evdevd, libevdev, libinput, seatd present; evdevd registers scheme at boot |
| D-Bus | **builds, usable (bounded)** | System bus wired in `redbear-full` |
| DRM/KMS | **builds** | redox-drm scheme daemon; no hardware runtime validation |
| GPU acceleration | **blocked** | PRIME/DMA-BUF ioctls implemented; GPU CS ioctl missing |
| smallvil compositor | **experimental** | Reaches early init in QEMU; no complete session |
| `redbear-wayland` profile | **builds, boots** | Bounded Wayland runtime profile |
| `redbear-full` profile | **builds, boots** | Broader desktop plumbing profile |
| `redbear-kde` profile | **builds** | KDE session-surface profile |
## Profile View
### `redbear-wayland`
Role:
- narrow runtime validation profile for Wayland bring-up
Current truth:
- it is the current first-class profile for a bounded Wayland runtime path
- it should be used for small-scope compositor/runtime validation, not broad desktop claims
- **Role:** Phase 2 Wayland compositor validation target
- **Current truth:** Builds and boots in QEMU; smallvil reaches early init but no complete session
- **Use for:** Compositor/runtime regression testing, not broad desktop claims
### `redbear-full`
Role:
- broader desktop/network/session plumbing profile
Current truth:
- it carries D-Bus and broader desktop integration pieces
- it is stronger than `redbear-wayland` for general integration, but still not the same as a stable
KDE session claim
- **Role:** Broader desktop/network/session plumbing
- **Current truth:** Carries D-Bus and broader integration pieces; VirtIO networking works in QEMU
- **Use for:** Desktop integration testing beyond the narrow Wayland slice
### `redbear-kde`
Role:
- KDE/Plasma session-surface profile
Current truth:
- it carries the KWin/session wiring and the KDE-facing package set
- it should still be described as experimental until runtime evidence catches up with build success
- **Role:** Phase 34 KDE/Plasma session bring-up
- **Current truth:** Carries KWin/session wiring and KDE-facing package set; experimental
- **Use for:** KDE session surface testing once Phase 2 completes
## Current Blockers
### 1. Runtime trust still trails build success
### 1. Runtime trust trails build success (Phase 1 gate)
The project now has real build-visible desktop progress, but build success still exceeds runtime
confidence.
The repo has real build-visible desktop progress, but build success exceeds runtime confidence.
Phase 1 exists specifically to close this gap.
That gap is the main thing older docs sometimes blur.
### 2. No complete compositor session (Phase 2 gate)
### 2. Graphics/runtime validation is still thinner than package progress
smallvil reaches early initialization but does not complete a usable Wayland compositor session.
This blocks all desktop session work.
The software-rendered stack is much further along than the hardware-validated stack.
### 3. KWin blocked by stub dependencies (Phase 3 gate)
Current QEMU truth:
Four stub cmake targets must become real builds:
- the tracked `redbear-wayland` test harness still uses `-vga std`
- the live compositor path currently reports `GL Renderer: "llvmpipe (LLVM 21.1.2, 256 bits)"`
- therefore the current QEMU Wayland proof is still a software-rendered runtime slice, not a
hardware-accelerated desktop proof
- QEMU should be treated as a bounded regression/test harness for Wayland/Qt bring-up, not as the
primary proof target for the final hardware-accelerated desktop claim
| Stub | Real library exists? | Path to resolve | Difficulty |
|---|---|---|---|
| `libepoxy-stub` | Yes — `recipes/wip/libs/gnome/libepoxy/` (meson, has redox.patch) | Port real libepoxy; currently needs full X11/GLX stack | Medium |
| `libudev-stub` | Partial — `recipes/wip/services/eudev/` (broken: POSIX headers missing) | Fix eudev compilation; `udev-shim` is a binary not a C library | Medium-Hard |
| `lcms2-stub` | Yes — `recipes/wip/libs/other/liblcms/` (compiled, untested) | Test and integrate real lcms2; depends on libtiff | Low |
| `libdisplay-info-stub` | **No** — not in recipe tree at all | New port from freedesktop.org; full EDID/CTA/DisplayID parser | Hard |
The real hardware-accelerated acceptance target remains the bare-metal/runtime-driver path:
Additionally, two packages need honest builds: kirigami (stub-only), kf6-kio (heavy shim).
- `redox-drm` detects and drives the target GPU family
- Mesa GBM/EGL/GLES2 uses that runtime graphics path
- the compositor and Qt Wayland clients run stably on top of it
- runtime evidence shows the desktop path is using the real accelerated stack rather than a
software fallback
### 4. Hardware acceleration missing GPU CS ioctl (Phase 5 gate)
The desktop stack therefore should not over-claim hardware-ready Wayland/KDE support yet.
### 3. KDE build progress is ahead of session maturity
KDE package and framework progress is real, but the session surface should still be described as an
experimental bring-up target rather than a broadly working desktop.
### 4. Input and seat management are present but not yet a final confidence story
`libinput`, `seatd`, and related runtime pieces matter, but they should still be treated as part of
the runtime-proof gap rather than as already-settled desktop infrastructure.
PRIME/DMA-BUF buffer sharing is implemented at the scheme level, but GPU command submission
does not exist. This blocks hardware-accelerated rendering.
## Canonical Document Roles
Use the desktop-related docs this way:
- `local/docs/DESKTOP-STACK-CURRENT-STATUS.md` — current build/runtime truth summary
- `local/docs/QT6-PORT-STATUS.md` — Qt/KF6 package-level build/status truth
- `docs/03-WAYLAND-ON-REDOX.md` — historical Wayland implementation path + deeper rationale
- `docs/05-KDE-PLASMA-ON-REDOX.md` — historical KDE implementation path + deeper rationale
- `local/docs/PROFILE-MATRIX.md` — profile role and support-language reference
| Document | Role |
|---|---|
| `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` | Canonical desktop path plan (v2.0, Phase 15) |
| This document | Current build/runtime truth summary |
| `local/docs/QT6-PORT-STATUS.md` | Qt/KF6/KWin package-level build status |
| `local/docs/AMD-FIRST-INTEGRATION.md` | AMD-specific hardware/driver detail |
| `docs/03-WAYLAND-ON-REDOX.md` | Historical Wayland design rationale |
| `docs/05-KDE-PLASMA-ON-REDOX.md` | Historical KDE design rationale |
| `local/docs/PROFILE-MATRIX.md` | Profile roles and support-language reference |
## Bottom Line
The current Red Bear desktop stack is in a transition phase:
The Red Bear desktop stack has crossed major build-side gates:
- All Qt6 core modules, all 32 KF6 frameworks, Mesa EGL/GBM/GLES2, and D-Bus build
- Three tracked desktop profiles exist and at least boot in QEMU
- relibc compatibility is materially stronger than before
- no longer blocked on basic Qt/Wayland package availability,
- materially stronger on relibc/Wayland-facing compatibility than before,
- but still short of a broad runtime-trusted desktop claim.
That is the current truth this repo should present.
The remaining work is **runtime validation and session assembly**, not more package porting.
Phase 1 (Runtime Substrate Validation) is the immediate next target.
+432
View File
@@ -0,0 +1,432 @@
# Red Bear OS: DMA-BUF Improvement Plan
**Date**: 2026-04-16
**Status**: v1 COMPLETE (Steps 1-6a implemented, Oracle-verified through 8 rounds). Step 6b blocked on GPU command submission. Stale token cleanup verified across all GEM destruction paths.
**Scope**: Cross-process GPU buffer sharing for hardware-accelerated KDE Plasma on Wayland
## Bottom Line
Redox kernel already has the three primitives needed for DMA-BUF-style cross-process buffer
sharing:
1. **`Provider::FmapBorrowed`** + **`Grant::borrow_fmap()`** — kernel mechanism for borrowing
pages from a scheme into another process's address space, mapping the same physical frames
(zero-copy). Source: `kernel/source/src/context/memory.rs:1157`, `memory.rs:1401`.
2. **`sendfd`** syscall — passes file descriptors between processes via scheme IPC. Both
processes hold the same `Arc<LockedFileDescription>`. Source: `kernel/source/src/syscall/fs.rs:415`.
3. **`PhysBorrow`** in `scheme:memory` — maps physical addresses directly into process space
(already used for GPU registers/BARs). Source: `kernel/source/src/scheme/memory.rs`.
No new kernel syscalls or scheme types are needed for v1. The work is entirely in userspace:
redox-drm scheme daemon, libdrm, and Mesa.
## Architecture Principle
**DMA-BUF is a sharing and lifetime contract, not a global allocator.**
Linux `dma_buf` is an exporter/importer contract. The exporter owns allocation and controls
lifetime. The importer gets shared access. Red Bear OS follows the same model:
- **Allocation stays with `redox-drm`** (the exporter). `DmaBuffer::allocate()` in `gem.rs`
already allocates physically-contiguous system RAM.
- **Sharing uses scheme-backed fds + `sendfd`**. No synthetic fd numbers. No global registry.
- **Mapping uses `FmapBorrowed`**. The kernel maps the same physical pages into the importer's
address space — zero-copy.
## Data Flow
```
Process A (GPU client, e.g. Mesa/radeonsi)
1. open("/scheme/drm/card0")
2. DRM_IOCTL_GEM_CREATE → allocate GPU buffer ← EXISTS
3. DRM_IOCTL_PRIME_HANDLE_TO_FD → get opaque export token ← IMPLEMENTED
4. open("/scheme/drm/card0/dmabuf/{token}") → get scheme fd ← IMPLEMENTED
5. sendfd(socket, fd) → pass fd to compositor ← KERNEL EXISTS
Process B (compositor, e.g. KWin)
6. recvfd(socket) → receive the fd ← KERNEL EXISTS
7. DRM_IOCTL_PRIME_FD_TO_HANDLE → import as local GEM ← IMPLEMENTED
7. mmap(fd, size) → kernel uses FmapBorrowed ← KERNEL EXISTS
8. Both processes see same physical pages ← ZERO-COPY
```
Steps 1-2 are already working. Steps 3-6 require redox-drm changes. Steps 4-5 and 7-8 use
existing kernel mechanisms.
## Current State
### What Exists
| Component | Status | Detail |
|-----------|--------|--------|
| GEM_CREATE ioctl | ✅ Working | `DmaBuffer::allocate()` in `gem.rs`, physically contiguous system RAM |
| GEM_CLOSE ioctl | ✅ Working | Ownership tracking, reference counting, safe cleanup |
| GEM_MMAP ioctl | ✅ Working | Returns virtual address for mmap_prep |
| KMS/modesetting ioctls | ✅ Working | 16 KMS ioctls, CRTC/connector/encoder/plane |
| Kernel FmapBorrowed | ✅ Exists | `Provider::FmapBorrowed` at `memory.rs:1157`, `Grant::borrow_fmap()` at `memory.rs:1401` |
| Kernel sendfd | ✅ Exists | `SYS_SENDFD` at `syscall/fs.rs:415`, passes `Arc<LockedFileDescription>` |
| Kernel PhysBorrow | ✅ Exists | `scheme:memory` physical address mapping |
| libdrm `__redox__` | ✅ Full | Opens `/scheme/drm`, dispatches KMS + PRIME ioctls via `redox_fpath` |
### What Is Missing
| Component | Status | Impact |
|-----------|--------|--------|
| PRIME_HANDLE_TO_FD | ✅ Implemented | Opaque export tokens via prime_exports map |
| PRIME_FD_TO_HANDLE | ✅ Implemented | Token lookup via prime_exports, adds to owned_gems |
| libdrm PRIME/GEM dispatch | ✅ Implemented | __redox__ wrappers in drmPrimeHandleToFD/drmPrimeFDToHandle |
| Mesa Redox winsys | 🚧 Scaffolding | Stubs compile but do not render — blocked on GPU CS |
| GPU command submission | ❌ Not implemented | No CS ioctl, no ring buffer programming |
| GPU fence/signaling | ❌ Not implemented | No GPU completion notification |
### What Was Cleaned Up (Previous Session)
The old fake PRIME implementation used synthetic fd numbers starting at 10,000 that were not real
kernel file descriptors. Other processes could not resolve them. Oracle caught this across 4
verification rounds. The cleanup:
- Removed `exported_dmafds` tracking from Handle struct
- Removed `imported_gems` from Handle
- Removed DMA-BUF methods from `GpuDriver` trait and AMD/Intel driver impls
- Removed `DmabufManager` from `GemManager`
- Removed `mod dmabuf` from `main.rs`
- Removed PRIME wire structs (`DrmPrimeHandleToFdWire`, `DrmPrimeFdToHandleWire`)
- PRIME handlers → EOPNOTSUPP (honest, not fake)
- Removed all `#[allow(dead_code)]` from fake bookkeeping
## Phased Implementation
### v1: System RAM, Linear, Single GPU (Target: working PRIME)
**Goal**: A compositor (KWin) can import a buffer rendered by a GPU client (Mesa) and display it.
All buffers in system RAM, linear layout, single GPU.
**Duration estimate**: 6-10 weeks (2 developers)
#### Step 1: Delete dead dmabuf.rs
Remove `local/recipes/gpu/redox-drm/source/src/dmabuf.rs`. It is dead code — `mod dmabuf` was
removed from `main.rs` but the file still exists.
**Effort**: trivial
#### Step 2: Implement PRIME export in redox-drm
When `PRIME_HANDLE_TO_FD` is called:
1. Look up the GEM handle in the calling fd's `owned_gems`
2. Validate ownership (same as GEM_MMAP check)
3. Generate an opaque export token and store `prime_exports[token] = gem_handle`
4. Return the token to the caller (NOT a scheme fd or GEM handle)
The client then opens `/scheme/drm/card0/dmabuf/{token}` to get a real scheme fd. The open
handler validates the token against `prime_exports`, creates a `NodeKind::DmaBuf` scheme handle,
and bumps the GEM export refcount. When that scheme fd is closed, the refcount is dropped.
Key design: export tokens are opaque identifiers, not synthetic fd numbers or raw GEM handles.
The `prime_exports` map resolves tokens to GEM handles. Tokens are cleaned up when the last
export ref for a GEM handle is dropped.
**Changes to `scheme.rs`**:
- Add `NodeKind::DmaBuf { gem_handle, export_token }` variant
- Add `prime_exports: BTreeMap<u32, GemHandle>` and `next_export_token: u32`
- `PRIME_HANDLE_TO_FD` handler: validate ownership → generate token → store in prime_exports → return token
- `PRIME_FD_TO_HANDLE` handler: receive token → look up in prime_exports → add GEM to caller's `owned_gems`
- `open()` handler: accept `"card0/dmabuf/{token}"` path → validate token → create DmaBuf node → bump export ref
- `mmap_prep()` handler: for DmaBuf nodes, return GEM physical address
**Changes to `driver.rs`**:
- No changes needed. GEM operations stay on the trait as-is. PRIME is a scheme-level concern,
not a driver-level concern.
**Effort**: 1-2 weeks
#### Step 3: Add reference counting for shared GEM objects
When a GEM buffer is exported via PRIME, multiple scheme fds may reference it. The `close()` path
must only call `driver.gem_close()` when ALL references (original GEM + all exported fds) are gone.
**Changes**:
- Add `gem_refcounts: BTreeMap<GemHandle, usize>` to `DrmScheme`
- Increment on export, decrement on close of DmaBuf fd
- `gem_close()` checks refcount before calling driver
**Effort**: 3-5 days
#### Step 4: Validate with a two-process reproducer
Build a minimal test that:
1. Process A opens `/scheme/drm/card0`, creates a GEM buffer, writes a pattern
2. Process A exports via PRIME_HANDLE_TO_FD
3. Process A sends the fd to Process B via `sendfd` (or equivalent scheme IPC)
4. Process B receives the fd, imports via PRIME_FD_TO_HANDLE
5. Process B mmaps the imported handle and reads the pattern
6. Verify both processes see the same physical pages (same data, zero-copy)
This validates the full chain: redox-drm → scheme fd → sendfd → import → mmap → FmapBorrowed.
**Effort**: 1 week
#### Step 5: libdrm Redox PRIME/GEM dispatch
libdrm already has `__redox__` conditionals. Add dispatch for:
- `drmPrimeHandleToFD()` → send `PRIME_HANDLE_TO_FD` ioctl to `/scheme/drm`
- `drmPrimeFDToHandle()` → send `PRIME_FD_TO_HANDLE` ioctl
- `drmPrimeClose()` → close the exported/imported fd
- `drmGemHandleToPrimeFD()` / `drmPrimeFDToGemHandle()` — aliases for the above
The libdrm WIP recipe is at `recipes/wip/x11/libdrm/`. The `__redox__` handling already opens
`/scheme/drm` and has ioctl dispatch infrastructure. The gap is PRIME/GEM-specific ioctl codes.
**Effort**: 1-2 weeks
#### Step 6: Mesa Redox winsys (compile-time scaffolding)
Add `src/gallium/winsys/redox/` to Mesa that:
- Opens the DRM scheme
- Allocates GEM buffers via `GEM_CREATE`
- Exports them via `PRIME_HANDLE_TO_FD`
- Imports shared buffers via `PRIME_FD_TO_HANDLE`
- Maps them via `mmap` (which triggers `FmapBorrowed`)
Pattern: similar to `winsys/amdgpu/drm/` but using Redox scheme IPC. This is scaffolding — it
compiles but cannot render without GPU command submission (Step 8).
Split into:
- **6a**: Compile-time winsys structure, buffer allocation, PRIME export/import
- **6b**: Runtime buffer-sharing enablement (depends on step 4 validation)
**Effort**: 3-4 weeks
### v2: VRAM/GTT Placement, Tiling, Multi-GPU
**Goal**: Buffers can live in VRAM with GTT aperture access. Tiled/modifier support for
scanout-optimized layouts. Multi-GPU buffer sharing.
**Duration estimate**: 8-12 weeks (after v1)
- AMD GTT/VRAM placement via `amdgpu_gtt_mgr` / `amdgpu_vram_mgr` equivalents
- Intel GGTT/PPGTT population for imported buffers
- DRM format modifiers: `DRM_FORMAT_MOD_LINEAR` + vendor-specific tiling
- Multi-GPU: each GPU has its own `redox-drm` instance, PRIME between them
- This tier requires the AMD/Intel driver GTT programming that is currently partial
### v3: Fencing, Explicit Sync, Vulkan
**Goal**: GPU fence objects for render/scanout synchronization. Explicit sync protocol for
Wayland. Vulkan driver support.
**Duration estimate**: 12-16 weeks (after v2)
- `dma_fence` equivalent: kernel waitable event per page-flip or command submission
- `sync_file` equivalent: fd-backed fence that can be passed between processes
- Wayland `zwp_linux_explicit_synchronization_v1` protocol in compositor
- Vulkan `VK_KHR_external_memory` / `VK_KHR_external_semaphore` backed by DMA-BUF fds
- AMD: fence through ring buffer writeback + IRQ
- Intel: fence through seqno writeback + IRQ
## Dependency Graph
```
Step 1 (delete dmabuf.rs)
→ no dependency, do immediately
Step 2 (PRIME export/import in scheme)
→ depends on: nothing
→ enables: steps 3, 4, 5
Step 3 (refcount for shared GEM)
→ depends on: step 2
→ enables: step 4
Step 4 (two-process reproducer)
→ depends on: steps 2, 3
→ validates: the full chain works
Step 5 (libdrm dispatch)
→ depends on: step 2 (ioctl protocol defined)
→ can start in parallel with steps 3-4
Step 6 (Mesa winsys)
→ depends on: step 5 (libdrm API available)
→ 6a can start once step 2 protocol is defined
→ 6b should wait for step 4 validation
```
Steps 5 and 6a can proceed in parallel with steps 3-4 once step 2 is done.
## What This Does NOT Cover
This plan covers **cross-process buffer sharing** (the DMA-BUF/PRIME contract). It does not cover:
| Out of scope | Where it lives |
|-------------|----------------|
| GPU command submission (CS ioctl) | `HARDWARE-3D-ASSESSMENT.md` Tier 2 |
| GPU fence/signaling | `HARDWARE-3D-ASSESSMENT.md` Tier 2 |
| Mesa hardware Gallium driver (radeonsi/iris) | `HARDWARE-3D-ASSESSMENT.md` Tier 1 |
| AMD ring buffer programming | `local/recipes/gpu/amdgpu/` |
| Intel render ring programming | `local/recipes/gpu/redox-drm/source/src/drivers/intel/` |
| Mesa EGL platform extension for DRM | `HARDWARE-3D-ASSESSMENT.md` Tier 3 |
PRIME/DMA-BUF is a **prerequisite** for hardware-accelerated rendering, but it is not sufficient
by itself. The render pipeline (command submission + fencing + Mesa driver) is tracked separately
in `HARDWARE-3D-ASSESSMENT.md`.
## Why Not a Kernel DMA-BUF Scheme
Linux has a global `dma-buf` kernel subsystem with its own fd type. Red Bear OS does NOT need this
because:
1. **`redox-drm` IS the exporter.** In Linux, any kernel subsystem can export a dma-buf. In Redox,
only the DRM scheme exports GPU buffers. There is no need for a generic kernel dma-buf layer.
2. **Scheme fds ARE the sharing mechanism.** In Linux, dma-buf has its own fd type with special
mmap semantics. In Redox, scheme file descriptors already support `fmap_prep``FmapBorrowed`.
The kernel maps the same physical pages. No new fd type needed.
3. **`sendfd` IS the fd passing mechanism.** In Linux, fd passing uses SCM_RIGHTS over Unix
sockets. In Redox, `sendfd` passes `Arc<LockedFileDescription>` via scheme IPC. Same result.
If a future use case requires sharing non-DRM buffers (e.g., camera frames, video decode output),
a separate `scheme:dmabuf` could be created. But for GPU buffer sharing, the DRM scheme is
sufficient.
## Wire Protocol Design
### PRIME_HANDLE_TO_FD
Request (from libdrm client):
```c
struct DrmPrimeHandleToFdWire {
uint32_t handle; // GEM handle to export
uint32_t flags; // DRM_CLOEXEC | DRM_RDWR (hints, not critical for v1)
};
```
Response:
```c
struct DrmPrimeHandleToFdResponseWire {
int32_t fd; // opaque export token (NOT a process fd or GEM handle)
uint32_t _pad;
};
```
The scheme internally:
1. Validates handle ownership
2. Generates an opaque export token (monotonically increasing counter)
3. Stores `prime_exports[token] = gem_handle`
4. Returns the token as `fd`
The client then opens `/scheme/drm/card0/dmabuf/{token}` to get a real scheme fd.
The open handler validates the token, creates a DmaBuf scheme handle, and bumps
`gem_export_refs`. When that scheme fd is closed, the ref is dropped.
### PRIME_FD_TO_HANDLE
Request (from libdrm client):
```c
struct DrmPrimeFdToHandleWire {
int32_t fd; // opaque export token (extracted via redox_fpath on dmabuf fd)
uint32_t _pad;
};
```
Response:
```c
struct DrmPrimeFdToHandleResponseWire {
uint32_t handle; // GEM handle for the imported buffer
uint32_t _pad;
};
```
The scheme internally:
1. Looks up the export token in `prime_exports` → gets the GEM handle
2. Validates the token exists
3. Adds the GEM handle to the caller's `owned_gems`
4. Returns the GEM handle
### open() path extension
```rust
// Existing paths:
"card0" NodeKind::Card
"card0Connector/{id}" NodeKind::Connector(id)
// Export token path (validated against prime_exports):
"card0/dmabuf/{token}" NodeKind::DmaBuf { gem_handle, export_token: token }
```
### redox_fpath() for DmaBuf
```rust
NodeKind::DmaBuf { export_token, .. } => format!("drm:card0/dmabuf/{export_token}")
```
### Token cleanup
When the last export ref for a GEM handle is dropped:
```rust
fn drop_export_ref(&mut self, gem_handle: GemHandle) {
// ... decrement refcount ...
if remove_entry {
self.gem_export_refs.remove(&gem_handle);
self.prime_exports.retain(|_, &mut h| h != gem_handle);
}
}
```
When a GEM is destroyed via any path (GEM_CLOSE, DESTROY_DUMB, handle close, fb reap),
`prime_exports` entries are pruned:
- `maybe_close_gem()`: central helper prunes tokens on successful `driver.gem_close()`
- `GEM_CLOSE` / `DESTROY_DUMB`: explicit `prime_exports.retain()` after direct `driver.gem_close()`
- `PRIME_FD_TO_HANDLE`: `gem_size()` liveness check removes stale token on failure
- `open("card0/dmabuf/{token}")`: `gem_size()` liveness check removes stale token on failure
## Files to Modify
| File | Change | Status |
|------|--------|--------|
| `local/recipes/gpu/redox-drm/source/src/dmabuf.rs` | **DELETED** | ✅ |
| `local/recipes/gpu/redox-drm/source/src/scheme.rs` | DmaBuf nodes, opaque export tokens, PRIME handlers, refcount cleanup, stale token cleanup | ✅ |
| `local/recipes/gpu/redox-drm/source/src/gem.rs` | No changes (GEM operations unchanged) | — |
| `local/recipes/gpu/redox-drm/source/src/driver.rs` | No changes (PRIME is scheme-level) | — |
| `local/recipes/gpu/redox-drm/source/src/main.rs` | No changes (already clean) | — |
| `recipes/wip/x11/libdrm/source/xf86drm.c` | `redox_fpath()` + export token dmabuf path + `sys/redox.h` | ✅ |
| `recipes/libs/mesa/source/src/gallium/winsys/redox/drm/` | 4 scaffolding files (compile-time only) | ✅ |
| `local/recipes/tests/redox-drm-prime-test/` | Test reproducer recipe + Rust binary (incl. stale token test) | ✅ |
| `local/docs/HARDWARE-3D-ASSESSMENT.md` | PRIME status updated | ✅ |
| `local/docs/DMA-BUF-IMPROVEMENT-PLAN.md` | Implementation status updated | ✅ |
## Implementation Status (2026-04-16)
| Step | Status | Deliverable |
|------|--------|-------------|
| 1. Delete dead dmabuf.rs | ✅ Done | File removed |
| 2. PRIME export/import in scheme | ✅ Done | DmaBuf nodes, export refcounting, mmap_prep, open/close/fpath |
| 3. Reference counting for shared GEM | ✅ Done | gem_export_refs, bump/drop, gem_can_close, maybe_close_gem |
| 4. Two-process reproducer | ✅ Recipe created | `local/recipes/tests/redox-drm-prime-test/` (runtime validation pending) |
| 5. libdrm Redox dispatch | ✅ Done | __redox__ wrappers in drmPrimeHandleToFD and drmPrimeFDToHandle |
| 6a. Mesa winsys scaffolding | ✅ Done | `src/gallium/winsys/redox/drm/` (4 files, compiles but does not render) |
| 6b. Mesa runtime buffer sharing | ⏳ Blocked | Requires GPU command submission (not yet implemented) |
**Stale token cleanup**: All GEM destruction paths now prune `prime_exports`. Central cleanup
in `maybe_close_gem()`, explicit cleanup in `GEM_CLOSE`/`DESTROY_DUMB`, liveness checks in
`PRIME_FD_TO_HANDLE` and `open("dmabuf/{token}")` that remove stale tokens on failure.
Verified by Oracle across 8 rounds.
**Protocol note**: PRIME uses opaque export tokens. PRIME_HANDLE_TO_FD returns a monotonically-
increasing token stored in `prime_exports`. The client opens `/scheme/drm/card0/dmabuf/{token}`
to get a real scheme fd. `redox_fpath()` on that fd reveals the token. PRIME_FD_TO_HANDLE
accepts the export token and resolves it via `prime_exports`. Tokens are cleaned up when the
last export ref is dropped.
## Relationship to Other Plans
- `local/docs/HARDWARE-3D-ASSESSMENT.md` — broader hardware 3D status (command submission, fencing,
Mesa driver enablement). This document is the DMA-BUF-specific deep dive.
- `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` — canonical desktop path plan. DMA-BUF is a
prerequisite for the hardware-accelerated rendering phase.
- `local/docs/AMD-FIRST-INTEGRATION.md` — AMD-specific GPU details including GTT/VRAM programming.
- `docs/04-LINUX-DRIVER-COMPAT.md` — linux-kpi architecture reference for driver porting.
+163
View File
@@ -0,0 +1,163 @@
# Red Bear OS: Hardware-Accelerated 3D Assessment
**Date**: 2026-04-16
**Scope**: AMD + Intel GPU hardware OpenGL/Vulkan for KDE Plasma desktop
## Bottom Line
PRIME/DMA-BUF cross-process buffer sharing is **now implemented** at the scheme level. GEM
allocation, PRIME export/import, and zero-copy mmap via FmapBorrowed all work through the
redox-drm scheme daemon and libdrm. The remaining gaps for hardware 3D are GPU command
submission (CS ioctl), GPU fence/signaling, and Mesa hardware Gallium driver enablement.
These are tracked separately in `local/docs/DMA-BUF-IMPROVEMENT-PLAN.md`.
## Capability Stack
```
Application (KDE Plasma / Qt6 / Wayland compositor)
EGL / GBM / Wayland protocol
Mesa (Gallium state tracker → hardware driver) ← ONLY swrast (CPU), Redox winsys scaffolding exists
libdrm (userspace DRM wrapper) ← __redox__ PRIME dispatch ✅, opens /scheme/drm
DRM scheme ioctls (GEM, PRIME, render) ← GEM ✅, PRIME ✅ (DmaBuf nodes), render ❌
redox-drm (userspace DRM/KMS daemon) ← display ✅, buffer sharing ✅, render ❌
Kernel (FmapBorrowed, sendfd, GPU interrupts) ← buffer sharing ✅, GPU fences ❌
GPU hardware (AMD RDNA / Intel Gen)
```
## Layer-by-Layer Status
### 1. GPU Hardware Drivers (redox-drm + amdgpu + linux-kpi)
| Component | Status | Lines | What's Implemented |
|-----------|--------|-------|-------------------|
| DRM/KMS modesetting | ✅ Code complete | ~500 | 16 KMS ioctls, CRTC/connector/encoder/plane |
| AMD Display Core | ✅ Compiles | ~1400 | DC init, CRTC programming, firmware loading, HPD |
| Intel Display Driver | ✅ Compiles | ~800 | Display pipe, GGTT, forcewake |
| GEM buffer management | ✅ Full | ~350 | create/close/mmap with DmaBuffer |
| GEM scheme ioctls | ✅ Wired | ~100 | GEM_CREATE, GEM_CLOSE, GEM_MMAP |
| PRIME scheme ioctls | ✅ Implemented | ~120 | PRIME_HANDLE_TO_FD + PRIME_FD_TO_HANDLE via DmaBuf nodes + export refcounting |
| libdrm PRIME dispatch | ✅ Implemented | ~30 | __redox__ wrappers: open dmabuf path + fpath-based GEM handle extraction |
| Mesa Redox winsys | 🚧 Scaffolding | ~4 files | Directory structure + stubs in src/gallium/winsys/redox/drm/ |
| Render command submission | ❌ Missing | 0 | No CS ioctl, no ring buffer programming |
| GPU context management | ❌ Missing | 0 | No context create/destroy |
| Fence/sync objects | ❌ Missing | 0 | No GPU fence signaling |
| AMD ring buffer | ⚠️ Partial | ~100 | Page flip only, no general command submission |
### 2. Mesa Build Configuration
| Setting | Current Value | Needed for HW 3D |
|---------|--------------|-------------------|
| `gallium-drivers` | `swrast` | `swrast,radeonsi` (AMD) or `swrast,iris` (Intel) |
| `vulkan-drivers` | `swrast` | `swrast,amd` (RADV) or `swrast,intel` (ANV) |
| `platforms` | `redox` | `redox` (same) |
| EGL | enabled | enabled (same) |
| GBM | enabled | enabled (same) |
| `gallium-winsys` | none (swrast doesn't need one) | New Redox winsys for radeonsi/iris |
| `egl/platform_redox.c` | 540 lines, Orbital-backed | Needs DRM backend for HW buffers |
### 3. Kernel Infrastructure
| Feature | Status | Impact |
|---------|--------|--------|
| PCI enumeration | ✅ | GPU devices discovered |
| Memory scheme (phys mmap) | ✅ | GPU register access works |
| IRQ scheme (MSI-X) | ✅ | GPU interrupts can be delivered |
| DMA-BUF fd passing | ✅ Scheme-level | FmapBorrowed + sendfd + DmaBuf nodes enable zero-copy cross-process sharing |
| GPU fence/wait | ❌ | No GPU completion signaling |
| IOMMU/GPU page tables for imports | ❌ | Imported buffers can't be mapped into GPU GTT |
## The Render Path Gap
For hardware OpenGL, the data path is:
```
Mesa Gallium (radeonsi)
→ libdrm open("drm:card0")
→ DRM_IOCTL_GEM_CREATE (allocate GPU buffer) ← EXISTS
→ DRM_IOCTL_PRIME_HANDLE_TO_FD (export for sharing) ← ✅ IMPLEMENTED (DmaBuf node + scheme fd)
→ DRM_IOCTL_AMDGPU_CS (submit commands to GPU) ← DOES NOT EXIST
→ fence wait (GPU completion) ← DOES NOT EXIST
→ present via KMS (PAGE_FLIP) ← EXISTS
```
Steps 1-2 now have full scheme ioctl support with cross-process buffer sharing via DmaBuf scheme
nodes, sendfd, and FmapBorrowed. Steps 3-4 (command submission, fencing) remain the critical
gaps. The buffer sharing foundation is in place — compositors and clients can share GPU buffers
zero-copy. The missing piece is GPU command submission for actual rendering.
## What Was Implemented
| Change | Before | After |
|--------|--------|-------|
| `DRM_IOCTL_GEM_CREATE` | Not in scheme | Full ioctl handler: allocate GEM buffer, track ownership |
| `DRM_IOCTL_GEM_CLOSE` | Not in scheme | Full ioctl handler with ownership check |
| `DRM_IOCTL_GEM_MMAP` | Not in scheme | Full ioctl handler: return virtual address |
| `DRM_IOCTL_PRIME_HANDLE_TO_FD` | EOPNOTSUPP | Full implementation: opaque export tokens, prime_exports map, dmabuf fd creation |
| `DRM_IOCTL_PRIME_FD_TO_HANDLE` | EOPNOTSUPP | Full implementation: accepts export token (from redox_fpath), resolves via prime_exports |
| `libdrm __redox__ PRIME` | Not present | drmPrimeHandleToFD opens dmabuf path via export token; drmPrimeFDToHandle extracts token via redox_fpath |
| `NodeKind::DmaBuf` | Not present | DmaBuf node with mmap_prep returning GEM virtual address (enables FmapBorrowed) |
| `gem_export_refs` tracking | Not present | BTreeMap refcount for shared GEM objects, prevents premature gem_close |
| Mesa winsys scaffolding | Not present | src/gallium/winsys/redox/drm/ stub directory structure |
## What Remains (Ordered by Dependency)
### Tier 1: Can be done without kernel changes
1. **Mesa Gallium hardware driver enablement** — Change recipe from `-Dgallium-drivers=swrast` to
include `radeonsi` or `iris`. This will fail to build without a winsys, but the attempt reveals
the exact Mesa-side gaps.
2. **Redox Mesa winsys** — Scaffolding exists at `src/gallium/winsys/redox/drm/` (compile-time
stubs). Needs real implementation of buffer allocation, PRIME export/import, and mmap.
PRIME ioctls are now implemented in redox-drm and libdrm has `__redox__` dispatch.
3. **libdrm Redox backend** — libdrm already has `__redox__` conditional handling, opens
`/scheme/drm`, and dispatches PRIME ioctls via `redox_fpath()` and dmabuf path opening.
The remaining gap is GPU-family-specific command submission ioctls.
### Tier 2: Requires kernel work
4. **GPU command submission** — The amdgpu and Intel drivers need ring buffer programming for
3D command submission, not just page flip. This is GPU-family-specific:
- AMD: GFX ring, compute ring, SDMA ring
- Intel: render ring, blitter ring
6. **GPU fence/signaling** — After submitting commands, the kernel needs to signal completion
back to userspace. This requires IRQ handling that maps GPU interrupts to fence objects.
### Tier 3: Requires significant new code
7. **GTT/PPGTT population for imported buffers** — When Mesa imports a DMA-BUF into the GPU,
the buffer's physical pages must be mapped into the GPU's address space. Currently only
internally-allocated GEM objects get GTT mappings.
8. **Mesa EGL platform extension**`platform_redox.c` currently uses Orbital for buffer
management. It needs an alternative path that uses DRM GEM for hardware-accelerated
surfaces.
## Estimated Effort (2 developers)
| Tier | Duration | Deliverable |
|------|----------|-------------|
| Tier 1 (userspace) | 8-16 weeks | Mesa builds with radeonsi, winsys talks to DRM scheme |
| Tier 2 (kernel/driver) | 12-20 weeks | GPU command submission, fences, VRAM placement |
| Tier 3 (integration) | 6-12 weeks | Hardware-accelerated OpenGL applications |
| **Total** | **26-48 weeks** | **Hardware 3D on AMD** |
Intel (iris) is expected to be faster than AMD (radeonsi is ~6M lines vs iris ~400k) but both are
equal-priority Red Bear OS targets. The order of enablement is driven by driver complexity, not
platform priority.
## Relationship to Other Plans
- `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` — Phase 5 covers hardware GPU enablement
- `local/docs/AMD-FIRST-INTEGRATION.md` — AMD-specific GPU driver details
- `local/docs/P2-AMD-GPU-DISPLAY.md` — Display driver code-complete status
- `docs/04-LINUX-DRIVER-COMPAT.md` — linux-kpi architecture reference
+5
View File
@@ -123,6 +123,11 @@ pcid: local/config/pcid.d/amd_gpu.toml
- Rust side passes real PciDeviceInfo (vendor, device, revision, IRQ, BAR0/BAR2) to C via FFI
- C layer validates the struct is populated before `amdgpu_redox_init()` uses it
### linux-kpi quirk consumption (current)
- `redox-drm` now also passes the real PCI BDF into the amdgpu C glue so linux-kpi quirk lookups resolve against the actual GPU, not a guessed location
- `amdgpu_redox_main.c` now calls `pci_get_quirk_flags()` / `pci_has_quirk()` in the live Redox init path
- `PCI_QUIRK_NEED_FIRMWARE` now gates DMCUB firmware loading as a hard requirement when present, while logs also spell out quirk-driven IRQ expectations (`NO_MSI`, `NO_MSIX`, `FORCE_LEGACY`)
### Intel GPU support (T4-T5)
- Intel driver switched to shared `InterruptHandle` (MSI-X + legacy)
- Added `local/config/pcid.d/intel_gpu.toml` for auto-detection (vendor 0x8086, class 0x03)
+9 -1
View File
@@ -1,5 +1,9 @@
# Red Bear OS Phase 03 Reassessment
> **DEPRECATED (2026-04-16):** This one-time reconciliation document has been absorbed into the
> updated `CONSOLE-TO-KDE-DESKTOP-PLAN.md` and `docs/07-RED-BEAR-OS-IMPLEMENTATION-PLAN.md`. It is
> retained for historical reference only. Do not use it as a current planning source.
## Purpose
This document reconciles the current public execution plan in `docs/07-RED-BEAR-OS-IMPLEMENTATION-PLAN.md`
@@ -256,7 +260,11 @@ For Phase 03 work, prefer closing validation gaps and documentation drift bef
The early-phase codebase is in a much better structural state now; the main quality risk is no
longer missing packages, but overstating readiness before runtime evidence exists.
## Phase 4 Handoff Note
## Phase 4 Handoff Note (historical P0P6 numbering)
> This section uses the old P0P6 phase numbering. In the v2.0 desktop plan
> (`local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md`), the "Phase 4" below corresponds to Phase 2
> (Wayland Compositor Proof).
Phase 4 should begin from the existing `wayland.toml` profile, not by jumping straight to KWin.
The current repo already contains the `smallvil`, `cosmic-comp`, `qtwayland`, and Mesa software
+13 -9
View File
@@ -20,15 +20,19 @@ USB plan uses:
## Tracked Profiles
> **Phase numbering note:** phase labels below use the v2.0 desktop plan phases from
> `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md`. Scripts and older docs may reference the
> historical P0P6 hardware-enablement sequence — those are not the same numbering.
| Profile | Intent | Key Fragments | Current support language |
|---|---|---|---|
| `redbear-minimal` | Console + storage + wired-network baseline | `minimal.toml`, `redbear-legacy-base.toml`, `redbear-device-services.toml`, `redbear-netctl.toml` | builds / primary validation baseline / DHCP boot profile enabled / input-runtime substrate wired |
| `redbear-bluetooth-experimental` | First bounded Bluetooth validation profile | `redbear-bluetooth-experimental.toml`, `redbear-minimal.toml` | builds / boots in QEMU / validated bounded Battery Level slice via `redbear-bluetooth-battery-check` and `test-bluetooth-qemu.sh --check` / explicit-startup USB BLE-first only / repeated helper + restart cleanup covered / not generic GATT / not USB-class-autospawn |
| `redbear-bluetooth-experimental` | First bounded Bluetooth validation profile | `redbear-bluetooth-experimental.toml`, `redbear-bluetooth-services.toml`, `redbear-minimal.toml` | builds / boots in QEMU / validated bounded Battery Level slice via `redbear-bluetooth-battery-check` and `test-bluetooth-qemu.sh --check` / explicit-startup USB BLE-first only / repeated helper + restart cleanup covered / not generic GATT / not USB-class-autospawn |
| `redbear-wifi-experimental` | First bounded Intel Wi-Fi validation profile | `redbear-wifi-experimental.toml`, `redbear-device-services.toml`, `redbear-netctl.toml` | builds / experimental bounded Intel Wi-Fi slice / driver + control/profile/reporting stack present / packaged in-target validation and capture commands available / real hardware connectivity still unproven |
| `redbear-desktop` | Main Red Bear desktop integration profile without KDE-specific session wiring | `desktop.toml`, `redbear-legacy-base.toml`, `redbear-device-services.toml`, `redbear-netctl.toml` | builds / input-runtime substrate wired / runtime reporting installed |
| `redbear-wayland` | Phase 4 Wayland runtime validation profile | `wayland.toml` | builds / boots in QEMU / experimental software-path graphics-runtime slice / not QEMU hardware-acceleration proof |
| `redbear-full` | Phase 5 desktop/network plumbing profile | `desktop.toml`, `redbear-legacy-base.toml`, `redbear-legacy-desktop.toml`, `redbear-device-services.toml`, `redbear-netctl.toml` | builds / boots in QEMU / D-Bus system bus wired / experimental runtime path |
| `redbear-kde` | Phase 6 KDE session-surface profile | `desktop.toml`, `redbear-legacy-base.toml`, `redbear-legacy-desktop.toml`, `redbear-device-services.toml`, `redbear-netctl.toml` | builds / experimental desktop path / D-Bus+seatd+KWin session surface wired |
| `redbear-wayland` | v2.0 Phase 2 Wayland compositor validation profile | `wayland.toml` | builds / boots in QEMU / experimental software-path graphics-runtime slice / not QEMU hardware-acceleration proof |
| `redbear-full` | Broader desktop/network/session plumbing (spans v2.0 Phases 23) | `desktop.toml`, `redbear-legacy-base.toml`, `redbear-legacy-desktop.toml`, `redbear-device-services.toml`, `redbear-netctl.toml` | builds / boots in QEMU / D-Bus system bus wired / experimental runtime path |
| `redbear-kde` | v2.0 Phases 34 KDE Plasma session-surface profile | `desktop.toml`, `redbear-legacy-base.toml`, `redbear-legacy-desktop.toml`, `redbear-device-services.toml`, `redbear-netctl.toml` | builds / experimental desktop path / D-Bus+seatd+KWin session surface wired |
| `redbear-live` | Live and recovery image layered on desktop | `redbear-desktop.toml` | builds |
## Profile Notes
@@ -37,7 +41,7 @@ USB plan uses:
- First place to validate repository discipline and profile reproducibility.
- Should stay smaller and less assumption-heavy than the graphics profiles.
- Enables the shared `wired-dhcp` netctl profile by default for the Phase 2 VM/wired baseline.
- Enables the shared `wired-dhcp` netctl profile by default for the VM/wired baseline.
- Ships the shared firmware/input runtime service prerequisites so the early substrate can be tested on the smallest profile as well.
### `redbear-bluetooth-experimental`
@@ -73,15 +77,15 @@ USB plan uses:
### `redbear-wayland`
- Wraps the repo's existing `wayland.toml` into a first-class Red Bear build target.
- Serves as the Phase 4 runtime-validation surface for `orbital-wayland` and `smallvil`.
- Serves as the v2.0 Phase 2 compositor validation surface for `orbital-wayland` and `smallvil`.
- Current verified path: QEMU/UEFI boot to login prompt plus guest-side `redbear-phase4-wayland-check`, with `smallvil` reaching xkbcommon initialization and EGL platform selection on Redox.
- Current QEMU renderer evidence is still software-based (`llvmpipe` on the current `-vga std` harness), so this profile must not be described as a hardware-accelerated desktop proof yet.
- Treat this profile as the bounded Phase 4 Wayland/Qt regression harness; the final hardware-desktop claim still belongs to the bare-metal accelerated graphics path.
- Treat this profile as the bounded Wayland/Qt regression harness; the final hardware-desktop claim still belongs to the bare-metal accelerated graphics path.
### `redbear-full`
- Used for broader desktop/session plumbing after the narrower `redbear-wayland` validation slice.
- Current Phase 5 role: carry D-Bus system-bus plumbing together with the native Red Bear network stack.
- Current role: carry D-Bus system-bus plumbing together with the native Red Bear network stack (spans v2.0 Phases 23).
- Current verified path: QEMU/UEFI boot to login prompt plus guest-side `redbear-phase5-network-check`, with functional VirtIO networking and `DBUS_SYSTEM_BUS=present`.
- Should not be described as fully supported until runtime validation is evidence-backed.
@@ -89,7 +93,7 @@ USB plan uses:
- Dedicated profile for Plasma/KWin session bring-up.
- Keep KDE-specific service wiring here instead of leaking it into the generic desktop profile.
- Current Phase 6 role: carry the KWin session launch surface and its D-Bus/seatd dependencies in one image.
- Current role: carry the KWin session launch surface and its D-Bus/seatd dependencies in one image (v2.0 Phases 34).
### `redbear-live`
@@ -1,5 +1,9 @@
# Project Documentation Assessment
> **DEPRECATED (2026-04-16):** This one-time assessment was completed and its findings applied to the
> documentation set. It is retained for historical reference only. For current documentation status,
> see `docs/README.md` and the document-status matrix there.
## Purpose
This document assesses the current Red Bear OS documentation set after the repository-model and WIP
+95 -41
View File
@@ -1,11 +1,21 @@
# Qt6 Port — Red Bear OS
**Last updated:** 2026-04-14
**Last updated:** 2026-04-16
**Qt version:** 6.11.0
**Target:** x86_64-unknown-redox (cross-compiled from Linux x86_64 host)
**Phase 1 status:** ✅ COMPLETE — Qt6 core stack + OpenGL/EGL + D-Bus + Wayland
**Phase 2 status:** ✅ COMPLETE — All 32 KF6 frameworks built
**Phase 3 status:** 🔄 IN PROGRESS — KWin + KDE Plasma build
> **Phase numbering note:** The phases below (Phase 16) are this document's internal Qt porting
> phases, not the canonical desktop plan phases. For the project-wide desktop execution plan
> (Phase 1: Runtime Substrate → Phase 5: Hardware GPU), see
> `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` (v2.0).
**Qt Phase 1 status:** ✅ COMPLETE — Qt6 core stack + OpenGL/EGL + D-Bus + Wayland
**Qt Phase 2 status:** ✅ COMPLETE — All 32 KF6 frameworks built
**Qt Phase 3 status:** 🔄 IN PROGRESS — KWin + KDE Plasma build
> **Execution note (2026-04-16):** The current Redox-applicable Qt 6.11 recipe expansion wave has
> real-cook verification for `qtimageformats`, `qt5compat`, `qttools`, `qttranslations`, and
> `qtshadertools`.
## Current Status Summary
@@ -15,16 +25,36 @@
| **qtdeclarative** | ✅ | 11 libs, QML JIT disabled |
| **qtsvg** | ✅ | 2 libs |
| **qtwayland** | ✅ | Wayland client + compositor |
| **qtimageformats** | ✅ | Real recipe + cook verified |
| **qt5compat** | ✅ | Real recipe + cook verified |
| **qttools** | ✅ | Redox-scoped tooling slice cook verified |
| **qttranslations** | ✅ | Translation catalogs cook verified |
| **qtshadertools** | ✅ | Real recipe + cook verified |
| **Mesa EGL+GBM** | ✅ | libEGL, libgbm, libGLESv2, swrast DRI |
| **libdrm** | ✅ | libdrm + libdrm_amdgpu |
| **libinput** | ✅ | 1.30.2 with comprehensive redox.patch |
| **D-Bus** | ✅ | 1.16.2, libdbus-1.so |
| **KF6 Frameworks** | ✅ 32/32 | All frameworks built |
| **KWin** | 🔄 | Recipe ready, now using real `libxcvt`, but still blocked by remaining shimmed/stubbed deps and incomplete runtime path |
| **Hardware acceleration** | ❌ | Requires kernel DMA-BUF (future work) |
| **Hardware acceleration** | ❌ | PRIME/DMA-BUF scheme ioctls implemented; blocked on GPU command submission (CS ioctl) |
---
## Wave 1 — Redox-applicable Qt 6.11 module expansion
The first post-core Qt 6.11 coverage wave is now real-cook verified:
| Module | Status | Verification |
|--------|--------|--------------|
| `qtimageformats` | ✅ | `CI=1 ./target/release/repo cook qtimageformats` |
| `qt5compat` | ✅ | `CI=1 ./target/release/repo cook qt5compat` |
| `qttools` | ✅ | `CI=1 ./target/release/repo cook qttools` |
| `qttranslations` | ✅ | `CI=1 ./target/release/repo cook qttranslations` |
| `qtshadertools` | ✅ | `CI=1 ./target/release/repo cook qtshadertools` |
This means the repo now has real Qt 6.11 recipes for the first high-yield Redox-applicable
expansion set, all verified by actual `repo cook` runs.
## Scope Definition
**Phase 1 scope**: qtbase, qtdeclarative, qtsvg — the foundational Qt6 stack.
@@ -32,8 +62,8 @@ Qt6 consists of many modules — each is a separate source package. Phase 2 (qtw
follows in the next step.
**User-agreed scope constraints:**
- OpenGL: software/shm only, no EGL — get Qt compiling first
- Disabled features: process, sharedmemory, systemsemaphore, testlib, sql, printsupport
- OpenGL: now enabled (GLES 2.0 software path via Mesa/LLVMpipe); hardware acceleration still future work
- Still disabled features: process testlib, sql, printsupport remain out of scope for current iteration
- Iterative approach: enable modules incrementally, re-enable disabled features later
## Build Status
@@ -99,32 +129,32 @@ Plus: QML debug plugins, QtQuick/QML modules staged.
### Disabled Modules — Full Blocker Analysis
| Module | Blocker | Root Cause | Re-enable Path |
|--------|---------|------------|----------------|
| QtNetwork | Runtime validation still pending | the relibc header/ioctl surface is now present in-tree, but downstream QtNetwork behavior still needs end-to-end validation on Redox | Validate QtNetwork against the updated relibc networking surface |
| QtOpenGL | No EGL, no GPU driver runtime validation | amdgpu/intel DRM drivers compile but haven't been validated on hardware; no Mesa EGL build | Validate GPU drivers on HW → build Mesa with EGL → enable QtOpenGL |
| QtOpenGLWidgets | Gated by `QT_FEATURE_opengl` | Same as QtOpenGL | Same as QtOpenGL |
| QtDBus | D-Bus IPC system not ported to Redox | No D-Bus daemon or libdbus on Redox | Port libdbus → enable QtDBus |
| QtSql | User-agreed scope exclusion | Not needed for initial GUI | Add sqlite/odbc recipe → enable QtSql |
| QtPrintSupport | User-agreed scope exclusion | No printing subsystem on Redox | Port cups/filters → enable QtPrintSupport |
| Module | Status | Blocker | Re-enable Path |
|--------|--------|---------|----------------|
| QtNetwork | ❌ Disabled | relibc networking runtime semantics still incomplete (DNS resolver, IPv6 multicast) | Validate QtNetwork against the updated relibc networking surface |
| QtSql | ❌ Disabled | User-agreed scope exclusion | Add sqlite/odbc recipe → enable QtSql |
| QtPrintSupport | ❌ Disabled | User-agreed scope exclusion, no printing subsystem on Redox | Port cups/filters → enable QtPrintSupport |
> **Previously disabled, now enabled:** QtOpenGL (✅ Phase 4b), QtOpenGLWidgets (✅ Phase 4b), and
> QtDBus (✅ Phase 2a) were disabled in earlier builds but have since been enabled and built
> successfully. See Phase 4b and Phase 2a sections below for details.
### Disabled Features — Full Blocker Analysis
| Feature | CMake Flag | Blocker | Re-enable Path |
|---------|-----------|---------|----------------|
| OpenGL | `-DFEATURE_opengl=OFF` | No EGL, no GPU runtime validation | Validate GPU drivers → Mesa EGL → enable |
| EGL | `-DFEATURE_egl=OFF` | No libEGL from Mesa | Mesa EGL build → enable |
| XCB/Xlib | `-DFEATURE_xcb=OFF -DFEATURE_xlib=OFF` | No X11 on Redox | Not applicable — Redox uses Wayland |
| Vulkan | `-DFEATURE_vulkan=OFF` | No Vulkan runtime | Port Mesa Vulkan ICD → enable |
| OpenSSL | `-DFEATURE_openssl=OFF` | OpenSSL3 port in WIP but not validated | Validate openssl3 recipe → enable |
| D-Bus | `-DFEATURE_dbus=OFF` | No D-Bus on Redox | Port libdbus → enable |
| Process | `-DFEATURE_process=ON` | relibc now provides a bounded `waitid()` path and qtbase configures, builds, and stages with process support enabled | Validate QProcess on Redox |
| Shared Memory | `-DFEATURE_sharedmemory=ON` | relibc now provides `shm_open()` plus bounded SysV shared-memory surfaces and qtbase configures, builds, and stages with shared memory enabled | Validate QSharedMemory on Redox |
| System Semaphore | `-DFEATURE_systemsemaphore=ON` | relibc now provides `sem_open()`/`sem_close()`/`sem_unlink()` and qtbase configures, builds, and stages with system semaphore support enabled | Validate QSystemSemaphore on Redox |
| qmake | `-DFEATURE_qmake=OFF` | Build tool, not needed with CMake | Enable if downstream needs qmake |
| SQL | `-DFEATURE_sql=OFF` | User-agreed scope exclusion | Add sqlite/odbc → enable |
| Print Support | `-DFEATURE_printsupport=OFF` | User-agreed scope exclusion | Port cups → enable |
| QML JIT | `-DFEATURE_qml_jit=OFF` | Does not compile for Redox | Fix in upstream or disable permanently |
| Feature | CMake Flag | Status | Notes |
|---------|-----------|--------|-------|
| XCB/Xlib | `-DFEATURE_xcb=OFF -DFEATURE_xlib=OFF` | ❌ Disabled | Not applicable — Redox uses Wayland, not X11 |
| Vulkan | `-DFEATURE_vulkan=OFF` | ❌ Disabled | No Vulkan runtime on Redox |
| OpenSSL | `-DFEATURE_openssl=OFF` | ❌ Disabled | OpenSSL3 port in WIP but not validated |
| qmake | `-DFEATURE_qmake=OFF` | ❌ Disabled | Build tool, not needed with CMake |
| SQL | `-DFEATURE_sql=OFF` | ❌ Disabled | User-agreed scope exclusion |
| Print Support | `-DFEATURE_printsupport=OFF` | ❌ Disabled | User-agreed scope exclusion |
| QML JIT | `-DFEATURE_qml_jit=OFF` | ❌ Disabled | Does not compile for Redox |
> **Previously disabled, now enabled:** OpenGL (`-DFEATURE_opengl=ON`), EGL (`-DFEATURE_egl=ON`),
> and D-Bus (`-DFEATURE_dbus=ON`) were disabled in earlier builds but have since been enabled and
> built successfully. Process, shared memory, and system semaphore were also enabled after relibc
> improvements. See respective Phase sections for details.
---
@@ -197,8 +227,8 @@ Plus: QML debug plugins, QtQuick/QML modules staged.
| Gap | Impact | Module Blocked |
|-----|--------|---------------|
| D-Bus IPC | QtDBus, KDE components | QtDBus |
| GPU display validation | Hardware-accelerated rendering | QtOpenGL |
| broader networking runtime validation | QtNetwork end-to-end behavior | QtNetwork |
| GPU hardware display validation | Hardware-accelerated rendering | QtOpenGL hardware path |
| broader shared-memory validation beyond the existing `shm_open()` path | Shared memory | QSharedMemory |
| broader semaphore/system-IPC validation beyond the new `sem_open()` path | POSIX semaphores | QSystemSemaphore |
| process/runtime validation beyond the new bounded `waitid()` path | QProcess internals | QProcess |
@@ -305,8 +335,9 @@ Current truth for Phase 4:
as a bounded regression/test path, not as the final acceleration proof target
- the in-repo Phase 4 runtime check currently still fails in `qt6-bootstrap-check` during early Qt
startup, so even the bounded software-path runtime proof remains incomplete
- true hardware-accelerated desktop readiness still requires kernel DMA-BUF fd passing plus real
- true hardware-accelerated desktop readiness still requires GPU command submission (CS ioctl) plus real
AMD/Intel hardware validation through the DRM → GBM/EGL → compositor → Qt client path
(PRIME/DMA-BUF cross-process buffer sharing is implemented at scheme level)
### Phase 4b — Qt6 OpenGL Enablement (✅ build-side complete, 🚧 runtime incomplete)
@@ -322,11 +353,29 @@ KDE Plasma packages built:
- plasma-wayland-protocols ✅ BUILT (protocol XMLs for kf6-kwayland)
- kdecoration ✅ BUILT (KDecoration3 window decoration library)
KWin recipe updated with dependencies (all KF6 + Mesa + libdrm + libinput + qtwayland):
- All KF6 deps built (kconfigwidgets, kxmlgui, kglobalaccel, kidletime, kio, etc.)
- Mesa EGL+GBM ✅
- libinput ✅
- libdrm ✅
plasma-workspace stub dependencies partially resolved:
- kf6-knewstuff ✅ STUB ONLY (KF6NewStuff cmake INTERFACE IMPORTED targets for plasma-workspace dep resolution)
- kf6-kwallet ✅ STUB ONLY (KF6Wallet cmake INTERFACE IMPORTED targets for plasma-workspace dep resolution)
- kf6-prison ✅ REAL RECIPE (real cmake build against libqrencode; dmtx/ZXing disabled; not yet compiled)
qt6-wayland-smoke improved to create a visible QWindow:
- Creates a 320x240 colored window (red background, "Red Bear OS - Qt6 Wayland Smoke Test" text)
- Uses QBackingStore for software rendering
- Runs for 3 seconds (previously 1 second, no window)
- This turns the smoke test from a bootstrap check into a real Wayland surface proof target
KWin recipe updated — features re-enabled where deps are satisfied:
- KWIN_BUILD_DECORATIONS=ON (kdecoration builds ✅)
- KWIN_BUILD_GLOBALSHORTCUTS=ON (kglobalaccel builds ✅)
- KWIN_BUILD_RUNNERS=ON (kf6-kio builds ✅)
- KWIN_BUILD_NOTIFICATIONS=ON (knotifications builds ✅)
- USE_DBUS=ON (D-Bus 1.16.2 builds ✅)
- Still disabled (9): KCMS, screen locking, tabbox, effects, X11, QML, running-in-kde,
signing docs, screenlocker
- Stub deps remaining: libepoxy-stub, libudev-stub, lcms2-stub, libdisplay-info-stub
New dependency library:
- libqrencode 4.1.1 ✅ BUILT (QR code encoder, dependency of kf6-prison)
- kf6-kwayland ✅
- seatd builds separately (runtime dependency, not needed for compilation)
@@ -357,7 +406,8 @@ Phase 1 ✅ (qtbase + qtdeclarative + qtsvg)
validated more broadly. QML network access is also affected.
3. **No GPU hardware acceleration** — Qt6 OpenGL/EGL and Mesa EGL+GBM now build, but they are still validated only on the software/LLVMpipe path.
True hardware acceleration (radeonsi or equivalent) still requires kernel DMA-BUF fd passing and real hardware validation.
True hardware acceleration (radeonsi or equivalent) still requires GPU command submission and real hardware validation.
PRIME/DMA-BUF cross-process buffer sharing is implemented at the scheme level.
4. **relibc / graphics surface still incomplete for runtime** — the build-side `open_memstream` and Wayland-facing header export path now work,
but DMA-BUF ioctls, sync objects, and broader graphics runtime validation are still unavailable.
@@ -370,7 +420,11 @@ Phase 1 ✅ (qtbase + qtdeclarative + qtsvg)
The Qt6/KF6 build stack is substantially further along than the earlier "~50%" estimate implied:
- Qt6, QtWayland, Mesa EGL+GBM, Qt6 OpenGL, libdrm amdgpu, and all 32 KF6 frameworks now build
- the remaining blockers are concentrated in KWin/Plasma runtime integration and in the still-shimmed or stub-only packages such as Kirigami, libepoxy, libudev, lcms2, and libdisplay-info
- hardware acceleration still requires kernel DMA-BUF work and real hardware validation
- hardware acceleration still requires GPU command submission and real hardware validation (PRIME/DMA-BUF buffer sharing is implemented)
- a successful build stack is not yet the same thing as a working KDE Plasma session
(Updated 2026-04-14 — status reconciled after relibc/libwayland bridge fixes; build-side progress is real, runtime remains incomplete)
For the canonical execution plan from this state to a working KDE Plasma desktop, see
`local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` (v2.0). The Qt work described here maps to
pre-Phase work (builds complete) and Phase 3 (KWin desktop session) in the canonical plan.
(Updated 2026-04-16 — aligned with CONSOLE-TO-KDE-DESKTOP-PLAN.md v2.0)
+303
View File
@@ -0,0 +1,303 @@
# Hardware Quirks Improvement Plan
## Purpose
This plan replaces vague “quirks support” follow-up work with a concrete path to:
1. keep quirks data and reporting honest,
2. integrate quirks into real runtime driver behavior,
3. reduce duplicated quirk logic,
4. leave DMI and USB device quirks in a maintainable state.
## Current status snapshot
Completed from this plan:
- runtime DMI TOML loading in `redox-driver-sys`,
- subsystem-gated PCI TOML matching in both the canonical path and `pcid-spawner`,
- shipped DMI TOML overrides in the brokered `pcid-spawner` env-var path,
- direct canonical `redox-driver-sys` quirk lookup from `pcid-spawner` instead of a separate in-tree PCI quirk engine,
- real USB device quirk consumption in `xhcid`,
- first real linux-kpi quirk consumption in the Red Bear amdgpu path.
Still open after this implementation wave:
- no remaining implementation items in the current quirks scope.
The runtime-behavior milestone from this plan is now implemented. The remaining work is
maintenance, validation depth, and future refinement rather than missing quirks behavior for the
shipped paths.
It is based on the current in-tree state of:
- `redox-driver-sys` as the canonical quirks library,
- `pcid-spawner` as an upstream-owned PCI launch broker that now brokers canonical quirks,
- `redox-drm`, `xhcid`, and the amdgpu Redox glue/runtime path as real runtime PCI quirk consumers,
- `lspci`, `lsusb`, and `redbear-info` as reporting surfaces.
## Reassessment Summary
### What is real today
- `redox-driver-sys` owns the canonical PCI/USB quirk flag definitions and lookup helpers.
- `redox-drm` consumes PCI quirks for interrupt fallback and `DISABLE_ACCEL`.
- `xhcid` consumes PCI controller quirks via `PCI_QUIRK_FLAGS` for IRQ mode selection and reset delay.
- `linux-kpi` exposes `pci_get_quirk_flags()` / `pci_has_quirk()` for C drivers, and amdgpu now consumes them in its Redox init path.
- `lspci` and `lsusb` surface active PCI/USB quirk flags for discovered devices.
- `redbear-info --quirks` reports configured TOML entries and DMI rule counts.
### What is still weak
- USB quirks now have a first real runtime consumer in `xhcid`, but broader USB-driver adoption is still missing.
- The `linux-kpi` bridge now has a first real in-tree C consumer: amdgpu uses it for firmware gating and quirk-aware IRQ expectation logging. Broader C-driver adoption is still missing.
- `pcid-spawner` still synthesizes a partial `PciDeviceInfo` instead of reusing a richer canonical PCI object, because it operates as an upstream-owned broker with a narrow interface.
### What should not be “fixed” in the wrong layer
- `firmware-loader` should stay a generic scheme service. `NEED_FIRMWARE` belongs in device driver policy, not in the firmware scheme daemon.
- `redbear-info` should describe configured and observable state; it should not pretend to prove runtime quirk application.
## Target Architecture
### Upstream-preference policy
When upstream Redox already provides the same functionality, the upstream path wins by default
unless the Red Bear-local implementation is materially better. For quirks and driver support,
this means the canonical path should converge on `redox-driver-sys` instead of preserving
lower-quality duplicate quirk engines as a steady state.
### Canonical rule
`redox-driver-sys` remains the authoritative quirks model:
- flag definitions,
- compiled-in tables,
- TOML parsing semantics,
- DMI matching behavior.
All other code should either:
1. call the canonical lookup directly, or
2. receive lookup results from a single broker that is guaranteed to use the same semantics.
### Driver integration rule
- **Rust PCI drivers using `redox-driver-sys`** should call `info.quirks()` directly.
- **C drivers using `linux-kpi`** should call `pci_has_quirk()` / `pci_get_quirk_flags()` directly in probe/init paths.
- **Upstream base drivers that cannot depend on `redox-driver-sys`** may continue using brokered quirk bits from `pcid-spawner`, but only if that broker is made semantically identical to the canonical library.
- **USB device quirks** should be consumed inside `xhcid` device enumeration/configuration logic, not only in tooling.
## Concrete Work Plan
### Wave 1 — Cleanup and truthfulness
#### Task 1.1: Keep docs and reporting surfaces honest
Scope:
- `local/docs/QUIRKS-SYSTEM.md`
- `local/recipes/system/redbear-info/source/src/main.rs`
- related AGENTS references if needed
Goals:
- separate reporting surfaces from real runtime consumers,
- remove claims that imply driver integration where only tooling exists,
- keep “not yet implemented” items explicit.
QA:
- `cargo test` in `local/recipes/system/redbear-info/source`
- review `redbear-info --help` text and `--quirks` output strings
#### Task 1.2: Remove stale equivalence claims from extraction/documentation
Scope:
- `local/scripts/extract-linux-quirks.py`
- `local/docs/QUIRKS-SYSTEM.md`
Goals:
- avoid mapping Linux flags to incorrect Red Bear flags,
- clearly mark heuristic extraction limits for PCI handler-name mode.
QA:
- run the script on a small synthetic USB/PXI input sample,
- confirm output omits unsupported PCI flag mappings instead of inventing equivalents.
### Wave 2 — Unify PCI quirk semantics
#### Task 2.1: Eliminate semantic drift between `pcid-spawner` and `redox-driver-sys`
Constraint:
- `pcid-spawner` is upstream-owned base code, so any convergence work must be implemented as upstream-base changes carried by Red Bear patching until upstream absorbs them.
Best approach:
- make `pcid-spawner` consume generated/shared quirk data instead of hand-maintained duplicated tables and flag maps.
Preferred implementation options, in order:
1. **Shared generated data module** used by both `redox-driver-sys` and `pcid-spawner`.
2. **Protocol extension** where a single canonical broker calculates quirk bits and hands them to drivers.
3. Keep duplication only as a short-term fallback if generation is not yet practical.
Do **not** continue manually editing two separate PCI quirk engines long-term.
Success criteria:
- one authoritative source for compiled PCI quirk entries and flag name mapping,
- subsystem matching behavior aligned,
- explicit decision on whether DMI is brokered by `pcid-spawner` or left to driver-local lookup.
QA:
- compare quirk outputs for the same synthetic PCI info through both paths,
- verify `PCI_QUIRK_FLAGS` emitted by `pcid-spawner` matches canonical lookup for representative devices.
#### Task 2.2: Decide DMI ownership clearly
Decision needed:
- either `pcid-spawner` becomes DMI-aware and brokers the final PCI quirk bitmask,
- or `pcid-spawner` remains PCI/TOML-only and DMI stays driver-local in `redox-driver-sys` consumers.
Recommendation:
- near term: document the split clearly,
- medium term: move toward one brokered result for upstream base drivers.
QA:
- one design note added to the docs explaining the chosen ownership model.
### Wave 3 — Real driver integration
#### Task 3.1: Integrate USB device quirks in `xhcid`
Best integration points:
- after device descriptor read,
- before SetConfiguration,
- before enabling LPM/U1/U2 or USB3-specific behavior,
- after reset paths where extra delay or reset-after-probe is needed.
Minimum runtime behaviors to wire first:
- `NO_SET_CONFIG`
- `NEED_RESET`
- `NO_LPM`
- `NO_U1U2`
- `BAD_DESCRIPTOR`
Success criteria:
- `xhcid` calls `lookup_usb_quirks()` for enumerated devices,
- these flags alter runtime behavior in concrete branches,
- tooling and runtime logic agree on the same device-level quirks.
QA:
- unit/integration tests for selector logic where possible,
- manual logging proof that a known vendor/product entry triggers the expected path.
#### Task 3.2: Consume linux-kpi quirks in `amdgpu`
Best integration points:
- probe path,
- IRQ mode selection,
- firmware gating,
- memory/power-management setup.
First flags to consume:
- `NO_MSI`
- `NO_MSIX`
- `NEED_FIRMWARE`
- `NO_ASPM`
- `NEED_IOMMU`
Success criteria:
- at least one real C driver uses `pci_has_quirk()` in production code,
- runtime logs show quirk-informed decision making.
Current state:
- `local/recipes/gpu/amdgpu/source/amdgpu_redox_main.c` now queries linux-kpi PCI quirks in the real Redox runtime path,
- `PCI_QUIRK_NEED_FIRMWARE` turns missing DMCUB firmware into an init failure instead of a warning-only fallback,
- logs now show the active quirk bitmask plus the implied IRQ fallback policy.
QA:
- `grep` shows real in-tree call sites in amdgpu,
- build passes for linux-kpi + amdgpu recipe path.
#### Task 3.3: Keep firmware policy in drivers, not firmware-loader
Action:
- when a driver has `NEED_FIRMWARE`, the driver should gate initialization until the firmware load succeeds.
- `firmware-loader` remains a transport/provider only.
Success criteria:
- docs stop implying that firmware-loader interprets quirk flags,
- driver init paths own the policy decision.
QA:
- driver code path shows firmware gating tied to quirks or explicit device rules.
### Wave 4 — DMI completion
#### Task 4.1: DMI TOML runtime loading
Scope:
- `toml_loader.rs` parses `[[dmi_system_quirk]]`,
- matching uses live DMI info served by `acpid` at `/scheme/acpi/dmi`,
- resulting PCI quirk overrides flow through the canonical `redox-driver-sys` DMI path.
Success criteria:
- `50-system.toml` entries are no longer config-only,
- runtime DMI TOML behavior is testable and documented through the live `acpid` DMI scheme.
QA:
- tests for TOML parsing,
- one mock DMI input path proving a TOML DMI rule applies flags.
#### Task 4.2: ACPI blacklist/override layer
Current state:
- `acpid` now supports narrow `[[acpi_table_quirk]]` skip rules, optionally gated by the same
DMI-style `match.*` fields used elsewhere.
- The implementation is intentionally limited to table suppression during ACPI table load; it is
not a broad AML patching or firmware replacement framework.
## Suggested Immediate Deliverables
If work resumes right away, the next concrete implementation sequence should be:
1. clean remaining stale quirks docs/reporting text,
2. write a design note for canonical PCI quirk ownership,
3. integrate `lookup_usb_quirks()` into `xhcid` enumeration/configuration,
4. add first real `pci_has_quirk()` use in `amdgpu`,
5. validate and extend shipped DMI TOML coverage as needed.
## Exit Criteria For The Next Quirks Milestone
The next milestone is complete when all are true:
- `pcid-spawner` and `redox-driver-sys` no longer drift semantically,
- `xhcid` consumes USB device quirks at runtime,
- at least one real C driver consumes linux-kpi quirks,
- docs distinguish clearly between reporting, infrastructure, and true runtime behavior,
- DMI TOML entries are either runtime-applied or removed from shipped config.
+339
View File
@@ -0,0 +1,339 @@
# Red Bear OS Hardware Quirks System
## Overview
Red Bear OS implements a data-driven hardware quirks system inspired by Linux's
PCI/USB/DMI quirk infrastructure, adapted for Redox's microkernel/userspace-driver
architecture.
Quirks handle known hardware defects that cannot be fixed by correct driver code
alone. They override default driver behavior for specific devices, revisions, or
entire system models.
For the current follow-up cleanup and integration roadmap, see
`local/docs/QUIRKS-IMPROVEMENT-PLAN.md`.
## Architecture
```
Driver probes device
└─ PciDeviceInfo::quirks()
├─ Layer 1: Compiled-in table (pci_table.rs, usb_table.rs)
├─ Layer 2: TOML files from /etc/quirks.d/*.toml
└─ Layer 3: DMI-based system rules
└─ Returns: PciQuirkFlags (bitwise OR of all matching entries)
```
All matching entries accumulate via bitwise OR, so broad rules (e.g., "all AMD GPUs
need firmware") and narrow rules (e.g., "this specific revision has broken MSI-X")
compose naturally.
## Quirk Sources
### 1. Compiled-in Tables
Location: `local/recipes/drivers/redox-driver-sys/source/src/quirks/`
Critical quirks that must be available before the root filesystem is mounted.
Defined as `const` arrays in Rust:
- `pci_table.rs``PCI_QUIRK_TABLE: &[PciQuirkEntry]`
- `usb_table.rs``USB_QUIRK_TABLE: &[UsbQuirkEntry]`
Each entry specifies:
- Vendor/device/subsystem match fields (0xFFFF = wildcard)
- Revision range (lo..hi inclusive)
- Class code mask and match value
- `PciQuirkFlags` bitmask
### 2. TOML Quirk Files
Location: `/etc/quirks.d/*.toml` (shipped by the `redbear-quirks` package)
Extensible at runtime without recompiling drivers. Format:
```toml
[[pci_quirk]]
vendor = 0x1002
device = 0x73BF
flags = ["need_firmware", "no_d3cold"]
[[pci_quirk]]
vendor = 0x10EC
device = 0x8125
flags = ["no_aspm"]
[[usb_quirk]]
vendor = 0x0A12
flags = ["bad_descriptor", "no_set_config"]
```
Files are loaded alphabetically from `/etc/quirks.d/`. Recommended naming:
`00-core.toml`, `10-gpu.toml`, `20-usb.toml`, `30-net.toml`, `40-storage.toml`,
`50-system.toml`.
Runtime TOML loading now also supports `[[dmi_system_quirk]]` entries. Those
entries are applied when `acpid` is running and serving live DMI data from
`/scheme/acpi/dmi`.
### 3. DMI-Based System Quirks
Match by SMBIOS fields (sys_vendor, board_name, product_name) to apply
system-wide quirk overrides. Eight compiled-in rules exist for known systems,
and `/etc/quirks.d/*.toml` can now add `[[dmi_system_quirk]]` rules with
`match.*` keys plus optional `pci_vendor` / `pci_device` selectors. Runtime use
now reads live SMBIOS strings from `acpid` via `/scheme/acpi/dmi`.
## Available Quirk Flags
### PCI Quirks (PciQuirkFlags)
| Flag | Meaning |
|------|---------|
| `NO_MSI` | MSI capability broken; use MSI-X or legacy |
| `NO_MSIX` | MSI-X capability broken; use MSI or legacy |
| `FORCE_LEGACY_IRQ` | Must use INTx interrupts |
| `NO_PM` | Disable all power management |
| `NO_D3COLD` | Cannot recover from D3cold power state |
| `NO_ASPM` | Active State Power Management broken |
| `NEED_IOMMU` | Requires IOMMU isolation |
| `NO_IOMMU` | Must NOT be behind IOMMU |
| `DMA_32BIT_ONLY` | Only supports 32-bit DMA |
| `RESIZE_BAR` | BAR sizing reports incorrectly |
| `DISABLE_BAR_SIZING` | Use firmware BAR values as-is |
| `NEED_FIRMWARE` | Requires firmware files to initialize |
| `DISABLE_ACCEL` | Disable hardware acceleration |
| `FORCE_VRAM_ONLY` | No GTT/system memory fallback |
| `NO_USB3` | Force USB 2.0 mode |
| `RESET_DELAY_MS` | Needs extra post-reset delay |
| `NO_STRING_FETCH` | Do not fetch string descriptors |
| `BAD_EEPROM` | EEPROM unreliable; use hardcoded values |
| `BUS_MASTER_DELAY` | Needs delay after bus-master enable |
| `WRONG_CLASS` | Reports incorrect class code |
| `BROKEN_BRIDGE` | PCI bridge forwarding bug |
| `NO_RESOURCE_RELOC` | Do not relocate PCI resources |
### USB Quirks (UsbQuirkFlags)
| Flag | Meaning |
|------|---------|
| `NO_STRING_FETCH` | Do not fetch string descriptors |
| `RESET_DELAY` | Needs extra reset delay |
| `NO_USB3` | Disable USB 3.x |
| `NO_SET_CONFIG` | Cannot handle SetConfiguration |
| `NO_SUSPEND` | Broken suspend/resume |
| `NEED_RESET` | Needs reset after probe |
| `BAD_DESCRIPTOR` | Wrong descriptor sizes |
| `NO_LPM` | Disable Link Power Management |
| `NO_U1U2` | Disable U1/U2 link transitions |
## Driver Integration
### For Rust Drivers (using redox-driver-sys)
```rust
use redox_driver_sys::quirks::PciQuirkFlags;
fn probe(info: &PciDeviceInfo) {
let quirks = info.quirks();
if quirks.contains(PciQuirkFlags::NO_MSIX) {
// Skip MSI-X, try MSI or legacy
}
if quirks.contains(PciQuirkFlags::NEED_FIRMWARE) {
// Load firmware before initializing device
}
if quirks.contains(PciQuirkFlags::DISABLE_ACCEL) {
// Skip hardware probe, let software renderer take over
return Err(DriverError::QuirkDisabled);
}
}
```
### For C Drivers (using linux-kpi)
The `linux-kpi` crate exposes two FFI functions for C drivers to query quirks:
```c
#include <linux/pci.h>
// After pci_enable_device() in your probe callback:
static int my_probe(struct pci_dev *dev, const struct pci_device_id *id)
{
u64 quirks = pci_get_quirk_flags(dev);
if (quirks & PCI_QUIRK_NO_MSIX) {
// Skip MSI-X, fall back to MSI or legacy IRQ
}
if (pci_has_quirk(dev, PCI_QUIRK_NEED_FIRMWARE)) {
// Load firmware before initializing hardware
}
}
```
The amdgpu Redox glue/runtime path is now the first in-tree production C consumer
of this interface: it queries `pci_get_quirk_flags()` during AMD DC init, logs the
resulting IRQ expectations, and treats `PCI_QUIRK_NEED_FIRMWARE` as a hard failure
instead of a warn-and-continue path when that quirk is active.
Available C quirk flag macros (defined in `linux/pci.h`):
| Macro | Bit | Meaning |
|-------|-----|---------|
| `PCI_QUIRK_NO_MSI` | 0 | MSI interrupts broken |
| `PCI_QUIRK_NO_MSIX` | 1 | MSI-X interrupts broken |
| `PCI_QUIRK_FORCE_LEGACY` | 2 | Must use legacy INTx |
| `PCI_QUIRK_NO_PM` | 3 | Power management broken |
| `PCI_QUIRK_NO_D3COLD` | 4 | D3cold state broken |
| `PCI_QUIRK_NO_ASPM` | 5 | ASPM broken |
| `PCI_QUIRK_NEED_IOMMU` | 6 | Requires IOMMU |
| `PCI_QUIRK_DMA_32BIT_ONLY` | 8 | DMA limited to 32-bit |
| `PCI_QUIRK_NEED_FIRMWARE` | 11 | Requires firmware load |
| `PCI_QUIRK_DISABLE_ACCEL` | 12 | Disable hardware acceleration |
## Adding New Quirks
### To the compiled-in table
Edit `local/recipes/drivers/redox-driver-sys/source/src/quirks/pci_table.rs`:
```rust
const F_MY_FLAGS: PciQuirkFlags = PciQuirkFlags::from_bits_truncate(
PciQuirkFlags::NEED_FIRMWARE.bits() | PciQuirkFlags::NO_ASPM.bits(),
);
PciQuirkEntry {
vendor: 0xVENDOR,
device: 0xDEVICE,
flags: F_MY_FLAGS,
..PciQuirkEntry::WILDCARD
},
```
### To a TOML file
Create or edit a file in `local/recipes/system/redbear-quirks/source/quirks.d/`:
```toml
[[pci_quirk]]
vendor = 0xVENDOR
device = 0xDEVICE
flags = ["need_firmware", "no_aspm"]
[[dmi_system_quirk]]
pci_vendor = 0xVENDOR
flags = ["disable_accel"]
match.sys_vendor = "Example Vendor"
match.product_name = "Example Model"
[[acpi_table_quirk]]
signature = "DMAR"
match.sys_vendor = "Example Vendor"
match.product_name = "Example Model"
```
### Choosing where to add
- **Compiled-in**: Boot-critical quirks, anything needed before root mount
- **TOML**: Everything else — easier to update, no recompilation needed
- **DMI rule**: System-specific workarounds that apply to specific laptop models
## File Layout
```
local/recipes/drivers/redox-driver-sys/source/src/quirks/
├── mod.rs # Public API: lookup_pci_quirks(), PciQuirkFlags, PciQuirkEntry
├── pci_table.rs # Compiled-in PCI quirk table
├── usb_table.rs # Compiled-in USB quirk table
├── dmi.rs # DMI/SMBIOS matching and system-level quirk rules
└── toml_loader.rs # /etc/quirks.d/*.toml parser
local/recipes/system/redbear-quirks/
├── recipe.toml # Custom build: copies TOML files to /etc/quirks.d/
└── source/quirks.d/
├── 00-core.toml
├── 10-gpu.toml
├── 20-usb.toml
├── 30-net.toml
├── 40-storage.toml
└── 50-system.toml
```
## Relationship to Linux Quirks
| Linux Pattern | Red Bear Equivalent |
|---------------|-------------------|
| `DECLARE_PCI_FIXUP_HEADER(v, d, fn)` | `PciQuirkEntry { vendor: v, device: d, flags: ... }` |
| `pci_dev->dev_flags \|= PCI_DEV_FLAGS_NO_BUS_RESET` | No direct equivalent — future flag candidate |
| `USB_QUIRK_STRING_FETCH` | `UsbQuirkFlags::NO_STRING_FETCH` |
| `DMI_MATCH(DMI_SYS_VENDOR, "Lenovo")` | `DmiMatchRule { sys_vendor: Some("Lenovo") }` |
| `acpi_black_listed()` | `[[acpi_table_quirk]] signature = "...."` with skip semantics in `acpid` |
## Testing
Run quirks unit tests:
```bash
cd local/recipes/drivers/redox-driver-sys/source
cargo test
```
## Implementation Status
| Phase | Component | Status |
|-------|-----------|--------|
| Q1 | Core types (PciQuirkFlags, PciQuirkEntry, UsbQuirkFlags) | ✅ Done |
| Q1 | Compiled-in PCI/USB quirk tables | ✅ Done |
| Q1 | Lookup API (quirks(), has_quirk()) | ✅ Done |
| Q1 | Subsystem (subvendor/subdevice) fields | ✅ Done — compiled and TOML PCI matching both apply subsystem selectors |
| Q2 | TOML loader for /etc/quirks.d/ | ✅ Done |
| Q2 | redbear-quirks data package | ✅ Done |
| Q3 | redox-drm integration (MSI-X/MSI/legacy + DISABLE_ACCEL) | ✅ Done |
| Q3 | xhcid PCI controller quirks (interrupt + reset delay) | ✅ Done |
| Q3 | xhcid USB device quirks (descriptor/configuration/BOS handling) | ✅ Done |
| Q3 | pcid-spawner quirk passthrough | ✅ Done |
| Q3 | linux-kpi quirk flag bridge | ✅ Done |
| Q3 | amdgpu linux-kpi quirk consumption | ✅ Done |
| Q3 | redbear-info --quirks display | ✅ Done |
| Q4 | DMI/SMBIOS compiled-in rules | ✅ Done — 8 system rules (const table) |
| Q4 | DMI/SMBIOS TOML runtime loading | ✅ Done — `dmi_system_quirk` uses live `/scheme/acpi/dmi` data from `acpid` |
| Q4 | ACPI table blacklist/override | ✅ Done — `acpid` applies `[[acpi_table_quirk]]` skip rules during table load |
| Q5 | lspci quirk display | ✅ Done — shows active quirks per device |
| Q5 | lsusb quirk display | ✅ Done — shows active quirks per device |
| Q5 | Linux quirk extraction tool | ✅ Script exists — PCI mode uses heuristic name matching, USB mode works for table entries |
Quirk flags span data definition, infrastructure wiring, and driver consumption.
Most flags are defined but not yet consumed at runtime — the tables below show
the honest breakdown.
**Flags consumed by drivers (runtime checks in production code):**
- redox-drm: `NO_MSIX`, `NO_MSI`, `FORCE_LEGACY_IRQ`, `DISABLE_ACCEL` (interrupt setup + driver probe)
- xhcid: `RESET_DELAY_MS`, `NO_MSI`, `NO_MSIX`, `FORCE_LEGACY_IRQ` (interrupt selection + port reset delay)
- xhcid (USB device path): `NO_SET_CONFIG`, `NO_STRING_FETCH`, `BAD_DESCRIPTOR`, `NO_USB3`, `NO_LPM`, `NO_U1U2` (enumeration/configuration/BOS handling)
- amdgpu: `NEED_FIRMWARE` (hard firmware gate), with real quirk-aware logging for `NO_ASPM`, `NEED_IOMMU`, `NO_MSI`, `NO_MSIX`
**Infrastructure (data flows, reporting, and partial integration):**
- pcid-spawner: computes `PCI_QUIRK_FLAGS` by calling the canonical `redox-driver-sys` lookup on synthesized `PciDeviceInfo`, then passes the env var onward
- linux-kpi: `pci_get_quirk_flags()` / `pci_has_quirk()` C FFI is available for C drivers and is now consumed by the Red Bear amdgpu path
- redbear-info: `--quirks` reads `/etc/quirks.d/*.toml` and reports configured PCI/USB/DMI entries
- lspci: shows active quirk flags per PCI device (via redox-driver-sys lookup)
- lsusb: shows active quirk flags per USB device (via redox-driver-sys lookup)
- DMI compiled-in rules: 8 entries match systems by vendor/product/board (served through `acpid` at `/scheme/acpi/dmi`)
**Observed/logged but not yet strongly enforced in runtime policy:**
- `NO_ASPM`, `NEED_IOMMU`, `NO_MSI`, `NO_MSIX` in the amdgpu path are surfaced in quirk-aware logs before broader driver policy exists.
**Defined but not yet consumed by any real driver path:**
- `NO_PM`, `NO_D3COLD`, `DMA_32BIT_ONLY`, `BUS_MASTER_DELAY`, `NO_IOMMU`, etc.
`firmware-loader` itself does not interpret `NEED_FIRMWARE`; that policy is now enforced in the amdgpu driver path instead.
`NEED_RESET` remains defined for USB devices but is not yet consumed by a runtime USB driver path.
**Remaining infrastructure work:**
- none in the current quirks scope
`pcid-spawner` now brokers quirks through the canonical `redox-driver-sys` lookup instead of carrying a separate in-tree PCI quirk engine.
+146 -42
View File
@@ -26,41 +26,91 @@ This repo should not treat **builds** or **enumerates** as equivalent to **valid
### Summary
USB in Red Bear OS is **present but incomplete**.
USB in Red Bear OS is **present and improving**.
The current repo supports a real host-side USB path built around the userspace `xhcid` controller
daemon, hub and HID class spawning, native USB observability (`lsusb`, `usbctl`, `redbear-info`),
and a low-level userspace client API through `xhcid_interface`.
The current limitations are material:
Completed work:
- BOS/SuperSpeed descriptor fetching wired up — `xhcid` fetches and parses BOS capability
descriptors during device enumeration, with bounds-checked slicing and graceful USB 2 fallback
- Speed detection for hub child devices — `usbhubd` extracts child device speed from hub port
status via `UsbSpeed` enum (`#[repr(u8)]` with `TryFrom<u8>`) and passes it through
`attach_with_speed()` protocol; server maps to PSIV via `lookup_speed_category()`
- Interrupt-driven operation restored — `main.rs` calls `get_int_method()` instead of hard-coded
`(None, Polling)`; MSI/MSI-X/INTx paths re-enabled
- Event ring growth implemented — `grow_event_ring()` doubles ring size (up to 4096 cap),
allocates new DMA ring, preserves dequeue pointer, updates ERDP/ERSTBA hardware registers
- USB 3 hub endpoint configuration — `SET_INTERFACE` always sent; stall on `(0,0)` tolerated
with debug log and graceful continuation
- Hub interrupt EP1 status change detection replacing full polling loop in `usbhubd`
- Hub change bit clearing on all port paths — `clear_port_changes` sends
`ClearFeature(C_PORT_CONNECTION, C_PORT_ENABLE, C_PORT_RESET, C_PORT_OVER_CURRENT)` plus
USB3-specific features (`C_PORT_LINK_STATE`, `C_PORT_CONFIG_ERROR`) after every port status read
- Runtime panic reduction across USB daemons — `device_enumerator.rs`, `irq_reactor.rs`,
`mod.rs`, `scheme.rs`, `usbhubd/main.rs`, `usbhidd/main.rs` converted from `panic!/expect`
to `log + continue/return` or `ok_or` in most hot paths; mutex poison recovery on all hot-path
locks; `scsi/mod.rs` block descriptor parsing returns errors instead of panicking;
`xhci/scheme.rs` uses `ok_or` for device descriptor and DMA buffer access
- `usbhidd` no longer panics on malformed report data — proper `Result` propagation
- `usbscsid` panic paths eliminated in BOT transport — all 4 `panic!()` calls replaced with
stall recovery (`clear_stall` + `reset_recovery`) and `ProtocolError` returns; SCSI
`get_mode_sense10` failure returns error instead of panicking; `main.rs` uses
`unwrap_or_else` with `eprintln` + `exit(1)` instead of `expect()`; startup sector read
failure logs and continues instead of panicking; event loop handles errors gracefully
- Empty UAS module stub removed from `usbscsid`; `protocol::setup` returns `None` gracefully
for unsupported protocols instead of unwrapping
- BOT transport correctness fixes — `CLEAR_FEATURE(ENDPOINT_HALT)` now uses USB endpoint
address from descriptor (`bEndpointAddress`) instead of driver endpoint index; `get_max_lun`
sends correct interface number; `early_residue` correctly computes `expected - transferred`
for short packets; CSW read uses iterative bounded loop instead of unbounded recursion
- USB validation harness (`test-usb-qemu.sh`) with 6-check QEMU validation
- In-guest USB checker binary (`redbear-usb-check`) walking scheme tree
- USB validation runbook for operators
- All changes mirrored to `local/patches/base/redox.patch` for upstream refresh survival
The remaining limitations are:
- xHCI no longer hard-forces polling; it uses the existing interrupt-mode selection path again, but
interrupt-driven behavior is still only lightly validated under runtime load
- checked-in event-ring growth support now exists, but it still needs stronger runtime validation
- USB support varies by machine, including known `xhcid` panic cases
- hub/topology handling is partial
- HID is still wired through the legacy mixed-stream `inputd` path
- USB mass storage exists in-tree and now autospawns successfully in the current QEMU validation
path, but broader runtime stability and wider class/topology validation are still open.
- SuperSpeedPlus differentiation requires Extended Port Status (not yet implemented)
- TTT (Think Time) in Slot Context hardcoded to 0 — needs parent hub descriptor propagation
- Composite devices and non-default alternate settings use first-match only (`//TODO: USE ENDPOINTS FROM ALL INTERFACES`)
- `grow_event_ring()` swaps to a new ring but does not copy pending TRBs from the old one; under sustained event-ring-full conditions this may lose in-flight events
- `usbhubd` startup uses `unwrap_or_else` with graceful exit (not panics), but per-child-port handle creation now skips failed ports with error logging
- there is no evidence of validated support for broader USB classes or modern USB-C / dual-role
scope
### Identified Correctness Issues (from audit)
A comprehensive audit of the xHCI driver identified these correctness issues. Fixes are being
applied through `local/patches/base/redox.patch`:
- **ERDP read pointer bug** (`event.rs`): `erdp()` returns the software producer pointer from the
ring state instead of reading the actual hardware dequeue pointer from the ERDP runtime register.
Per XHCI spec §4.9.3, the ERDP must reflect where hardware has finished reading, not where
software enqueues new entries. This causes the event ring dequeue pointer to be incorrect after
processing events, potentially leading to missed or double-processed events.
- **Mutex poisoning panics**: ~37 `unwrap()` calls on mutex locks across `mod.rs`, `irq_reactor.rs`,
`scheme.rs`, and `ring.rs` will panic if a thread holding the lock panics. All should use
`unwrap_or_else(|e| e.into_inner())` for poisoning recovery. Additionally, ~22 `expect()` calls
need proper error handling.
- **Ring `panic!()` in `trb_phys_ptr()`**: `ring.rs` contains a direct `panic!()` on invalid state
instead of returning an error.
### Current Status Matrix
| Area | State | Notes |
|---|---|---|
| Host mode | **usable / experimental** | Real host-side stack exists, but not broadly validated |
| xHCI controller | **builds / enumerates / usable on some hardware** | Interrupt-mode selection restored, hardware-variable, event-ring growth exists in-tree but still needs stronger runtime validation |
| Hub handling | **builds / partial usable** | `usbhubd` exists, USB 3 hub limitations remain |
| HID | **builds / usable in narrow path** | `usbhidd` handles keyboard/mouse/button/scroll via legacy input path |
| Mass storage | **builds / autospawns in QEMU** | `usbscsid` now spawns from the xHCI class-driver table, but runtime stability past spawn still needs work |
| Native tooling | **builds / enumerates** | `lsusb`, `usbctl`, `redbear-info` provide partial observability |
| Low-level userspace API | **builds** | `xhcid_interface` exists, but not a mature general userspace USB story |
| libusb | **builds / experimental** | WIP, compiled but not tested |
| usbutils | **broken / experimental** | WIP, compilation error |
| EHCI/OHCI/UHCI | **absent / undocumented** | No evidence present in-tree |
| USB networking/audio/video/Bluetooth classes | **partial / experimental** | Broad class support remains incomplete, but one bounded explicit-startup USB-attached Bluetooth slice now exists |
| Device mode / OTG / dual-role / USB-C / PD / alt-modes / USB4 | **absent / undocumented** | No evidence present |
| Host mode | **usable / experimental** | Real host-side stack exists, interrupt-driven, not broadly validated on hardware |
| xHCI controller | **builds / usable on some hardware** | Interrupt delivery restored (MSI/MSI-X/INTx), event ring growth, CLEAR_FEATURE uses USB endpoint address; mutex poison recovery on all hot-path locks in scheme.rs and mod.rs |
| Hub handling | **builds / improving** | `usbhubd` uses interrupt EP1, change bits cleared, USB 3 speed-aware attach |
| HID | **builds / usable in narrow path** | `usbhidd` handles keyboard/mouse/button/scroll via legacy input path, no panics in report loop |
| Mass storage | **builds / improving** | `usbscsid` BOT transport has graceful error handling; endpoint addresses corrected; event loop handles errors; `plain::from_bytes`/`slice_from_bytes` error mapping in bot.rs and scsi/mod.rs block descriptors with bounds checks; runtime I/O validation still needed |
| Native tooling | **builds / enumerates** | `lsusb`, `usbctl`, `redbear-info`, `redbear-usb-check` provide observability |
| Low-level userspace API | **builds** | `xhcid_interface` with `UsbSpeed` enum, `attach_with_speed()` |
| Validation | **builds** | `test-usb-qemu.sh` + `redbear-usb-check` + USB-VALIDATION-RUNBOOK.md |
## Evidence Already In Tree
@@ -101,19 +151,17 @@ The current limitations are material:
Current repo-visible issues include:
- partially restored interrupt-driven behavior without complete event-ring growth support
- incorrect or incomplete speed handling for child devices
- TODOs around configuration choice and alternate settings
- TODOs around endpoint selection across interfaces
- incomplete BOS / SuperSpeed / SuperSpeedPlus handling
- TTT (Think Time) hardcoded to 0 in Slot Context — needs parent hub descriptor propagation
This means the current stack is more than a bring-up stub, but still below the bar for a reliable,
future-proof USB controller foundation.
### 2. Topology and hotplug maturity are partial
The stack can enumerate ports and descendants, but the code still carries explicit TODOs around hub
behavior and USB 3 hub handling.
The stack can enumerate ports and descendants. USB 3 hub endpoint configuration now works without
stalling, and child device speed detection is correct when devices attach through hubs.
The current repo does not justify a claim that attach, detach, reset, reconfigure, and hub-chained
topologies are runtime-proven in a broad sense.
@@ -126,14 +174,17 @@ However, the current HID path is still tied to the older anonymous `inputd` prod
`local/docs/INPUT-SCHEME-ENHANCEMENT.md` already defines the needed next step: named producers,
per-device streams, and explicit hotplug events.
### 4. Storage is present in-tree but not a current support claim
### 4. Storage is present in-tree, improving, but not yet validated
`usbscsid` is a real driver and the xHCI class-driver table now spawns it again during QEMU USB
storage validation. The current blocker is not matching or spawn, but transport/runtime stability
after spawn.
`usbscsid` is a real driver and the xHCI class-driver table spawns it during QEMU USB storage
validation. All BOT transport `panic!()` paths have been replaced with proper stall recovery and
error returns. The `main.rs` initialization path uses graceful error handling instead of `expect()`.
That means Red Bear should document USB storage as **implemented in-tree but not currently enabled
as a default working class path**.
The remaining gap is runtime validation: proving that stall recovery actually works under real
device I/O, and that multi-LUN devices configure correctly.
Red Bear should document USB storage as **implemented in-tree with improved error handling, but not yet
runtime-validated on hardware**.
### 5. The userspace USB story is still low-level
@@ -199,6 +250,21 @@ not a recommendation to bypass Red Bear's overlay/patch discipline.
### Phase U1 — xHCI Controller Baseline
**Status**: Partially complete.
**Completed**:
- BOS/SuperSpeed descriptor fetching wired up in `get_desc()``fetch_bos_desc()` called,
`bos_capability_descs()` iterator parsed, `supports_superspeed`/`supports_superspeedplus` stored
in `DevDesc`
- Speed detection for hub child devices fixed — `UsbSpeed` enum with `from_v2_port_status()` and
`from_v3_port_status()` mapping, passed via `attach_with_speed()` protocol from `usbhubd`
- `attach_device_with_speed()` accepts optional speed override byte, maps to PSIV via
`lookup_speed_category()`
**Remaining**:
- Validate one controller family as the first real support target
- Tighten controller-state correctness under sustained load
**Goal**: Turn `xhcid` from partial bring-up into a dependable baseline on at least one controller
family.
@@ -225,11 +291,24 @@ family.
### Phase U2 — Topology, Configuration, and Hotplug Correctness
**Status**: Partially complete.
**Completed**:
- USB 3 hub endpoint configuration stall handled — `SET_INTERFACE` is always sent; stall on
`(0, 0)` is tolerated with debug log and graceful continuation
- `usbhubd` now passes `interface_desc` and `alternate_setting` to `configure_endpoints`
**Remaining**:
- validate repeated attach/detach/reset behavior
- support non-default configurations and alternate settings where needed
- improve composite-device handling and endpoint selection across interfaces
- separate "enumerates" from "stays correct under topology changes"
**Goal**: Make the USB tree and device configuration path correct enough for real-world devices.
**What to do**:
- fix USB 3 hub stall cases and other known hub limitations
- USB 3 hub stall handling completed — SET_INTERFACE always sent with (0,0) stall tolerance
- validate repeated attach/detach/reset behavior
- support non-default configurations and alternate settings where needed
- improve composite-device handling and endpoint selection across interfaces
@@ -251,6 +330,17 @@ family.
### Phase U3 — HID Modernization
**Status**: Partially complete.
**Completed**:
- `usbhidd` error handling improved — `assert_eq!` replaced with `anyhow::bail!`, `.expect()` in
main loop replaced with `match` + `continue` for graceful recovery
**Remaining**:
- migrate `usbhidd` toward named producers and per-device streams
- expose hotplug add/remove behavior cleanly to downstream consumers
- align USB HID with the `inputd` enhancement design already documented in-tree
**Goal**: Move USB HID from legacy mixed-stream input to a modern per-device runtime path.
**What to do**:
@@ -333,6 +423,18 @@ implicit forever.
### Phase U6 — Validation Slices and Support Claims
**Status**: Partially complete.
**Completed**:
- `local/scripts/test-usb-qemu.sh` — Full USB stack validation harness that boots with xHCI +
keyboard + tablet + mass storage, then checks for xHCI interrupt mode, HID spawn, SCSI spawn,
BOS processing, and no crash-class errors
**Remaining**:
- add hardware-matrix coverage for target controllers and class families
- extend `redbear-info` only where passive probing can be honest
- tie support claims to a concrete profile or package-group slice
**Goal**: Turn USB from a collection of partial capabilities into an evidence-backed support story.
**What to do**:
@@ -368,23 +470,25 @@ Prefer language such as:
- “xHCI host support is present but experimental”
- “USB enumeration and HID-adjacent host paths exist in-tree”
- “USB support remains controller-variable”
- “USB storage support exists in-tree and is QEMU-proven for the current validation path, but is
not yet a broad hardware support claim”
- “USB storage support exists in-tree with improved error handling, but is not yet a broad hardware support claim”
## Summary
USB in Red Bear today is not missing. It is a real userspace host-side subsystem with meaningful
enumeration, runtime observability, hub/HID infrastructure, and a low-level userspace API.
It is also not complete. The current gaps are no longer “does Red Bear have any USB code at all?”
but rather:
Recent work has closed several specific gaps: BOS/SuperSpeed descriptor handling, hub child speed
detection, USB 3 hub configuration stalls, HID error handling, and a comprehensive QEMU validation
harness.
- controller correctness and interrupt maturity
- topology and configuration correctness
- HID modernization
- re-enabling and validating storage
The remaining gaps are:
- controller interrupt maturity under sustained load
- topology and configuration correctness under attach/detach stress
- HID modernization toward named producers and per-device streams
- re-enabling and validating storage runtime stability
- defining a coherent userspace USB API strategy
- deciding how much modern USB scope Red Bear actually wants
- building a real USB validation surface
- building broader USB validation coverage
That is the correct framing for a modern, future-proof USB implementation plan in this repo.
+221 -682
View File
@@ -2,740 +2,279 @@
## Purpose
This document defines the current Wi-Fi state in Red Bear OS and lays out the recommended path for
integrating Wi-Fi drivers and a usable wireless control plane.
This document describes the current Wi-Fi state in Red Bear OS and the path from the existing
bounded Intel bring-up scaffold to validated wireless connectivity.
The goal is not to imply that working Wi-Fi already exists. The goal is to describe what the repo
currently proves, what `linux-kpi` can and cannot realistically provide, and how Red Bear can grow
from a **bounded experimental Intel Wi-Fi scaffold** to one experimental, validated Wi-Fi path that
fits the existing Redox / Red Bear architecture.
Wi-Fi is currently **not working connectivity**. What exists is a structurally complete,
host-tested Intel transport layer and native control plane, awaiting real hardware + firmware
validation.
## Validation States
- **builds** — code exists in-tree and is expected to compile
- **boots** — image or service path reaches a usable runtime state
- **reports** — runtime surfaces can honestly report current wireless state
- **validated** — behavior has been exercised with real evidence for the claimed scope
- **experimental** — available for bring-up, but not support-promised
- **missing** — no in-tree implementation path is currently present
| State | Meaning |
|---|---|
| **builds** | Compiles in-tree |
| **host-tested** | Tests pass on Linux host with synthesized fixtures |
| **validated** | Behavior confirmed with real hardware evidence |
| **experimental** | Available for bring-up, not support-promised |
| **missing** | No in-tree implementation |
This repo should not treat planned wireless scope as equivalent to implemented support.
## Current State
## Current Repo State
### Status Matrix
### Summary
Wi-Fi is currently **not supported as working connectivity** in Red Bear OS.
There is still no complete in-tree cfg80211/mac80211/nl80211-compatible surface, no supplicant
path, and no profile that can honestly claim working Wi-Fi support. What now exists in-tree is a
bounded Intel bring-up slice: a driver-side package, a Wi-Fi control daemon/scheme, profile
plumbing, and host-validated LinuxKPI/CLI scaffolding below the real association boundary.
What the repo *does* have is a meaningful set of prerequisites:
- userspace drivers and schemes as the standard architectural model
- `redox-driver-sys` for PCI/MMIO/IRQ/DMA primitives
- `linux-kpi` as a limited low-level C-driver compatibility layer
- `firmware-loader` for blob-backed devices
- a working native wired network path through `network.*`, `smolnetd`, `dhcpd`, and `netcfg`
- profile/package-group discipline, including the reserved `net-wifi-experimental` slice
### Current Status Matrix
| Area | State | Notes |
| Area | State | Detail |
|---|---|---|
| Wi-Fi controller support | **experimental bounded slice exists** | `redbear-iwlwifi` provides an Intel-only bounded driver-side package, not validated Wi-Fi connectivity |
| Linux wireless stack compatibility | **early compatibility scaffolding exists** | `linux-kpi` now carries `cfg80211` / `wiphy` / `mac80211` registration, station-mode scaffolding, channel/band/rate/BSS definitions, and RX/TX data-path structures (24 tests pass), but not a complete Linux wireless stack |
| Firmware loading | **partial prerequisite exists** | `firmware-loader` can serve firmware blobs generically |
| Wireless control plane | **experimental bounded slice exists** | `redbear-wifictl` and `redbear-netctl` expose bounded prepare/init/activate/scan orchestration, not real association support |
| Post-association IP path | **present** | Native `smolnetd` / `netcfg` / `dhcpd` / `redbear-netctl` path exists |
| Desktop Wi-Fi API | **missing** | No NetworkManager-like or D-Bus Wi-Fi surface |
| Runtime diagnostics | **experimental bounded slice exists** | `redbear-info` and runtime helpers expose Wi-Fi state surfaces, but not real Wi-Fi functionality proof |
## Evidence Already In Tree
### Direct current-state caution about supported connectivity
- `HARDWARE.md` says broad Wi-Fi and Bluetooth hardware support is still incomplete even though
bounded in-tree scaffolding now exists
- `local/docs/AMD-FIRST-INTEGRATION.md` now treats `Wi-Fi/BT` as in progress with bounded wireless
scaffolding present but validated connectivity still incomplete
### Positive driver-side prerequisites
- `docs/04-LINUX-DRIVER-COMPAT.md` documents `redox-driver-sys`, `linux-kpi`, and
`firmware-loader`
- `local/recipes/drivers/redox-driver-sys/` provides userspace PCI/MMIO/IRQ/DMA primitives
- `local/recipes/drivers/linux-kpi/` provides a limited Linux-style compatibility subset
- `local/recipes/system/firmware-loader/` provides `scheme:firmware`
### Positive network/control-plane prerequisites
- `local/docs/NETWORKING-RTL8125-NETCTL.md` documents the native wired path:
`pcid-spawner` → NIC daemon → `network.*``smolnetd``dhcpd` / `netcfg`
- `recipes/core/base/source/netstack/src/scheme/netcfg/mod.rs` shows route/address/resolver state
is already exposed through a native control scheme
- `local/recipes/system/redbear-netctl/source/src/main.rs` shows Red Bear already uses a native
network profile tool, even though it is currently wired-only
- `docs/07-RED-BEAR-OS-IMPLEMENTATION-PLAN.md` reserves `net-wifi-experimental` as a package-group
slot for future wireless work
## Feasibility Constraints
### 1. Wi-Fi is not just a driver
Wi-Fi in Red Bear cannot be treated as a single hardware daemon.
At minimum, a working Wi-Fi path needs:
- hardware transport and firmware bring-up
- scan/discovery
- authentication and association state
- link-state and disconnect handling
- credential storage
- post-association handoff into the native IP stack
- later desktop/user-facing integration if the repo wants it
This makes Wi-Fi more like a complete subsystem than a simple wired NIC driver.
### 2. `linux-kpi` is feasible only below the wireless control-plane boundary
Current `linux-kpi` is suitable for low-level driver-enablement work such as:
- PCI / IRQ / DMA / MMIO access
- firmware request glue
- workqueue-style helper logic
- C-driver compatibility for narrow hardware bring-up
Current `linux-kpi` is **not** a complete Wi-Fi architecture because the repo still has no in-tree,
complete:
- cfg80211
- mac80211
- nl80211
- wiphy model
- supplicant/control-plane compatibility layer
So `linux-kpi` is feasible only as a **partial low-level aid**, not as the primary Red Bear Wi-Fi
stack.
### 3. The current Red Bear control plane is Ethernet-specific
The current native network stack is useful, but not yet Wi-Fi-ready.
`redbear-netctl` now has a first Wi-Fi-facing profile layer, but only at the profile/orchestration
boundary.
Current `redbear-netctl` support now includes:
- `Connection=ethernet`
- `Connection=wifi`
- arbitrary `Interface=` values at the profile layer (for example `eth0`, `wlan0`)
- DHCP/static address, route, and DNS control after association
- Wi-Fi profile fields for `SSID`, `Security`, and `Key`/`Passphrase`
- a bounded native handoff to a future `/scheme/wifictl` control surface
The repo now also contains the first bounded implementation of that control surface:
- `local/recipes/system/redbear-wifictl/` provides a `redbear-wifictl` daemon and `/scheme/wifictl`
scheme
- the current daemon supports a stub backend for end-to-end validation and an Intel-oriented backend
boundary that detects Intel wireless-class PCI devices
- the current Intel backend is now firmware-aware: it reports candidate firmware families, selected
firmware blobs when present, and supports a bounded `prepare` step before connect
- this is still not a full Intel association path, but it turns the control-plane contract into a
real in-tree interface rather than a placeholder
This means `redbear-netctl` can now represent and start a Wi-Fi profile without pretending Wi-Fi is
just an Ethernet profile, but it still does **not** own scan/auth/association itself.
`netcfg` is no longer hard-wired to a single `eth0` node in the control scheme. The native control
surface can now expose per-device interface nodes dynamically from the current device list, which is
the first required step for post-association Wi-Fi handoff.
That means Red Bear can reuse its native IP plumbing **after association**, but not as the radio
control plane itself.
### 4. Intel target changes the first-driver strategy
The original version of this plan preferred a FullMAC-first path to avoid recreating Linux wireless
subsystem boundaries.
That is still the simplest architecture in the abstract, but the project target has now changed:
Red Bear must target **Intel Wi-Fi for Arrow Lake and older Intel client chips**.
That means the first realistic driver family is now Intel `iwlwifi`-class hardware rather than an
unspecified FullMAC family.
This changes the implementation burden materially:
- Intel `iwlwifi` is not a simple FullMAC path
- current Linux support is tightly coupled to `mac80211` / `cfg80211`
- firmware loading remains necessary but is not the hard part by itself
- Red Bear must plan for a bounded compatibility layer below the user-facing control plane
So the practical first target is now:
- **Intel `iwlwifi`-class devices, Arrow Lake and older**, with the understanding that this is a
harder first driver family than a generic FullMAC-first strategy would have been
## Recommended Architecture
The best current Red Bear Wi-Fi architecture for the Intel target is:
1. **native Red Bear wireless control plane above the driver boundary**
2. **Intel-first low-level driver work below that boundary**
3. **reuse `firmware-loader` and `redox-driver-sys` wherever possible**
4. **accept bounded `linux-kpi` growth where Intel transport/firmware glue requires it**
### Build-note for the current Intel control-plane code
The earlier Redox-target source-level compile failure in `redbear-wifictl`'s Intel backend is now
fixed in-tree. If `cargo build --target x86_64-unknown-redox` still reports that
`x86_64-unknown-redox-gcc` is missing, check whether the repo-provided cross toolchain under
`prefix/x86_64-unknown-redox/sysroot/bin/` is on `PATH` before treating it as a fresh source-level
regression.
For repeatable local builds, use `local/scripts/build-redbear-wifictl-redox.sh`, which wires that
repo-provided toolchain path into the build invocation explicitly.
5. **reuse the existing native IP path only after association**
This is still a native-first architecture at the control-plane level, but it is no longer a pure
FullMAC-first plan.
### Subsystem boundary
The Wi-Fi subsystem should be split into these pieces:
- one **device transport / driver daemon** for the Intel target family
- one **firmware loading path** via `firmware-loader`
- one **Wi-Fi control daemon** for scan/auth/association/link state
- one **user-facing control tool** (`wifictl` or equivalent)
- one **post-association handoff** into `smolnetd` / `netcfg` / `dhcpd`
- one **later desktop shim** only if KDE/user-facing workflows require it
`redbear-netctl` should **not** become the supplicant. It can own profile orchestration and the
post-association IP handoff, but scan/auth/association should still live in a dedicated Wi-Fi
control daemon or scheme.
The current implementation now matches that boundary more closely:
- `redbear-netctl` can parse Wi-Fi profiles and hand credentials/intent to a native Wi-Fi control
surface (`/scheme/wifictl`)
- `redbear-netctl` now also has a host-side CLI proof that starting a Wi-Fi profile drives the
bounded driver/control actions and preserves the surfaced bounded connect metadata in status
output; this is not yet proof of verified prepare/init/activate/connect execution order on a real
associated link
- `redbear-netctl` stop now also drives the bounded disconnect path, so the current profile-manager
slice covers start and stop instead of start-only behavior
- `redbear-wifictl` now exposes bounded connect and disconnect CLI flows, and the runtime checker
now exercises the bounded connect step through the scheme surface
- the native IP path can address a non-`eth0` interface name after association
- `redbear-netctl` now also performs interface-specific DHCP handoff for Wi-Fi profiles and waits
for the selected interface to receive an address in the bounded host/runtime validation path
- `local/recipes/system/redbear-netctl-console/` now adds a terminal UI client on top of the same
`/scheme/wifictl` + `/etc/netctl` contract, so scan/select/edit/save/connect/disconnect workflows
can be exercised without introducing a new daemon or bypassing profile semantics
- `local/scripts/test-wifi-baremetal-runtime.sh` now provides the strongest in-repo runtime
validation path for this Wi-Fi slice on a real Red Bear OS target: driver probe, control probe,
bounded connect/disconnect, profile start/stop, and `redbear-info --json` lifecycle reporting
- `redbear-phase5-wifi-check` now packages that bounded in-target validation flow as a first-class
guest/runtime command, instead of leaving it only as a shell script
- that packaged runtime proof currently defaults to the bounded open-profile path; WPA2-PSK remains
implemented and host/unit-verified elsewhere in-repo rather than equally packaged/runtime-validated
- `redbear-phase5-wifi-capture` now packages the corresponding runtime evidence bundle, so target
runs can produce a single JSON artifact for debugging real hardware/passthrough failures;
that bundle now includes command outputs, Wi-Fi scheme state, `netctl` profile state, active
profile contents, interface listings, and `lspci` output
- `test-wifi-baremetal-runtime.sh` now writes that capture bundle to `/tmp/redbear-phase5-wifi-capture.json`
as part of the target-side bounded validation flow
- `local/scripts/test-wifi-passthrough-qemu.sh` now provides the corresponding VFIO/QEMU harness for
exercising the same bounded runtime path when an Intel Wi-Fi PCI function can be passed through to
a Red Bear guest, including optional host-side extraction of the packaged Wi-Fi capture bundle
- `local/scripts/prepare-wifi-vfio.sh` now provides the matching host-side bind/unbind helper for
moving an Intel Wi-Fi PCI function onto `vfio-pci` before passthrough validation and restoring it
afterwards
- `local/scripts/run-wifi-passthrough-validation.sh` now wraps the whole host-side passthrough flow:
bind to `vfio-pci`, run the packaged in-guest Wi-Fi validation path, collect the host-visible
capture bundle, and restore the original host driver afterwards
- `local/scripts/validate-wifi-vfio-host.sh` now provides a read-only preflight for the same flow:
PCI presence, current binding, UEFI firmware, image availability, QEMU/expect presence, VFIO
module state, and visible IOMMU groups
- `local/docs/WIFI-VALIDATION-RUNBOOK.md` now ties the bare-metal path, VFIO path, packaged
validators, and capture artifacts together into one operator runbook
- the control daemon exists now, and the first bounded driver-side package now exists as
`local/recipes/drivers/redbear-iwlwifi/`
- `redbear-iwlwifi` now supports bounded `--probe` and `--prepare` driver-side actions for the
current Intel family set
- `redbear-iwlwifi` now also supports bounded `--init-transport` and `--activate-nic` actions for
the current Intel family set
- `redbear-iwlwifi` now also supports bounded `--scan` and `--retry` actions for the current Intel
family set
- `redbear-iwlwifi` now also carries a first bounded `--connect` path that runs through the new
LinuxKPI wireless compatibility scaffolding instead of stopping immediately at a hardcoded
transport/association error
- `redbear-iwlwifi` now also carries a bounded `--disconnect` path so the current station-mode
lifecycle is not connect-only anymore
- `redbear-iwlwifi --status` now reports the current bounded driver-side view directly
- the bounded driver-side action set can be exercised through the dedicated helper script
`local/scripts/test-iwlwifi-driver-runtime.sh`
- on Redox targets, `redbear-iwlwifi` now also begins to use a `linux-kpi` C shim for firmware
request and PCI/MMIO-facing prepare/transport actions instead of keeping those paths purely in
Rust fallback code
### Port vs rewrite decision
For Arrow Lake-and-lower Intel WiFi, the current repo direction is:
- **do not** attempt a full Linux `mac80211` / `cfg80211` / `nl80211` port first,
- **do** create a bounded Intel driver/transport package below the native Red Bear WiFi control
plane,
- **do** accept limited `linux-kpi` growth only where it materially reduces transport/firmware glue
cost,
- keep `redbear-netctl` and `redbear-wifictl` as the native control-plane/user-facing layers above
that driver boundary.
That means the repo is now following a **bounded transport-layer port with native control-plane
rewrite above it**, not a full Linux wireless stack port and not a pure greenfield driver rewrite.
### What this means in practical porting terms
The currently feasible interpretation of “use the real Linux Intel driver through `linux-kpi`” is:
- port and reuse **transport-layer and firmware-facing logic** where that lowers cost materially,
- keep the **native Red Bear control plane** above that boundary,
- and avoid treating a full `cfg80211` / `mac80211` / `nl80211` / `wiphy` port as the immediate
first milestone.
In other words, Red Bear should not try to import the whole Linux wireless stack in one step.
Red Bear should instead pull over the **device-facing part** of the Intel stack in bounded layers.
### Boundary where `linux-kpi` is helpful
`linux-kpi` is most useful for:
- PCI helper semantics
- MMIO/IRQ/DMA glue
- firmware request/load glue
- workqueue-style deferred execution
- timer, mutex, and IRQ-critical-section helpers that transport-facing Linux Wi-Fi code expects
- low-level transport and reset sequences
- early packet-buffer / `net_device` / `wiphy` / registration scaffolding when Red Bear begins the
first real Linux wireless-subsystem compatibility slice
That is the boundary where “run Linux driver code on Red Bear” is currently realistic.
The current tree now has the first explicit step in that direction as well:
- `linux-kpi` now carries initial `sk_buff`, `net_device`, `cfg80211`/`wiphy`, and `mac80211`
registration scaffolding alongside the earlier firmware/timer/mutex/IRQ helpers
- that scaffolding now also includes the first station-mode compatibility types and hooks used by
the bounded Intel scan/connect path: SSID/connect/station parameter structs plus basic
`cfg80211_connect_bss` / ready-on-channel and `mac80211` VIF/STA/BSS-conf surfaces
- the bounded station-mode slice now also preserves real private-allocation sizes, exposes the
common `sk_buff` reserve/push/pull/headroom/tailroom helpers, tracks `net_device`
registration/setup, keeps carrier down until connect success, and routes
`ieee80211_queue_work()` through the bounded LinuxKPI workqueue instead of silently dropping
deferred work
- the wireless scaffolding now also includes channel/band/rate definitions
(`Ieee80211Channel`, `Ieee80211Rate`, `Ieee80211SupportedBand` with NL80211 band constants
and IEEE80211 channel/rate flags), BSS information reporting (`Cfg80211Bss`,
`cfg80211_inform_bss`/`get_bss`/`put_bss`), RX/TX data-path structures (`Ieee80211RxStatus`,
`Ieee80211TxInfo` with RX/TX flag constants, `ieee80211_rx_irqsafe`/`tx_status`),
channel definition creation (`ieee80211_chandef_create`), and STA state-transition constants
(`IEEE80211_STA_NOTEXIST` through `IEEE80211_STA_AUTHORIZED`)
- all scaffolding is compile- and host-test-validated inside the `linux-kpi` crate (24 tests pass)
- this is still **not** a claim that Red Bear now has a working Linux wireless stack
### Boundary where a full Linux port becomes too expensive
A full Linux-style `iwlwifi` port becomes dramatically more expensive as soon as the code path
depends on the Linux wireless subsystem proper:
- `cfg80211`
- `mac80211`
- `nl80211`
- `wiphy` model and callbacks
- Linux regulatory integration
- Linux station/BSS bookkeeping and userspace-facing wireless semantics
The repo now has the earliest pieces of those subsystem layers, but still not anything close to a
complete Linux wireless stack. Building them out far enough to host Intel WiFi as a true Linux-like
solution still turns the effort from a bounded driver port into a much larger compatibility-stack
port.
### Chosen direction
The chosen direction for Arrow Lake-and-lower Intel WiFi is therefore:
1. keep the **native Red Bear control plane** (`redbear-netctl` + `redbear-wifictl`),
2. keep pushing the **hardware-facing Intel path** down into `redbear-iwlwifi`,
3. use `linux-kpi` for the low-level Linux-facing transport/runtime glue where that reduces effort,
4. avoid promising or attempting a full Linux wireless-stack port as the first milestone.
The current code now matches that decision more closely than before: `redbear-wifictl` remains the
native control plane, while `redbear-iwlwifi` is the place where Linux-facing firmware/PCI/MMIO
driver logic is starting to accumulate.
The current tree also now pushes more of that bounded Intel path through the actual LinuxKPI
surface instead of bespoke C declarations alone:
- `linux-kpi` now exports direct and async firmware request helpers for firmware-family workflows
- timer and IRQ save/restore bindings are exported through the Linux-facing headers instead of
remaining header-only stubs
- `mutex_trylock()` is available to transport-facing code that needs bounded serialization without
pretending the full Linux scheduler model exists
- the current `redbear-iwlwifi` C transport shim now includes the LinuxKPI headers directly and
uses Linux-style firmware, timer, mutex, and IRQ helper entry points for prepare/probe/init/
activate steps
This remains a bounded transport-layer port. It does **not** change the rule that cfg80211/
mac80211/nl80211 remain out of scope for the current milestone.
### Current validation status for this bounded LinuxKPI slice
The current validation story for this slice is intentionally narrow and should be described that
way:
- the `linux-kpi` host-side test suite now runs cleanly in this repo (24 tests pass), including
the WiFi-facing helper changes in this slice: `request_firmware_direct`,
`request_firmware_nowait`, `mutex_trylock`, IRQ-depth tracking, variable private-allocation
lifetime tracking, station-mode scan/connect/disconnect lifecycle assertions,
workqueue-backed `ieee80211_queue_work()`, `sk_buff` headroom/tailroom helpers, channel/band
creation and flag tests, RX status default and flag combination tests, `ieee80211_get_tid` null
safety, and the existing memory tests
- `redbear-iwlwifi` host-side tests now smoke-test the bounded firmware/transport/activation/scan/
retry actions used by the current Intel path
- `redbear-iwlwifi` also now has a binary-level host-side CLI smoke test for the current bounded
Intel path against temporary PCI/firmware fixtures; this is not the same as a chained real-target
transport→activation→association proof
- `redbear-wifictl` host-side tests pass for the bounded control-plane state propagation above that
Intel path
- the packaged target-side Wi-Fi validators now also accept bounded `status=associating`/
pending-connect output, so the in-target/runtime checks stay aligned with the current honest
connect semantics instead of requiring a fake associated/connected result
- the default packaged bounded runtime profile is now `wifi-open-bounded`, separating lifecycle
validation from the later DHCP-on-real-association gate
This does **not** mean Red Bear has validated a full Linux WiFi driver stack. The validated claim
is narrower: this repo now has tested, bounded LinuxKPI support for the current Intel transport-
facing helper slice, plus host-tested bounded CLI/control flows above it. Current bounded connect
results should still be read as pending/experimental lifecycle state, not proof of real AP
association.
In the current host environment used for this hardening pass, the Intel-specific VFIO runtime path
also remains blocked by prerequisites outside the repo changes themselves: the host validator sees a
MediaTek MT7921K (`14c3:0608`) instead of an Intel `iwlwifi` device on the available WiFi slot,
and `vfio_pci` is not loaded. That means the repo-side bounded runtime harness is present and the
Red Bear image/QEMU/OVMF/`expect` prerequisites are available, but a literal Intel passthrough run
still requires compatible host hardware and VFIO binding before it can be executed.
That is the current feasibility conclusion grounded in the codebase.
| Intel PCIe transport | **builds, host-tested** | `redbear-iwlwifi`: ~2450 lines C transport + ~1550 lines Rust CLI. Real 802.11 RX frame parsing, DMA ring management, TX reclaim, ISR/tasklet dispatch, command response parsing, mac80211 ops, station state transitions, key management. Commands time out without real firmware — by design. |
| LinuxKPI compatibility | **builds, host-tested** | `linux-kpi`: 17 Rust modules, 93 tests. cfg80211/wiphy/mac80211 registration, ieee80211_ops 12-callback dispatch, PCI MSI/MSI-X, DMA pool, sk_buff, NAPI poll, list_head, atomic_t, completion, IO barriers, BSS/channel/band/rate, scan/connect/disconnect events, BSS registry with reference release. |
| IRQ dispatch | **builds, host-tested** | `request_irq`/`free_irq`/`disable_irq`/`enable_irq` fully implemented with real `scheme:irq/{}` integration, thread-based dispatch, and mask/unmask support. |
| Test coverage | **119 tests pass** | 93 linux-kpi + 8 redbear-iwlwifi + 18 redbear-wifictl. No production `unwrap()` in Wi-Fi daemon request loop (startup uses `expect()`). Host-tested; Redox-only C transport paths are compile-tested but not directly exercised by host tests. |
| Firmware loading | **partial** | `firmware-loader` can serve blobs generically. |
| Control plane | **host-tested** | `redbear-wifictl` daemon + `/scheme/wifictl` scheme with stub and Intel backends, state-machine enforcement, firmware-family reporting. Daemon request loop has graceful shutdown on socket errors. |
| Profile orchestration | **host-tested** | `redbear-netctl` Wi-Fi profiles (SSID/Security/Key), bounded prepare→init-transport→activate-nic→connect→disconnect flow, DHCP handoff. |
| Runtime diagnostics | **host-tested** | `redbear-info` Wi-Fi surfaces, packaged validators (`redbear-phase5-wifi-check/run/capture/analyze`). |
| Real hardware validation | **missing** | No Intel Wi-Fi device has been exercised. Transport is structurally correct but functionally unproven. |
| Desktop Wi-Fi API | **missing** | No NetworkManager-like or D-Bus Wi-Fi surface. |
### Transport Quality (from hardening pass)
The iwlwifi transport has been hardened with these specific improvements:
- **Atomic command state**: `command_complete`, `last_cmd_id`, `last_cmd_cookie`, `last_cmd_status` use `__atomic_store_n`/`__atomic_load_n` with `__ATOMIC_SEQ_CST` — no torn reads between ISR and command submission.
- **Stale response sentinel** (0xFFFF): After command timeout, the response fields are poisoned so a late-arriving firmware response cannot be misattributed to the next command.
- **Command queue space management**: `iwl_pcie_send_cmd` reclaims completed TX descriptors before submitting each command. If the command queue is still full after reclaim, the command fails immediately rather than entering the overflow queue — commands are synchronous and one-at-a-time, so overflow queuing would create ownership ambiguity.
- **DMA read barrier**: `rmb()` added after `dma_sync_single_for_cpu()` and before parsing RX frame data — ensures correct ordering on weakly-ordered architectures.
- **TX queue selection safety**: `rb_iwlwifi_choose_txq()` returns -1 when no data queue is active instead of falling back to the command queue — data frames never use the command queue.
- **TX error handling**: `iwl_ops_tx` now properly frees the skb on failure and logs warnings instead of silently swallowing errors.
- **Association BSSID guard**: BSSID from association-response frames is only copied to transport state when `trans->connecting` is set — prevents stale frames from corrupting connection state.
- **TXQ stuck detection fix**: Removed `trans->irq <= 0` from stuck detection — queue stuckness is independent of IRQ allocation state.
- **RX drain**: Parses 802.11 frame_control type/subtype before freeing — distinguishes data, management, and control frames instead of blind disposal.
- **RX restock**: Write pointer pushed to hardware in both restock and start_dma paths — prevents DMA ring starvation.
- **TX reclaim**: Full DMA unmap cycle — no leaked mappings.
- **BSS registry cleanup**: `cfg80211_put_bss()` now removes entries from the BSS registry and cleans up associated IEs — no memory leak on repeated scans.
### LinuxKPI Compat Layer Improvements
The linux-kpi compatibility layer has been enhanced with real frame delivery and statistics:
- **RX callback mechanism**: `ieee80211_register_rx_handler(hw, callback)` registers a per-hw
callback that receives drained RX frames. When `ieee80211_rx_drain` processes queued frames,
it delivers them to the registered callback instead of logging and freeing. This allows the
upper layer (e.g., a Redox wireless daemon) to consume frames in real time.
- **TX statistics tracking**: `ieee80211_get_tx_stats(hw)` returns per-hw TX completion counters
(total, acked, nacked). `ieee80211_tx_status` increments these on every TX completion.
- **Full frame data in cfg80211 events**: `cfg80211_rx_mgmt` now stores complete frame data (not
just metadata) in the wireless event state, enabling later consumption by the native wireless
stack. `cfg80211_mgmt_tx_status` similarly stores full TX frame data.
- **IRQ dispatch confirmed real**: `request_irq`/`free_irq`/`disable_irq`/`enable_irq` use real
`scheme:irq/{}` integration with thread-based dispatch and mask/unmask support — not stubs.
- **119 tests pass**: 93 linux-kpi + 8 redbear-iwlwifi + 18 redbear-wifictl.
### Honest Assessment
Without real hardware + firmware:
- Command submission times out (no firmware alive response)
- Scan returns no results (no firmware scan response)
- Association does not complete
- RX frames are never processed
The code reports these states honestly (timeout, no results) rather than fabricating success.
Hardware runtime validation is the required next gate.
## Architecture
### Subsystem Boundaries
```
User-facing
redbear-netctl (profiles, CLI)
redbear-netctl-console (ncurses TUI)
/scheme/wifictl (redbear-wifictl daemon)
│ scan / auth / association / link state / credentials
redbear-iwlwifi (driver daemon)
│ PCIe transport / firmware / DMA / IRQ
linux-kpi (compatibility glue)
│ PCI / MMIO / IRQ / DMA / sk_buff / mac80211 ops
redox-driver-sys (scheme:memory, scheme:irq, scheme:pci)
firmware-loader (scheme:firmware)
Kernel: scheme-based primitives only
Post-association IP path:
smolnetd → netcfg → dhcpd → redbear-netctl
```
### Key Design Decisions
1. **Native control plane above the driver**`redbear-wifictl` owns scan/auth/association, not `redbear-netctl`.
2. **Bounded Intel transport port below that boundary** — reuse Linux-facing firmware/PCI/MMIO logic where it lowers cost.
3. **No full Linux wireless stack port** — cfg80211/mac80211/nl80211 are out of scope for the first milestone.
4. **`redbear-netctl` is the profile manager, not the supplicant** — it hands off to `/scheme/wifictl`, which hands off to the driver.
### Port vs Rewrite
The chosen approach is a **bounded transport-layer port with native control-plane rewrite above it**:
- Port and reuse transport-layer and firmware-facing logic from Linux `iwlwifi`
- Keep the native Red Bear control plane above that boundary
- Do not import the whole Linux wireless stack in one step
## Hardware Strategy
### Target hardware scope
- **Target**: Intel Wi-Fi chips on Arrow Lake and older Intel client platforms
- **Driver family**: `iwlwifi`-class (7000/8000/9000/AX210/BZ)
- **Security scope**: Open networks + WPA2-PSK only (phase 1)
- **Out of scope**: WPA3, 802.1X, AP mode, roaming, monitor mode, suspend/resume, multi-BSS
The target scope for this plan is now:
## Implementation Phases
- **Intel Wi-Fi chips used on Arrow Lake and older Intel client platforms**
### Phase W0 — Scope Freeze ✅ Complete
That includes the practical `iwlwifi` family boundary, not an abstract FullMAC-first family chosen
for architectural neatness.
- Intel target scope frozen
- Security scope frozen (open + WPA2-PSK)
- `redbear-wifi-experimental` config slice defined (`config/redbear-wifi-experimental.toml`)
- Unsupported features documented
### What this means for phase 1
### Phase W1 — Intel Driver Substrate Fit ✅ Complete (build-side)
Phase 1 is no longer “pick any convenient Wi-Fi family.”
- Intel device family mapped onto `redox-driver-sys` primitives
- Firmware naming/fetch path wired through `firmware-loader`
- Minimum `linux-kpi` additions identified and implemented (93 tests)
- All additions stay below the wireless control-plane boundary
Phase 1 is now:
**Exit criteria met (build-side)**: Intel target device can be discovered, initialized, and paired
with its firmware-loading path — in compiled/host-tested code. Real hardware validation still pending.
- prove one bounded Intel client Wi-Fi path,
- keep the support language experimental,
- and avoid promising the entire Linux wireless stack up front.
### Phase W2 — Native Wireless Control Plane ✅ Complete (host-tested)
## Security Scope Freeze
- `redbear-wifictl` daemon with `/scheme/wifictl` scheme
- Stub backend for end-to-end control-plane validation
- Intel backend: device detection, firmware-family reporting, transport-readiness, state machine
- `redbear-netctl` Wi-Fi profile support (SSID/Security/Key)
- Bounded prepare→init-transport→activate-nic→scan→connect→disconnect flow
- `redbear-netctl-console` ncurses TUI client
### Phase-1 supported security
**Exit criteria met (host-tested)**: Daemon reports scan results and link state honestly in
host-side tests. Runtime validation pending.
- open networks
- WPA2-PSK
### Phase W3 — Network Stack for Post-Association Handoff ✅ Complete (build-side)
### Explicitly out of initial scope
- `netcfg` exposes per-device interface nodes dynamically (not hard-coded `eth0`)
- `redbear-netctl` performs DHCP handoff for Wi-Fi profiles
- Native IP plumbing can consume a post-association Wi-Fi interface
- WPA3
- 802.1X / enterprise Wi-Fi
- AP mode
- roaming
- monitor mode
- suspend/resume guarantees
- multi-BSS support
- sophisticated regulatory-domain handling
**Exit criteria met (build-side)**: A connected Wi-Fi link can be handed off to the existing IP
path without treating it as raw Ethernet. Runtime validation pending.
This scope freeze is required to keep the first milestone honest and achievable.
### Phase W4 — First Association Milestone 🚧 Not started (blocked on hardware)
## Comprehensive Full Plan
## Current Implementation Progress
### Already landed in-tree
The current repo now contains a **bounded Phase W0/W2/W3 slice**:
- the plan target is explicitly Intel Arrow Lake and older Intel Wi-Fi chips
- `redbear-netctl` now supports WiFi profiles with `Connection=wifi`, `Interface=...`, `SSID`,
`Security`, and `Key` / `Passphrase`
- `netctl` now performs a bounded `prepare``init-transport``connect` handoff into
`/scheme/wifictl`
- that user-facing path now also includes a bounded `activate-nic` step before `connect`
- `netctl scan <profile|iface>` now uses the same `prepare``init-transport` ordering before the
active `scan` action
- `netcfg` no longer hard-codes a single `eth0` interface node and can expose interfaces from the
current device list dynamically
- `redbear-wifictl` now exists as a real package/daemon/scheme with:
- a stub backend for end-to-end control-plane validation
- an Intel-oriented backend boundary for Arrow Lake-and-lower families
- firmware-family and firmware-presence reporting
- a bounded `prepare` step before `connect`
- transport-readiness reporting derived from PCI command/BAR/IRQ state
- a bounded PCI transport-prep action that enables memory-space and bus-master bits before connect
- a bounded `scan` action with a working stub path and a bounded Intel scan/reporting path rather
than the older explicit `not implemented yet` result
- a bounded `init-transport` state boundary after preparation and before any future association path
- a bounded `activate-nic` state boundary after `init-transport`
- state-machine enforcement so Intel scan/connect refuse to proceed before `init-transport`
- `redbear-info` and the runtime helper scripts now expose the WiFi control-plane surfaces
- `redbear-info` now reports WiFi firmware status, transport status, activation status, and scan results from the
primary WiFi control interface
- `redbear-info` and the runtime helper also now expose `transport-init-status`, which separates
simple transport probing from an actual transport-initialization attempt
- on Redox runtime builds where `/usr/lib/drivers/redbear-iwlwifi` is present **and** at least one
Intel Wi-Fi candidate is actually detectable, `redbear-wifictl` now auto-selects the Intel backend
instead of silently falling back to the stub backend
- if the Intel driver package is present but no Intel Wi-Fi candidate is detected, `redbear-wifictl`
now exposes a dedicated no-device fallback rather than a synthetic stub `wlan0`, so the runtime
does not pretend the Intel path is usable
### What this means
This does **not** mean Red Bear has working Intel WiFi connectivity yet.
It means the repo now has:
- a real WiFi profile model,
- a real WiFi control-plane daemon and scheme,
- a first dedicated Intel WiFi driver-side package (`redbear-iwlwifi`),
- a runtime helper for the bounded Intel driver probe path (`local/scripts/test-iwlwifi-driver-runtime.sh`),
- a runtime check that the WiFi control daemon selects the Intel backend only when Intel WiFi
candidates are actually present,
- a native post-association IP handoff path that can address non-`eth0` interfaces,
- and a firmware-aware, transport-aware Intel backend boundary.
- and a bounded active scan surface.
- and a bounded transport-initialization surface.
The current bounded implementation is therefore no longer just static plumbing. It now has a real
user-facing WiFi orchestration flow through `netctl`, a real control daemon state machine, and a
real Intel-targeted firmware/transport preparation boundary.
That is the first substantial WiFi bring-up slice, but not the final result.
### Still missing after the current slice
- real Intel transport initialization
- actual firmware loading/prepare action on Redox target hardware
- scan implementation against real hardware
- authentication and association
- WPA2 key negotiation on a real link
- DHCP/static IP handoff on a real associated wireless interface
- runtime validation on Intel hardware or a realistic guest path
### Phase W0 — Scope Freeze and Package-Group Definition
**Goal**: Define the first Wi-Fi milestone precisely before implementation starts.
**Goal**: One real Wi-Fi connection under phase-1 scope.
**What to do**:
1. Obtain an Intel Wi-Fi device (iwlwifi-class) for bare-metal or VFIO passthrough testing
2. Boot Red Bear on hardware with the Intel Wi-Fi PCI function visible
3. Verify firmware loads via `firmware-loader`
4. Verify transport init succeeds (command queue alive, firmware responds)
5. Scan for one real SSID
6. Join one test network (open or WPA2-PSK)
7. Hand off to DHCP or static IP
8. Confirm bidirectional connectivity
- freeze the target scope to Intel Arrow Lake and older Intel Wi-Fi chips
- freeze security scope to open + WPA2-PSK
- define `net-wifi-experimental` as the package/config slice for first Wi-Fi support
- document unsupported wireless features explicitly
**Exit criteria**: One Intel device family reaches usable network connectivity on a real network.
**Exit criteria**:
**Prerequisites**:
- Intel Wi-Fi PCI device available for testing
- `low-level controller` / IRQ quality validated (current blocker chain)
- Firmware blobs for the target device family
- Intel target scope is explicit
- support language and non-goals are written down
- the repo has a standalone tracked Wi-Fi experimental profile (`config/redbear-wifi-experimental.toml`) extending the minimal Red Bear baseline
### Phase W5 — Runtime Reporting and Recovery (After W4)
---
- Extend `redbear-info` with real Wi-Fi runtime evidence (not just bounded surfaces)
- Reconnect after disconnect
- Failure-state reporting and retry
- `redbear-phase5-wifi-check/run/capture/analyze` validated against real hardware
### Phase W1 — Intel Driver Substrate Fit
**Goal**: Prove the Intel target family can fit Red Bears existing driver primitives and identify
the minimum additional compatibility surface required.
**What to do**:
- map the Intel target family onto `redox-driver-sys`
- verify firmware naming and fetch path through `firmware-loader`
- identify exactly which `linux-kpi` additions are mandatory for Intel transport/firmware bring-up
- keep those additions below the wireless control-plane boundary
**Exit criteria**:
- one Intel target device can be discovered, initialized, and paired with its firmware-loading path
---
### Phase W2 — Native Wireless Control Plane
**Goal**: Add a Red Bear-native wireless daemon and control interface.
**What to do**:
- implement a Wi-Fi daemon that owns:
- scan state
- auth/association state
- link state
- disconnect/retry behavior
- credential ownership
- add a user-facing `wifictl`-style control surface
**What not to do**:
- do not push supplicant logic into `redbear-netctl`
- do not model Wi-Fi as “just another Ethernet profile” at this phase
**Exit criteria**:
- the daemon can report scan results and current link state honestly
---
### Phase W3 — Network Stack Refactor for Post-Association Handoff
**Goal**: Make the native IP stack accept Wi-Fi as a first-class post-association interface.
**What to do**:
- generalize current `eth0` / Ethernet assumptions where needed
- allow the native stack to consume a post-association Wi-Fi interface state
- keep route/address/DNS handling in native `netcfg` / `smolnetd` plumbing after association
**Exit criteria**:
- a connected Wi-Fi link can be handed off to the existing IP path without pretending it is merely a
raw Ethernet control-plane object
---
### Phase W4 — First Association Milestone
**Goal**: Achieve one real Wi-Fi connection under the frozen phase-1 scope.
**What to do**:
- scan for one real SSID
- join one test network
- complete open or WPA2-PSK association
- hand off to DHCP or static IP configuration
**Exit criteria**:
- one chosen device family reaches usable network connectivity on a real network
---
### Phase W5 — Runtime Reporting and Recovery
**Goal**: Make Wi-Fi support diagnosable and honest.
**What to do**:
- extend `redbear-info` with Wi-Fi-specific runtime reporting
- add reconnect and failure-state reporting
- keep all support labels experimental
**Exit criteria**:
- users can see whether hardware is present, firmware is loaded, scans succeed, and association has
succeeded or failed
---
**Exit criteria**: Users can see whether hardware is present, firmware is loaded, scans succeed,
and association has succeeded or failed — backed by real hardware evidence.
### Phase W6 — Desktop Compatibility (Later)
**Goal**: Add desktop-oriented control only after native Wi-Fi works.
- If KDE or desktop workflows require it, add a compatibility shim over the native Wi-Fi service
- Keep the shim above the native control plane, not in place of it
**What to do**:
### Phase W7 — Broader Hardware Reassessment (Later)
- if KDE or desktop workflows require it, add a small compatibility shim over the native Wi-Fi
service
- keep that shim above the native control plane, not in place of it
**Exit criteria**:
- desktop Wi-Fi workflows become possible without changing the native subsystem boundaries
---
### Phase W7 — Broader Hardware and `linux-kpi` Reassessment
**Goal**: Reassess whether Red Bear wants to widen WiFi support after one bounded Intel path works.
**What to do**:
- only after one bounded Intel transport/association path is validated, decide whether a wider
multi-family or deeper `linux-kpi` path is worth the cost
- do not assume this is automatically justified
**Exit criteria**:
- Red Bear either keeps the narrow native-first architecture, or consciously chooses a larger Linux
wireless-compat effort with full awareness of the cost
- After one bounded Intel path is validated, reassess whether wider multi-family or deeper
`linux-kpi` growth is justified
- Do not assume this is automatically warranted
## Validation Gates
Wi-Fi should not be described as supported until these gates are passed in order:
Wi-Fi should not be described as supported until these gates pass in order:
1. hardware is detected
2. firmware loads successfully
3. the driver/daemon initializes and reports link state
4. scan sees a real SSID
5. association succeeds for one supported network type
6. DHCP or static IP handoff succeeds through the native network stack
7. reconnect works after disconnect or reboot
8. `redbear-info` and profile docs report supported and unsupported states honestly
1. ✅ Hardware detected via PCI scheme
2. 🚧 Firmware loads successfully
3. 🚧 Driver/daemon initializes and reports link state
4. 🚧 Scan sees a real SSID
5. 🚧 Association succeeds for one supported network type
6. 🚧 DHCP or static IP handoff succeeds
7. 🚧 Reconnect works after disconnect or reboot
8. 🚧 `redbear-info` reports all states honestly with real evidence
Until then, support language should remain under `net-wifi-experimental` only.
Until all gates pass, support language stays under `redbear-wifi-experimental`.
## Support-Language Guidance
## Current Blockers
Until the validation gates above are passed, Red Bear should use language such as:
1. **No Intel Wi-Fi hardware available for testing** — the current host has a MediaTek MT7921K
(`14c3:0608`), not an Intel `iwlwifi` device
2. **Low-level controller / IRQ quality** — must be validated before driver bring-up is reliable
3. **VFIO not loaded on current host** — passthrough path requires `vfio_pci` module and compatible IOMMU groups
- “Wi-Fi is not supported yet”
- “Wi-Fi remains experimental and hardware-specific”
- “The current wireless path is an experimental Intel bounded-transport bring-up”
## Scripts and Validation Tools
Avoid language such as:
| Script | Purpose |
|---|---|
| `test-iwlwifi-driver-runtime.sh` | Bounded Intel driver lifecycle check in target runtime |
| `test-wifi-control-runtime.sh` | Bounded Wi-Fi control/profile runtime check |
| `test-wifi-baremetal-runtime.sh` | Strongest in-repo Wi-Fi runtime check on real Red Bear target |
| `test-wifi-passthrough-qemu.sh` | QEMU/VFIO Wi-Fi validation with in-guest checks |
| `validate-wifi-vfio-host.sh` | Host-side VFIO passthrough readiness check |
| `prepare-wifi-vfio.sh` | Bind/unbind Intel Wi-Fi PCI function for VFIO |
| `run-wifi-passthrough-validation.sh` | One-shot host wrapper for full passthrough validation |
| `package-wifi-validation-artifacts.sh` | Package validation artifacts into host-side tarball |
| `summarize-wifi-validation-artifacts.sh` | Summarize captured artifacts for quick triage |
| `finalize-wifi-validation-run.sh` | Analyze capture bundle and package final evidence set |
- “Linux WiFi drivers are supported”
- “wireless support works”
- “Wi-Fi is generally available
Packaged validators (inside target runtime):
- `redbear-phase5-wifi-check` — bounded in-target Wi-Fi validation
- `redbear-phase5-wifi-run` — run bounded Wi-Fi lifecycle
- `redbear-phase5-wifi-capture` — capture runtime evidence bundle
- `redbear-phase5-wifi-analyze` — analyze captured evidence
- `redbear-phase5-wifi-link-check` — link-level validation
unless profile-scoped validation evidence exists.
## Related Documents
- `local/docs/WIFI-VALIDATION-RUNBOOK.md` — canonical operator runbook for bare-metal and VFIO validation
- `local/docs/WIFI-VALIDATION-ISSUE-TEMPLATE.md` — issue template for validation failures
- `local/docs/WIFICTL-SCHEME-REFERENCE.md``/scheme/wifictl` protocol reference
- `docs/04-LINUX-DRIVER-COMPAT.md` — linux-kpi and redox-driver-sys architecture
## Summary
The best Red Bear Wi-Fi path is **native-first**:
- native wireless control plane
- one experimental bounded Intel family path first
- Native wireless control plane (`redbear-wifictl` + `redbear-netctl`)
- One experimental Intel family path first (`redbear-iwlwifi`)
- `firmware-loader` + `redox-driver-sys` underneath
- optional narrow `linux-kpi` glue only where useful
- native `smolnetd` / `netcfg` / `redbear-netctl` reused only after association
- Narrow `linux-kpi` glue only where useful (93 tests, 17 modules)
- Native `smolnetd` / `netcfg` / `dhcpd` reused after association
`linux-kpi` is therefore **feasible only in a narrow sense**. It is useful as a low-level helper
for driver bring-up, but it is not currently a viable full WiFi architecture for Red Bear OS.
That is the most realistic way to integrate WiFi into Red Bear while keeping the design aligned
with the repos current userspace-driver and profile-based architecture.
The codebase has 119 tests passing (93 linux-kpi + 8 redbear-iwlwifi + 18 redbear-wifictl), no production `unwrap()` in the Wi-Fi daemon request loop (startup uses `expect()`), atomic command
handling, proper timer cancellation, honest timeout reporting, and real 802.11 frame parsing.
The structural skeleton is solid. The next required step is **real hardware validation** with an
Intel Wi-Fi device — everything else is gated on that.