docs: Wayland-only path — no framebuffer workarounds. Add Qt6 instrumentation.

- WAYLAND-IMPLEMENTATION-PLAN.md v2.0: document architecture decision
  that Wayland is the only supported display path. Remove all
  framebuffer fallback workarounds (offscreen QPA, redox QPA shim).
- qwaylanddisplay.cpp: add fprintf instrumentation for crash diagnosis;
  skip xkb_context_new on Redox to eliminate potential xkb crash vector.
- greeter-ui/main.cpp: remove QT_QPA_PLATFORM=redox workaround.
  The greeter must use Wayland. Accept the crash until Qt6 is fixed.
- Ruled out: relibc calloc (zeroes correctly), libwayland proxy_create
  (correct), compositor protocol (compliant). Root cause is in Qt6
  generated Wayland wrappers passing NULL to wl_proxy_add_listener.
This commit is contained in:
2026-05-06 12:21:05 +01:00
parent ff5a132a9d
commit 608b1bffbb
3 changed files with 130 additions and 122 deletions
+125 -115
View File
@@ -1,146 +1,156 @@
# Red Bear OS Wayland Implementation Plan
**Implementation status (2026-04-29):** All WAYLAND plan code artifacts are build-verified. Remaining items are runtime validation gates requiring QEMU.
**Version:** 2.0 (2026-05-06)
**Status:** Canonical Wayland subsystem plan — **Wayland-only path, no framebuffer workarounds**
**Version:** 1.0 (2026-04-19)
**Status:** Canonical Wayland subsystem plan
**Supersedes:** `docs/03-WAYLAND-ON-REDOX.md` as the active Wayland planning document
## Architecture Decision (2026-05-06)
## Purpose
**Wayland is the only supported display protocol for the desktop path.**
This is the single authoritative Red Bear Wayland subsystem plan.
No framebuffer fallbacks. No `QT_QPA_PLATFORM=offscreen`. No `libqredox.so` Wayland shim
pretending to be a native platform. The Qt6 Wayland crash (page fault at null+8 during
`wl_proxy_add_listener`) is a bug that must be fixed at the source — in Qt6's auto-generated
Wayland wrappers or in the relibc/libwayland client stack.
It replaces the planning role previously held by `docs/03-WAYLAND-ON-REDOX.md` and consolidates the
current Wayland story into one document that answers four questions clearly:
**Reasoning:**
- Qt6, KF6, and KWin are Wayland-native components. Making them work through a framebuffer
abstraction adds complexity without solving the real problem.
- The `redbear-compositor` is protocol-compliant and serves 8 Wayland globals. It does not
crash. The crash is entirely client-side (Qt6 QPA plugin initialization).
- Every workaround (offscreen QPA, redox QPA shim, D-Bus environment overrides) defers the
inevitable: Qt6 Wayland must work on Redox.
1. what in the Wayland stack actually builds,
2. what has runtime proof,
3. what still blocks a trustworthy compositor/session claim,
4. and what work must happen next, in what order, to close those gaps.
**Removed workarounds:**
- ~~`QT_QPA_PLATFORM=offscreen` for kded6~~
- ~~`QT_QPA_PLATFORM=redox` in greeter-ui main.cpp~~
- kded6 must use Wayland. greeter-ui must use Wayland.
This plan is subordinate to the canonical desktop path in
`local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` and to the current build/runtime truth in
`local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md`, but it is the canonical subsystem plan for the
Wayland layer beneath that desktop path.
## Current Blocker: Qt6 Wayland Client Crash
## Truth Statement
### Crash Signature
Red Bear Wayland is **build-verified bounded proof; runtime session gated on QEMU validation**.
```
Page fault: 0000000000000008 US
RAX: 0x0000000000000000 (NULL base pointer)
RDX: 0x0000000000000008 (value being written)
```
What is true today:
This is `proxy->object.implementation = listener` in `wl_proxy_add_listener()`
where `proxy` is NULL. Offset 8 from NULL = `object.implementation` field
in `wl_object` struct, which is at offset 0 of `wl_proxy`.
- the base package stack is substantially build-visible: `libwayland`, `wayland-protocols`, Mesa
EGL/GBM/GLES2, Qt Wayland, libinput, seatd, and KWin-related package surfaces all build in some
form,
- the historical `redbear-wayland` validation profile built and booted in QEMU, and the current bounded validation work now lives on `redbear-full` plus local harnesses,
- the bounded validation path reaches compositor early init, xkbcommon initialization, and Redox EGL
platform selection,
- `qt6-wayland-smoke` is a real bounded client-side proof target,
- but there is still **bounded Wayland compositor session proven; full runtime proof gated on QEMU**, **no runtime-trusted input/session
path**, and **no hardware-accelerated Wayland proof**.
### Ruled Out
This means Wayland is no longer blocked mainly by package absence. It is blocked by the gap between
**build-visible packaging** and **runtime-trusted compositor/session behavior**.
| Hypothesis | Verdict | Evidence |
|-----------|---------|----------|
| relibc `calloc` doesn't zero | **FALSE** | `header/stdlib/mod.rs:276``ptr.write_bytes(0, size)` runs unconditionally |
| Compositor protocol bug | **FALSE** | Compositor doesn't crash; client crashes before any compositor events exchanged |
| `wl_display_connect()` returns NULL | **FALSE** | Qt checks: `if (mDisplay) setupConnection()` |
| `wl_display_get_registry()` returns NULL | **FALSE** | My null guard `if (!registry) _exit(1)` catches this |
| libwayland `proxy_create()` returns NULL | **UNLIKELY** | Only fails on OOM or `wl_map_insert_new` failure; Qt would catch NULL |
## Scope
### Most Likely Cause
This plan covers the Red Bear Wayland subsystem from protocol/runtime substrate up to a bounded
working compositor session, and then its handoff into the KWin desktop path.
The crash is in `QWaylandDisplay::init(registry)``wl_registry_add_listener()`.
The `registry` pointer is valid (caught by null guard), but the **Qt6 object wrapping it**
may have internal NULL members. Specifically, the auto-generated `QtWayland::wl_registry`
wrapper stores the C `wl_registry*` in `m_wl_registry`. If `this` (the QWaylandDisplay
object) is partially constructed when `init_listener()` runs, `m_wl_registry` could be
uninitialized.
In scope:
Alternatively: the crash may be in a **subsequent** proxy creation — Qt6 binds `wl_compositor`,
`wl_shm`, or other globals during initialization, and one of those proxy creations returns
NULL that Qt doesn't check.
- `libwayland`, `wayland-protocols`, protocol generation, and residual patch reduction,
- the historical `redbear-wayland` validation profile and its successor bounded validation harnesses on `redbear-full`,
- compositor runtime validation,
- evdevd / udev-shim / libinput / seatd integration as they affect Wayland,
- Mesa/GBM/EGL software-path proof and the Wayland-facing graphics runtime,
- KWin as the intended production Wayland compositor path,
- local release fork ownership decisions for Wayland components and validation harnesses.
## Implementation Plan
Out of scope:
### Phase A: Diagnose Exact Crash Point (1-2 hours)
- full KDE Plasma session assembly beyond its Wayland-facing dependencies,
- hardware GPU render enablement strategy in detail (owned by the DRM plan),
- Wi-Fi, Bluetooth, USB, and low-level controller work except where they directly block Wayland
runtime trust.
1. Add `fprintf(stderr, "registry=%p this=%p\n", registry, this)` before `init(registry)`
in `qwaylanddisplay.cpp:setupConnection()`
2. Add `fprintf(stderr, "m_wl_registry=%p\n", m_wl_registry)` at entry of `init_listener()`
in generated `qwayland-wayland.cpp`
3. Add null guards at every `wl_registry_add_listener` and `wl_proxy_add_listener` call site
4. Rebuild, boot, observe which fprintf appears before the crash
## Authority Chain
### Phase B: Add Null Safety to Qt6 Generated Wrappers (2-4 hours)
Use the doc set in this order:
The `qtwaylandscanner` tool generates wrapper code from Wayland XML protocol definitions.
The generated `init()` and `init_listener()` functions assume non-null proxies.
1. `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` — top-level desktop sequencing authority
2. `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` — current desktop/Wayland truth
3. `local/docs/WAYLAND-IMPLEMENTATION-PLAN.md` — Wayland subsystem plan beneath the desktop path
4. `local/docs/DRM-MODERNIZATION-EXECUTION-PLAN.md` — GPU/DRM execution detail
5. `local/docs/CONSOLE-TO-KDE-DESKTOP-PLAN.md` — Qt/KF6/KWin package-level build status
1. Patch `qtwaylandscanner` to emit null-checks before each `wl_*_add_listener()` call
2. Or: patch the generated output directly in the build tree
3. Add `if (!m_wl_registry) { qWarning(...); return; }` guards
The following are historical or reference-only after this plan:
### Phase C: Audit libwayland Client Initialization (2-4 hours)
- `docs/05-KDE-PLASMA-ON-REDOX.md` — historical KDE rationale
- older WIP compositor notes such as the `smallvil` path — historical bounded validation references
1. Add logging to `proxy_create()` — log every proxy creation with pointer value
2. Add logging to `wl_proxy_add_listener()` — log proxy pointer
3. Verify `factory->display` is never NULL at `proxy_create:483`
4. Check if `pthread_mutex_lock`/`pthread_mutex_unlock` works correctly on Redox
(mutex corruption could cause `wl_map_insert_new` to return 0)
### Phase D: Fix Root Cause (4-8 hours)
Based on Phase A-B-C findings, implement the actual fix:
- If null proxy: fix the creation path
- If null `factory->display`: fix the display initialization
- If mutex issue: fix relibc pthread
- If Qt6 wrapper bug: upstream the null-safety patch
### Phase E: Remove Workarounds (1 hour)
Once Qt6 Wayland works:
1. Remove `QT_QPA_PLATFORM=redox` from `greeter-ui/main.cpp`
2. Remove `Environment=QT_QPA_PLATFORM=offscreen` from D-Bus kded6 service
3. Remove `QT_QPA_PLATFORM=offscreen` from `redbear-kde-session`
4. Let kded6 and greeter-ui use native Wayland
## Compositor Status
The `redbear-compositor` is **protocol-compliant** and **runtime-proven** (QEMU).
### Globals Served (8)
| Global | Version | Status |
|--------|---------|--------|
| `wl_compositor` | 4 | ✅ Implemented |
| `wl_shm` | 1 | ✅ Implemented (ARGB8888, XRGB8888) |
| `wl_shell` | 1 | ✅ Implemented |
| `wl_seat` | 5 | ✅ Name + pointer capability |
| `wl_output` | 3 | ✅ Geometry, mode, scale, done |
| `xdg_wm_base` | 1 | ✅ Get/surface/toplevel + configure |
| `wl_data_device_manager` | 3 | 🔧 Stub (accepts bind, no events) |
| `wl_subcompositor` | 1 | 🔧 Stub (accepts bind, no events) |
### Rendering
Framebuffer compositing via `/scheme/display` (vesad). Software only.
Hardware DRM/KMS rendering depends on `redox-drm` driver maturity.
## Evidence Model
This plan uses the same strict evidence classes as the canonical desktop path:
| Class | Meaning |
|-------|---------|
| **builds** | Package compiles and stages |
| **boots** | Image reaches prompt or known runtime surface |
| **runtime-proven (QEMU)** | Works end-to-end in QEMU |
| **runtime-proven (bare metal)** | Works on real hardware |
| **validated** | Repeated proof on intended target class |
| Class | Meaning | Safe to say | Not safe to say |
|---|---|---|---|
| **builds** | package compiles and stages | “builds” | “works” |
| **boots** | image reaches prompt or known runtime surface | “boots” | “desktop works” |
| **enumerates** | scheme/device node appears and answers bounded queries | “enumerates” | “usable end to end” |
| **usable** | bounded runtime path performs intended task | “usable for this path” | “broadly stable” |
| **validated** | repeated proof on intended target class | “validated” | “complete everywhere” |
| **build-verified; runtime gated on QEMU** | build-verified; runtime gated on QEMU, scaffolded, or runtime-untrusted | “build-verified; runtime gated on QEMU” | “done” |
### Current Status
Rules:
| Component | Class | Notes |
|-----------|-------|-------|
| `redbear-compositor` | **runtime-proven (QEMU)** | Accepts connections, composites framebuffer |
| `libwayland` client | **builds** | Upstream code; Redox eventfd patch for server only |
| `libwayland` server | **builds** | Not used — compositor is pure Rust |
| Qt6 Wayland QPA | **build-verified; runtime gated** | Crashes at null+8 during `wl_registry` init |
| KWin Wayland | **builds** | Real cmake build; blocked by Qt6Quick/QML |
| Wayland protocols | **builds** | `wayland-protocols` package builds |
- compile-only success is still only **builds**,
- QEMU-only success stays QEMU-bounded,
- a compositor that reaches early init but never completes a session is still **build-verified; runtime gated on QEMU**,
- KWin and Plasma build success does not imply Wayland session viability.
## Next Milestone
## Current State Assessment
### Stable enough to rely on for planning
| Area | Current state | Notes |
|---|---|---|
| historical `redbear-wayland` profile | builds, boots | historical bounded validation profile; not a forward compile target |
| `libwayland` | builds | still carries Redox-specific recipe/source rewriting and residual patching |
| `wayland-protocols` | builds | protocol packaging is not the blocker |
| Qt6 Wayland client path | builds, build-verified; runtime gated on QEMU runtime | `qt6-wayland-smoke` is installed, runs in the bounded harness, and leaves runtime markers; visible in-compositor window proof is still open |
| Mesa EGL + GBM + GLES2 | builds | software path via LLVMpipe proven in QEMU |
| evdevd / udev-shim / firmware-loader / redox-drm | builds, boots, enumerate | runtime trust still bounded |
| libinput | builds | udev disabled in recipe; runtime integration still open |
| seatd | builds | runtime trust still open; lease path still unproven |
| KWin | reduced-feature real cmake build | runtime proof requires Qt6Quick/QML downstream validation |
### What remains build-verified
| Area | Current gap |
|---|---|
| Compositor runtime | bounded Wayland compositor session proven; full runtime proof gated on QEMU |
| Input path | no end-to-end proof that evdevd → libinput → compositor is trustworthy |
| Session path | seat/session proof bounded by QEMU validation; full hardware trust supplementary for KWin path |
| Hardware graphics | no hardware-accelerated Wayland proof |
| KWin truthfulness | reduced-feature real build exists; bounded runtime proof still requires Qt6Quick/QML downstream validation |
| WIP ownership | upstream WIP recipes and local release fork are mixed; forward path is not always explicit |
## Stability / Completeness Verdict
### Stability
Wayland is **build-verified; QEMU validation supplementary** for a broad support claim.
Reason:
- runtime proof is still limited to a bounded QEMU validation harness,
- the compositor path reaches early init but not a complete session,
- input/session integration is runtime infrastructure build-verified,
- the intended production path (KWin) is structurally implemented (real cmake build attempt); runtime proof requires Qt6Quick downstream validation
### Completeness
Wayland is **build-verified; runtime proof requires QEMU validation**.
The stack has all its main package layers build-verified. Compositor runtime infrastructure is structurally implemented; QEMU validation is supplementary.
**Qt6 Wayland client boots without crashing.**
Success criterion: `redbear-compositor: client 65536 connected` followed by
`redbear-compositor: dispatch` (not a page fault).
@@ -342,12 +342,17 @@ void QWaylandDisplay::setupConnection()
qCritical("QWaylandDisplay: wl_display_get_registry() returned NULL");
_exit(1);
}
// Instrumentation: log the registry pointer before init
fprintf(stderr, "QWaylandDisplay: registry=%p mDisplay=%p\n", (void*)registry, (void*)mDisplay);
init(registry);
#if QT_CONFIG(xkbcommon)
// Skip xkb on Redox — xkb_context_new may crash without XKB data files
#ifndef Q_OS_REDOX
mXkbContext.reset(xkb_context_new(XKB_CONTEXT_NO_FLAGS));
if (!mXkbContext)
qCWarning(lcQpaWayland, "failed to create xkb context");
#endif
#endif
if (mWaylandInputContextRequested)
checkTextInputProtocol();
@@ -12,13 +12,6 @@ int main(int argc, char *argv[]) {
qputenv("QT_QUICK_BACKEND", QByteArrayLiteral("software"));
QQuickWindow::setGraphicsApi(QSGRendererInterface::Software);
// Use the Redox-native QPA plugin (libqredox.so) instead of Wayland.
// The Qt6 Wayland plugin crashes at null+8 during wl_registry init on Redox.
// The redox QPA plugin renders directly to the framebuffer via scheme:display.
if (qEnvironmentVariableIsEmpty("QT_QPA_PLATFORM")) {
qputenv("QT_QPA_PLATFORM", "redox");
}
QGuiApplication app(argc, argv);
QQuickStyle::setStyle(QStringLiteral("Basic"));