Files
RedBear-OS/local/docs/WAYLAND-IMPLEMENTATION-PLAN.md
T
vasilito b66431cbbe fix: restore libwayland redox.patch to working state, update docs
- Reverted redox.patch to original 39-line version (build-tested)
- Documented libwayland→qtbase→kded6 build dependency chain
- Updated WAYLAND-IMPLEMENTATION-PLAN.md to v2.1
- Deleted 45 stale .bak patch files
- pkgar restored from packages/ backup
2026-05-06 14:39:55 +01:00

176 lines
7.6 KiB
Markdown

# Red Bear OS Wayland Implementation Plan
**Version:** 2.1 (2026-05-06)
**Status:** Canonical Wayland subsystem plan — **Wayland-only path, no framebuffer workarounds**
## Architecture Decision (2026-05-06)
**Wayland is the only supported display protocol for the desktop path.**
No framebuffer fallbacks. No `QT_QPA_PLATFORM=offscreen`. No `libqredox.so` Wayland shim
pretending to be a native platform. The Qt6 Wayland crash (page fault at null+8 during
`wl_proxy_add_listener`) is a bug that must be fixed at the source — in Qt6's auto-generated
Wayland wrappers or in the relibc/libwayland client stack.
**Reasoning:**
- Qt6, KF6, and KWin are Wayland-native components. Making them work through a framebuffer
abstraction adds complexity without solving the real problem.
- The `redbear-compositor` is protocol-compliant and serves 8 Wayland globals. It does not
crash. The crash is entirely client-side (Qt6 QPA plugin initialization).
- Every workaround (offscreen QPA, redox QPA shim, D-Bus environment overrides) defers the
inevitable: Qt6 Wayland must work on Redox.
**Removed workarounds:**
- ~~`QT_QPA_PLATFORM=offscreen` for kded6~~
- ~~`QT_QPA_PLATFORM=redox` in greeter-ui main.cpp~~
- kded6 must use Wayland. greeter-ui must use Wayland.
## Current Blocker: Qt6 Wayland Client Crash
### Crash Signature
```
Page fault: 0000000000000008 US
RAX: 0x0000000000000000 (NULL base pointer)
RDX: 0x0000000000000008 (value being written)
```
This is `proxy->object.implementation = listener` in `wl_proxy_add_listener()`
where `proxy` is NULL. Offset 8 from NULL = `object.implementation` field
in `wl_object` struct, which is at offset 0 of `wl_proxy`.
### Ruled Out
| Hypothesis | Verdict | Evidence |
|-----------|---------|----------|
| relibc `calloc` doesn't zero | **FALSE** | `header/stdlib/mod.rs:276``ptr.write_bytes(0, size)` runs unconditionally |
| Compositor protocol bug | **FALSE** | Compositor doesn't crash; client crashes before any compositor events exchanged |
| `wl_display_connect()` returns NULL | **FALSE** | Qt checks: `if (mDisplay) setupConnection()` |
| `wl_display_get_registry()` returns NULL | **FALSE** | My null guard `if (!registry) _exit(1)` catches this |
| libwayland `proxy_create()` returns NULL | **UNLIKELY** | Only fails on OOM or `wl_map_insert_new` failure; Qt would catch NULL |
### Most Likely Cause
The crash is in `QWaylandDisplay::init(registry)``wl_registry_add_listener()`.
The `registry` pointer is valid (caught by null guard), but the **Qt6 object wrapping it**
may have internal NULL members. Specifically, the auto-generated `QtWayland::wl_registry`
wrapper stores the C `wl_registry*` in `m_wl_registry`. If `this` (the QWaylandDisplay
object) is partially constructed when `init_listener()` runs, `m_wl_registry` could be
uninitialized.
Alternatively: the crash may be in a **subsequent** proxy creation — Qt6 binds `wl_compositor`,
`wl_shm`, or other globals during initialization, and one of those proxy creations returns
NULL that Qt doesn't check.
## Implementation Plan
### Phase A: Diagnose Exact Crash Point (1-2 hours)
1. Add `fprintf(stderr, "registry=%p this=%p\n", registry, this)` before `init(registry)`
in `qwaylanddisplay.cpp:setupConnection()`
2. Add `fprintf(stderr, "m_wl_registry=%p\n", m_wl_registry)` at entry of `init_listener()`
in generated `qwayland-wayland.cpp`
3. Add null guards at every `wl_registry_add_listener` and `wl_proxy_add_listener` call site
4. Rebuild, boot, observe which fprintf appears before the crash
### Phase B: Add Null Safety to Qt6 Generated Wrappers (2-4 hours)
The `qtwaylandscanner` tool generates wrapper code from Wayland XML protocol definitions.
The generated `init()` and `init_listener()` functions assume non-null proxies.
1. Patch `qtwaylandscanner` to emit null-checks before each `wl_*_add_listener()` call
2. Or: patch the generated output directly in the build tree
3. Add `if (!m_wl_registry) { qWarning(...); return; }` guards
### Phase C: Audit libwayland Client Initialization (2-4 hours)
1. Add logging to `proxy_create()` — log every proxy creation with pointer value
2. Add logging to `wl_proxy_add_listener()` — log proxy pointer
3. Verify `factory->display` is never NULL at `proxy_create:483`
4. Check if `pthread_mutex_lock`/`pthread_mutex_unlock` works correctly on Redox
(mutex corruption could cause `wl_map_insert_new` to return 0)
### Phase D: Fix Root Cause (4-8 hours)
Based on Phase A-B-C findings, implement the actual fix:
- If null proxy: fix the creation path
- If null `factory->display`: fix the display initialization
- If mutex issue: fix relibc pthread
- If Qt6 wrapper bug: upstream the null-safety patch
### Phase E: Remove Workarounds (1 hour)
Once Qt6 Wayland works:
1. Remove `QT_QPA_PLATFORM=redox` from `greeter-ui/main.cpp`
2. Remove `Environment=QT_QPA_PLATFORM=offscreen` from D-Bus kded6 service
3. Remove `QT_QPA_PLATFORM=offscreen` from `redbear-kde-session`
4. Let kded6 and greeter-ui use native Wayland
## Compositor Status
The `redbear-compositor` is **protocol-compliant** and **runtime-proven** (QEMU).
### Globals Served (8)
| Global | Version | Status |
|--------|---------|--------|
| `wl_compositor` | 4 | ✅ Implemented |
| `wl_shm` | 1 | ✅ Implemented (ARGB8888, XRGB8888) |
| `wl_shell` | 1 | ✅ Implemented |
| `wl_seat` | 5 | ✅ Name + pointer capability |
| `wl_output` | 3 | ✅ Geometry, mode, scale, done |
| `xdg_wm_base` | 1 | ✅ Get/surface/toplevel + configure |
| `wl_data_device_manager` | 3 | 🔧 Stub (accepts bind, no events) |
| `wl_subcompositor` | 1 | 🔧 Stub (accepts bind, no events) |
### Rendering
Framebuffer compositing via `/scheme/display` (vesad). Software only.
Hardware DRM/KMS rendering depends on `redox-drm` driver maturity.
## Evidence Model
| Class | Meaning |
|-------|---------|
| **builds** | Package compiles and stages |
| **boots** | Image reaches prompt or known runtime surface |
| **runtime-proven (QEMU)** | Works end-to-end in QEMU |
| **runtime-proven (bare metal)** | Works on real hardware |
| **validated** | Repeated proof on intended target class |
### Current Status
| Component | Class | Notes |
|-----------|-------|-------|
| `redbear-compositor` | **runtime-proven (QEMU)** | Accepts connections, composites framebuffer |
| `libwayland` client | **builds** | Upstream code; Redox eventfd patch for server only |
| `libwayland` server | **builds** | Not used — compositor is pure Rust |
| Qt6 Wayland QPA | **build-verified; runtime gated** | Crashes at null+8 during `wl_registry` init |
| KWin Wayland | **builds** | Real cmake build; blocked by Qt6Quick/QML |
| Wayland protocols | **builds** | `wayland-protocols` package builds |
## Next Milestone
**Qt6 Wayland client boots without crashing.**
Success criterion: `redbear-compositor: client 65536 connected` followed by
`redbear-compositor: dispatch` (not a page fault).
### Build Dependency Chain (Critical, discovered 2026-05-06)
When modifying libwayland source or `local/patches/libwayland/redox.patch`, the
full chain must be rebuilt:
```
libwayland recipe → repo/x86_64-unknown-redox/libwayland.pkgar
qtbase recipe → copies pkgar into sysroot → links Qt6 Wayland QPA
kf6-kded6 recipe → links against qtbase/sysroot/libwayland-client.a
```
**Failure mode**: Modifying only libwayland does NOT update qtbase's static copy
of `libwayland-client.a`. The old binary persists in qtbase's sysroot. All
three recipes must be force-rebuilt (delete `target/` directories) AND the
repo pkgar cache must be regenerated.
**Recovery**: If `repo/x86_64-unknown-redox/libwayland.pkgar` is deleted, a
backup exists at `packages/x86_64-unknown-redox/libwayland.pkgar`.