Add remaining planning docs
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
This commit is contained in:
@@ -0,0 +1,560 @@
|
||||
# Red Bear OS IRQ and Low-Level Controllers Enhancement Plan
|
||||
|
||||
## Purpose
|
||||
|
||||
This document assesses the current IRQ and low-level controller implementation in Red Bear OS for
|
||||
completeness and quality, then defines the next enhancement plan in execution order.
|
||||
|
||||
It is grounded in the current repository state, especially:
|
||||
|
||||
- `local/recipes/drivers/redox-driver-sys/`
|
||||
- `local/recipes/drivers/linux-kpi/`
|
||||
- `local/recipes/gpu/redox-drm/`
|
||||
- `local/recipes/system/iommu/`
|
||||
- `recipes/core/kernel/source/src/acpi/`
|
||||
- `recipes/core/base/source/drivers/acpid/`
|
||||
- `local/docs/IOMMU-SPEC-REFERENCE.md`
|
||||
- `local/docs/ACPI-FIXES.md`
|
||||
- `docs/04-LINUX-DRIVER-COMPAT.md`
|
||||
|
||||
The goal is not to restate that these pieces compile, but to separate:
|
||||
|
||||
- what exists architecturally,
|
||||
- what is only build-validated,
|
||||
- what is runtime-validated,
|
||||
- and what still needs focused enhancement work.
|
||||
|
||||
## Evidence Model
|
||||
|
||||
This plan uses four different evidence buckets and does **not** treat them as equivalent:
|
||||
|
||||
- **Checked-in source** — what is visible directly in the current source tree.
|
||||
- **Local patch state** — behavior carried by `local/patches/*` that may not be visible in the
|
||||
unpacked upstream source snapshot until patches are applied.
|
||||
- **Build-validated** — code or recipes compile successfully.
|
||||
- **Runtime-validated** — behavior has been exercised in a real boot/runtime path.
|
||||
|
||||
Where a statement depends on local patches instead of the visible source snapshot, that is called
|
||||
out explicitly below.
|
||||
|
||||
## Controller Inventory and Ownership
|
||||
|
||||
| Area | Primary owner | Main entry points | Current evidence class |
|
||||
|---|---|---|---|
|
||||
| LAPIC / xAPIC / x2APIC | kernel | `recipes/core/kernel/source/src/acpi/madt/`, `arch/x86_shared/device/local_apic.rs` | source + local patch + boot/runtime evidence |
|
||||
| IOAPIC / IRQ overrides | kernel | `recipes/core/kernel/source/src/arch/x86_shared/device/ioapic.rs`, MADT ISO parsing | source |
|
||||
| Legacy PIC | kernel | `arch/x86_shared/device/pic.rs` | source |
|
||||
| ACPI power/reset methods | userspace `acpid` | `recipes/core/base/source/drivers/acpid/src/acpi.rs` plus local base patch | source + local patch + runtime evidence |
|
||||
| HPET / timer tables | kernel | `recipes/core/kernel/source/src/acpi/hpet.rs` | source |
|
||||
| PIT fallback timer | kernel | `recipes/core/kernel/source/src/arch/x86_shared/device/mod.rs`, `pit.rs` | source |
|
||||
| PCI interrupt plumbing | userspace `pcid` / driver layer | `recipes/core/base/source/drivers/pci/`, `scheme:irq`, `scheme:pci` | source + runtime evidence |
|
||||
| Driver IRQ abstraction | `redox-driver-sys` | `local/recipes/drivers/redox-driver-sys/source/src/irq.rs` | source |
|
||||
| Linux IRQ compatibility | `linux-kpi` | `local/recipes/drivers/linux-kpi/source/` headers | source |
|
||||
| GPU MSI/MSI-X usage | `redox-drm` | `local/recipes/gpu/redox-drm/source/` | source + build evidence |
|
||||
| IOMMU / interrupt remapping | `iommu` daemon | `local/recipes/system/iommu/source/src/main.rs`, `local/docs/IOMMU-SPEC-REFERENCE.md` | source + build evidence |
|
||||
| Kernel serio / PS2 path | kernel `serio` + userspace `ps2d` | `recipes/core/kernel/source/src/scheme/serio.rs`, `recipes/core/base/source/drivers/input/ps2d/src/main.rs` | source |
|
||||
| Input controller path | `inputd` / `evdevd` / `udev-shim` | base driver + local system recipes | source + runtime evidence |
|
||||
| USB xHCI host controller | userspace `xhcid` | `recipes/core/base/source/drivers/usb/xhcid/src/main.rs` | source + build evidence |
|
||||
| Port I/O / legacy controller access | kernel + `redox-driver-sys` | `iopl`, `io.rs`, legacy driver code | source |
|
||||
| Legacy IRQ dispatch / ownership map | kernel | `recipes/core/kernel/source/src/arch/x86_shared/interrupt/irq.rs` | source |
|
||||
|
||||
## Current State Summary
|
||||
|
||||
### What is already in place
|
||||
|
||||
Red Bear OS already has a meaningful low-level controller and interrupt foundation:
|
||||
|
||||
- ACPI boot, FADT power control, visible MADT parsing for LAPIC/IOAPIC/interrupt overrides, and
|
||||
HPET initialization are in place in the checked-in source.
|
||||
- Additional MADT x2APIC / NMI / power-method handling exists in the local patch set and in prior
|
||||
runtime validation notes, but that behavior should not be conflated with the unpatched source
|
||||
snapshot.
|
||||
- `redox-driver-sys` provides userspace driver primitives for MMIO, DMA, PCI access, IRQ handles,
|
||||
MSI-X table mapping, and IRQ affinity control.
|
||||
- `linux-kpi` exposes Linux-style IRQ, PCI, memory, and synchronization APIs on top of
|
||||
`redox-driver-sys`.
|
||||
- `redox-drm` already contains a shared interrupt abstraction with MSI-X-first and legacy-IRQ
|
||||
fallback paths for GPU drivers.
|
||||
- The AMD-Vi / Intel VT-d reference material and the in-tree `iommu` daemon establish a serious
|
||||
implementation direction for IOMMU and interrupt-remapping work.
|
||||
|
||||
### What is still weak
|
||||
|
||||
The dominant weakness is not missing abstractions. It is missing runtime proof and uneven
|
||||
controller-specific validation.
|
||||
|
||||
- MSI-X support exists architecturally but is still weak on hardware validation.
|
||||
- IOMMU support is specification-rich and code-rich, but still unvalidated on real hardware.
|
||||
- IRQ routing quality-of-service remains primitive: raw wait handles exist, but balancing,
|
||||
coalescing, and validation of affinity behavior remain thin.
|
||||
- Input stacks (`inputd`, `evdevd`, `udev-shim`) now exist as a runtime substrate, but the exact
|
||||
end-to-end interrupt-to-consumer path still needs sustained validation discipline.
|
||||
- Low-level controller quality is uneven: ACPI/APIC are much further along than IOMMU, MSI-X, and
|
||||
controller-specific runtime characterization.
|
||||
|
||||
## Architectural Assessment
|
||||
|
||||
### 1. IRQ delivery architecture
|
||||
|
||||
The project’s IRQ delivery model is fundamentally sound.
|
||||
|
||||
- Kernel/platform side routes interrupts through APIC/x2APIC infrastructure.
|
||||
- Userspace consumes interrupts through `scheme:irq` handles.
|
||||
- MSI-X vector allocation is already modeled per CPU via the IRQ scheme.
|
||||
|
||||
This is the right design for Red Bear OS. The main enhancement need is validation and quality, not
|
||||
an architectural rewrite.
|
||||
|
||||
### 2. PCI and MSI/MSI-X
|
||||
|
||||
The PCI and MSI-X model is one of the strongest parts of the current stack.
|
||||
|
||||
- Config-space access exists.
|
||||
- Capability parsing exists.
|
||||
- MSI-X table mapping exists.
|
||||
- GPU drivers already use the abstraction.
|
||||
|
||||
The gap is that the repository still talks too often in “compiles” language instead of “validated on
|
||||
hardware with real interrupts firing” language.
|
||||
|
||||
Current runtime-proof entrypoint now present in-tree:
|
||||
|
||||
- `local/scripts/test-msix-qemu.sh` — QEMU/UEFI boot path that verifies live `virtio-net`
|
||||
initialization reporting `virtio: using MSI-X`
|
||||
|
||||
### 3. IOMMU and interrupt remapping
|
||||
|
||||
IOMMU is the most important low-level controller area that is still incomplete in practice.
|
||||
|
||||
- The implementation direction is correct.
|
||||
- The data structures and register model are already documented deeply.
|
||||
- But the hardware-validation story is still effectively open, and current daemon discovery is still
|
||||
only partially integrated: the daemon now searches common IVRS table locations automatically, but
|
||||
full platform-native discovery and hardware validation are still open.
|
||||
- The current QEMU path now reaches AMD-Vi unit detection and `scheme:iommu` registration without
|
||||
crashing at daemon startup, but unit initialization is still deferred and real hardware validation
|
||||
remains open.
|
||||
- The current guest-driven first-use proof now reaches AMD-Vi MMIO reads in QEMU (`control=0x0`,
|
||||
`status=0x0`), but still dies during the completion path with a CPU-side page fault while touching
|
||||
the completion-store region. That narrows the remaining blocker to DMA mapping/page-coverage
|
||||
behavior rather than to missing discovery, missing scheme wiring, or unreadable MMIO registers.
|
||||
|
||||
This makes IOMMU the highest-value long-term controller enhancement area after basic MSI-X runtime
|
||||
validation.
|
||||
|
||||
### 4. Input/controller path
|
||||
|
||||
The input/controller path is no longer missing. It is now a quality and observability problem.
|
||||
|
||||
- `inputd` exists.
|
||||
- `evdevd` exists.
|
||||
- `udev-shim` exists.
|
||||
- Phase 3 validation helpers exist.
|
||||
|
||||
The enhancement task is to keep turning these from “service present” into “interrupt path proven,”
|
||||
especially under real runtime scenarios.
|
||||
|
||||
## Completeness Assessment by Area
|
||||
|
||||
### ACPI / APIC / x2APIC
|
||||
|
||||
**State**: materially complete for current platform bring-up goals.
|
||||
|
||||
**Important source note**: the checked-in MADT parser in
|
||||
`recipes/core/kernel/source/src/acpi/madt/mod.rs` visibly handles `LocalApic`, `IoApic`,
|
||||
`IntSrcOverride`, `Gicc`, and `Gicd`. Additional x2APIC/NMI support referenced elsewhere in the
|
||||
repo is currently evidenced through the local patch set and prior validation notes rather than the
|
||||
plain source snapshot alone.
|
||||
|
||||
Strengths:
|
||||
|
||||
- MADT entries for xAPIC/x2APIC/NMI are handled.
|
||||
- ACPI reboot/shutdown/power methods exist.
|
||||
- x2APIC and SMP platform bring-up have already crossed the foundational threshold.
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- Better controller/runtime characterization on diverse hardware.
|
||||
- Clearer documentation for what is kernel-complete versus only tested on limited platforms.
|
||||
|
||||
### IOAPIC / interrupt source override routing
|
||||
|
||||
**State**: present in ACPI parsing, but less explicitly validated than LAPIC/x2APIC paths.
|
||||
|
||||
Concrete checked-in owner:
|
||||
|
||||
- `recipes/core/kernel/source/src/arch/x86_shared/device/ioapic.rs`
|
||||
- `recipes/core/kernel/source/src/acpi/madt/mod.rs`
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- explicit validation of interrupt source overrides on more real machines
|
||||
- repo-visible test notes for IOAPIC routing behavior
|
||||
|
||||
### HPET / timer controller surface
|
||||
|
||||
**State**: present, but still thinly characterized.
|
||||
|
||||
Concrete checked-in owner:
|
||||
|
||||
- `recipes/core/kernel/source/src/acpi/hpet.rs`
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- runtime verification beyond “initialized from ACPI”
|
||||
- clearer single-HPET limitation documentation
|
||||
|
||||
### PIT fallback timer path
|
||||
|
||||
**State**: explicit checked-in fallback controller path.
|
||||
|
||||
Concrete checked-in owner:
|
||||
|
||||
- `recipes/core/kernel/source/src/arch/x86_shared/device/mod.rs`
|
||||
- `recipes/core/kernel/source/src/arch/x86_shared/device/pit.rs`
|
||||
|
||||
Current behavior:
|
||||
|
||||
- the kernel prefers HPET when available
|
||||
- if HPET initialization fails or is unavailable, it falls back to PIT
|
||||
- PIT interrupt ticks currently drive timeout and scheduler timing paths
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- document runtime characterization of PIT-only boots
|
||||
- clarify timer-source selection evidence in validation notes
|
||||
|
||||
### PCI interrupt plumbing / MSI / MSI-X
|
||||
|
||||
**State**: architecturally strong, validation-incomplete.
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- real hardware MSI-X proof for AMD and Intel GPU paths
|
||||
- controller-level observability for vector allocation and affinity behavior
|
||||
- testable records of fallback behavior between MSI-X and legacy IRQs
|
||||
|
||||
Current runtime-validation surface now present in-tree:
|
||||
|
||||
- `local/scripts/test-msix-qemu.sh` — boots a Red Bear image and confirms a live MSI-X path via
|
||||
`virtio-net` log evidence in QEMU
|
||||
|
||||
### IOMMU / interrupt remapping
|
||||
|
||||
**State**: the biggest completeness gap.
|
||||
|
||||
Concrete checked-in owner:
|
||||
|
||||
- `local/recipes/system/iommu/source/src/main.rs`
|
||||
- `local/docs/IOMMU-SPEC-REFERENCE.md`
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- real AMD-Vi initialization validation
|
||||
- event log and fault-path validation
|
||||
- interrupt remapping validation under device load
|
||||
- explicit distinction between “daemon builds” and “controller works”
|
||||
- replacement of `IOMMU_IVRS_PATH`-only discovery with real system discovery/integration
|
||||
- diagnosis/fix for the remaining QEMU first-use blocker where completion-store CPU access faults
|
||||
even after MMIO reads and multiple completion-store placement strategies succeed structurally
|
||||
|
||||
Current implementation improvement:
|
||||
|
||||
- the daemon no longer depends only on `IOMMU_IVRS_PATH`; it now searches common IVRS table paths
|
||||
automatically before falling back to the environment variable override
|
||||
- daemon startup now defers AMD-Vi unit initialization until first scheme use, which keeps the
|
||||
QEMU validation path alive long enough to prove detection plus `scheme:iommu` registration
|
||||
- a guest-driven self-test path now exists (`/usr/bin/iommu --self-test-init` via
|
||||
`redbear-phase-iommu-check` / `test-iommu-qemu.sh`) and proves that the remaining failure is in
|
||||
runtime completion/DMA-page handling, not in daemon startup or bare MMIO readability
|
||||
|
||||
### Legacy IRQ ownership and dispatch map
|
||||
|
||||
**State**: explicit checked-in kernel ownership exists, but it is under-documented in higher-level
|
||||
controller discussions.
|
||||
|
||||
Concrete checked-in owner:
|
||||
|
||||
- `recipes/core/kernel/source/src/arch/x86_shared/interrupt/irq.rs`
|
||||
|
||||
Current covered paths include:
|
||||
|
||||
- PIT timer interrupt handling
|
||||
- keyboard and mouse interrupt delivery
|
||||
- serial COM1/COM2 delivery
|
||||
- PIC/APIC mask, acknowledge, and EOI behavior
|
||||
- spurious IRQ accounting for IRQ7 and IRQ15
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- document legacy IRQ ownership and routing expectations explicitly in validation notes
|
||||
- record PIC-vs-APIC runtime behavior on more hardware classes
|
||||
|
||||
### Kernel `serio` / PS2 controller path
|
||||
|
||||
**State**: present and important, but easy to miss if input work is described only in terms of the
|
||||
later `evdevd`/`udev-shim` stack.
|
||||
|
||||
Concrete checked-in owner:
|
||||
|
||||
- `recipes/core/kernel/source/src/scheme/serio.rs`
|
||||
- `recipes/core/base/source/drivers/input/ps2d/src/main.rs`
|
||||
|
||||
Current behavior:
|
||||
|
||||
- the kernel owns the serio byte queues to avoid PS/2 controller races
|
||||
- `ps2d` consumes `/scheme/serio/0` and `/scheme/serio/1`
|
||||
- that path then feeds the broader input producer chain
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- keep validation language explicit about the PS/2 path versus the later generic input stack
|
||||
- add platform notes for systems that still rely on PS/2 keyboard/mouse delivery
|
||||
|
||||
### USB xHCI controller interrupt path
|
||||
|
||||
**State**: present, but not honestly interrupt-complete in the checked-in source.
|
||||
|
||||
Concrete checked-in owner:
|
||||
|
||||
- `recipes/core/base/source/drivers/usb/xhcid/src/main.rs`
|
||||
|
||||
Current behavior:
|
||||
|
||||
- xHCI has MSI/MSI-X and legacy INTx detection logic in source
|
||||
- the hardwired polling override in `xhcid` has been removed, and the driver now uses the existing
|
||||
MSI-X / MSI / INTx selection logic again
|
||||
- `local/scripts/test-xhci-irq-qemu.sh --check` now provides a repo-visible runtime proof path by
|
||||
booting a Red Bear image in QEMU and checking the xHCI interrupt-mode log output
|
||||
- `redox-driver-sys` now logs allocated MSI-X vectors so interrupt selection is more observable in
|
||||
runtime logs
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- validate the restored interrupt path beyond early boot/logging, especially event-ring behavior
|
||||
- validate the checked-in event-ring growth path under sustained runtime/device activity
|
||||
|
||||
### Port I/O / legacy controller support
|
||||
|
||||
**State**: exists, but under-characterized.
|
||||
|
||||
Concrete current consumers/owners include:
|
||||
|
||||
- legacy PIC handling in `recipes/core/kernel/source/src/arch/x86_shared/device/pic.rs`
|
||||
- port-I/O wrappers in `local/recipes/drivers/redox-driver-sys/source/src/io.rs`
|
||||
- ACPI reset fallback via keyboard-controller port writes in the base/acpid patch path documented in
|
||||
`local/docs/ACPI-FIXES.md`
|
||||
|
||||
Open enhancement items:
|
||||
|
||||
- determine which real devices still need the port-I/O path
|
||||
- validate that the current wrappers are sufficient for those devices
|
||||
|
||||
## Quality Assessment
|
||||
|
||||
### Strong points
|
||||
|
||||
- The layering is correct: kernel/platform routing below, userspace schemes and driver wrappers
|
||||
above.
|
||||
- The repository already has serious implementation artifacts, not just speculative plans.
|
||||
- The low-level controller work is documented more deeply than many higher-level desktop areas.
|
||||
- ACPI and early-platform work is significantly more mature than the rest of the low-level stack.
|
||||
|
||||
### Weak points
|
||||
|
||||
- Validation language is still inconsistent across docs. “builds” and “validated” are too often
|
||||
treated as adjacent states when they are not.
|
||||
- IOMMU progress is easy to overread because the spec reference is detailed, but the runtime proof
|
||||
and discovery story are not there yet.
|
||||
- Some controller areas are rich in abstractions but poor in operator-facing validation procedures.
|
||||
- Hardware-controller quality is still under-documented in terms of negative results and known
|
||||
failure modes.
|
||||
- Earlier summaries in the repo can blur checked-in source, local patches, and validated runtime
|
||||
behavior; this document should be used to keep those categories separate.
|
||||
- Broad category labels can hide concrete controller owners unless PIT, `serio`/PS2, legacy IRQ
|
||||
dispatch, and xHCI are named explicitly.
|
||||
|
||||
## Enhancement Priorities
|
||||
|
||||
## Priority 1 — MSI-X runtime validation on real devices
|
||||
|
||||
Goal: move MSI-X from “implemented abstraction” to “repeatedly proven behavior.”
|
||||
|
||||
Deliverables:
|
||||
|
||||
- explicit AMD GPU MSI-X validation notes
|
||||
- explicit Intel GPU MSI-X validation notes
|
||||
- verified fallback behavior to legacy IRQs when MSI-X is unavailable
|
||||
- logged CPU/vector affinity behavior in real runs
|
||||
|
||||
Why first:
|
||||
|
||||
This is the lowest-level controller feature that already exists in the main runtime driver path and
|
||||
blocks confidence in GPU/display work above it.
|
||||
|
||||
## Priority 2 — IOMMU hardware bring-up and fault-path validation
|
||||
|
||||
Goal: move IOMMU from spec-driven implementation to actual controller bring-up.
|
||||
|
||||
Deliverables:
|
||||
|
||||
- validated AMD-Vi daemon initialization on real hardware
|
||||
- device table / command buffer / event log validation
|
||||
- explicit interrupt-remapping validation notes
|
||||
- negative-result documentation if hardware still fails
|
||||
|
||||
Why second:
|
||||
|
||||
It is the largest remaining low-level completeness gap, and it affects the safety and correctness of
|
||||
userspace driver DMA.
|
||||
|
||||
## Priority 3 — IRQ quality-of-service and observability
|
||||
|
||||
Goal: make IRQ behavior easier to reason about in production.
|
||||
|
||||
Deliverables:
|
||||
|
||||
- better logging/telemetry around allocated IRQs and vectors
|
||||
- explicit affinity-validation procedures
|
||||
- measured notes on whether current userspace IRQ wait behavior is good enough for display/input
|
||||
latency needs
|
||||
|
||||
Why third:
|
||||
|
||||
This improves reliability without changing the underlying architecture.
|
||||
|
||||
## Priority 4 — input/controller runtime proof
|
||||
|
||||
Goal: continue turning the existing input substrate into a well-proven low-level controller path.
|
||||
|
||||
Deliverables:
|
||||
|
||||
- sustained validation of `inputd` → `evdevd` → consumer path
|
||||
- documentation of real interrupt-backed input evidence, not only service existence
|
||||
- explicit known limitations for consumer nodes and path expectations
|
||||
|
||||
Why fourth:
|
||||
|
||||
The architecture is there. What remains is proof quality.
|
||||
|
||||
## Priority 5 — timer/controller characterization
|
||||
|
||||
Goal: reduce uncertainty around HPET/APIC-timer behavior and controller assumptions.
|
||||
|
||||
Deliverables:
|
||||
|
||||
- a compact validation note for HPET behavior on real hardware
|
||||
- notes on timer-controller assumptions and known limits
|
||||
|
||||
Why fifth:
|
||||
|
||||
Important, but less immediately blocking than MSI-X and IOMMU.
|
||||
|
||||
## Priority 6 — xHCI interrupt restoration
|
||||
|
||||
This is Priority 6 **within the low-level controller plan itself**, not within the repository-wide
|
||||
subsystem order. At the repo-wide level, low-level controller quality remains ahead of USB/Wi-Fi/
|
||||
Bluetooth because these later subsystems depend on the controller/runtime proof work documented
|
||||
here.
|
||||
|
||||
Goal: move USB host-controller operation from polling back to real interrupt-driven behavior.
|
||||
|
||||
Deliverables:
|
||||
|
||||
- restore the actual `get_int_method` path in `xhcid`
|
||||
- validate MSI/MSI-X or INTx behavior for xHCI on real hardware and/or QEMU
|
||||
- update docs so USB controller quality is not overstated while polling remains active
|
||||
|
||||
Why sixth:
|
||||
|
||||
This is a real completeness gap in an important low-level controller, but it is narrower in scope
|
||||
than the cross-cutting MSI-X and IOMMU priorities above.
|
||||
|
||||
## Execution Plan
|
||||
|
||||
### Step A — Establish validation vocabulary in all related docs
|
||||
|
||||
For every low-level controller area, use the same four states consistently:
|
||||
|
||||
- builds
|
||||
- boots
|
||||
- validated
|
||||
- experimental
|
||||
|
||||
Do not mark controller infrastructure “complete” unless the claimed runtime behavior is actually
|
||||
proven.
|
||||
|
||||
### Step B — Add dedicated validation notes for MSI-X and IOMMU
|
||||
|
||||
The project already has enough code to justify dedicated runtime-validation docs for:
|
||||
|
||||
- GPU MSI-X behavior
|
||||
- IOMMU bring-up and fault handling
|
||||
|
||||
There is now also an in-tree generic MSI-X runtime proof helper:
|
||||
|
||||
- `local/scripts/test-msix-qemu.sh`
|
||||
|
||||
These should record both successful and failed hardware runs.
|
||||
|
||||
### Step C — Expand runtime-proof tooling where signal is weak
|
||||
|
||||
The project already has a good pattern for this in the Phase 3/4/5 validation helpers.
|
||||
|
||||
Use the same pattern for low-level controllers:
|
||||
|
||||
- one host-side launcher/check path
|
||||
- one guest-side runtime check path
|
||||
- one doc entry that records what “passing” actually means
|
||||
|
||||
### Step D — Keep the controller plan separate from higher-level desktop work
|
||||
|
||||
Do not let IRQ/IOMMU/controller planning get absorbed into generic Wayland/KDE roadmaps.
|
||||
|
||||
Controller quality must remain measurable at its own layer.
|
||||
|
||||
## Recommended New Documentation Work
|
||||
|
||||
The current project docs should eventually include dedicated runtime-validation companion documents
|
||||
for:
|
||||
|
||||
- MSI-X validation
|
||||
- IOMMU bring-up and fault validation
|
||||
- timer/controller characterization
|
||||
- input/controller runtime evidence
|
||||
|
||||
This document is the umbrella enhancement plan; those would be the execution/validation companions.
|
||||
|
||||
## Current Validation Entry Points
|
||||
|
||||
The following in-tree validation paths now exist and should be treated as the current controller
|
||||
runtime-evidence surface:
|
||||
|
||||
- `local/scripts/test-xhci-irq-qemu.sh --check` — xHCI interrupt-mode proof from QEMU boot logs
|
||||
- `local/scripts/test-msix-qemu.sh` — live MSI-X proof via `virtio-net`
|
||||
- `local/scripts/test-iommu-qemu.sh --check` — AMD IOMMU device visibility plus guest boot reachability
|
||||
- `local/scripts/test-usb-storage-qemu.sh` — USB mass-storage autospawn probe (currently still an
|
||||
active blocker path)
|
||||
|
||||
## Bottom Line
|
||||
|
||||
Red Bear OS does **not** need a new IRQ/controller architecture.
|
||||
|
||||
It already has the correct architectural direction:
|
||||
|
||||
- scheme-based userspace IRQ delivery
|
||||
- safe Rust driver wrappers
|
||||
- PCI/MSI-X support
|
||||
- IOMMU direction
|
||||
- ACPI/APIC groundwork
|
||||
|
||||
What it needs now is disciplined completion work in this order:
|
||||
|
||||
1. MSI-X runtime proof
|
||||
2. IOMMU hardware validation
|
||||
3. IRQ observability and affinity proof
|
||||
4. input/controller runtime evidence
|
||||
5. timer/controller characterization
|
||||
|
||||
The main quality risk is no longer missing design. It is over-claiming readiness before low-level
|
||||
controller runtime evidence exists.
|
||||
@@ -0,0 +1,264 @@
|
||||
# Red Bear OS Phase 0–3 Reassessment
|
||||
|
||||
## Purpose
|
||||
|
||||
This document reconciles the current public execution plan in `docs/07-RED-BEAR-OS-IMPLEMENTATION-PLAN.md`
|
||||
with the older hardware-oriented roadmap in `local/docs/AMD-FIRST-INTEGRATION.md`.
|
||||
|
||||
The goal is to make Phase 0 through Phase 3 readable in terms of **what is built**, **what is
|
||||
boot/runtime wired**, and **what is actually validated**.
|
||||
|
||||
## Validation States
|
||||
|
||||
- **builds** — code or profile compiles successfully
|
||||
- **boots** — image or service path reaches a usable boot/runtime state
|
||||
- **validated** — behavior has been exercised with real evidence for the claimed scope
|
||||
- **experimental** — available for bring-up but not support-promised
|
||||
|
||||
This repo should not treat “compiles” as equivalent to “validated”.
|
||||
|
||||
## Why this reassessment exists
|
||||
|
||||
Two active documents describe the early Red Bear roadmap differently:
|
||||
|
||||
- `docs/07-RED-BEAR-OS-IMPLEMENTATION-PLAN.md` is the canonical public execution plan.
|
||||
- `local/docs/AMD-FIRST-INTEGRATION.md` is the older AMD-first technical roadmap.
|
||||
|
||||
They are both useful, but they number phases differently:
|
||||
|
||||
- `docs/07` uses a product-enablement framing (`Phase 1` repository/profile structure, `Phase 2`
|
||||
minimal-system baseline, `Phase 3` driver/runtime substrate).
|
||||
- `AMD-FIRST` uses a hardware-enablement framing (`P0` ACPI boot, `P1` driver infrastructure,
|
||||
`P2` AMD display, `P3` input + POSIX).
|
||||
|
||||
This document is the bridge for Phase 0–3 discussions.
|
||||
|
||||
## Phase 0 — Bare-metal boot and ACPI baseline
|
||||
|
||||
### Source of truth
|
||||
|
||||
- `local/docs/AMD-FIRST-INTEGRATION.md`
|
||||
- Root `AGENTS.md` status summary
|
||||
|
||||
### Scope
|
||||
|
||||
- AMD bare-metal bootability
|
||||
- ACPI checksums and table handling
|
||||
- shutdown/reboot/power-method support
|
||||
- SMP/x2APIC-era platform readiness
|
||||
|
||||
### Current status
|
||||
|
||||
- **builds** — yes
|
||||
- **boots** — yes
|
||||
- **validated** — yes, at the platform/boot level described in the AMD-first notes
|
||||
|
||||
### Notes
|
||||
|
||||
Phase 0 is not part of the public `docs/07` numbering, but it remains a real prerequisite in the
|
||||
AMD-first implementation history and should stay visible when discussing early Red Bear progress.
|
||||
|
||||
## Phase 1 — Repository discipline and profile reproducibility
|
||||
|
||||
### Source of truth
|
||||
|
||||
- `docs/07-RED-BEAR-OS-IMPLEMENTATION-PLAN.md`
|
||||
- `local/docs/repo-governance.md`
|
||||
- `local/docs/PROFILE-MATRIX.md`
|
||||
|
||||
### Scope
|
||||
|
||||
- tracked profile definitions
|
||||
- shared config fragments instead of duplicated wiring
|
||||
- helper scripts aligned with tracked profiles
|
||||
- support-language and validation-language rules
|
||||
|
||||
### Current status
|
||||
|
||||
- **builds** — yes
|
||||
- **boots** — indirectly supported by later profile builds
|
||||
- **validated** — partially, in the sense that `redbear-minimal` and `redbear-desktop` were used as
|
||||
reproducibility targets during the Phase 1 cleanup
|
||||
|
||||
### Implemented evidence
|
||||
|
||||
- `config/redbear-*.toml` shared fragment refactor
|
||||
- `local/docs/repo-governance.md`
|
||||
- `local/docs/PROFILE-MATRIX.md`
|
||||
- `local/scripts/build-redbear.sh` profile coverage updates
|
||||
|
||||
### Remaining caution
|
||||
|
||||
Phase 1 is structurally in good shape, but support labels still need to be used consistently in
|
||||
phase-level docs.
|
||||
|
||||
## Phase 2 — Minimal-system baseline
|
||||
|
||||
### Source of truth
|
||||
|
||||
- `docs/07-RED-BEAR-OS-IMPLEMENTATION-PLAN.md`
|
||||
- `local/docs/NETWORKING-RTL8125-NETCTL.md`
|
||||
- `local/docs/REDBEAR-INFO-RUNTIME-REPORT.md`
|
||||
|
||||
### Scope
|
||||
|
||||
- bootable minimal profile
|
||||
- package-management baseline
|
||||
- VM networking baseline
|
||||
|
||||
### Current status
|
||||
|
||||
- **builds** — yes
|
||||
- **boots** — helper and validation surfaces now exist for the VM path
|
||||
- **validated** — partially; the repo now has explicit validation helpers, but this still needs
|
||||
continued real runtime use to graduate from baseline bring-up to stronger support claims
|
||||
|
||||
### Implemented evidence
|
||||
|
||||
- `redbear-minimal` enables `wired-dhcp` by default
|
||||
- `redbear-info` reports VirtIO VM networking visibility
|
||||
- `local/scripts/validate-vm-network-baseline.sh`
|
||||
- `local/scripts/test-vm-network-qemu.sh`
|
||||
- `local/scripts/test-vm-network-runtime.sh`
|
||||
|
||||
### Remaining caution
|
||||
|
||||
Phase 2 should continue to be described as a **baseline**. It now has build-time, launch-time, and
|
||||
runtime check paths, but that is still not the same as broad hardware validation.
|
||||
|
||||
## Phase 3 — Driver and runtime substrate
|
||||
|
||||
### Source of truth
|
||||
|
||||
- `docs/07-RED-BEAR-OS-IMPLEMENTATION-PLAN.md`
|
||||
- `local/docs/AMD-FIRST-INTEGRATION.md`
|
||||
|
||||
### Correct framing
|
||||
|
||||
The public plan's wording is the correct top-level framing:
|
||||
|
||||
> **Driver and runtime substrate**
|
||||
|
||||
The AMD-first wording remains useful as a lower-level technical breakdown:
|
||||
|
||||
> **Input + POSIX**
|
||||
|
||||
These are not competing scopes. The second explains the concrete components that fulfill the first.
|
||||
|
||||
### Scope
|
||||
|
||||
- shared driver substrate already built in-tree
|
||||
- firmware loading available as runtime infrastructure
|
||||
- input/runtime prerequisites such as `evdevd` and `udev-shim`
|
||||
- relibc POSIX surfaces required by downstream consumers
|
||||
|
||||
### Current status
|
||||
|
||||
- **builds** — yes for the major in-tree Phase 3 components
|
||||
- **boots** — partially wired via profile/service configuration
|
||||
- **validated** — not yet at the level needed to call the substrate runtime-proven end to end
|
||||
|
||||
### Built evidence already in tree
|
||||
|
||||
- `local/recipes/drivers/redox-driver-sys/`
|
||||
- `local/recipes/drivers/linux-kpi/`
|
||||
- `local/recipes/system/firmware-loader/`
|
||||
- `local/recipes/system/evdevd/`
|
||||
- `local/recipes/system/udev-shim/`
|
||||
- `local/patches/relibc/P3-*.patch`
|
||||
|
||||
### Real remaining work
|
||||
|
||||
The main remaining Phase 3 task is not “invent the substrate” — it already exists in-tree. The
|
||||
real gap is **runtime and downstream-consumer validation**:
|
||||
|
||||
- prove the relibc POSIX surfaces against actual consumers
|
||||
- prove the input path from Redox input sources through `evdevd` and `udev-shim`
|
||||
- keep Phase 3 distinct from later graphics/Wayland/KDE work
|
||||
|
||||
### Current runtime-validation helpers
|
||||
|
||||
- `./local/scripts/test-phase3-runtime-substrate.sh` — in-guest runtime check for
|
||||
`firmware-loader`, `udev-shim`, `evdevd`, and their scheme surfaces
|
||||
- `redbear-info --verbose` — passive runtime evidence for installed/active integrations
|
||||
|
||||
### Runtime evidence gathered during reassessment
|
||||
|
||||
- `redbear-desktop` was booted successfully in QEMU with x86_64 UEFI firmware and reached a real
|
||||
login prompt over the serial console.
|
||||
- `pcid-spawner` successfully spawned `virtio-netd` during the guest boot sequence.
|
||||
- `firmware-loader` registered `scheme:firmware` without crashing, even with an empty
|
||||
`/usr/firmware/` directory.
|
||||
- `evdevd` registered `scheme:evdev` and `udev-shim` registered `scheme:udev` during the same
|
||||
guest boot.
|
||||
- `redbear-info --json` inside the guest reported `virtio_net_present: true`, a configured
|
||||
`eth0` address, and live firmware/udev integration evidence.
|
||||
|
||||
## Recommended interpretation going forward
|
||||
|
||||
When discussing the roadmap publicly:
|
||||
|
||||
- use `docs/07` phase numbering as canonical
|
||||
- treat `AMD-FIRST` phase numbering as historical hardware-roadmap context
|
||||
- always attach validation language (`builds`, `boots`, `validated`, `experimental`) to claims
|
||||
|
||||
## Summary
|
||||
|
||||
Phase 0 is the AMD-first bare-metal boot foundation.
|
||||
|
||||
Phase 1 is structurally implemented and largely cleaned up.
|
||||
|
||||
Phase 2 now has an actual VM-network baseline with repo, launch, and in-guest validation helpers.
|
||||
|
||||
One practical caveat surfaced during reassessment: the QEMU launch helper also depends on usable
|
||||
x86_64 UEFI firmware on the host. When that firmware is missing, the failure mode is a host-side
|
||||
SeaBIOS/iPXE fallback rather than a guest-side Red Bear runtime failure, so the helper now checks
|
||||
for that prerequisite explicitly.
|
||||
|
||||
Phase 3 should be understood as **runtime-substrate validation and wiring**, not as a brand-new
|
||||
infrastructure buildout from zero.
|
||||
|
||||
## Quality Assessment
|
||||
|
||||
### Planning quality
|
||||
|
||||
**Strong points**
|
||||
|
||||
- The public plan in `docs/07` is clearer and more execution-oriented than the older roadmap.
|
||||
- Phase 1 and Phase 2 now have concrete helper scripts and docs instead of relying on implicit
|
||||
operator knowledge.
|
||||
- The profile matrix and governance docs substantially reduce ambiguity about what each tracked
|
||||
profile is supposed to represent.
|
||||
|
||||
**Weak points**
|
||||
|
||||
- Historical phase numbering from `AMD-FIRST-INTEGRATION.md` still differs from the newer public
|
||||
plan, which can confuse progress reporting if the bridge document is not consulted.
|
||||
- Some status language across the repo still tends to overvalue “builds” relative to “validated”.
|
||||
|
||||
### Implementation quality
|
||||
|
||||
**Strong points**
|
||||
|
||||
- Shared Red Bear config fragments reduced duplication in tracked profiles.
|
||||
- The VM-network baseline now has layered validation surfaces: repo-level, launcher-level, and
|
||||
in-guest runtime checks.
|
||||
- `redbear-info` remains aligned with real integration changes instead of becoming stale.
|
||||
|
||||
**Weak points**
|
||||
|
||||
- Runtime validation is still thinner than build validation across the early phases.
|
||||
- Some local operating docs needed follow-up cleanup to reflect the newer scripts and profile set.
|
||||
|
||||
### Recommendation
|
||||
|
||||
For Phase 0–3 work, prefer closing validation gaps and documentation drift before adding new scope.
|
||||
The early-phase codebase is in a much better structural state now; the main quality risk is no
|
||||
longer missing packages, but overstating readiness before runtime evidence exists.
|
||||
|
||||
## Phase 4 Handoff Note
|
||||
|
||||
Phase 4 should begin from the existing `wayland.toml` profile, not by jumping straight to KWin.
|
||||
The current repo already contains the `smallvil`, `cosmic-comp`, `qtwayland`, and Mesa software
|
||||
rendering pieces; the highest-value next work is validating the `orbital-wayland` → `smallvil`
|
||||
runtime path on QEMU/VirtIO and only then widening to heavier compositor/session stacks.
|
||||
@@ -0,0 +1,631 @@
|
||||
# Red Bear OS relibc Completeness and Enhancement Plan
|
||||
|
||||
## Purpose
|
||||
|
||||
This document assesses relibc in Red Bear OS for **strengths**, **deficiencies**, **subsystem-facing
|
||||
gaps**, and **overall quality**, then defines a practical plan for improving it.
|
||||
|
||||
The goal is not to treat relibc as a generic libc project. The goal is to describe:
|
||||
|
||||
- what is already strong,
|
||||
- what exists only through local patch carriers,
|
||||
- what is still incomplete or weak,
|
||||
- what downstream subsystems still depend on relibc improvement,
|
||||
- and what order of work best improves real system capability.
|
||||
|
||||
This is a Red Bear-specific document. It is grounded in the current repo state rather than older,
|
||||
pre-correction roadmap assumptions.
|
||||
|
||||
## Evidence Model
|
||||
|
||||
This plan uses four evidence buckets and does **not** treat them as equivalent:
|
||||
|
||||
- **source-visible** — behavior visible directly in the current relibc source tree
|
||||
- **patch-carried** — behavior carried in `local/patches/relibc/P3-*.patch`
|
||||
- **build-visible downstream** — downstream packages now compile because the libc surface exists
|
||||
- **runtime-validated** — behavior has been exercised successfully in real downstream/runtime paths
|
||||
|
||||
This distinction matters because relibc’s current problem is often **not** “API absent,” but the gap
|
||||
between **implemented**, **patch-carried**, **build-proven**, and **runtime-trusted**.
|
||||
|
||||
## Upstream vs Red Bear ownership
|
||||
|
||||
For relibc, the ownership boundary must stay explicit:
|
||||
|
||||
- `recipes/core/relibc/source/` is the live upstream-owned working tree used for actual build and
|
||||
validation
|
||||
- `local/patches/relibc/P3-*.patch` is the Red Bear-owned durable carrier for relibc changes
|
||||
- `local/docs/...` is the durable explanation of what those changes mean and how to reapply them
|
||||
|
||||
That means a relibc change is not truly preserved until it exists in **both** places:
|
||||
|
||||
1. the live relibc source tree, so the current build can prove it
|
||||
2. the `local/patches/relibc/` carrier set, so the same result can be recreated after an upstream refresh
|
||||
|
||||
The repo standard for success is not merely “the current source tree builds.” The standard is:
|
||||
|
||||
> we can fetch fresh upstream relibc sources, reapply the Red Bear relibc patch carriers, and still
|
||||
> rebuild the same working result.
|
||||
|
||||
Any relibc work that exists only under `recipes/core/relibc/source/` should therefore be treated as
|
||||
validated-but-not-yet-preserved.
|
||||
|
||||
Because relibc is also one of the fastest-moving upstream areas, Red Bear should apply one more
|
||||
rule here:
|
||||
|
||||
> if a Red Bear relibc patch solves a problem that upstream has already solved, prefer the upstream
|
||||
> solution and retire or reduce the local patch.
|
||||
|
||||
The goal is durable compatibility, not a permanent relibc fork.
|
||||
|
||||
## Current Repo State
|
||||
|
||||
> **Implementation note (current Red Bear tree):** this repo pass moved several relibc items from
|
||||
> patch-carried-only or downstream-workaround status into source-visible libc behavior. The current
|
||||
> tree now contains source-visible `signalfd`, `timerfd`, `eventfd`, `open_memstream`,
|
||||
> `F_DUPFD_CLOEXEC`, `MSG_NOSIGNAL`, a bounded `waitid()` path, bounded `RLIMIT_NOFILE` /
|
||||
> `RLIMIT_MEMLOCK` behavior, a bounded `eth0`-backed `net/if.h` / `ifaddrs.h` view, a source-visible
|
||||
> `resolv.h` plus bounded `res_query()` / `res_search()` compatibility paths with receive/send
|
||||
> timeout hardening, a first named-semaphore implementation on top of the existing shm path, and
|
||||
> bounded `sys/ipc.h` / `sys/shm.h` surfaces for the `IPC_PRIVATE` / `shmget` / `shmat` /
|
||||
> `shmdt` / `shmctl(IPC_RMID)` workflow.
|
||||
|
||||
> **Downstream validation note (current Red Bear tree):** `libwayland` now cooks successfully
|
||||
> against the updated relibc, and qtbase now configures, builds, and stages with
|
||||
> `FEATURE_process=ON`, `FEATURE_sharedmemory=ON`, and `FEATURE_systemsemaphore=ON` in the current
|
||||
> tree. The relibc `tests/` harness also now builds focused Redox-target binaries for `eventfd`,
|
||||
> `waitid`, `res_init`, `res_query`, `sem_open`, and `shmget`, and the host-target variants of those same
|
||||
> focused tests now execute successfully under the relibc-built host sysroot. That does not mean
|
||||
> relibc is complete, but it does mean the implementation has crossed real downstream build/stage
|
||||
> gates and direct execution-level proof rather than remaining an isolated libc-only pass. The
|
||||
> current host-side `res_query` proof is still bounded: it compiles, runs, and fails fast under the
|
||||
> relibc sysroot instead of hanging, but it is not yet a runtime-trusted downstream DNS proof.
|
||||
>
|
||||
> **Additional downstream proof (current Red Bear tree):** the in-tree `openssh` recipe now cooks
|
||||
> successfully against the relibc resolver surface after switching the recipe to the rebuilt relibc
|
||||
> headers/libraries and removing stale Redox-specific resolver fallbacks from the OpenSSH patch.
|
||||
> That is still build/stage proof rather than runtime SSH validation, but it demonstrates that real
|
||||
> consumers can now compile and link `res_init`, `res_query`, and `dn_expand` from relibc.
|
||||
>
|
||||
> **Fresh revalidation pass (current Red Bear tree):** the focused host-side relibc proofs were
|
||||
> rerun for `eventfd`, `waitid`, `res_init`, `res_query`, `sem_open`, and `shmget`; the binaries all
|
||||
> built, and the executions succeeded for `eventfd`, `waitid`, `res_init`, `sem_open`, and `shmget`
|
||||
> with the bounded `res_query` test still failing fast rather than hanging. The main downstream
|
||||
> consumers previously used as evidence were also rerun successfully: `CI=1 ./target/release/repo cook libwayland`,
|
||||
> `CI=1 ./target/release/repo cook qtbase`, and `CI=1 ./target/release/repo cook openssh` now all
|
||||
> succeed in the current tree.
|
||||
>
|
||||
> **Additional focused coverage (current Red Bear tree):** integrated relibc tests were also added
|
||||
> for `open_memstream`, SysV semaphores via `semget`/`semop`/`semctl`, `timerfd`, and `signalfd`.
|
||||
> On the host-side relibc sysroot, `open_memstream` and `semget` execute successfully, while the
|
||||
> `timerfd` and `signalfd` tests currently report bounded unavailability in that host environment
|
||||
> rather than hanging or crashing. That still falls short of Redox runtime proof for those two
|
||||
> non-POSIX APIs, but it moves them from source-visible-only status into the explicit test harness.
|
||||
>
|
||||
> **Fresh-upstream reapply proof (current Red Bear tree):** a fresh `repo unfetch relibc` →
|
||||
> `repo fetch relibc` cycle was used to reconstruct the relibc source tree from upstream-owned
|
||||
> sources, the durable `local/patches/relibc/` carrier set was reapplied to that fresh tree, and the
|
||||
> resulting rebuild again supported successful downstream `libwayland` and `qtbase` cooks. That is
|
||||
> the current proof that Red Bear’s relibc work is not only buildable in-place, but also recoverable
|
||||
> after a fresh upstream source refresh.
|
||||
|
||||
> **Current reconstructed-state proof set:** with the refreshed source tree rebuilt from the local
|
||||
> relibc overlay set, the repo now has successful cookbook evidence for all three layers in order:
|
||||
> `CI=1 ./target/release/repo cook relibc`, then `CI=1 ./target/release/repo cook libwayland`, then
|
||||
> `CI=1 ./target/release/repo cook qtbase`. This is the strongest current proof that the relibc
|
||||
> compatibility work is preserved in the right place for long-term maintenance.
|
||||
|
||||
### Summary
|
||||
|
||||
relibc is one of Red Bear’s strongest foundational subsystems, but it is not complete.
|
||||
|
||||
The current repo shows a relibc that is already strong in:
|
||||
|
||||
- broad header/libc surface coverage
|
||||
- real Redox-native platform integration
|
||||
- source-visible implementations of the historical Wayland-facing P3 APIs, with patch carriers still retained as sync/upstream artifacts
|
||||
- enough maturity to unlock major build-side progress in Wayland, Qt, and KDE
|
||||
- a substantial generic upstream-style test tree
|
||||
|
||||
The current repo also shows relibc is still weak in:
|
||||
|
||||
- shared memory / SysV IPC completeness
|
||||
- named semaphores
|
||||
- process/runtime quality for some downstreams
|
||||
- networking/resolver/interface completeness
|
||||
- Redox-target and downstream-runtime validation depth
|
||||
|
||||
### Status Matrix
|
||||
|
||||
| Area | State | Notes |
|
||||
|---|---|---|
|
||||
| Core POSIX/header breadth | **strong / partial** | Large header surface exists, but many TODO headers and feature gaps remain |
|
||||
| Wayland-facing P3 APIs | **implemented / source-visible / runtime-unproven** | `signalfd`, `timerfd`, `eventfd`, `open_memstream`, socket flags, and `F_DUPFD_CLOEXEC` now exist in the relibc source tree; runtime proof still trails build integration |
|
||||
| Networking/libc socket surface | **usable / partial** | AF_INET/AF_UNIX paths exist, but interface/reporting/resolver behavior remains narrow |
|
||||
| Qt/KDE downstream unblockers | **build-side improved / multiple gates crossed** | `QProcess`, `QSharedMemory`, and `QSystemSemaphore` now configure, build, and stage on in-tree qtbase; broader runtime validation is still needed |
|
||||
| Shared memory / semaphore completeness | **partial** | `shm_open` exists through the Redox shm path, but SysV IPC/shared-memory and named semaphore completeness remain open |
|
||||
| Process/runtime completeness | **partial** | Some process-facing functionality still uses stubs or downstream workarounds |
|
||||
| Dedicated test surface | **present / Redox-specific coverage still thin** | relibc has a substantial `source/tests/` tree, but the Red Bear-visible Redox/P3/runtime validation story is still weaker than the generic libc test surface |
|
||||
| Runtime validation against real consumers | **insufficient** | Still weaker than build-side evidence |
|
||||
|
||||
## Strong Points
|
||||
|
||||
### 1. relibc already exposes a broad libc/header surface
|
||||
|
||||
`recipes/core/relibc/source/src/header/mod.rs` shows a broad libc/header tree with networking,
|
||||
threading, polling, stdio, locale, signal, socket, time, and many Unix-facing modules already
|
||||
present.
|
||||
|
||||
That means Red Bear should treat relibc work as **quality and completeness hardening**, not as a
|
||||
greenfield libc effort.
|
||||
|
||||
### 2. The historical P3 Wayland-facing API bridge is now source-visible
|
||||
|
||||
The local relibc patch carriers documented the APIs that historically blocked Wayland and downstream
|
||||
consumers. Some of those fixes are still Red Bear-owned overlays; others are now present upstream and
|
||||
should no longer be carried locally.
|
||||
|
||||
- `local/patches/relibc/P3-signalfd.patch`
|
||||
- `local/patches/relibc/P3-timerfd.patch`
|
||||
- `local/patches/relibc/P3-eventfd.patch`
|
||||
- `local/patches/relibc/P3-waitid.patch`
|
||||
|
||||
The remaining Red Bear-owned relibc carriers currently add or complete:
|
||||
|
||||
- `signalfd` / `signalfd4`
|
||||
- `timerfd_create` / `timerfd_settime` / `timerfd_gettime`
|
||||
- `eventfd` / `eventfd_read` / `eventfd_write`
|
||||
- bounded `waitid()`
|
||||
- bounded `sys/ipc.h`, `sys/sem.h`, and `sys/shm.h` compatibility layers
|
||||
- focused relibc IPC tests needed to keep those overlays validated after upstream refresh
|
||||
|
||||
The upstream-first policy still applies here, but the durable patch-carrier set should be trimmed
|
||||
only when a fresh upstream refetch plus reapply plus downstream rebuild actually proves the upstream
|
||||
coverage is sufficient. In the current Red Bear tree, `open_memstream`, `F_DUPFD_CLOEXEC`, and the
|
||||
socket flag work still need to remain in the relibc overlay set because the clean reconstructed
|
||||
consumer path still depends on them.
|
||||
|
||||
This is one of relibc’s strongest current points: Red Bear already has the exact P3 compatibility
|
||||
surface that older docs used to describe as absent.
|
||||
|
||||
The local patches still matter as provenance and sync-upstream carriers for the gaps upstream does
|
||||
not yet solve, but they should be retired as soon as upstream makes them redundant.
|
||||
|
||||
### 3. Downstream build progress proves relibc is materially useful
|
||||
|
||||
The current docs consistently show that relibc has already enabled substantial downstream progress:
|
||||
|
||||
- `docs/02-GAP-ANALYSIS.md` now marks the P3 bridge as implemented in-tree, with runtime validation still pending
|
||||
- `docs/03-WAYLAND-ON-REDOX.md` says the build-side relibc/libwayland bridge is restored and that the remaining blocker is runtime validation, not basic POSIX availability
|
||||
- `local/docs/QT6-PORT-STATUS.md` treats many earlier relibc blockers as moved from “missing” to “present but still needs downstream validation”
|
||||
|
||||
This is a major quality signal: relibc is already strong enough to unlock real build-side subsystem work.
|
||||
|
||||
### 4. relibc already has a substantial generic test surface
|
||||
|
||||
`recipes/core/relibc/source/tests/` is real and large. It already covers many libc-facing areas such
|
||||
as:
|
||||
|
||||
- `fcntl/`
|
||||
- `net/` and `netdb/`
|
||||
- `pthread/`
|
||||
- `stdio/`
|
||||
- `sys_mman/`
|
||||
- `sys_socket/`
|
||||
- `sys_resource/`
|
||||
- `time/`
|
||||
- `unistd/`
|
||||
|
||||
That is a genuine strength and should be documented as one.
|
||||
|
||||
The remaining weakness is narrower: Red Bear still lacks a strong **Redox-target / P3 API /
|
||||
downstream-runtime** validation story that is as visible and deliberate as this generic relibc test
|
||||
tree.
|
||||
|
||||
### 5. The current relibc problem is no longer one single blocker
|
||||
|
||||
The downstream evidence shows that relibc now has **multiple completeness fronts**:
|
||||
|
||||
- Wayland-facing POSIX/event APIs
|
||||
- Qt/KDE shared memory and semaphore support
|
||||
- process-facing behavior such as `waitid()`
|
||||
- networking/resolver completeness
|
||||
- legacy but still-consumed items such as `sigjmp_buf` and locale/runtime edges
|
||||
|
||||
That means the right enhancement plan is no longer “finish one missing API and unblock everything.”
|
||||
The work has to be triaged by downstream impact.
|
||||
|
||||
### 6. The Redox networking model is reflected in relibc
|
||||
|
||||
`recipes/core/relibc/source/src/platform/redox/socket.rs` shows a real Redox-native socket/path
|
||||
model instead of a pure stub implementation. That is another strong point: relibc already knows
|
||||
about Redox-native runtime behavior.
|
||||
|
||||
## Deficiencies and Gaps
|
||||
|
||||
### 1. Header coverage is still incomplete in visible source
|
||||
|
||||
`recipes/core/relibc/source/src/header/mod.rs` still contains a meaningful backlog of TODO or absent
|
||||
header surfaces, including examples such as:
|
||||
|
||||
- `iconv.h`
|
||||
- `mqueue.h`
|
||||
- `spawn.h`
|
||||
- `sys/msg.h`
|
||||
- `threads.h`
|
||||
- `wordexp.h`
|
||||
|
||||
Some of these are lower-value than others, but they still show that relibc has real completeness work left.
|
||||
|
||||
### 2. Named semaphores are now source-visible, but still incomplete
|
||||
|
||||
`recipes/core/relibc/source/src/header/semaphore/mod.rs` is still a clear example of partial completeness.
|
||||
|
||||
Basic unnamed semaphore paths exist (`sem_init`, `sem_post`, `sem_wait`, `sem_timedwait`, etc.),
|
||||
and the named semaphore path is now source-visible too:
|
||||
|
||||
- `sem_open`
|
||||
- `sem_close`
|
||||
- `sem_unlink`
|
||||
|
||||
These are now implemented on top of the existing shm path instead of left as raw `todo!()` stubs.
|
||||
|
||||
The remaining weakness is semantic and validation depth, not pure absence:
|
||||
|
||||
- broader POSIX semaphore semantics are still not strongly runtime-validated
|
||||
- downstream configure/runtime behavior still needs continued confirmation
|
||||
- the SysV semaphore surface remains thinner than a full Unix implementation
|
||||
|
||||
This directly affects downstream consumers such as `QSystemSemaphore`.
|
||||
|
||||
### 3. Shared memory is present, but not complete enough for downstream GUI/runtime work
|
||||
|
||||
The current relibc source already exposes one meaningful shared-memory path:
|
||||
|
||||
- `recipes/core/relibc/source/src/header/sys_mman/mod.rs` provides `shm_open()` and `shm_unlink()`
|
||||
- on Redox, that path resolves to `/scheme/shm/`
|
||||
- `recipes/core/base/source/ipcd/src/shm.rs` implements the backing shared-memory scheme
|
||||
|
||||
That is a real strength and should not be described as “shared memory absent.”
|
||||
|
||||
The real gap is that shared-memory completeness is still insufficient for broader downstream use:
|
||||
|
||||
- the source tree now has visible `sys/shm.h` / `sys/ipc.h` / `sys/sem.h` modules, but they remain bounded rather than comprehensive
|
||||
- Qt/KDE-facing docs still treat `shm_open()` / `shmget()`-class behavior as unresolved enough to block full `QSharedMemory` confidence
|
||||
- the current repo still lacks a strong end-to-end validation story for these paths in desktop consumers
|
||||
|
||||
### 4. Resolver and interface-networking completeness are still uneven
|
||||
|
||||
The downstream scan shows that networking-facing userland still hits relibc gaps beyond raw socket
|
||||
basics.
|
||||
|
||||
Examples from downstream recipes and docs:
|
||||
|
||||
- `recipes/wip/qt/qtbase/recipe.toml` still leaves QtNetwork disabled because of broader networking/runtime concerns such as `in6_pktinfo` and richer interface semantics, even though minimal `resolv.h` and `arpa/nameser.h` surfaces now exist
|
||||
- `recipes/net/openssh/recipe.toml` and its patch history still call out `resolv.h`
|
||||
- `recipes/wip/terminal/tmux/redox.patch` comments out `resolv.h`
|
||||
- `recipes/libs/glib/redox.patch` still touches resolver-facing includes
|
||||
|
||||
### 5. The networking surface is narrower than generic Unix software expects
|
||||
|
||||
The current source still shows important limits that should be named directly:
|
||||
|
||||
- `recipes/core/relibc/source/src/platform/redox/socket.rs` has AF_INET / AF_UNIX socket handling
|
||||
- `recipes/core/relibc/source/src/header/net_if/mod.rs` now exposes a bounded `eth0`-backed interface view instead of a permanent `stub`
|
||||
- `recipes/core/relibc/source/src/header/ifaddrs/mod.rs` now provides a bounded `eth0`-backed `getifaddrs()` path instead of pure `ENOSYS`
|
||||
- source-visible `resolv.h` / `arpa/nameser.h` plus bounded `res_query()` / `res_search()` compatibility are now present, and at least one real downstream (`openssh`) now builds against them, but broader resolver compatibility is still incomplete
|
||||
|
||||
That is enough to support the current Red Bear native network path in a bounded sense, but it is not
|
||||
yet strong enough to claim broad interface-aware compatibility for higher-level consumers. Resolver/
|
||||
header gaps and interface-model assumptions still show up in ports such as QtNetwork, OpenSSH,
|
||||
tmux, glib, curl, and libuv.
|
||||
|
||||
### 6. Process/runtime completeness is still uneven
|
||||
|
||||
The repo still has process/runtime unevenness, but one meaningful consumer-facing gap has now moved:
|
||||
|
||||
- relibc now provides a bounded `waitid()` implementation over the existing `waitpid` path
|
||||
- the old Qt-side injected `waitid()` stub has been retired from the Qt recipe layer
|
||||
|
||||
The source state needs to be classified carefully:
|
||||
|
||||
- `sigjmp_buf` exists in `recipes/core/relibc/source/include/setjmp.h`, so older downstream comments treating it as absent are better read as compatibility/staleness signals rather than primary source truth
|
||||
- `getgroups()` has a Redox implementation path in `platform/redox/mod.rs`
|
||||
- `getrlimit()` is no longer a pure placeholder for all consumers: Red Bear now has bounded `RLIMIT_NOFILE` and `RLIMIT_MEMLOCK` behavior, but broader resource-limit completeness is still weak
|
||||
|
||||
So process/runtime completeness should be treated as a real subsystem-quality track, but the plan
|
||||
must distinguish **missing**, **implemented but weak**, and **stale downstream complaint**.
|
||||
|
||||
### 7. Source quality still contains many TODO / unimplemented branches
|
||||
|
||||
The current source has a large amount of unfinished or explicitly deferred behavior across:
|
||||
|
||||
- `pthread`
|
||||
- `time`
|
||||
- `unistd`
|
||||
- `platform/redox`
|
||||
- `epoll`
|
||||
- `ptrace`
|
||||
- locale and stdio internals
|
||||
|
||||
This does not mean relibc is unusable. It means completeness and quality work now needs a stronger
|
||||
triage model instead of treating all missing items as equally important.
|
||||
|
||||
### 8. Redox-target and downstream validation remain thin relative to subsystem importance
|
||||
|
||||
The current repo already contains a substantial generic relibc test tree, but the Red Bear-visible
|
||||
validation story is still thin in the areas that matter most for current subsystem unblockers.
|
||||
|
||||
Right now much of relibc’s confidence in the Red Bear docs still comes from:
|
||||
|
||||
- source inspection
|
||||
- patch carriers
|
||||
- build-side downstream success
|
||||
- limited runtime validation via downstream stacks
|
||||
|
||||
That is not enough for a component as central as libc, especially for the Redox-target and
|
||||
downstream-consumer paths Red Bear depends on.
|
||||
|
||||
## Downstream-Blocking Gaps by Subsystem
|
||||
|
||||
### Wayland
|
||||
|
||||
The old “basic POSIX APIs are missing” story is no longer the main one.
|
||||
|
||||
Current state:
|
||||
|
||||
- `signalfd`, `timerfd`, `eventfd`, `open_memstream`, and key socket flags are now source-visible in relibc and still tracked by patch carriers for sync/upstream purposes
|
||||
- the current bounded `waitid()` path is also preserved as a relibc patch carrier so it can be reapplied after upstream refresh
|
||||
- `libwayland` now rebuilds with a much smaller Redox patch
|
||||
|
||||
Remaining blocker:
|
||||
|
||||
- runtime validation of the full relibc -> libwayland -> compositor path
|
||||
|
||||
So the current relibc task for Wayland is primarily **runtime proof and patch reduction**, not just
|
||||
adding obvious libc symbols.
|
||||
|
||||
Current Red Bear evidence is stronger than before: `libwayland` now cooks successfully against the
|
||||
updated relibc tree, which means the generated `sys/signalfd.h`, `sys/timerfd.h`, `sys/eventfd.h`,
|
||||
and `stdio.h`/`sys/socket.h` surfaces are now sufficient for at least one major downstream consumer.
|
||||
|
||||
### Qt / KDE
|
||||
|
||||
The Qt/KDE-facing relibc backlog is still substantial.
|
||||
|
||||
The biggest libc-facing gaps are:
|
||||
|
||||
- shared memory (`shm_open` / `shmget`) for `QSharedMemory`
|
||||
- named/system semaphores (`sem_open` / `semget`) for `QSystemSemaphore`
|
||||
- stronger process/runtime behavior for `QProcess`
|
||||
- runtime validation of QtNetwork against the current relibc networking surface
|
||||
- resolver/header completeness (`resolv.h`) and network-interface semantics for QtNetwork
|
||||
- broader process/runtime validation after the new bounded `waitid()` path
|
||||
|
||||
This makes Qt/KDE the clearest downstream consumer pushing relibc from “build-capable” toward
|
||||
“desktop-capable”.
|
||||
|
||||
Current Red Bear evidence is stronger than before here too: qtbase now configures, builds, and
|
||||
stages with
|
||||
`FEATURE_process=ON`, `FEATURE_sharedmemory=ON`, and `FEATURE_systemsemaphore=ON` in the current
|
||||
tree. The remaining work is therefore less about “make the feature visible at all” and more about
|
||||
runtime semantics, broader compatibility, and downstream cleanup.
|
||||
|
||||
### Networking and interface-aware software
|
||||
|
||||
The current relibc networking model is usable, but still narrow enough that higher-level consumers
|
||||
keep carrying workarounds or disabled features.
|
||||
|
||||
The newer bounded `eth0`-backed `net_if` / `ifaddrs` work improves the source-visible story, but it
|
||||
is still only a first Red Bear-shaped interface view, not a full generic Unix interface model.
|
||||
|
||||
This is why the plan should treat networking as **usable but still validation-heavy**, not “done”.
|
||||
|
||||
### General userland / server software
|
||||
|
||||
The downstream scan also shows relibc gaps outside graphics:
|
||||
|
||||
- PostgreSQL and some libraries still carry `sigjmp_buf`-related downstream notes that need revalidation against current headers
|
||||
- SQLite still notes `getrlimit()` / `getgroups()` gaps, even though the current source state now splits those two differently
|
||||
- Apache and other ports still touch semaphore or IPC assumptions
|
||||
|
||||
That is important because it means relibc completeness is not only about desktop bring-up. It also
|
||||
affects core application/server breadth.
|
||||
|
||||
### Desktop/session path
|
||||
|
||||
Session and desktop work depends less on one dramatic relibc gap than on overall libc quality:
|
||||
|
||||
- process semantics
|
||||
- IPC completeness
|
||||
- synchronization primitives
|
||||
- runtime interaction with D-Bus/Qt/Wayland consumers
|
||||
|
||||
This is why relibc should be treated as a cross-cutting runtime-quality subsystem, not just a POSIX checklist.
|
||||
|
||||
## Quality Assessment
|
||||
|
||||
### What relibc is good at now
|
||||
|
||||
- broad visible libc/header coverage
|
||||
- practical Redox-native integration rather than fake stubs everywhere
|
||||
- concrete P3 compatibility work for real downstreams
|
||||
- enough maturity to unlock major subsystem builds
|
||||
- a substantial generic test tree
|
||||
|
||||
### What relibc is bad at now
|
||||
|
||||
- uneven implementation depth
|
||||
- too many TODO/unimplemented branches for a component this central
|
||||
- patch-carried functionality that is still not strongly reflected in visible source snapshots
|
||||
- too little Redox-target and downstream-runtime validation relative to the generic test tree
|
||||
- too much downstream confidence still derived from “compiles” instead of “runtime-proven”
|
||||
|
||||
## Enhancement Plan
|
||||
|
||||
### Phase R0 — Evidence and Ownership Cleanup
|
||||
|
||||
**Goal**: Make relibc status honest before widening scope.
|
||||
|
||||
**What to do**:
|
||||
|
||||
- explicitly track relibc claims as `source-visible`, `patch-carried`, `build-proven`, or `runtime-validated`
|
||||
- keep the P3 patch carriers discoverable and documented as canonical until upstreamed
|
||||
- stop describing relibc gaps with outdated “missing basics” language where the code already exists
|
||||
|
||||
**Exit criteria**:
|
||||
|
||||
- subsystem docs consistently distinguish between missing, patch-carried, and runtime-proven relibc behavior
|
||||
|
||||
---
|
||||
|
||||
### Phase R1 — Stabilize the newly source-visible P3 APIs
|
||||
|
||||
**Goal**: Keep the newly source-visible P3 APIs aligned with their patch-carrier and downstream expectations.
|
||||
|
||||
**What to do**:
|
||||
|
||||
- keep `signalfd`, `timerfd`, `eventfd`, `open_memstream`, socket flags, and `F_DUPFD_CLOEXEC` visible and maintained as canonical relibc behavior
|
||||
- reduce downstream assumptions that these APIs are still absent
|
||||
- ensure generated/exported headers stay aligned with the source-visible implementation set
|
||||
|
||||
**Exit criteria**:
|
||||
|
||||
- the repo consistently treats these P3 APIs as source-visible functionality that now needs validation and downstream cleanup rather than invention
|
||||
|
||||
---
|
||||
|
||||
### Phase R2 — Close the shared-memory and semaphore completeness gap
|
||||
|
||||
**Goal**: Unlock the next meaningful Qt/KDE-facing libc surface.
|
||||
|
||||
**What to do**:
|
||||
|
||||
- keep the existing `shm_open` / `/scheme/shm/` path explicit and documented
|
||||
- implement the missing SysV IPC/shared-memory side or document a deliberate non-goal if Red Bear does not want full SysV compatibility
|
||||
- harden and validate the now source-visible named semaphore support (`sem_open`, `sem_close`, `sem_unlink`)
|
||||
- close the specific `QSharedMemory` and `QSystemSemaphore` blockers identified in the Qt docs
|
||||
|
||||
**Exit criteria**:
|
||||
|
||||
- the Qt/KDE docs no longer list shared memory and named semaphores as unresolved relibc blockers
|
||||
|
||||
---
|
||||
|
||||
### Phase R3 — Process/runtime correctness for desktop consumers
|
||||
|
||||
**Goal**: Reduce downstream process workarounds.
|
||||
|
||||
**What to do**:
|
||||
|
||||
- strengthen process-facing libc/runtime behavior enough to remove targeted workarounds such as the Qt `waitid()` shim path
|
||||
- close or intentionally document the remaining `sigjmp_buf` / `getrlimit()` / `getgroups()` quality gaps that still force downstream patches
|
||||
- validate process semantics against real downstream consumers, not only isolated libc expectations
|
||||
|
||||
**Current implementation note:** the bounded `waitid()` path is now source-visible, the old Qt-side
|
||||
`waitid()` shim is gone, and qtbase now configures/builds/stages with process support enabled. The
|
||||
remaining work is broader process/runtime validation and cleanup, not the old total absence of `waitid()`.
|
||||
|
||||
**Exit criteria**:
|
||||
|
||||
- downstream process workarounds are reduced or eliminated for the current desktop stack
|
||||
|
||||
---
|
||||
|
||||
### Phase R4 — Networking/runtime validation
|
||||
|
||||
**Goal**: Turn the current networking surface from “present” into “trusted”.
|
||||
|
||||
**What to do**:
|
||||
|
||||
- validate QtNetwork and similar consumers against the current relibc socket/ioctl/interface model
|
||||
- close the highest-value resolver/header gaps such as `resolv.h` where they are still forcing downstream stubs or disabled modules
|
||||
- evolve the new bounded `eth0`-backed interface-reporting path into a better general Redox interface model where needed
|
||||
- document which current networking semantics are intentionally Redox-specific and which are intended to mimic broader Unix behavior
|
||||
|
||||
**Exit criteria**:
|
||||
|
||||
- at least one meaningful higher-level network consumer is validated against the current relibc networking surface
|
||||
|
||||
---
|
||||
|
||||
### Phase R5 — Dedicated relibc validation expansion
|
||||
|
||||
**Goal**: Improve libc confidence without waiting for whole desktop stacks.
|
||||
|
||||
**What to do**:
|
||||
|
||||
- build a stronger dedicated Redox-target and P3/downstream validation layer on top of the existing generic relibc test tree
|
||||
- ensure new APIs and bugfixes come with focused libc-level tests where practical
|
||||
- keep downstream consumer tests, but stop relying on them as the only quality signal
|
||||
|
||||
**Exit criteria**:
|
||||
|
||||
- relibc has explicit Redox-target and downstream-runtime validation beyond the generic upstream-style test tree
|
||||
|
||||
---
|
||||
|
||||
### Phase R6 — General completeness triage
|
||||
|
||||
**Goal**: Attack the remaining TODO/unimplemented backlog by priority rather than by random header count.
|
||||
|
||||
**What to do**:
|
||||
|
||||
- rank remaining TODO/unimplemented items by downstream subsystem impact
|
||||
- prioritize IPC, synchronization, process, time, and networking correctness over obscure or deprecated headers
|
||||
- keep deprecated/low-value gaps documented, but do not let them drive the roadmap ahead of higher-value runtime work
|
||||
|
||||
**Exit criteria**:
|
||||
|
||||
- relibc backlog is organized by real system impact instead of undifferentiated TODO volume
|
||||
|
||||
## Recommended Order of Work
|
||||
|
||||
The current best order is:
|
||||
|
||||
1. evidence cleanup and canonicalization of what already exists
|
||||
2. shared memory and named semaphores
|
||||
3. process/runtime correctness
|
||||
4. networking/runtime validation
|
||||
5. Redox-target and downstream validation expansion
|
||||
6. broader backlog triage and cleanup
|
||||
|
||||
That order matches the current downstream blocker chain better than a generic “finish all missing headers” strategy.
|
||||
|
||||
## Support-Language Guidance
|
||||
|
||||
Until the runtime-validation phases are materially complete, Red Bear should avoid saying:
|
||||
|
||||
- “relibc POSIX gaps are solved”
|
||||
- “Qt/Wayland blockers are fully gone”
|
||||
- “network/process/shared-memory support is complete”
|
||||
|
||||
Prefer language such as:
|
||||
|
||||
- “consumer-visible P3 APIs are now present, with runtime validation still needed”
|
||||
- “relibc is materially stronger, but desktop-facing completeness work remains”
|
||||
- “the remaining relibc problem is now quality and downstream proof, not just symbol absence”
|
||||
|
||||
## Summary
|
||||
|
||||
relibc is one of Red Bear’s strongest foundational subsystems, but it is not complete.
|
||||
|
||||
Its strongest current qualities are:
|
||||
|
||||
- broad libc/header coverage
|
||||
- real Redox-native platform integration
|
||||
- concrete source-visible and patch-backed solutions to the historical P3 Wayland-facing blockers
|
||||
- clear downstream build progress because of those fixes
|
||||
- a substantial generic test surface
|
||||
|
||||
Its largest remaining weaknesses are:
|
||||
|
||||
- incomplete shared memory and named semaphore support
|
||||
- process/runtime unevenness
|
||||
- networking/resolver/interface completeness gaps
|
||||
- too many TODO/unimplemented branches in central paths
|
||||
- too little Redox-target and downstream-runtime validation relative to the generic test tree
|
||||
|
||||
The correct relibc roadmap is therefore **not** “hunt random missing symbols.” It is to turn the
|
||||
current build-capable libc into a runtime-trusted subsystem by closing the high-value desktop/runtime
|
||||
gaps, strengthening validation, and reducing patch-carried ambiguity.
|
||||
@@ -0,0 +1,393 @@
|
||||
# Red Bear OS relibc IPC Assessment and Improvement Plan
|
||||
|
||||
## Purpose
|
||||
|
||||
This document assesses the current **IPC-related relibc surface** in Red Bear OS and turns that
|
||||
assessment into a concrete improvement plan.
|
||||
|
||||
The focus here is narrower than the general relibc plan:
|
||||
|
||||
- POSIX shared memory and semaphores
|
||||
- System V shared memory and semaphores
|
||||
- missing System V / POSIX IPC areas such as message queues
|
||||
- IPC-adjacent descriptor/event primitives that downstream software treats as part of the same
|
||||
coordination substrate: `eventfd`, `signalfd`, and `timerfd`
|
||||
- the downstream subsystem pressure created by Qt, KDE, Wayland, and related userland
|
||||
|
||||
This is not a generic libc-compliance document. It is grounded in the current repository state.
|
||||
|
||||
## Evidence Model
|
||||
|
||||
This assessment distinguishes four evidence levels:
|
||||
|
||||
- **source-visible** — behavior exists in relibc source now
|
||||
- **test-visible** — behavior is exercised by focused relibc tests
|
||||
- **build-visible downstream** — real consumers compile/link against it
|
||||
- **runtime-validated** — behavior has been exercised in real Redox or consumer runtime paths
|
||||
|
||||
The key IPC problem in the current tree is not simple absence. It is the gap between
|
||||
**source-visible**, **bounded**, **build-proven**, and **runtime-trusted**.
|
||||
|
||||
## Upstream vs Red Bear separation
|
||||
|
||||
For this IPC work, keep the storage model explicit:
|
||||
|
||||
- the live implementation under `recipes/core/relibc/source/src/header/` is the working upstream
|
||||
tree used for builds and tests
|
||||
- the durable Red Bear ownership boundary is `local/patches/relibc/` plus `local/docs/`
|
||||
|
||||
So the IPC implementation is only truly safe when:
|
||||
|
||||
1. the upstream-owned relibc source tree builds with the change now, and
|
||||
2. the same delta is preserved in `local/patches/relibc/` so a fresh upstream refetch can recover it
|
||||
|
||||
This repo should be able to pull renewed upstream sources every day and still rebuild after
|
||||
reapplying the local relibc patch carriers. That requirement is part of the IPC improvement plan,
|
||||
not an afterthought.
|
||||
|
||||
The same section also implies an upstream-preference policy:
|
||||
|
||||
- when upstream relibc already provides the same IPC fix, prefer upstream
|
||||
- keep Red Bear IPC patches only for gaps that upstream still does not solve adequately
|
||||
- review patch carriers regularly and delete or shrink ones made obsolete by upstream evolution
|
||||
|
||||
## Current Implementation Note
|
||||
|
||||
This repo pass did not just assess the IPC surface; it also restored the missing relibc IPC modules
|
||||
that the drafted Red Bear docs were already assuming existed in-tree.
|
||||
|
||||
The current tree now contains source-visible implementations for:
|
||||
|
||||
- `sys/eventfd.h` / `eventfd()` / `eventfd_read()` / `eventfd_write()`
|
||||
- `sys/timerfd.h` / `timerfd_create()` / `timerfd_settime()` / `timerfd_gettime()`
|
||||
- `sys/signalfd.h` / `signalfd()` / `signalfd4()`
|
||||
- `open_memstream()`
|
||||
- bounded `sys/ipc.h`, `sys/shm.h`, and `sys/sem.h` compatibility layers
|
||||
- a bounded `waitid()` path sufficient to satisfy current Qt process-side linking
|
||||
|
||||
This pass also added focused relibc tests for:
|
||||
|
||||
- `stdio/open_memstream`
|
||||
- `sys_sem/semget`
|
||||
- `sys_timerfd/timerfd`
|
||||
- `sys_signalfd/signalfd`
|
||||
|
||||
Current manual verification in this repo pass:
|
||||
|
||||
- `cargo check --target x86_64-unknown-linux-gnu` passes for relibc
|
||||
- host-side focused IPC tests execute successfully for `open_memstream` and `semget`
|
||||
- host-side focused `timerfd` and `signalfd` tests report bounded unavailability rather than hanging
|
||||
- `CI=1 ./target/release/repo cook relibc` completes successfully after clearing a stale stage-dir collision
|
||||
- `CI=1 ./target/release/repo cook qtbase` now succeeds after exporting `eventfd_t` and restoring a bounded `waitid()` path
|
||||
- a fresh `repo unfetch relibc` → `repo fetch relibc` cycle plus reapplication of
|
||||
`local/patches/relibc/` again supports successful downstream `libwayland` and `qtbase` builds,
|
||||
which is the current proof that the relibc IPC overlay is recoverable from refreshed upstream
|
||||
source, not only from the previously edited working tree
|
||||
|
||||
In other words, the current relibc IPC work is no longer just “working in the checked-out source
|
||||
tree”. It is now proven as an overlay workflow:
|
||||
|
||||
1. refresh upstream relibc source
|
||||
2. reapply the local relibc compatibility overlays
|
||||
3. rebuild relibc
|
||||
4. rebuild real downstream consumers (`libwayland`, `qtbase`)
|
||||
|
||||
## Scope Map
|
||||
|
||||
### In scope in relibc today
|
||||
|
||||
| Area | State | Primary evidence |
|
||||
|---|---|---|
|
||||
| `shm_open()` / `shm_unlink()` | implemented | `recipes/core/relibc/source/src/header/sys_mman/mod.rs` |
|
||||
| POSIX unnamed semaphores | implemented | `recipes/core/relibc/source/src/header/semaphore/mod.rs` |
|
||||
| POSIX named semaphores | implemented but bounded | `recipes/core/relibc/source/src/header/semaphore/mod.rs` |
|
||||
| SysV shared memory | implemented but bounded | `recipes/core/relibc/source/src/header/sys_shm/mod.rs` |
|
||||
| SysV semaphores | implemented but bounded | `recipes/core/relibc/source/src/header/sys_sem/mod.rs` |
|
||||
| `eventfd` | implemented; stronger than the other descriptor-event APIs | `recipes/core/relibc/source/src/header/sys_eventfd/mod.rs` |
|
||||
| `signalfd` | implemented, but runtime-thin and not broadly Redox-runtime-trusted yet | `recipes/core/relibc/source/src/header/signal/signalfd.rs` |
|
||||
| `timerfd` | implemented, but semantically narrow and not broadly Redox-runtime-trusted yet | `recipes/core/relibc/source/src/header/sys_timerfd/mod.rs` |
|
||||
|
||||
### Explicitly incomplete or absent
|
||||
|
||||
| Area | Current state | Evidence |
|
||||
|---|---|---|
|
||||
| POSIX message queues | absent | `recipes/core/relibc/source/src/header/mod.rs` still has `TODO: mqueue.h` |
|
||||
| SysV message queues | absent | `recipes/core/relibc/source/src/header/mod.rs` still has `TODO: sys/msg.h` |
|
||||
| `threads.h` / other broader libc completeness | outside this IPC focus, still incomplete | `recipes/core/relibc/source/src/header/mod.rs` |
|
||||
|
||||
## Current Implementation Assessment
|
||||
|
||||
### 1. Strong spots
|
||||
|
||||
The strongest IPC-related point is that relibc is no longer missing its core coordination substrate.
|
||||
The current tree has real, source-visible implementations for POSIX shm, POSIX semaphores, SysV
|
||||
shared memory, SysV semaphores, `eventfd`, `signalfd`, and `timerfd`. This is already enough to
|
||||
move several downstreams from patch-side workarounds to actual libc usage.
|
||||
|
||||
`shm_open()` and `shm_unlink()` are cleanly tied to the Redox-native `/scheme/shm/` path in
|
||||
`sys_mman/mod.rs`. That is a good architectural fit: Red Bear is not pretending to have a Linux
|
||||
kernel IPC model under the hood, but it still exposes familiar libc entry points on top of Redox
|
||||
schemes.
|
||||
|
||||
The second strong point is that the IPC work is not just source-visible anymore. The focused relibc
|
||||
tests already cover `sem_open`, `shmget`, `open_memstream`, `semget`, `eventfd`, and the bounded
|
||||
host-side `timerfd` / `signalfd` cases. The broader relibc plan also records successful downstream
|
||||
builds for `libwayland`, `qtbase`, and `openssh`, which means real consumers are already benefiting
|
||||
from this work, but those consumers do **not** all prove IPC depth equally.
|
||||
|
||||
### 2. Weak spots
|
||||
|
||||
The biggest weakness is **boundedness masquerading as compatibility**. The SysV layers exist, but
|
||||
they are deliberately thin wrappers over `/scheme/shm/` and relibc-local bookkeeping, not a broad
|
||||
Unix-complete implementation.
|
||||
|
||||
In `sys_shm/mod.rs`, `shmat()` rejects non-null attach addresses with `ENOSYS`, `SHM_RND` is
|
||||
defined but not meaningfully implemented, and `shmctl()` only meaningfully supports `IPC_RMID` and
|
||||
`IPC_STAT`. This is good enough for simple `IPC_PRIVATE` workflows and current compile-time
|
||||
consumers, but it is not strong enough to claim general SysV shared-memory completeness.
|
||||
|
||||
In `sys_sem/mod.rs`, `semget()` rejects any `nsems != 1`, so the implementation is effectively a
|
||||
single-semaphore set model rather than a full semaphore-set model. `semop()` supports multiple
|
||||
operations in one call, but only for semaphore number 0, and there is no `semtimedop()` support.
|
||||
`SEM_UNDO` is defined but not actually implemented. Compared with the standard `semop(2)` model,
|
||||
this means the current layer matches only the narrowest downstream cases.
|
||||
|
||||
Named POSIX semaphores are also present but still bounded. `sem_open()` is implemented on top of
|
||||
`shm_open()`, which is a practical Redox-native strategy, but the current code comments already mark
|
||||
it as a bounded Redox path rather than a full Linux/glibc-equivalent semantic model.
|
||||
|
||||
The descriptor-event primitives are in a similar state. `eventfd` is in comparatively good shape,
|
||||
including a host fallback for Linux test execution. `signalfd` and `timerfd` are weaker. The host
|
||||
tests for both currently report bounded unavailability instead of successful execution, which is
|
||||
better than a hang or crash but still leaves them short of runtime trust. `timerfd` in particular
|
||||
supports only `TFD_CLOEXEC`, `TFD_NONBLOCK`, and `TFD_TIMER_ABSTIME`; Linux-style
|
||||
`TFD_TIMER_CANCEL_ON_SET` semantics are still absent, and downstream KWin code explicitly wants
|
||||
that flag.
|
||||
|
||||
### 3. Missing areas
|
||||
|
||||
The obvious missing IPC area is message queues. Both `mqueue.h` and `sys/msg.h` remain TODOs in the
|
||||
header tree, which means relibc currently has no story at all for POSIX message queues or SysV
|
||||
message queues. That is not necessarily today’s highest-value blocker, but it is still a real IPC
|
||||
gap and should be named directly instead of being buried under generic TODO volume.
|
||||
|
||||
## Downstream Subsystem Assessment
|
||||
|
||||
### Qt / KDE
|
||||
|
||||
Qt and KDE are the clearest subsystem forcing IPC depth rather than just IPC surface area.
|
||||
|
||||
`local/docs/QT6-PORT-STATUS.md` already treats `QSharedMemory`, `QSystemSemaphore`, and `QProcess`
|
||||
as moved from “missing libc surface” to “present, but still needs runtime validation”. That is the
|
||||
right framing. The libc surface is no longer the primary blocker; confidence and semantics are.
|
||||
|
||||
The strongest concrete consumers in-tree are:
|
||||
|
||||
- `local/recipes/kde/kf6-kservice/source/src/sycoca/kmemfile.cpp` — heavy `QSharedMemory` usage
|
||||
- `local/recipes/kde/kf6-solid/source/src/solid/devices/backends/udisks2/udisksopticaldisc.cpp` —
|
||||
`QSharedMemory` plus `QSystemSemaphore`
|
||||
- `local/recipes/kde/kf6-kio/source/src/gui/previewjob.cpp` — direct SysV `shmget` / `shmat`
|
||||
- `local/recipes/kde/kwin/source/src/utils/xcbutils.cpp` — direct `shmget`
|
||||
- `local/recipes/kde/kwin/source/src/core/syncobjtimeline.cpp` and kio scoped-process code —
|
||||
`eventfd`
|
||||
- `local/recipes/kde/kwin/source/src/plugins/nightlight/clockskewnotifierengine_linux.cpp` —
|
||||
`timerfd` with `TFD_TIMER_CANCEL_ON_SET`
|
||||
|
||||
This matters because it shows two different downstream classes:
|
||||
|
||||
1. **Qt abstractions** (`QSharedMemory`, `QSystemSemaphore`) that can tolerate bounded underlying
|
||||
libc behavior if their common paths work.
|
||||
2. **Direct Unix/Linux-style callers** (KIO/KWin) that expose the places where the current relibc
|
||||
SysV and timerfd layers are still semantically narrower than software expects.
|
||||
|
||||
### Wayland stack
|
||||
|
||||
Wayland is less about classic shared-memory IPC completeness now and more about the descriptor-event
|
||||
side of the same subsystem family. The repo’s existing docs correctly show that `signalfd`,
|
||||
`timerfd`, `eventfd`, and `open_memstream` were the historical blockers and are now source-visible.
|
||||
`libwayland` cooking successfully is strong build-side proof, but the remaining work is runtime
|
||||
behavior under a compositor/session stack.
|
||||
|
||||
### Secondary consumers: OpenSSH / GLib / tmux
|
||||
|
||||
These are weaker IPC drivers and stronger networking/resolver drivers. They still matter because they
|
||||
show a pattern: once relibc exports the needed surface, downstream recipes can drop fake fallbacks,
|
||||
but runtime validation still trails source visibility. For an IPC-focused roadmap, they are useful
|
||||
secondary evidence, not primary IPC blockers.
|
||||
|
||||
The downstream proof should therefore be read this way:
|
||||
|
||||
- `qtbase` is the strongest IPC-facing downstream because it directly pressures shared memory,
|
||||
semaphores, and process behavior.
|
||||
- KDE consumers on top of Qt are the strongest subsystem evidence for where IPC semantics still need
|
||||
runtime trust.
|
||||
- `libwayland` is strongest as descriptor-event proof (`signalfd`, `timerfd`, `eventfd`,
|
||||
`open_memstream`) rather than SysV IPC proof.
|
||||
- `openssh`, `glib`, and `tmux` are useful proof that relibc header/export cleanup is helping real
|
||||
ports, but they should not be over-counted as core IPC validation.
|
||||
|
||||
## Main Blockers
|
||||
|
||||
### Blocker 1 — SysV layers are intentionally narrower than their API surface suggests
|
||||
|
||||
This is the highest-value blocker because it affects both direct consumers and Qt/KDE confidence.
|
||||
|
||||
Current examples:
|
||||
|
||||
- `semget()` only supports one semaphore per set
|
||||
- `semop()` only supports semaphore number 0
|
||||
- `SEM_UNDO` is not implemented
|
||||
- `semtimedop()` is absent
|
||||
- `shmat()` does not support non-null attach addresses
|
||||
- `shmctl()` does not cover the broader control matrix
|
||||
- SysV message queues are absent entirely
|
||||
|
||||
None of these invalidate the current build work. But together they mean “API present” is still not
|
||||
the same as “subsystem-complete”.
|
||||
|
||||
### Blocker 2 — Runtime validation is still shallower than subsystem importance
|
||||
|
||||
The IPC surface is better-tested than before, but runtime validation still trails the subsystem’s
|
||||
importance.
|
||||
|
||||
Current test story:
|
||||
|
||||
- host-side focused execution exists for `sem_open`, `shmget`, `open_memstream`, `semget`, and
|
||||
`eventfd`
|
||||
- `signalfd` and `timerfd` are in the test harness, but host execution currently reports bounded
|
||||
unavailability
|
||||
- downstream build evidence exists for `libwayland`, `qtbase`, and `openssh`
|
||||
|
||||
What is still missing is stronger Redox-target or consumer-runtime proof for Qt/KDE and Wayland
|
||||
paths that actually exercise shared memory, semaphores, and timer/signal descriptor behavior in a
|
||||
live session.
|
||||
|
||||
The strongest safe claim today is therefore:
|
||||
|
||||
- **source-visible** across the major IPC surfaces,
|
||||
- **test-visible** for focused host-side cases,
|
||||
- **build-visible downstream** for meaningful consumers,
|
||||
- but **not yet broadly runtime-trusted on Redox**.
|
||||
|
||||
### Blocker 3 — Descriptor-event semantics are still narrower than Linux-oriented callers expect
|
||||
|
||||
KWin’s timer code wants `TFD_TIMER_CANCEL_ON_SET`. The current relibc timerfd layer does not support
|
||||
that flag. This is a concrete example of a downstream expectation gap that is not solved by simply
|
||||
having `timerfd_create()` present.
|
||||
|
||||
Likewise, `signalfd` support is visible and exported, but its current confidence story is still too
|
||||
thin for broad claims about desktop/runtime readiness.
|
||||
|
||||
### Blocker 4 — Message queues remain a completely open IPC front
|
||||
|
||||
`mqueue.h` and `sys/msg.h` are still absent. This is not the first blocker to fix for today’s
|
||||
desktop stack, but it is the clearest “IPC truly not implemented yet” gap left in relibc.
|
||||
|
||||
## Current Non-Goals / Not Yet Claimed
|
||||
|
||||
The current tree should **not** be described as claiming any of the following:
|
||||
|
||||
- full SysV semaphore-set semantics
|
||||
- full SysV shared-memory semantics
|
||||
- full Linux-equivalent `timerfd` semantics
|
||||
- broad Redox-runtime trust for `signalfd` or `timerfd`
|
||||
- any POSIX message queue support
|
||||
- any SysV message queue support
|
||||
|
||||
## Recommended Improvement Plan
|
||||
|
||||
### Phase I1 — Reclassify the IPC support language
|
||||
|
||||
**Goal:** Make subsystem docs accurately describe the current state.
|
||||
|
||||
**Do:**
|
||||
|
||||
- describe POSIX shm and semaphores as implemented
|
||||
- describe SysV shm and semaphores as **bounded compatibility layers**, not comprehensive support
|
||||
- describe `eventfd` as stronger than `signalfd` / `timerfd`
|
||||
- describe message queues as still absent
|
||||
|
||||
**Exit criteria:** repo docs stop using broad phrases that imply complete IPC compatibility.
|
||||
|
||||
### Phase I2 — Harden the bounded SysV compatibility layers
|
||||
|
||||
**Goal:** Make the existing SysV support less misleading and more useful.
|
||||
|
||||
**Do:**
|
||||
|
||||
- decide whether Red Bear wants full semaphore-set support or an intentionally limited single-set model
|
||||
- if limited, document that choice explicitly in relibc and subsystem docs
|
||||
- otherwise extend `semget` / `semop` / `semctl` beyond the current semaphore-0-only model
|
||||
- implement or explicitly reject `SEM_UNDO`
|
||||
- add `semtimedop()` if downstreams need it
|
||||
- expand `shmctl()` and `shmat()` support where real consumers need more than the current `IPC_PRIVATE`
|
||||
attach workflow
|
||||
|
||||
**Exit criteria:** the SysV shm/sem layers either become materially broader or are clearly documented
|
||||
as intentionally bounded Redox compatibility shims.
|
||||
|
||||
### Phase I3 — Close the Qt/KDE runtime-proof gap
|
||||
|
||||
**Goal:** Move the IPC story from build-visible to desktop-visible.
|
||||
|
||||
**Do:**
|
||||
|
||||
- validate `QSharedMemory` under real Qt/KDE usage paths
|
||||
- validate `QSystemSemaphore` in KDE consumers such as Solid
|
||||
- validate KIO / KWin direct SysV shm paths
|
||||
- record exactly which Qt/KDE IPC paths are now runtime-trusted versus merely build-capable
|
||||
|
||||
**Exit criteria:** Qt/KDE docs stop listing shared memory and semaphore support as unresolved relibc
|
||||
confidence gaps.
|
||||
|
||||
### Phase I4 — Improve descriptor-event completeness for compositor/session code
|
||||
|
||||
**Goal:** Turn the current `eventfd` / `signalfd` / `timerfd` set into a more trustworthy runtime layer.
|
||||
|
||||
**Do:**
|
||||
|
||||
- keep `eventfd` on the current stable path
|
||||
- validate `signalfd` in real event-loop style consumers
|
||||
- extend `timerfd` semantics where current downstream code expects more than `TFD_TIMER_ABSTIME`
|
||||
(notably `TFD_TIMER_CANCEL_ON_SET`)
|
||||
- build targeted Redox-target tests where host behavior is inherently not representative
|
||||
|
||||
**Exit criteria:** at least one meaningful compositor/session consumer is runtime-validated against
|
||||
the current descriptor-event path.
|
||||
|
||||
### Phase I5 — Triage message queues explicitly
|
||||
|
||||
**Goal:** Stop leaving message queues as unprioritized TODOs.
|
||||
|
||||
**Do:**
|
||||
|
||||
- determine whether any current Red Bear subsystem actually needs POSIX or SysV message queues
|
||||
- if not, mark them as lower-priority completeness debt
|
||||
- if yes, create a dedicated implementation plan rather than burying them in generic header backlog
|
||||
|
||||
**Exit criteria:** `mqueue.h` and `sys/msg.h` are either on a concrete roadmap or explicitly treated
|
||||
as non-blocking backlog.
|
||||
|
||||
## Recommended Order
|
||||
|
||||
The current best order is:
|
||||
|
||||
1. documentation cleanup and accurate IPC classification
|
||||
2. SysV shm/sem hardening or explicit non-goal documentation
|
||||
3. Qt/KDE runtime validation
|
||||
4. descriptor-event runtime validation and timerfd semantic expansion
|
||||
5. message queue triage
|
||||
|
||||
That order matches the current subsystem pressure better than a generic “finish all missing IPC
|
||||
headers” strategy.
|
||||
|
||||
## Bottom Line
|
||||
|
||||
relibc IPC in Red Bear OS is no longer a story of missing primitives. It is now a story of **real
|
||||
surface area with bounded compatibility depth**.
|
||||
|
||||
The strongest parts are POSIX shm, POSIX semaphores, `eventfd`, and the fact that major downstreams
|
||||
already build. The weakest parts are the narrow SysV semantics, the lack of message queues, and the
|
||||
runtime-proof gap for the desktop/session stack. The right next step is not random header work; it
|
||||
is to harden and validate the IPC layers that current Qt/KDE and Wayland-adjacent consumers are
|
||||
already trying to use.
|
||||
Reference in New Issue
Block a user