484 lines
19 KiB
Markdown
484 lines
19 KiB
Markdown
# Live ISO Mount — Architecture, Failure Analysis, and Fix Plan
|
|
|
|
**Date:** 2026-05-27
|
|
**Status:** Draft — fixes not yet implemented
|
|
**Scope:** Bootloader live preload, lived daemon, RedoxFS mount chain
|
|
|
|
---
|
|
|
|
## 1. Current Architecture
|
|
|
|
### 1.1 Boot Flow (Live ISO)
|
|
|
|
```
|
|
UEFI firmware
|
|
→ Bootloader (recipes/core/bootloader/source/src/main.rs)
|
|
1. Find RedoxFS partition on disk
|
|
2. Read filesystem header → get total filesystem size (e.g., 4093 MiB)
|
|
3. Live preload: read first N MiB of filesystem into RAM
|
|
- Cap: max_preload = 1024 MiB (line 559)
|
|
- Set env: DISK_LIVE_ADDR=<phys addr>, DISK_LIVE_SIZE=<preload size>
|
|
- Set env: REDOXFS_BLOCK=0 (start of partition)
|
|
4. Load kernel from RedoxFS into memory
|
|
5. Load initfs from RedoxFS into memory
|
|
6. Set up paging, pass env to kernel
|
|
7. Jump to kernel entry point
|
|
|
|
Kernel
|
|
→ bootstrap (initfs)
|
|
→ init daemon
|
|
→ lived daemon (10_lived.service)
|
|
- Reads DISK_LIVE_ADDR + DISK_LIVE_SIZE from env
|
|
- Maps preloaded RAM as LiveDisk via /scheme/memory/physical
|
|
- Registers scheme:disk.live
|
|
- LiveDisk.size() = preloaded size (1024 MiB)
|
|
- LiveDisk.block_size() = PAGE_SIZE (4096) [P6 patch changes to 512]
|
|
|
|
→ redoxfs daemon (50_rootfs.service)
|
|
- Opens /scheme/disk.live/0 as DiskFile
|
|
- Calls FileSystem::open(disk, password, block=0, cleanup=true)
|
|
- Reads header at block 0 (inside preloaded region → works)
|
|
- Calls fs.reset_allocator() → walks the allocation tree
|
|
- Calls fs.cleanup() → may read blocks across the entire filesystem
|
|
- FAILURE: any read beyond preloaded size returns EINVAL
|
|
```
|
|
|
|
### 1.2 Component Map
|
|
|
|
| Component | Source | Role |
|
|
|-----------|--------|------|
|
|
| **Bootloader** | `recipes/core/bootloader/source/src/main.rs` | Preloads filesystem into RAM, passes env vars |
|
|
| **lived** | `recipes/core/base/source/drivers/storage/lived/src/main.rs` | Maps preloaded RAM as `scheme:disk.live` |
|
|
| **RedoxFS mount** | `recipes/core/redoxfs/source/src/bin/mount.rs` | Opens disk scheme, calls FileSystem::open |
|
|
| **RedoxFS lib** | `recipes/core/redoxfs/source/src/filesystem.rs` | Reads header, walks allocator tree |
|
|
| **driver-block** | `recipes/core/base/source/drivers/storage/driver-block/src/lib.rs` | DiskWrapper with block_size alignment checks |
|
|
| **P6 patch** | `local/patches/base/P6-lived-block-size-512.patch` | Changes block_size from PAGE_SIZE to 512 |
|
|
|
|
### 1.3 The Preload Cap
|
|
|
|
```rust
|
|
// bootloader/src/main.rs:559
|
|
let max_preload: u64 = 1024 * MIBI as u64; // 1 GiB hard cap
|
|
let preload_size = if size > max_preload {
|
|
max_preload // Cap at 1 GiB
|
|
} else {
|
|
size // Preload entire filesystem if ≤ 1 GiB
|
|
};
|
|
```
|
|
|
|
For redbear-full (4093 MiB filesystem): preloads 1024 MiB, 3069 MiB must come from disk.
|
|
For redbear-mini (1533 MiB filesystem): preloads 1024 MiB, 509 MiB must come from disk.
|
|
|
|
### 1.4 The lived Disk
|
|
|
|
```rust
|
|
// lived/src/main.rs - LiveDisk::read (CURRENT, unpatched source)
|
|
fn block_size(&self) -> u32 {
|
|
PAGE_SIZE as u32 // P6 changes this to 512
|
|
}
|
|
|
|
fn size(&self) -> u64 {
|
|
self.original.len() as u64 // This is the PRELOADED size, not total filesystem size
|
|
}
|
|
|
|
async fn read(&mut self, mut block: u64, buffer: &mut [u8]) -> syscall::Result<usize> {
|
|
let mut offset = (block as usize) * PAGE_SIZE;
|
|
if offset + buffer.len() > self.original.len() {
|
|
return Err(syscall::Error::new(EINVAL)); // ← THIS IS THE FAILURE POINT
|
|
}
|
|
// ... read from preloaded buffer
|
|
}
|
|
```
|
|
|
|
**The fundamental problem:** `lived` only has the preloaded buffer (1024 MiB). It has no
|
|
access to the remaining filesystem data on the physical disk. When RedoxFS tries to read
|
|
beyond 1024 MiB, lived returns EINVAL.
|
|
|
|
---
|
|
|
|
## 2. Failure Analysis
|
|
|
|
### 2.1 Why Does the Mini ISO Work?
|
|
|
|
The mini ISO (1533 MiB) also has 509 MiB beyond the preload. However:
|
|
|
|
1. RedoxFS `FileSystem::open` reads the header at block 0 (within preload) → OK
|
|
2. `reset_allocator` walks the free block tree. For a 1533 MiB filesystem with minimal
|
|
contents, the allocator metadata is concentrated near the start → likely within 1024 MiB
|
|
3. `cleanup` reads extent nodes — for a small filesystem, these are also near the start
|
|
|
|
For the full ISO (4093 MiB) with hundreds of packages:
|
|
- The allocator tree and extent nodes span the entire 4093 MiB range
|
|
- RedoxFS needs to read blocks at offsets > 1024 MiB during `FileSystem::open`
|
|
- lived rejects those reads → mount fails
|
|
|
|
**The mini ISO works by luck** — its metadata happens to fit within the preload window.
|
|
This is not a reliable design.
|
|
|
|
### 2.2 The Exact Error Chain
|
|
|
|
```
|
|
RedoxFS FileSystem::open
|
|
→ disk.read_at(block_N, &mut header)
|
|
→ DiskFile::read_at(buffer, block_N * BLOCK_SIZE)
|
|
→ syscall::read(scheme:disk.live/0, offset=block_N * 512)
|
|
→ lived::LiveDisk::read(block_N, buffer)
|
|
→ offset = block_N * PAGE_SIZE // or 512 with P6
|
|
→ if offset + buffer.len() > self.original.len():
|
|
return Err(EINVAL) // ← HERE
|
|
```
|
|
|
|
The error propagates:
|
|
- lived → EINVAL
|
|
- DiskFile → "RedoxFS: IO ERROR: Invalid argument (os error 22)"
|
|
- FileSystem::open → Err(EINVAL)
|
|
- mount.rs → "not able to mount uuid ..."
|
|
|
|
### 2.3 The P6 Block Size Patch
|
|
|
|
The P6 patch (`local/patches/base/P6-lived-block-size-512.patch`) fixes a different but
|
|
related issue: the original `block_size()` returned `PAGE_SIZE` (4096), but RedoxFS reads
|
|
in 512-byte chunks (`BLOCK_SIZE = 4096` but individual reads may be 512). The `DiskWrapper`
|
|
in `driver-block` rejects misaligned reads. Changing to 512 fixes alignment but does NOT
|
|
fix the size/out-of-bounds problem.
|
|
|
|
**Note:** The current source tree (`recipes/core/base/source/drivers/storage/lived/src/main.rs`)
|
|
does NOT have the P6 patch applied — it still shows `PAGE_SIZE as u32`. The P6 patch is
|
|
applied during `repo fetch base` and only exists durably in `local/patches/base/`.
|
|
|
|
---
|
|
|
|
## 3. Fix Strategy
|
|
|
|
### 3.1 Design Principle
|
|
|
|
> Preload the minimum needed to boot the kernel + initfs. Once the OS is running, mount
|
|
> the filesystem from the actual disk device, not from the RAM preload.
|
|
|
|
The bootloader already loads kernel + initfs from RedoxFS before switching to live mode.
|
|
After that, the running OS has access to the AHCI driver (ahcid) and can mount the
|
|
filesystem directly from the physical disk.
|
|
|
|
### 3.2 Two-Phase Approach
|
|
|
|
**Phase A: Bootloader Changes** (bootloader is UEFI code, runs before the OS)
|
|
|
|
1. **Reduce preload to the minimum needed for kernel + initfs discovery**
|
|
- The bootloader needs to read the RedoxFS superblock + directory tree to find
|
|
`usr/lib/boot/kernel` and `usr/lib/boot/initfs`. This requires reading the header,
|
|
the root node, and walking directory entries.
|
|
- Instead of preloading a fixed 1024 MiB, preload only what's needed to locate and
|
|
read these two files. In practice, this is the first few MiB of the filesystem.
|
|
- Fallback: if the filesystem is small enough (≤ 64 MiB?), preload everything.
|
|
|
|
2. **Pass the physical disk location to the kernel**
|
|
- Set `DISK_PHYS_ADDR` and `DISK_PHYS_SIZE` env vars with the full disk geometry
|
|
- Keep `DISK_LIVE_ADDR` / `DISK_LIVE_SIZE` for the minimal preload
|
|
- Add `REDOXFS_FULL_SIZE` so the OS knows the true filesystem extent
|
|
|
|
**Phase B: lived Daemon Changes** (OS-level, patchable via `local/patches/base/`)
|
|
|
|
1. **Accept the full filesystem size as an additional env var**
|
|
- Read `REDOXFS_FULL_SIZE` or derive from the RedoxFS header
|
|
- Report `LiveDisk::size()` as the FULL filesystem size, not just the preload
|
|
|
|
2. **Fall through to the physical disk for reads beyond the preload**
|
|
- When `read(block, buffer)` is called with an offset beyond `self.original.len()`:
|
|
- Open the underlying block device (e.g., `/scheme/disk/0` after ahcid starts)
|
|
- Read the data from the physical disk
|
|
- Cache the result in the overlay HashMap
|
|
- This makes lived act as a write-through cache: preload in RAM, fallback to disk
|
|
|
|
3. **Alternative simpler approach: bypass lived entirely for large images**
|
|
- After ahcid starts and registers `/scheme/disk/0`, the init system could mount
|
|
RedoxFS directly from `/scheme/disk/0` instead of `/scheme/disk.live/0`
|
|
- The preload would only be used by the bootloader to load kernel + initfs
|
|
- Once the OS boots, lived is unnecessary — mount from the real disk
|
|
|
|
---
|
|
|
|
## 4. Concrete Fix Plan
|
|
|
|
### 4.1 Fix 1: Reduce Bootloader Preload (bootloader patch)
|
|
|
|
**File:** `recipes/core/bootloader/source/src/main.rs`
|
|
|
|
**Current:**
|
|
```rust
|
|
let max_preload: u64 = 1024 * MIBI as u64;
|
|
```
|
|
|
|
**Proposed change:**
|
|
```rust
|
|
// Only preload what the bootloader actually needs:
|
|
// - RedoxFS header + allocator (first ~1 MiB)
|
|
// - Root directory tree (typically first 32-64 MiB)
|
|
// - kernel and initfs files (loaded separately after preload)
|
|
// 64 MiB is generous for the metadata region of any reasonable filesystem.
|
|
// The kernel and initfs are loaded separately via fs.disk.read_at() directly
|
|
// from the physical disk, so they don't need to be in the preload.
|
|
let max_preload: u64 = 64 * MIBI as u64;
|
|
```
|
|
|
|
Wait — this doesn't work. The bootloader reads kernel and initfs from the RedoxFS
|
|
filesystem using `load_to_memory(os, &mut fs, "usr/lib/boot/kernel", ...)`. After the
|
|
preload, the bootloader has already switched the disk to the live buffer. So the kernel
|
|
and initfs must be within the preload, OR the bootloader must load them before switching
|
|
to live mode.
|
|
|
|
**Looking at the actual bootloader flow:**
|
|
```
|
|
1. Open RedoxFS from physical disk → fs
|
|
2. Preload first N MiB into RAM buffer
|
|
3. Set LIVE_OPT = Some((fs.block, buffer))
|
|
4. Load kernel from fs (still using physical disk? or from buffer?)
|
|
5. Load initfs from fs
|
|
6. Pass LIVE_OPT to kernel env
|
|
```
|
|
|
|
The live buffer is set in `LIVE_OPT` at line 625, but the kernel and initfs are loaded
|
|
at lines 642-663, AFTER the live preload. The `load_to_memory` function uses `fs` which
|
|
still uses the original disk handle. So the kernel and initfs are read from the physical
|
|
disk, not from the live buffer.
|
|
|
|
**This means the preload doesn't need to include kernel or initfs at all.** The preload
|
|
exists solely so that `lived` can serve the filesystem to the running OS via `scheme:disk.live`.
|
|
|
|
**Revised Fix 1:** Reduce max_preload to a small value (e.g., 4-64 MiB) that covers just
|
|
the RedoxFS metadata needed for initial mount, then rely on the disk fallback for the rest.
|
|
|
|
BUT: this only works if `lived` can fall through to the physical disk for out-of-bounds
|
|
reads. Without the fallback, reducing preload makes the problem worse.
|
|
|
|
### 4.2 Fix 2: lived Disk Fallback (base patch)
|
|
|
|
**File:** `recipes/core/base/source/drivers/storage/lived/src/main.rs`
|
|
|
|
This is the core fix. Make `lived` aware of the full filesystem and able to read from
|
|
the physical disk when the preload doesn't cover the requested region.
|
|
|
|
**Design:**
|
|
|
|
```rust
|
|
struct LiveDisk {
|
|
// Preloaded RAM buffer (may be smaller than total filesystem)
|
|
preload: &'static [u8],
|
|
// Full filesystem size (from RedoxFS header or env var)
|
|
total_size: u64,
|
|
// Physical disk offset where the filesystem starts
|
|
disk_block: u64,
|
|
// Handle to the physical disk (opened after ahcid starts)
|
|
disk_handle: Option<File>,
|
|
// Write overlay (same as before)
|
|
overlay: HashMap<u64, Box<[u8]>>,
|
|
}
|
|
|
|
impl Disk for LiveDisk {
|
|
fn block_size(&self) -> u32 { 512 }
|
|
|
|
fn size(&self) -> u64 { self.total_size }
|
|
|
|
async fn read(&mut self, block: u64, buffer: &mut [u8]) -> syscall::Result<usize> {
|
|
let bs = self.block_size() as usize;
|
|
let offset = (block as usize) * bs;
|
|
|
|
if offset + buffer.len() > self.total_size as usize {
|
|
return Err(syscall::Error::new(EINVAL));
|
|
}
|
|
|
|
let preload_bytes = self.preload.len();
|
|
|
|
for (i, chunk) in buffer.chunks_mut(bs).enumerate() {
|
|
let block_i = block + i as u64;
|
|
let offset_i = offset + i * bs;
|
|
|
|
// Check overlay first
|
|
if let Some(overlay) = self.overlay.get(&block_i) {
|
|
chunk.copy_from_slice(&overlay[..chunk.len()]);
|
|
continue;
|
|
}
|
|
|
|
if offset_i + chunk.len() <= preload_bytes {
|
|
// Within preload → read from RAM
|
|
chunk.copy_from_slice(&self.preload[offset_i..offset_i + chunk.len()]);
|
|
} else {
|
|
// Beyond preload → read from physical disk
|
|
self.read_from_disk(block_i, chunk)?;
|
|
}
|
|
}
|
|
Ok(buffer.len())
|
|
}
|
|
|
|
fn read_from_disk(&mut self, block: u64, buffer: &mut [u8]) -> syscall::Result<()> {
|
|
// Try to open the physical disk if not already open
|
|
if self.disk_handle.is_none() {
|
|
// Try common disk scheme paths
|
|
for path in &["/scheme/disk/0", "/scheme/disk/1"] {
|
|
if let Ok(file) = OpenOptions::new().read(true).open(path) {
|
|
self.disk_handle = Some(file);
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
|
|
if let Some(ref mut disk) = self.disk_handle {
|
|
// Seek to the correct block (accounting for partition offset)
|
|
let abs_block = self.disk_block + block;
|
|
disk.read_at(buffer, abs_block * self.block_size() as u64)
|
|
.map_err(|_| syscall::Error::new(EIO))?;
|
|
Ok(())
|
|
} else {
|
|
// No disk available yet — return what we have from preload
|
|
// (fill with zeros for regions not in preload)
|
|
buffer.fill(0);
|
|
Err(syscall::Error::new(EIO))
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Problem with this approach:** `lived` starts before `ahcid` (it's at priority 10 in
|
|
init.initfs.d, while ahcid is at priority 40). So when lived first starts, there IS no
|
|
`/scheme/disk/0` to fall back to. The disk fallback would only work after ahcid initializes.
|
|
|
|
### 4.3 Fix 3: Two-Stage Mount (Recommended)
|
|
|
|
The cleanest fix is to change the init sequence:
|
|
|
|
**Stage 1: lived serves the preloaded buffer (for early boot)**
|
|
- lived starts as before, serves the preload via `scheme:disk.live`
|
|
- RedoxFS does NOT mount from `disk.live` for the root filesystem
|
|
- The initfs has everything needed for early boot (lived, ahcid, basic tools)
|
|
|
|
**Stage 2: Mount from physical disk (after drivers start)**
|
|
- After `40_drivers.target` completes, ahcid has registered `/scheme/disk/0`
|
|
- Init runs `redoxfs --uuid $REDOXFS_UUID file $REDOXFS_BLOCK` pointing to `/scheme/disk/0`
|
|
- This reads the full filesystem from the physical disk
|
|
|
|
**Current init flow:**
|
|
```
|
|
10_lived.service → starts lived (scheme:disk.live)
|
|
40_drivers.target → starts ahcid (scheme:disk/0)
|
|
50_rootfs.service → redoxfs mounts from... whichever disk scheme it finds first
|
|
(scans all /scheme/disk/* for matching UUID)
|
|
```
|
|
|
|
**The issue is timing:** `50_rootfs.service` requires `40_drivers.target`, so it should
|
|
wait for ahcid. But `redoxfs` scans ALL disk schemes, and `disk.live` matches first
|
|
(since lived starts earlier). RedoxFS finds the UUID in `disk.live` and tries to mount
|
|
from it, but the disk is too small.
|
|
|
|
**Proposed fix:**
|
|
|
|
1. **Make lived report the full filesystem size** by reading the RedoxFS header from
|
|
the preload buffer to determine `total_size`. Report that as `size()`.
|
|
|
|
2. **Make lived fall through to disk reads** for out-of-bounds regions. Use a lazy-open
|
|
approach: when a read goes beyond the preload and no disk handle is open yet, try
|
|
to open `/scheme/disk/0`. If it fails, return EIO (which RedoxFS will retry).
|
|
|
|
3. **Reduce the preload** in the bootloader. Since lived now handles disk fallback,
|
|
we can preload much less (e.g., 4-64 MiB). The preload just needs to cover the
|
|
RedoxFS header and enough metadata for the initial mount.
|
|
|
|
---
|
|
|
|
## 5. Recommended Implementation Order
|
|
|
|
### Step 1: Fix lived to report full size + disk fallback (base patch)
|
|
|
|
Create `local/patches/base/P59-lived-disk-fallback.patch`:
|
|
|
|
1. Add `total_size: u64` field to LiveDisk
|
|
2. Parse RedoxFS header from preload buffer to determine total filesystem size
|
|
3. Report `total_size` from `size()` instead of `preload.len()`
|
|
4. For reads beyond preload: attempt to open and read from `/scheme/disk/0`
|
|
5. Keep overlay for writes
|
|
6. Keep block_size = 512 (from P6)
|
|
7. Add env var `DISK_PHYS_BLOCK` for the partition offset on the physical disk
|
|
|
|
### Step 2: Reduce bootloader preload cap (bootloader patch)
|
|
|
|
Create `local/patches/bootloader/P1-reduce-live-preload.patch`:
|
|
|
|
1. Change `max_preload` from 1024 MiB to a calculated minimum:
|
|
- Read the RedoxFS header to determine filesystem size
|
|
- Calculate the minimum preload needed: max(header + allocator extent, 4 MiB)
|
|
- Cap at 128 MiB (generous upper bound for metadata region)
|
|
2. Add `DISK_PHYS_BLOCK` env var so lived knows where the partition starts on disk
|
|
|
|
### Step 3: Verify
|
|
|
|
1. Build and test redbear-full ISO in QEMU with virtio-gpu
|
|
2. Verify RedoxFS mounts the full 4093 MiB filesystem
|
|
3. Verify login prompt appears
|
|
4. Verify KDE desktop loads (or at minimum, the greeter starts)
|
|
|
|
---
|
|
|
|
## 6. Risk Assessment
|
|
|
|
| Risk | Impact | Mitigation |
|
|
|------|--------|------------|
|
|
| RedoxFS header format changes between versions | lived parses header incorrectly | Use the same header parsing code as RedoxFS lib |
|
|
| ahcid not started when lived first needs disk | Read fails with ENOENT | Retry with backoff; RedoxFS mount retries automatically |
|
|
| Physical disk block offset wrong | Read corrupt data | Pass exact block offset from bootloader via env var |
|
|
| Preload too small for RedoxFS to find header | Mount fails immediately | Keep minimum preload at 4 MiB (covers any superblock) |
|
|
| Mini ISO regression | Small images broken | Test mini ISO after every change |
|
|
|
|
---
|
|
|
|
## 7. Alternative Approach: Mount From Physical Disk Directly
|
|
|
|
Instead of fixing lived, we could modify the init sequence to skip `disk.live` entirely
|
|
for the root filesystem mount:
|
|
|
|
1. Bootloader preloads just enough for kernel + initfs (no change needed)
|
|
2. lived starts but is only used for early boot I/O
|
|
3. `50_rootfs.service` is changed to explicitly mount from `/scheme/disk/0` (via ahcid)
|
|
instead of scanning all disk schemes
|
|
4. This requires passing the disk path and block offset from bootloader to init
|
|
|
|
**Pros:** Simpler lived (no disk fallback), cleaner architecture
|
|
**Cons:** Requires knowing which disk scheme serves the boot device; may not work if
|
|
the AHCI driver assigns a different number to the boot disk
|
|
|
|
**Verdict:** Fix 3 (lived with disk fallback) is more robust because it works regardless
|
|
of which disk scheme is assigned. The lived approach acts as a transparent cache layer.
|
|
|
|
---
|
|
|
|
## 8. Implementation Notes
|
|
|
|
### Bootloader env vars (current)
|
|
|
|
```
|
|
DISK_LIVE_ADDR=<hex phys addr of preload buffer>
|
|
DISK_LIVE_SIZE=<hex size of preload buffer>
|
|
REDOXFS_BLOCK=0 (always 0 for live mode)
|
|
REDOXFS_UUID=<uuid>
|
|
```
|
|
|
|
### Bootloader env vars (proposed additions)
|
|
|
|
```
|
|
DISK_PHYS_BLOCK=<hex block offset of partition on physical disk>
|
|
REDOXFS_FULL_SIZE=<hex total filesystem size>
|
|
```
|
|
|
|
### lived env vars (current)
|
|
|
|
```
|
|
DISK_LIVE_ADDR → phys addr to mmap
|
|
DISK_LIVE_SIZE → size to mmap (= preload size, NOT total filesystem size)
|
|
```
|
|
|
|
### lived env vars (proposed)
|
|
|
|
```
|
|
DISK_LIVE_ADDR → phys addr of preload buffer
|
|
DISK_LIVE_SIZE → size of preload buffer
|
|
DISK_PHYS_BLOCK → block offset for disk fallback reads
|
|
REDOXFS_FULL_SIZE → total filesystem size (for size() reporting)
|
|
```
|