Files
RedBear-OS/local/docs/LIVE-ISO-MOUNT-PLAN.md
T

484 lines
19 KiB
Markdown

# Live ISO Mount — Architecture, Failure Analysis, and Fix Plan
**Date:** 2026-05-27
**Status:** Draft — fixes not yet implemented
**Scope:** Bootloader live preload, lived daemon, RedoxFS mount chain
---
## 1. Current Architecture
### 1.1 Boot Flow (Live ISO)
```
UEFI firmware
→ Bootloader (recipes/core/bootloader/source/src/main.rs)
1. Find RedoxFS partition on disk
2. Read filesystem header → get total filesystem size (e.g., 4093 MiB)
3. Live preload: read first N MiB of filesystem into RAM
- Cap: max_preload = 1024 MiB (line 559)
- Set env: DISK_LIVE_ADDR=<phys addr>, DISK_LIVE_SIZE=<preload size>
- Set env: REDOXFS_BLOCK=0 (start of partition)
4. Load kernel from RedoxFS into memory
5. Load initfs from RedoxFS into memory
6. Set up paging, pass env to kernel
7. Jump to kernel entry point
Kernel
→ bootstrap (initfs)
→ init daemon
→ lived daemon (10_lived.service)
- Reads DISK_LIVE_ADDR + DISK_LIVE_SIZE from env
- Maps preloaded RAM as LiveDisk via /scheme/memory/physical
- Registers scheme:disk.live
- LiveDisk.size() = preloaded size (1024 MiB)
- LiveDisk.block_size() = PAGE_SIZE (4096) [P6 patch changes to 512]
→ redoxfs daemon (50_rootfs.service)
- Opens /scheme/disk.live/0 as DiskFile
- Calls FileSystem::open(disk, password, block=0, cleanup=true)
- Reads header at block 0 (inside preloaded region → works)
- Calls fs.reset_allocator() → walks the allocation tree
- Calls fs.cleanup() → may read blocks across the entire filesystem
- FAILURE: any read beyond preloaded size returns EINVAL
```
### 1.2 Component Map
| Component | Source | Role |
|-----------|--------|------|
| **Bootloader** | `recipes/core/bootloader/source/src/main.rs` | Preloads filesystem into RAM, passes env vars |
| **lived** | `recipes/core/base/source/drivers/storage/lived/src/main.rs` | Maps preloaded RAM as `scheme:disk.live` |
| **RedoxFS mount** | `recipes/core/redoxfs/source/src/bin/mount.rs` | Opens disk scheme, calls FileSystem::open |
| **RedoxFS lib** | `recipes/core/redoxfs/source/src/filesystem.rs` | Reads header, walks allocator tree |
| **driver-block** | `recipes/core/base/source/drivers/storage/driver-block/src/lib.rs` | DiskWrapper with block_size alignment checks |
| **P6 patch** | `local/patches/base/P6-lived-block-size-512.patch` | Changes block_size from PAGE_SIZE to 512 |
### 1.3 The Preload Cap
```rust
// bootloader/src/main.rs:559
let max_preload: u64 = 1024 * MIBI as u64; // 1 GiB hard cap
let preload_size = if size > max_preload {
max_preload // Cap at 1 GiB
} else {
size // Preload entire filesystem if ≤ 1 GiB
};
```
For redbear-full (4093 MiB filesystem): preloads 1024 MiB, 3069 MiB must come from disk.
For redbear-mini (1533 MiB filesystem): preloads 1024 MiB, 509 MiB must come from disk.
### 1.4 The lived Disk
```rust
// lived/src/main.rs - LiveDisk::read (CURRENT, unpatched source)
fn block_size(&self) -> u32 {
PAGE_SIZE as u32 // P6 changes this to 512
}
fn size(&self) -> u64 {
self.original.len() as u64 // This is the PRELOADED size, not total filesystem size
}
async fn read(&mut self, mut block: u64, buffer: &mut [u8]) -> syscall::Result<usize> {
let mut offset = (block as usize) * PAGE_SIZE;
if offset + buffer.len() > self.original.len() {
return Err(syscall::Error::new(EINVAL)); // ← THIS IS THE FAILURE POINT
}
// ... read from preloaded buffer
}
```
**The fundamental problem:** `lived` only has the preloaded buffer (1024 MiB). It has no
access to the remaining filesystem data on the physical disk. When RedoxFS tries to read
beyond 1024 MiB, lived returns EINVAL.
---
## 2. Failure Analysis
### 2.1 Why Does the Mini ISO Work?
The mini ISO (1533 MiB) also has 509 MiB beyond the preload. However:
1. RedoxFS `FileSystem::open` reads the header at block 0 (within preload) → OK
2. `reset_allocator` walks the free block tree. For a 1533 MiB filesystem with minimal
contents, the allocator metadata is concentrated near the start → likely within 1024 MiB
3. `cleanup` reads extent nodes — for a small filesystem, these are also near the start
For the full ISO (4093 MiB) with hundreds of packages:
- The allocator tree and extent nodes span the entire 4093 MiB range
- RedoxFS needs to read blocks at offsets > 1024 MiB during `FileSystem::open`
- lived rejects those reads → mount fails
**The mini ISO works by luck** — its metadata happens to fit within the preload window.
This is not a reliable design.
### 2.2 The Exact Error Chain
```
RedoxFS FileSystem::open
→ disk.read_at(block_N, &mut header)
→ DiskFile::read_at(buffer, block_N * BLOCK_SIZE)
→ syscall::read(scheme:disk.live/0, offset=block_N * 512)
→ lived::LiveDisk::read(block_N, buffer)
→ offset = block_N * PAGE_SIZE // or 512 with P6
→ if offset + buffer.len() > self.original.len():
return Err(EINVAL) // ← HERE
```
The error propagates:
- lived → EINVAL
- DiskFile → "RedoxFS: IO ERROR: Invalid argument (os error 22)"
- FileSystem::open → Err(EINVAL)
- mount.rs → "not able to mount uuid ..."
### 2.3 The P6 Block Size Patch
The P6 patch (`local/patches/base/P6-lived-block-size-512.patch`) fixes a different but
related issue: the original `block_size()` returned `PAGE_SIZE` (4096), but RedoxFS reads
in 512-byte chunks (`BLOCK_SIZE = 4096` but individual reads may be 512). The `DiskWrapper`
in `driver-block` rejects misaligned reads. Changing to 512 fixes alignment but does NOT
fix the size/out-of-bounds problem.
**Note:** The current source tree (`recipes/core/base/source/drivers/storage/lived/src/main.rs`)
does NOT have the P6 patch applied — it still shows `PAGE_SIZE as u32`. The P6 patch is
applied during `repo fetch base` and only exists durably in `local/patches/base/`.
---
## 3. Fix Strategy
### 3.1 Design Principle
> Preload the minimum needed to boot the kernel + initfs. Once the OS is running, mount
> the filesystem from the actual disk device, not from the RAM preload.
The bootloader already loads kernel + initfs from RedoxFS before switching to live mode.
After that, the running OS has access to the AHCI driver (ahcid) and can mount the
filesystem directly from the physical disk.
### 3.2 Two-Phase Approach
**Phase A: Bootloader Changes** (bootloader is UEFI code, runs before the OS)
1. **Reduce preload to the minimum needed for kernel + initfs discovery**
- The bootloader needs to read the RedoxFS superblock + directory tree to find
`usr/lib/boot/kernel` and `usr/lib/boot/initfs`. This requires reading the header,
the root node, and walking directory entries.
- Instead of preloading a fixed 1024 MiB, preload only what's needed to locate and
read these two files. In practice, this is the first few MiB of the filesystem.
- Fallback: if the filesystem is small enough (≤ 64 MiB?), preload everything.
2. **Pass the physical disk location to the kernel**
- Set `DISK_PHYS_ADDR` and `DISK_PHYS_SIZE` env vars with the full disk geometry
- Keep `DISK_LIVE_ADDR` / `DISK_LIVE_SIZE` for the minimal preload
- Add `REDOXFS_FULL_SIZE` so the OS knows the true filesystem extent
**Phase B: lived Daemon Changes** (OS-level, patchable via `local/patches/base/`)
1. **Accept the full filesystem size as an additional env var**
- Read `REDOXFS_FULL_SIZE` or derive from the RedoxFS header
- Report `LiveDisk::size()` as the FULL filesystem size, not just the preload
2. **Fall through to the physical disk for reads beyond the preload**
- When `read(block, buffer)` is called with an offset beyond `self.original.len()`:
- Open the underlying block device (e.g., `/scheme/disk/0` after ahcid starts)
- Read the data from the physical disk
- Cache the result in the overlay HashMap
- This makes lived act as a write-through cache: preload in RAM, fallback to disk
3. **Alternative simpler approach: bypass lived entirely for large images**
- After ahcid starts and registers `/scheme/disk/0`, the init system could mount
RedoxFS directly from `/scheme/disk/0` instead of `/scheme/disk.live/0`
- The preload would only be used by the bootloader to load kernel + initfs
- Once the OS boots, lived is unnecessary — mount from the real disk
---
## 4. Concrete Fix Plan
### 4.1 Fix 1: Reduce Bootloader Preload (bootloader patch)
**File:** `recipes/core/bootloader/source/src/main.rs`
**Current:**
```rust
let max_preload: u64 = 1024 * MIBI as u64;
```
**Proposed change:**
```rust
// Only preload what the bootloader actually needs:
// - RedoxFS header + allocator (first ~1 MiB)
// - Root directory tree (typically first 32-64 MiB)
// - kernel and initfs files (loaded separately after preload)
// 64 MiB is generous for the metadata region of any reasonable filesystem.
// The kernel and initfs are loaded separately via fs.disk.read_at() directly
// from the physical disk, so they don't need to be in the preload.
let max_preload: u64 = 64 * MIBI as u64;
```
Wait — this doesn't work. The bootloader reads kernel and initfs from the RedoxFS
filesystem using `load_to_memory(os, &mut fs, "usr/lib/boot/kernel", ...)`. After the
preload, the bootloader has already switched the disk to the live buffer. So the kernel
and initfs must be within the preload, OR the bootloader must load them before switching
to live mode.
**Looking at the actual bootloader flow:**
```
1. Open RedoxFS from physical disk → fs
2. Preload first N MiB into RAM buffer
3. Set LIVE_OPT = Some((fs.block, buffer))
4. Load kernel from fs (still using physical disk? or from buffer?)
5. Load initfs from fs
6. Pass LIVE_OPT to kernel env
```
The live buffer is set in `LIVE_OPT` at line 625, but the kernel and initfs are loaded
at lines 642-663, AFTER the live preload. The `load_to_memory` function uses `fs` which
still uses the original disk handle. So the kernel and initfs are read from the physical
disk, not from the live buffer.
**This means the preload doesn't need to include kernel or initfs at all.** The preload
exists solely so that `lived` can serve the filesystem to the running OS via `scheme:disk.live`.
**Revised Fix 1:** Reduce max_preload to a small value (e.g., 4-64 MiB) that covers just
the RedoxFS metadata needed for initial mount, then rely on the disk fallback for the rest.
BUT: this only works if `lived` can fall through to the physical disk for out-of-bounds
reads. Without the fallback, reducing preload makes the problem worse.
### 4.2 Fix 2: lived Disk Fallback (base patch)
**File:** `recipes/core/base/source/drivers/storage/lived/src/main.rs`
This is the core fix. Make `lived` aware of the full filesystem and able to read from
the physical disk when the preload doesn't cover the requested region.
**Design:**
```rust
struct LiveDisk {
// Preloaded RAM buffer (may be smaller than total filesystem)
preload: &'static [u8],
// Full filesystem size (from RedoxFS header or env var)
total_size: u64,
// Physical disk offset where the filesystem starts
disk_block: u64,
// Handle to the physical disk (opened after ahcid starts)
disk_handle: Option<File>,
// Write overlay (same as before)
overlay: HashMap<u64, Box<[u8]>>,
}
impl Disk for LiveDisk {
fn block_size(&self) -> u32 { 512 }
fn size(&self) -> u64 { self.total_size }
async fn read(&mut self, block: u64, buffer: &mut [u8]) -> syscall::Result<usize> {
let bs = self.block_size() as usize;
let offset = (block as usize) * bs;
if offset + buffer.len() > self.total_size as usize {
return Err(syscall::Error::new(EINVAL));
}
let preload_bytes = self.preload.len();
for (i, chunk) in buffer.chunks_mut(bs).enumerate() {
let block_i = block + i as u64;
let offset_i = offset + i * bs;
// Check overlay first
if let Some(overlay) = self.overlay.get(&block_i) {
chunk.copy_from_slice(&overlay[..chunk.len()]);
continue;
}
if offset_i + chunk.len() <= preload_bytes {
// Within preload → read from RAM
chunk.copy_from_slice(&self.preload[offset_i..offset_i + chunk.len()]);
} else {
// Beyond preload → read from physical disk
self.read_from_disk(block_i, chunk)?;
}
}
Ok(buffer.len())
}
fn read_from_disk(&mut self, block: u64, buffer: &mut [u8]) -> syscall::Result<()> {
// Try to open the physical disk if not already open
if self.disk_handle.is_none() {
// Try common disk scheme paths
for path in &["/scheme/disk/0", "/scheme/disk/1"] {
if let Ok(file) = OpenOptions::new().read(true).open(path) {
self.disk_handle = Some(file);
break;
}
}
}
if let Some(ref mut disk) = self.disk_handle {
// Seek to the correct block (accounting for partition offset)
let abs_block = self.disk_block + block;
disk.read_at(buffer, abs_block * self.block_size() as u64)
.map_err(|_| syscall::Error::new(EIO))?;
Ok(())
} else {
// No disk available yet — return what we have from preload
// (fill with zeros for regions not in preload)
buffer.fill(0);
Err(syscall::Error::new(EIO))
}
}
}
```
**Problem with this approach:** `lived` starts before `ahcid` (it's at priority 10 in
init.initfs.d, while ahcid is at priority 40). So when lived first starts, there IS no
`/scheme/disk/0` to fall back to. The disk fallback would only work after ahcid initializes.
### 4.3 Fix 3: Two-Stage Mount (Recommended)
The cleanest fix is to change the init sequence:
**Stage 1: lived serves the preloaded buffer (for early boot)**
- lived starts as before, serves the preload via `scheme:disk.live`
- RedoxFS does NOT mount from `disk.live` for the root filesystem
- The initfs has everything needed for early boot (lived, ahcid, basic tools)
**Stage 2: Mount from physical disk (after drivers start)**
- After `40_drivers.target` completes, ahcid has registered `/scheme/disk/0`
- Init runs `redoxfs --uuid $REDOXFS_UUID file $REDOXFS_BLOCK` pointing to `/scheme/disk/0`
- This reads the full filesystem from the physical disk
**Current init flow:**
```
10_lived.service → starts lived (scheme:disk.live)
40_drivers.target → starts ahcid (scheme:disk/0)
50_rootfs.service → redoxfs mounts from... whichever disk scheme it finds first
(scans all /scheme/disk/* for matching UUID)
```
**The issue is timing:** `50_rootfs.service` requires `40_drivers.target`, so it should
wait for ahcid. But `redoxfs` scans ALL disk schemes, and `disk.live` matches first
(since lived starts earlier). RedoxFS finds the UUID in `disk.live` and tries to mount
from it, but the disk is too small.
**Proposed fix:**
1. **Make lived report the full filesystem size** by reading the RedoxFS header from
the preload buffer to determine `total_size`. Report that as `size()`.
2. **Make lived fall through to disk reads** for out-of-bounds regions. Use a lazy-open
approach: when a read goes beyond the preload and no disk handle is open yet, try
to open `/scheme/disk/0`. If it fails, return EIO (which RedoxFS will retry).
3. **Reduce the preload** in the bootloader. Since lived now handles disk fallback,
we can preload much less (e.g., 4-64 MiB). The preload just needs to cover the
RedoxFS header and enough metadata for the initial mount.
---
## 5. Recommended Implementation Order
### Step 1: Fix lived to report full size + disk fallback (base patch)
Create `local/patches/base/P59-lived-disk-fallback.patch`:
1. Add `total_size: u64` field to LiveDisk
2. Parse RedoxFS header from preload buffer to determine total filesystem size
3. Report `total_size` from `size()` instead of `preload.len()`
4. For reads beyond preload: attempt to open and read from `/scheme/disk/0`
5. Keep overlay for writes
6. Keep block_size = 512 (from P6)
7. Add env var `DISK_PHYS_BLOCK` for the partition offset on the physical disk
### Step 2: Reduce bootloader preload cap (bootloader patch)
Create `local/patches/bootloader/P1-reduce-live-preload.patch`:
1. Change `max_preload` from 1024 MiB to a calculated minimum:
- Read the RedoxFS header to determine filesystem size
- Calculate the minimum preload needed: max(header + allocator extent, 4 MiB)
- Cap at 128 MiB (generous upper bound for metadata region)
2. Add `DISK_PHYS_BLOCK` env var so lived knows where the partition starts on disk
### Step 3: Verify
1. Build and test redbear-full ISO in QEMU with virtio-gpu
2. Verify RedoxFS mounts the full 4093 MiB filesystem
3. Verify login prompt appears
4. Verify KDE desktop loads (or at minimum, the greeter starts)
---
## 6. Risk Assessment
| Risk | Impact | Mitigation |
|------|--------|------------|
| RedoxFS header format changes between versions | lived parses header incorrectly | Use the same header parsing code as RedoxFS lib |
| ahcid not started when lived first needs disk | Read fails with ENOENT | Retry with backoff; RedoxFS mount retries automatically |
| Physical disk block offset wrong | Read corrupt data | Pass exact block offset from bootloader via env var |
| Preload too small for RedoxFS to find header | Mount fails immediately | Keep minimum preload at 4 MiB (covers any superblock) |
| Mini ISO regression | Small images broken | Test mini ISO after every change |
---
## 7. Alternative Approach: Mount From Physical Disk Directly
Instead of fixing lived, we could modify the init sequence to skip `disk.live` entirely
for the root filesystem mount:
1. Bootloader preloads just enough for kernel + initfs (no change needed)
2. lived starts but is only used for early boot I/O
3. `50_rootfs.service` is changed to explicitly mount from `/scheme/disk/0` (via ahcid)
instead of scanning all disk schemes
4. This requires passing the disk path and block offset from bootloader to init
**Pros:** Simpler lived (no disk fallback), cleaner architecture
**Cons:** Requires knowing which disk scheme serves the boot device; may not work if
the AHCI driver assigns a different number to the boot disk
**Verdict:** Fix 3 (lived with disk fallback) is more robust because it works regardless
of which disk scheme is assigned. The lived approach acts as a transparent cache layer.
---
## 8. Implementation Notes
### Bootloader env vars (current)
```
DISK_LIVE_ADDR=<hex phys addr of preload buffer>
DISK_LIVE_SIZE=<hex size of preload buffer>
REDOXFS_BLOCK=0 (always 0 for live mode)
REDOXFS_UUID=<uuid>
```
### Bootloader env vars (proposed additions)
```
DISK_PHYS_BLOCK=<hex block offset of partition on physical disk>
REDOXFS_FULL_SIZE=<hex total filesystem size>
```
### lived env vars (current)
```
DISK_LIVE_ADDR → phys addr to mmap
DISK_LIVE_SIZE → size to mmap (= preload size, NOT total filesystem size)
```
### lived env vars (proposed)
```
DISK_LIVE_ADDR → phys addr of preload buffer
DISK_LIVE_SIZE → size of preload buffer
DISK_PHYS_BLOCK → block offset for disk fallback reads
REDOXFS_FULL_SIZE → total filesystem size (for size() reporting)
```