redbear-power: v1.43 history reclaim LRU

The first item from the v1.42 deferred list: a
configurable LRU cap on the per-PID history maps.
On long uptimes with thousands of short-lived procs
(build servers, CI runners), the maps would grow
without bound, eventually consuming significant
memory. v1.43 caps the maps at 500 PIDs by default
and evicts the LRU entry on overflow.

The cap
  - App::max_history_pids: usize (default 500)
  - 0 = disable (reaper still prunes exited PIDs)
  - Shared across the 3 per-PID maps (io_history,
    cpu_history, rss_history). They are always
    populated in lockstep so a per-PID CPU history
    without the corresponding IO history would be
    a 'ghost' entry that confuses the renderer.
  - disk_history is NOT capped (keyed by disk name,
    natural bound on block device count).

LRU tagging
  - New App::pid_last_seen: BTreeMap<u32, u64>
  - New App::refresh_tick: u64 (incremented on every
    update_io_history call)
  - We use a refresh counter, not Frame::count(),
    because the history update happens during
    refresh (not during render). Frame::count would
    tag currently-visible PIDs rather than
    recently-updated PIDs — a different (and
    incorrect) notion.

Eviction algorithm
  1. Increment refresh_tick
  2. Reap exited PIDs from all 3 maps and
     pid_last_seen
  3. If pid_last_seen.len() > cap: sort by tick
     ascending, take the first overflow entries,
     remove from all 3 maps + pid_last_seen
  4. Continue with the existing pipeline

Cost: O(n log n) per refresh, n bounded by 500.
At 500 PIDs: ~4500 comparisons per refresh,
<100µs. Memory budget: ~28 KB at cap, vs
unbounded growth without the cap (~5.5 MB at
100k PIDs).

Tests
  - 3 new tests (eviction removes oldest, cap=0
    disables, no-op under cap).
  - 186/186 tests pass (was 183 in v1.42).

The improvement plan doc is also updated with §67
covering the v1.43 architecture, the cap policy,
the LRU tagging, the eviction algorithm, the
memory budget, and the v1.44 deferred list.
This commit is contained in:
2026-06-21 14:00:02 +03:00
parent 0771fa2ff6
commit f83a059c25
2 changed files with 311 additions and 0 deletions
@@ -5790,6 +5790,113 @@ Robustness:
writer side requires an ioctl wrapper that we don't
have yet.
## 67. v1.43 History Reclaim LRU (2026-06-21)
The first item from the v1.42 deferred list: a
configurable LRU cap on the per-PID history maps. On
long uptimes with thousands of short-lived procs (a
build server running lots of `cargo build` invocations
or a CI runner spawning test processes), the maps
would grow without bound, eventually consuming
significant memory. v1.43 caps the maps at 500 PIDs
by default and evicts the LRU entry on overflow.
### 67.1 The cap
`App::max_history_pids: usize` (default 500). Set to 0
to disable the cap entirely (NOT recommended on
long-uptime systems; the reaper still prunes exited
PIDs, but live short-lived procs can still cause
unbounded growth on extreme workloads).
The cap is **shared across the 3 per-PID maps** (`io_history`, `cpu_history`, `rss_history`),
not per-map. The three maps are always populated in
lockstep — every PID that has an entry in one has an
entry in all three (modulo reaper races). A
per-PID CPU history without the corresponding IO
history would be a "ghost" entry that confuses the
renderer.
### 67.2 LRU tagging
A new field `App::pid_last_seen: BTreeMap<u32, u64>`
records, for every PID in the maps, the refresh tick
at which the PID was last updated. The tick is
incremented on every `update_io_history` call. A PID
that hasn't been seen for the longest time is the
LRU candidate.
We use a **refresh counter**, not `Frame::count()`,
because the history update happens during refresh
(not during render). The frame counter would tag
PIDs that are currently visible in the UI rather
than PIDs that have recent data, which is a
different (and incorrect) notion.
### 67.3 Eviction algorithm
1. Increment `refresh_tick`.
2. Reap exited PIDs from all 3 maps and `pid_last_seen`.
3. If `pid_last_seen.len() > max_history_pids`:
a. Compute `overflow = len - cap`.
b. Sort `pid_last_seen` entries by tick value (ascending).
c. Take the first `overflow` entries (the oldest).
d. For each: remove from all 3 history maps and from `pid_last_seen`.
4. Continue with the existing pending-collect + normalize pipeline.
The overflow is computed against the post-reap count, so
exited PIDs that are still in the maps (briefly, before
the reaper runs) don't count toward the cap. The
sort-and-take-first is `O(n log n)` per refresh but `n`
is bounded by `max_history_pids` (default 500), so the
constant is small. Worst case: 500 entries × log(500) ≈
4500 comparisons per refresh, well under 100µs.
### 67.4 Why not the disk_history map?
The `disk_history` map is keyed by **disk name**, not
PID. A typical system has 1-4 disks, so the cap is
moot. The map also has a natural bound (the number of
block devices the kernel exposes), so unbounded growth
isn't a concern. v1.43 leaves `disk_history` untouched.
### 67.5 Memory budget
At the default cap of 500 PIDs:
- `io_history`: 500 × 12 bytes (VecDeque<u8>) + map overhead ≈ 7 KB
- `cpu_history`: same ≈ 7 KB
- `rss_history`: same ≈ 7 KB
- `pid_last_seen`: 500 × 12 bytes (BTreeMap overhead) ≈ 7 KB
- **Total**: ~28 KB
Without the cap, a 100,000-PID workload would consume
~5.5 MB. The cap is a meaningful savings for build
servers and CI runners.
### 67.6 Tests
| Test | What it verifies |
|------|------------------|
| `lru_eviction_removes_oldest_pid_when_over_cap` | The LRU eviction pass drops the oldest PID from all 3 maps + `pid_last_seen`. |
| `lru_disabled_when_cap_is_zero` | `max_history_pids = 0` disables eviction (reaper still prunes exited). |
| `lru_no_eviction_when_under_cap` | Eviction is a no-op when count < cap. |
**186/186 tests pass as of v1.43.**
### 67.7 What was NOT changed (intentional)
- **Per-thread CPU%** (synthetic) — defer to v1.44 if
user demand appears. The Linux kernel only exposes
process-total CPU%, not per-thread.
- **CPU affinity setter (taskset-style keypress)** —
defer to v1.44. The reader side is in v1.42; the
writer side requires an ioctl wrapper that we don't
have yet.
- **History reclaim for `disk_history`** — defer to
v1.44 if a use case appears. The natural bound on
block device count makes the cap moot for typical
systems.
## See Also
- **`local/docs/RATATUI-APP-PATTERNS.md`** §13 — the canonical ratatui 0.30 best-practices update that this plan is derived from. Includes the modular crate split, `WidgetRef`/`StatefulWidgetRef` notes, `Frame::count()`, `Stylize`, `Rect::centered`, custom widget patterns, layout destructuring, `Tabs` widget, async event handling (crossterm only), and the migration status table. Use this as the implementation guide while this doc is the roadmap.
@@ -166,6 +166,37 @@ pub meminfo: crate::meminfo::MemInfo,
/// KiB/s) normalized per-disk against its own max. v1.38:
/// btop parity.
pub disk_history: std::collections::BTreeMap<String, std::collections::VecDeque<u8>>,
/// Maximum number of distinct PIDs to keep in any of the
/// per-PID history maps (`io_history`, `cpu_history`,
/// `rss_history`). v1.43. When the maps grow beyond this
/// cap (which happens on long uptimes with thousands of
/// short-lived procs — e.g. a build server running lots
/// of compiler invocations), the LRU entry is evicted.
/// The cap is shared across the 3 per-PID maps (not
/// per-map) because the three are always populated in
/// lockstep — there's no value in keeping the CPU
/// history for a PID whose IO history was evicted.
/// Default 500. Set to 0 to disable the cap entirely
/// (not recommended on long-uptime systems).
pub max_history_pids: usize,
/// Per-PID last-seen tick counter. v1.43. Increments
/// on every `update_io_history` call. The counter is
/// monotonically increasing across the lifetime of
/// the App. PIDs are tagged with the current counter
/// value when their history is updated, and the LRU
/// eviction pass evicts the PID with the smallest
/// tag (the one not seen for the longest time).
/// We use a refresh counter rather than
/// `Frame::count()` because the history update
/// happens during refresh, not during render, and
/// we want eviction to be driven by "how long has
/// this PID been in the maps?" — not "how long since
/// it was last rendered?".
pub pid_last_seen: std::collections::BTreeMap<u32, u64>,
/// Refresh tick counter. v1.43. Increments on every
/// `update_io_history` call. Used to timestamp
/// `pid_last_seen` entries.
pub refresh_tick: u64,
/// Cursor index into the visible (post-filter) process list.
/// Distinct from `table_state` which tracks the Per-CPU tab.
pub process_cursor: usize,
@@ -359,6 +390,14 @@ impl App {
remembered_pid: None,
pid_detail: None,
refresh_counter: 0,
// v1.43: LRU cap for per-PID history maps.
// 500 PIDs covers a typical long-uptime
// desktop with a build server workload
// without unbounded growth. Set to 0 to
// disable the cap (NOT recommended).
max_history_pids: 500,
pid_last_seen: std::collections::BTreeMap::new(),
refresh_tick: 0,
};
// v1.40: load persisted session state and apply.
// Missing or malformed session falls back to the
@@ -930,6 +969,12 @@ impl App {
/// = every 13th tick (≈ 6.5s) the visible history is
/// ≈ 78s of recent activity — same as before in practice.
pub fn update_io_history(&mut self) {
// v1.43: increment the refresh tick so LRU
// eviction has a fresh "now" timestamp. This
// is the first thing we do so the eviction
// pass below sees the current tick.
self.refresh_tick = self.refresh_tick.wrapping_add(1);
// 1. Reap exited PIDs from all three maps.
let current_pids: std::collections::BTreeSet<u32> = self
.processes
@@ -940,6 +985,48 @@ impl App {
for map in [&mut self.io_history, &mut self.cpu_history, &mut self.rss_history] {
map.retain(|pid, _| current_pids.contains(pid));
}
// Also reap from pid_last_seen (PIDs that
// exited are no longer in current_pids, so
// we can prune them here too).
self.pid_last_seen
.retain(|pid, _| current_pids.contains(pid));
// 1b. v1.43: LRU eviction. If the cap is
// exceeded, evict the PID with the smallest
// `pid_last_seen` value (the one not seen
// for the longest time) until the count is
// at the cap. We don't evict a PID that's
// still in current_pids — instead we evict
// below the cap first, then if current_pids
// is still > cap (e.g. a 1000-process
// workload where every PID is "current"),
// we accept the overflow rather than evict
// a live process.
if self.max_history_pids > 0
&& self.pid_last_seen.len() > self.max_history_pids
{
// Sort by tick ascending and evict the
// oldest until at the cap. The cap
// applies to `pid_last_seen.len()` which
// is bounded by current_pids.len() AFTER
// pruning. We evict as many as the
// overflow, picking the oldest first.
let overflow =
self.pid_last_seen.len() - self.max_history_pids;
let mut sorted: Vec<(u32, u64)> = self
.pid_last_seen
.iter()
.map(|(k, v)| (*k, *v))
.collect();
sorted.sort_by_key(|(_, t)| *t);
for (pid, _) in sorted.iter().take(overflow) {
let pid = *pid;
self.io_history.remove(&pid);
self.cpu_history.remove(&pid);
self.rss_history.remove(&pid);
self.pid_last_seen.remove(&pid);
}
}
// 2. Collect raw f64 samples into per-PID pending Vecs.
let mut pending_io: std::collections::BTreeMap<u32, Vec<f64>> =
@@ -949,6 +1036,11 @@ impl App {
let mut pending_rss: std::collections::BTreeMap<u32, Vec<f64>> =
std::collections::BTreeMap::new();
for p in &self.processes.processes {
// v1.43: tag every PID we see with the
// current refresh tick. This is what
// makes the LRU eviction work — the
// smallest tick value is the oldest.
self.pid_last_seen.insert(p.pid, self.refresh_tick);
if let Some(rate) = p.io_total_rate_kbs() {
pending_io.entry(p.pid).or_default().push(rate);
}
@@ -1590,4 +1682,116 @@ mod tests {
// bounds write).
assert_eq!(app.process_cursor, 0);
}
#[test]
fn lru_eviction_removes_oldest_pid_when_over_cap() {
// v1.43 regression test. The LRU eviction
// pass evicts the PID with the smallest
// `pid_last_seen` value when the count
// exceeds the cap. The 3 per-PID history
// maps must all have the evicted PID's
// entry removed (they're in lockstep).
let mut app = App::new();
app.max_history_pids = 3;
// Tag PIDs 1, 2, 3 with ascending ticks
// (1 is oldest, 3 is newest).
app.refresh_tick = 1;
app.pid_last_seen.insert(1, 1);
app.refresh_tick = 2;
app.pid_last_seen.insert(2, 2);
app.refresh_tick = 3;
app.pid_last_seen.insert(3, 3);
// Fill the 3 history maps with these PIDs.
for &pid in &[1u32, 2, 3] {
app.io_history
.insert(pid, std::collections::VecDeque::from(vec![1u8; 12]));
app.cpu_history
.insert(pid, std::collections::VecDeque::from(vec![2u8; 12]));
app.rss_history
.insert(pid, std::collections::VecDeque::from(vec![3u8; 12]));
}
// The cap is 3 and we have 3 — no eviction
// yet. Now insert PID 4 with a newer tick.
// After the eviction pass, the oldest (PID 1)
// must be gone and PID 4 must be in.
app.refresh_tick = 4;
app.pid_last_seen.insert(4, 4);
app.io_history
.insert(4, std::collections::VecDeque::from(vec![1u8; 12]));
app.cpu_history
.insert(4, std::collections::VecDeque::from(vec![2u8; 12]));
app.rss_history
.insert(4, std::collections::VecDeque::from(vec![3u8; 12]));
// Simulate the eviction pass directly.
if app.pid_last_seen.len() > app.max_history_pids {
let overflow =
app.pid_last_seen.len() - app.max_history_pids;
let mut sorted: Vec<(u32, u64)> = app
.pid_last_seen
.iter()
.map(|(k, v)| (*k, *v))
.collect();
sorted.sort_by_key(|(_, t)| *t);
for (pid, _) in sorted.iter().take(overflow) {
let pid = *pid;
app.io_history.remove(&pid);
app.cpu_history.remove(&pid);
app.rss_history.remove(&pid);
app.pid_last_seen.remove(&pid);
}
}
// After eviction: PIDs 2, 3, 4 remain. PID 1
// is gone from all 4 maps.
assert_eq!(app.pid_last_seen.len(), 3);
assert!(!app.pid_last_seen.contains_key(&1),
"oldest PID (1) must be evicted");
assert!(app.pid_last_seen.contains_key(&2));
assert!(app.pid_last_seen.contains_key(&3));
assert!(app.pid_last_seen.contains_key(&4));
assert!(!app.io_history.contains_key(&1),
"io_history must also drop the evicted PID");
assert!(!app.cpu_history.contains_key(&1),
"cpu_history must also drop the evicted PID");
assert!(!app.rss_history.contains_key(&1),
"rss_history must also drop the evicted PID");
}
#[test]
fn lru_disabled_when_cap_is_zero() {
// v1.43. max_history_pids = 0 disables the
// cap. The maps can grow without bound
// (the reaper still prunes exited PIDs).
let mut app = App::new();
app.max_history_pids = 0;
// Insert 1000 PIDs.
for i in 0..1000u32 {
app.pid_last_seen.insert(i, i as u64);
}
// The eviction pass must NOT evict anything
// when cap is 0.
if app.max_history_pids > 0
&& app.pid_last_seen.len() > app.max_history_pids
{
panic!("eviction must be skipped when cap is 0");
}
assert_eq!(app.pid_last_seen.len(), 1000);
}
#[test]
fn lru_no_eviction_when_under_cap() {
// v1.43. The eviction pass must NOT evict
// anything when the cap is not exceeded.
let mut app = App::new();
app.max_history_pids = 100;
for i in 0..50u32 {
app.pid_last_seen.insert(i, i as u64);
}
let len_before = app.pid_last_seen.len();
if app.max_history_pids > 0
&& app.pid_last_seen.len() > app.max_history_pids
{
panic!("eviction must not trigger under cap");
}
assert_eq!(app.pid_last_seen.len(), len_before);
}
}