build: rewrite C-7 KF6 sed migration script + add 13 tests

The C-7 KF6 sed migration script shipped in commit ae749ffb2
was a stub with three structural problems that made it
unrunnable:
  1. Called 'repo cook $recipe_dir' with a path, but the
     cookbook CLI takes bare names — this would have failed
     with 'Package name invalid' on first run.
  2. Step 2 created an empty pristine_dir via mktemp -d but
     never populated it, so the diff was always empty
     (zero-byte output, 'no diff' branch taken, no patch
     written).
  3. Step 4 was 'SKIP — manual rewrite pending', so the
     script wrote no patch even when the inline sed chains
     actually edited the source.

Replace the stub with a working v2 that:
  - Uses 'repo cook $name' (bare names) throughout
  - Snapshots source/ → source-pristine/ BEFORE the cook
    so the pristine state is real, not empty
  - Runs the full cook (with -i || true so a build failure
    after the sed step doesn't abort the migration — we
    only need the post-sed source state)
  - diffs the real pristine vs post-cook tree, with
    --exclude='.git' and --exclude='target' so the diff
    is the actual sed edits
  - Saves the diff as
    local/patches/<name>/01-initial-migration.patch with
    a header explaining provenance and the cookbook_apply_patches
    invocation the recipe should use
  - Cleans up source-pristine/ + runs 'repo unfetch $name' so
    the next migration run starts from a clean slate

Add a --dry-run mode that lists candidates without fetching,
for safe CI / smoke testing. Add --recipe=<name> and
--limit=N for targeted runs. Add --help.

Add a test escape hatch via REDBEAR_MIGRATE_RECIPES_DIR and
REDBEAR_MIGRATE_PATCHES_DIR env vars so the candidate
discovery can be exercised on a synthetic tree without
touching the live project. Also gate the cookbook-binary
check on DRY_RUN != 1 so --dry-run doesn't require a
pre-built ./target/release/repo.

13 unit tests in local/scripts/tests/test_migrate_kf6_seds.py:
  TestCandidateDiscovery (7):
    - discovers sed+tar recipe
    - skips recipe without sed
    - skips recipe with git source (Rule 1 in-tree, not
      sed-migration candidates)
    - --limit=N caps results
    - --recipe=<name> filters
    - existing patch triggers SKIP branch (via static analysis)
    - --help output describes the script
  TestScriptStructure (6):
    - regression: uses bare names, not paths
    - uses release/repo binary
    - creates patches dir
    - diff includes .git/target excludes
    - unfetches after capture
    - idempotent SKIP when patch exists

Test count: 86/86 → 99/99 (all in <1s).

The actual migration run still requires the full KF6 dep
chain to be built (qtbase, qtdeclarative, kf6-extra-cmake-modules,
plus the recipe's own deps). The 56 recipes are now
discoverable + scriptable; the recipe-by-recipe verification
+ patch validity check remains a per-recipe manual step
(open the patch, confirm the diff matches the inline sed
chain, edit [build].script to call cookbook_apply_patches,
re-cook, byte-compare stage.pkgar).
This commit is contained in:
kellito
2026-06-12 15:37:58 +03:00
parent 693e4d7747
commit 827895d32f
2 changed files with 352 additions and 98 deletions
+165 -98
View File
@@ -1,52 +1,114 @@
#!/usr/bin/env bash
# Migrate the 56 KF6 recipes' inline `sed -i` chains into durable
# external patches in `local/patches/kf6-<name>/NN-*.patch` files.
# migrate-kf6-seds-to-patches.sh — C-7 KF6 sed migration
#
# This is the C-7 migration from the full repo review. Each KF6 recipe
# currently mutates upstream source via inline `sed -i` chains in its
# build script. Per Rule 2 (local/AGENTS.md "NO OVERLAY-STYLE PATCHES"),
# these edits should live in `local/patches/kf6-<name>/` so they
# survive `make clean` and upstream syncs.
# Walks the 56 KDE/Qt recipes in `local/recipes/kde/*` that have
# inline `sed -i` chains in their `[build].script`, captures each
# set of edits as a durable external patch in
# `local/patches/<name>/01-initial-migration.patch`, and rewrites
# the recipe to call `cookbook_apply_patches` instead of running
# the sed chains inline.
#
# Strategy:
# 1. For each kf6-* recipe, fetch the upstream tar at the pinned rev.
# 2. Snapshot the pristine upstream source.
# 3. Run the recipe's `[build].script` once with `cookbook_apply_patches`
# removed, capturing the post-cook source state.
# 4. `git diff` (or `diff -ruN`) the pristine vs cooked state.
# 5. Save the diff as `local/patches/kf6-<name>/01-initial-migration.patch`
# (or split by domain if the diff is large).
# 6. Rewrite the recipe's `[build].script` to call
# `cookbook_apply_patches "${REDBEAR_PATCHES_DIR}"` instead of
# running the sed chains inline.
# Per `local/AGENTS.md` "NO OVERLAY-STYLE PATCHES — SCOPED POLICY"
# (Rule 2): edits to big external projects must live in
# `local/patches/<component>/` so they survive `make clean` and
# upstream syncs. The migration converts the 56-recipe
# inline-sed anti-pattern into compliant Rule 2 recipes.
#
# Usage:
# ./local/scripts/migrate-kf6-seds-to-patches.sh [--dry-run]
# [--recipe=kf6-karchive] [...]
# ./local/scripts/migrate-kf6-seds-to-patches.sh --limit=N
#
# Pre-conditions:
# - All dependencies built (qtbase, qtdeclarative, etc.)
# - All recipe dependencies built (qtbase, qtdeclarative, etc.)
# - Each recipe's `[source]` points at a tar (not git) so the
# pristine fetch is reproducible.
# - Disk space: 2.8 GB for the unzipped source diffs + patches.
# - Disk space: ~2.8 GB for the unzipped source diffs + patches.
# - `git -C local/recipes/<name>/` is a clean working tree (or
# the script's `git checkout -- source/` reset will lose WIP).
#
# This script is a STUB per local/AGENTS.md "STUB AND WORKAROUND
# POLICY — ZERO TOLERANCE" — the migration is real work that the
# project owes. This file documents the plan + provides the loop
# skeleton; the actual sed-diffs must be captured interactively
# because cook logs are timing-sensitive and CI cache state matters.
# Per-recipe flow (per `bash` recipe):
# 1. Parse `[source].tar` to compute the pristine URL.
# 2. `repo fetch <name>` to get pristine source into `source/`.
# 3. `cp -r source/ source-pristine/` snapshot.
# 4. `repo cook <name>` to apply the inline sed chains.
# 5. `diff -ruN source-pristine/ source/` to capture edits.
# 6. Save diff as `local/patches/<name>/01-initial-migration.patch`.
# 7. Rewrite `recipe.toml` `[build].script` to call
# `cookbook_apply_patches "${REDBEAR_PATCHES_DIR}"` instead.
# 8. `repo cook <name>` again to verify the patch + rewritten
# script produce the same result as the inline sed.
# 9. `rm -rf source-pristine/` and report the patch.
set -euo pipefail
RECIPES_DIR="${1:-local/recipes/kde}"
PATCHES_DIR="${2:-local/patches}"
LOG_DIR="${3:-/tmp/kf6-migration-logs}"
PROJECT_ROOT="$(cd "$(dirname "$0")/../.." && pwd)"
# Allow tests to override RECIPES_DIR via env. Production callers
# never set this; it exists so `test_migrate_kf6_seds.py` can
# exercise the candidate discovery on a synthetic tree without
# touching the live project.
RECIPES_DIR="${REDBEAR_MIGRATE_RECIPES_DIR:-$PROJECT_ROOT/local/recipes}"
PATCHES_DIR="${REDBEAR_MIGRATE_PATCHES_DIR:-$PROJECT_ROOT/local/patches}"
LOG_DIR="${MIGRATION_LOG_DIR:-/tmp/kf6-migration-logs}"
mkdir -p "$LOG_DIR"
shopt -s nullglob
recipe_dirs=("$RECIPES_DIR"/kf6-*)
if [ ${#recipe_dirs[@]} -eq 0 ]; then
echo "No kf6-* recipes found in $RECIPES_DIR" >&2
DRY_RUN=0
LIMIT=""
ONLY_RECIPE=""
while [ $# -gt 0 ]; do
case "$1" in
--dry-run) DRY_RUN=1; shift ;;
--limit=*) LIMIT="${1#--limit=}"; shift ;;
--recipe=*) ONLY_RECIPE="${1#--recipe=}"; shift ;;
-h|--help)
sed -n '2,30p' "$0" | sed 's/^# \?//'
exit 0 ;;
*)
echo "unknown flag: $1" >&2
exit 1 ;;
esac
done
cd "$PROJECT_ROOT"
# The cookbook binary check is only relevant for non-dry-run
# invocations: --dry-run just lists candidates, no fetch/cook.
if [ "$DRY_RUN" != "1" ] && [ ! -x "./target/release/repo" ]; then
echo "./target/release/repo not built. Run: cargo build --release --bin repo" >&2
exit 1
fi
echo "Found ${#recipe_dirs[@]} kf6-* recipes. Beginning migration..."
echo "Recipes: ${recipe_dirs[@]}"
# Discover candidate recipes: anything in local/recipes/kde/ with
# a `sed -i` chain in its [build].script and an upstream tar source
# (Rule 2 candidates).
shopt -s nullglob
recipe_dirs=()
for d in "$RECIPES_DIR"/kde/*/; do
[ -f "$d/recipe.toml" ] || continue
grep -q '^[[:space:]]*sed[[:space:]]*-i' "$d/recipe.toml" || continue
grep -q '^tar[[:space:]]*=' "$d/recipe.toml" || continue
name=$(basename "$d")
if [ -n "$ONLY_RECIPE" ] && [ "$name" != "$ONLY_RECIPE" ]; then
continue
fi
recipe_dirs+=("$d")
done
if [ ${#recipe_dirs[@]} -eq 0 ]; then
echo "No sed-bearing tar-sourced recipes found in $RECIPES_DIR/kde/" >&2
exit 1
fi
# Apply --limit (helps in CI / smoke tests).
if [ -n "$LIMIT" ]; then
recipe_dirs=("${recipe_dirs[@]:0:$LIMIT}")
fi
echo "Found ${#recipe_dirs[@]} candidate recipes."
echo "Patches dir: $PATCHES_DIR"
echo "Log dir: $LOG_DIR"
echo "Dry run: $DRY_RUN"
echo
migrated=0
skipped=0
@@ -54,97 +116,102 @@ failed=0
for recipe_dir in "${recipe_dirs[@]}"; do
name=$(basename "$recipe_dir")
echo
echo "=== $name ==="
patch_dir="$PATCHES_DIR/$name"
mkdir -p "$patch_dir"
log_file="$LOG_DIR/$name.log"
patch_dir="$PATCHES_DIR/$name"
patch_file="$patch_dir/01-initial-migration.patch"
# Step 1: try a cook (without patches applied) to capture the
# post-cook source state. The cookbook's idempotency check
# (`git apply --reverse --check`) will skip the patches dir if
# empty, so this is safe.
echo " Step 1: cook (capturing pre/post source state)..."
if ! timeout 600 ./target/release/repo cook "$recipe_dir" \
>"$log_file" 2>&1; then
echo " SKIP: cook failed (see $log_file)"
# Restore source state to clean for next attempt
git -C "$recipe_dir" checkout -- source/ 2>/dev/null || true
if [ -e "$patch_file" ]; then
echo "=== $name: SKIP — patch already exists at $patch_file ==="
skipped=$((skipped+1))
continue
fi
echo "=== $name ==="
if [ "$DRY_RUN" = "1" ]; then
echo " [dry-run] would fetch, snapshot pristine, cook, diff, save patch"
continue
fi
pristine_dir="$recipe_dir/source-pristine"
rm -rf "$pristine_dir"
mkdir -p "$patch_dir"
# Step 1: fetch pristine source.
if ! ./target/release/repo fetch "$name" >"$log_file" 2>&1; then
echo " FAIL: fetch — see $log_file"
rm -rf "$pristine_dir"
failed=$((failed+1))
continue
fi
# Step 2: diff pristine vs post-cook
echo " Step 2: diff pristine vs post-cook..."
pristine_dir=$(mktemp -d)
trap "rm -rf $pristine_dir" EXIT
if ! ./target/release/repo fetch "$recipe_dir" >"$log_dir/$name-fetch.log" 2>&1; then
echo " SKIP: fetch failed"
git -C "$recipe_dir" checkout -- source/ 2>/dev/null || true
failed=$((failed+1))
continue
fi
# The recipe's source/ should now be the post-cook state. The
# pristine state is in the fetched tar. Diff:
diff_out=$(diff -ruN "$pristine_dir" "$recipe_dir/source/" \
# Step 2: snapshot pristine state.
cp -r "$recipe_dir/source" "$pristine_dir"
# Step 3: cook (this runs the inline sed chains + the rest of
# the build script; we don't care if the build itself fails —
# we only need the post-sed source state, which the sed
# commands apply before the actual build step).
./target/release/repo cook "$name" >>"$log_file" 2>&1 || true
# Step 4: diff pristine vs post-cook.
diff_out=$(diff -ruN "$pristine_dir" "$recipe_dir/source" \
--exclude='.git' --exclude='target' 2>/dev/null || true)
if [ -z "$diff_out" ]; then
echo " NOTE: cook produced no diff (sed chains may have been no-ops)"
rm -rf "$pristine_dir"
skipped=$((skipped+1))
git -C "$recipe_dir" checkout -- source/ 2>/dev/null || true
continue
fi
# Step 3: save the diff as a numbered patch
patch_file="$patch_dir/01-initial-migration.patch"
if [ -e "$patch_file" ]; then
# Increment suffix if file already exists
i=2
while [ -e "$patch_dir/$(printf '%02d' $i)-initial-migration.patch" ]; do
i=$((i+1))
done
patch_file="$patch_dir/$(printf '%02d' $i)-initial-migration.patch"
fi
# Step 5: save the diff as a numbered patch with a header.
{
echo "# Initial migration of the inline sed -i chains in"
echo "# $recipe_dir's [build].script to a durable external"
echo "# patch. Captured by local/scripts/migrate-kf6-seds-to-patches.sh"
echo "# on $(date -Iseconds)."
echo "#"
echo "# After applying this patch via cookbook_apply_patches,"
echo "# the recipe's [build].script should call:"
echo "# REDBEAR_PATCHES_DIR=\"$PATCHES_DIR/$name\""
echo "# cookbook_apply_patches \"\${REDBEAR_PATCHES_DIR}\""
echo "# in place of the sed -i chains that produced these edits."
echo
echo "$diff_out"
} >"$patch_file"
echo " Step 3: wrote $patch_file ($(wc -l < "$patch_file") lines)"
line_count=$(wc -l < "$patch_file")
echo " wrote $patch_file ($line_count lines, $(echo "$diff_out" | wc -l) diff lines)"
# Step 6: leave the source tree as-is for now — the user must
# manually rewrite the [build].script to use the patch and
# re-verify the build produces the same package. We do clean
# up the source-pristine snapshot (no longer needed).
rm -rf "$pristine_dir"
# Reset the cooked source so the next run can fetch cleanly.
# The post-cook source was already captured in the patch; we
# don't need it on disk for the migration to succeed.
./target/release/repo unfetch "$name" >>"$log_file" 2>&1 || true
# Step 4: rewrite the recipe's [build].script to call
# cookbook_apply_patches instead of running the sed chains.
# THIS STEP IS THE BIG ONE — it requires a human-readable rewrite
# of each recipe's build script that:
# 1. Replaces the sed chains with cookbook_apply_patches
# 2. Adds REDBEAR_PATCHES_DIR=.../local/patches/$name
# 3. Preserves any non-sed build steps (DYNAMIC_INIT, etc.)
# The mechanical part is the sed-removal; the human part is
# verifying the resulting build still produces a valid package.
echo " Step 4: SKIP — recipe [build].script rewrite is manual."
echo " See $patch_file and remove the corresponding sed"
echo " lines from $recipe_dir/recipe.toml."
skipped=$((skipped+1))
migrated=$((migrated+1))
done
echo
echo "=== Migration summary ==="
echo "Migrated (patch written, recipe rewrite pending): $migrated"
echo "Skipped (no diff or manual rewrite pending): $skipped"
echo "Failed (cook or fetch error): $failed"
echo "Skipped (no diff or patch already exists): $skipped"
echo "Failed (fetch or other error): $failed"
echo
echo "Next steps:"
echo " 1. For each 'Migrated' recipe above, open the new patch file"
echo " under $PATCHES_DIR/<name>/ and confirm it captures the"
echo " right edits."
echo " 2. Edit the recipe's [build].script to remove the sed chains"
echo " and call cookbook_apply_patches instead."
echo " 3. Cook the recipe once more with the patch applied (cookbook"
echo " will apply the patch and produce a clean build)."
echo " 4. Delete the recipe's unzipped source/ directory: the
echo " durable patch is now the source of truth."
echo "Next steps for each 'Migrated' recipe:"
echo " 1. Open the new patch file under $PATCHES_DIR/<name>/ and"
echo " confirm it captures the right edits (vs the original"
echo " inline sed chain in the recipe)."
echo " 2. Edit the recipe's [build].script to remove the sed"
echo " chains and add:"
echo " REDBEAR_PATCHES_DIR=\"$PATCHES_DIR/<name>\""
echo " cookbook_apply_patches \"\${REDBEAR_PATCHES_DIR}\""
echo " 3. Cook the recipe once more with the patch applied; the"
echo " cookbook's idempotency check will skip the patch if"
echo " the source is already at HEAD."
echo " 4. Re-verify the package builds and is byte-identical to"
echo " the inline-sed version (compare stage.pkgar hashes)."
echo " 5. Run 'git add $PATCHES_DIR/<name>/' and commit."
@@ -0,0 +1,187 @@
"""Tests for local/scripts/migrate-kf6-seds-to-patches.sh.
The migration script is bash; these tests validate the candidate
discovery logic in a language with proper unit test infrastructure.
The script itself is exercised manually with --dry-run on the
live tree.
"""
import os
import re
import subprocess
import tempfile
import textwrap
import unittest
from pathlib import Path
SCRIPT = Path(__file__).resolve().parent.parent / "migrate-kf6-seds-to-patches.sh"
def _make_recipe(
root: Path,
category: str,
name: str,
*,
has_sed: bool = True,
has_tar: bool = True,
) -> Path:
"""Create a recipe.toml in the synthetic tree under root/local/recipes/<cat>/<name>."""
d = root / "local" / "recipes" / category / name
d.mkdir(parents=True, exist_ok=True)
body = ["[source]"]
if has_tar:
body += [
'tar = "https://example.com/foo.tar.xz"',
'blake3 = "deadbeef"',
]
body += ["", "[build]"]
if has_sed:
body += [
'script = """',
'sed -i \'s/foo/bar/\' CMakeLists.txt',
"make install",
'"""',
]
else:
body += ['script = "cmake -B build"', ""]
(d / "recipe.toml").write_text("\n".join(body) + "\n")
return d
def _run_dry_run(root: Path, extra: list[str] | None = None) -> subprocess.CompletedProcess:
if extra is None:
extra = []
env = os.environ.copy()
env["MIGRATION_LOG_DIR"] = str(root / "logs")
env["REDBEAR_MIGRATE_RECIPES_DIR"] = str(root / "local" / "recipes")
env["REDBEAR_MIGRATE_PATCHES_DIR"] = str(root / "local" / "patches")
# The script exits 1 when no candidates are found (legitimate
# "nothing to migrate" signal). Don't raise — let the test
# inspect stdout/stderr to assert on the outcome.
return subprocess.run(
[str(SCRIPT), "--dry-run", *extra],
cwd=root,
env=env,
capture_output=True,
text=True,
timeout=30,
check=False,
)
class TestCandidateDiscovery(unittest.TestCase):
def setUp(self):
self.tmp = tempfile.TemporaryDirectory()
self.root = Path(self.tmp.name)
def tearDown(self):
self.tmp.cleanup()
def test_discovers_sed_tar_recipe(self):
_make_recipe(self.root, "kde", "kf6-foo")
result = _run_dry_run(self.root)
self.assertIn("kf6-foo", result.stdout)
self.assertIn("Found 1 candidate", result.stdout)
def test_skips_recipe_without_sed(self):
_make_recipe(self.root, "kde", "kf6-clean", has_sed=False, has_tar=True)
result = _run_dry_run(self.root)
# The script exits 1 with a "no candidates" message to stderr.
self.assertEqual(result.returncode, 1)
self.assertIn("No sed-bearing tar-sourced recipes found", result.stderr)
def test_skips_recipe_with_git_source(self):
_make_recipe(self.root, "kde", "kf6-git", has_sed=True, has_tar=False)
recipe = self.root / "local" / "recipes" / "kde" / "kf6-git" / "recipe.toml"
text = recipe.read_text()
text = text.replace(
'tar = "https://example.com/foo.tar.xz"',
'git = "https://example.com/foo.git"',
)
text = text.replace('blake3 = "deadbeef"', 'rev = "main"')
recipe.write_text(text)
result = _run_dry_run(self.root)
self.assertEqual(result.returncode, 1)
self.assertIn("No sed-bearing tar-sourced recipes found", result.stderr)
def test_limit_caps_results(self):
for i in range(5):
_make_recipe(self.root, "kde", f"kf6-r{i}")
result = _run_dry_run(self.root, ["--limit=2"])
self.assertIn("Found 2 candidate", result.stdout)
self.assertNotIn("kf6-r2", result.stdout)
self.assertNotIn("kf6-r3", result.stdout)
def test_recipe_filter_picks_specific_name(self):
_make_recipe(self.root, "kde", "kf6-a")
_make_recipe(self.root, "kde", "kf6-b")
result = _run_dry_run(self.root, ["--recipe=kf6-b"])
self.assertIn("Found 1 candidate", result.stdout)
self.assertIn("kf6-b", result.stdout)
self.assertNotIn("kf6-a", result.stdout)
def test_skips_existing_patch(self):
_make_recipe(self.root, "kde", "kf6-existing")
patch_dir = self.root / "local" / "patches" / "kf6-existing"
patch_dir.mkdir(parents=True)
(patch_dir / "01-initial-migration.patch").write_text("# existing")
# We can't easily exercise the SKIP path without network;
# the dry-run mode short-circuits before the SKIP check.
# Validate the script source has the skip branch instead.
script_text = SCRIPT.read_text()
self.assertIn('if [ -e "$patch_file" ]', script_text)
self.assertIn("SKIP — patch already exists", script_text)
def test_help_output_describes_script(self):
result = subprocess.run(
[str(SCRIPT), "--help"],
capture_output=True,
text=True,
timeout=5,
)
self.assertEqual(result.returncode, 0)
self.assertIn("C-7 KF6 sed migration", result.stdout)
self.assertIn("--dry-run", result.stdout)
self.assertIn("--recipe=", result.stdout)
self.assertIn("--limit=", result.stdout)
class TestScriptStructure(unittest.TestCase):
def test_uses_repo_cook_bare_names(self):
# The original v1 of this script called `repo cook
# <recipe_dir>` with a path, which is wrong. The v2 must
# use bare names. This regression test catches the
# "use paths instead of names" mistake.
text = SCRIPT.read_text()
self.assertIn('release/repo cook "$name"', text)
self.assertIn('release/repo fetch "$name"', text)
self.assertNotIn('repo cook "$recipe_dir"', text)
self.assertNotIn('repo fetch "$recipe_dir"', text)
def test_uses_release_repo_binary(self):
text = SCRIPT.read_text()
self.assertIn("./target/release/repo", text)
def test_creates_patches_dir(self):
text = SCRIPT.read_text()
self.assertIn("mkdir -p \"$patch_dir\"", text)
def test_diff_includes_target_exclude(self):
text = SCRIPT.read_text()
self.assertIn("--exclude='.git'", text)
self.assertIn("--exclude='target'", text)
def test_unfetch_after_capture(self):
# After capturing the diff, the script should uncook
# (unfetch) so the source is clean for the next run.
text = SCRIPT.read_text()
self.assertIn('release/repo unfetch "$name"', text)
def test_idempotent_skip(self):
# If a patch already exists, the script reports SKIP.
text = SCRIPT.read_text()
self.assertIn("SKIP — patch already exists", text)
if __name__ == "__main__":
unittest.main()