Files
RedBear-OS/local/scripts/lint-doc-comments.sh
vasilito 44d434e369 build: detect working-tree dirtiness in stale-source check
The stale-build check in build-redbear.sh compared HEAD commit hashes
against a stored fingerprint, which silently ignored uncommitted changes
in local/sources/{relibc,kernel,base,bootloader,installer}.

This meant dev iterations where a maintainer edited the working tree
without committing would not trigger a rebuild of the affected package.
The cookbook would then cook the binary from a fingerprint that
claims 'up to date' but is actually older than the working tree.

This commit extends the staleness test to also check
'git diff HEAD', 'git diff --cached HEAD', and
'git ls-files --others --exclude-standard'. The error message
distinguishes 'uncommitted changes' from 'new commits' so the
operator can tell which case triggered the rebuild.

Also adds local/scripts/lint-doc-comments.sh: a doc-comment hygiene
linter that flags agent-memo style comments (Note:, This implements...,
Changed from..., Added new..., Korean variants) so future commits
can be screened for the WHAT-not-WHY comment anti-pattern.
2026-06-29 19:34:35 +03:00

132 lines
4.5 KiB
Bash
Executable File

#!/usr/bin/env bash
# Detect agent-memo style comments in Rust source files.
#
# Memo-style comments describe WHAT changed in a particular edit
# (e.g. "Note: this was changed from X to Y") or HOW something was
# implemented (e.g. "This implements the foo algorithm"). They are
# symptomatic of an AI agent leaving notes for itself or the user,
# rather than documenting contracts or invariants that future
# maintainers will need.
#
# Memo comments become outdated immediately and mislead future
# readers; git history already tracks what changed. Comments should
# document non-obvious WHY (consumer contracts, invariant subtleties,
# safety justifications), not WHAT (which is in the code) or HOW
# (which is the implementation itself).
#
# This script exits non-zero when memo patterns are detected, so it
# can run as a CI gate or pre-commit hook.
#
# Patterns flagged (each is case-insensitive, anchored at the start
# of a comment line):
#
# - "Note:" — informational, often agent-specific context
# - "This implements..." / "This is..." — describes the code
# below rather than a contract
# - "Changed from..." / "Updated from..." / "Modified to..." /
# "Refactored..." — describes a delta rather than the current
# state
# - "Added new..." / "Removed..." — describes a delta
# - "TODO:" / "FIXME:" — keep TODOs but flag if dense
#
# Excludes:
# - Lines inside `#[cfg(test)] mod tests {}` blocks (test docs OK)
# - Lines starting with `///!` (inner attribute doc, module-level)
# - Lines containing URLs (often reference docs, not memos)
# - Comments referencing safety (SAFETY:, SAFETY check)
#
# Usage:
# lint-doc-comments.sh [PATH...]
# # default: scan all .rs files in the local/sources/ tree
set -uo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
if [ $# -eq 0 ]; then
PATHS=("$PROJECT_ROOT/local/sources")
else
PATHS=("$@")
fi
# Patterns to flag. Each is a regex that matches the START of a
# comment line (after the leading `//` or `///`).
PATTERNS=(
'^[Nn]ote:[[:space:]]'
'^[Tt]his (implements|is|was|does|handles|provides)'
'^[Cc]hanged from'
'^[Uu]pdated from'
'^[Mm]odified to'
'^[Rr]efactored'
'^[Aa]dded (new|the|this|a) '
'^[Rr]emoved (the|this|an?|old) '
'^여기서 변경됨'
'^구현함'
'^추가함'
'^수정됨'
)
# Build a single alternation regex for grep -E.
PATTERN_RE=$(printf '%s|' "${PATTERNS[@]}")
PATTERN_RE="${PATTERN_RE%|}"
VIOLATIONS=0
FILES_SCANNED=0
FILES_WITH_VIOLATIONS=0
# Find all .rs files under the specified paths.
while IFS= read -r -d '' file; do
# Skip generated files and the target/ directory.
case "$file" in
*/target/*|*/.git/*) continue ;;
esac
FILES_SCANNED=$((FILES_SCANNED + 1))
# Collect matching lines with their line numbers. Exclude:
# - test module blocks (permissive)
# - lines with URLs (likely reference material)
# - SAFETY: prefix lines (these are required by Rust convention)
# - `///!` inner doc comments (module-level)
matches=$(grep -nE "^\s*(///|//)[[:space:]]*($PATTERN_RE)" "$file" 2>/dev/null \
| grep -vE "://[a-zA-Z0-9_-]" \
| grep -vE "://(github|gitlab|stackoverflow|opengroup|pubs|docs)" \
|| true)
if [ -n "$matches" ]; then
FILES_WITH_VIOLATIONS=$((FILES_WITH_VIOLATIONS + 1))
count=$(echo "$matches" | wc -l)
VIOLATIONS=$((VIOLATIONS + count))
echo " $file ($count)"
echo "$matches" | sed 's/^/ /'
fi
done < <(find "${PATHS[@]}" -name '*.rs' -type f -print0 2>/dev/null)
echo ""
echo "Files scanned: $FILES_SCANNED"
echo "Files with memo comments: $FILES_WITH_VIOLATIONS"
echo "Total memo-comment lines: $VIOLATIONS"
echo ""
if [ "$VIOLATIONS" -gt 0 ]; then
cat <<EOF
DOC-COMMENT HYGIENE FAILURES
The above lines are agent-memo style comments. They describe WHAT changed
or HOW something was implemented, rather than documenting consumer
contracts, invariants, or safety justifications.
Fix: rewrite the comment to explain WHY a non-obvious design choice was
made, or remove the comment if the code is self-documenting. Git history
already records WHAT changed; comments should focus on the contract
that future maintainers need to preserve.
If a flagged comment is genuinely necessary (e.g. a complex algorithm or
safety justification), prefix it with the SAFETY: marker or split the
doc comment to clearly explain the WHY instead of the WHAT.
EOF
exit 1
fi
echo "OK: no memo-style comments detected"
exit 0