configng: refresh clone once, early — fix matrix-prep cache fingerprint#9866
configng: refresh clone once, early — fix matrix-prep cache fingerprint#9866igorpecovnik wants to merge 8 commits into
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughWhen BUILD_DESKTOP=yes, artifact_rootfs_config_dump refreshes or initializes the local armbian-configng worktree via fetch_from_repo (saving/restoring PWD), falls back with a warning if fetch fails, and—if a .git directory exists—computes CONFIGNG_DESKTOPS_HASH from git and extracts the commit subject, emitting an informational breadcrumb. ChangesConfigng desktops hash derivation
🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested Labels
Suggested Reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
442f7d8 to
18a52a5
Compare
18a52a5 to
fbb6537
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/functions/artifacts/artifact-rootfs.sh`:
- Around line 46-51: The code runs git log against configng_dir even when
fetch_from_repo failed (it only logs a warning), which can reuse stale on-disk
state; modify the flow so git is only consulted when the fetch actually
succeeded: capture the fetch result (e.g. a boolean like fetch_ok from
fetch_from_repo), or check for a successful clone rather than just the presence
of configng_dir, and only set configng_desktops_hash="$(git -C "${configng_dir}"
...)" when fetch_ok is true; otherwise set configng_desktops_hash="unknown" (or
export an empty/explicit sentinel) and avoid touching configng_dir/.git (or
remove/unset configng_dir) so stale hashes are not used; update references to
fetch_from_repo, display_alert, configng_dir, and configng_desktops_hash
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fcc6bf82-c452-4342-9579-8dce0b545caa
📒 Files selected for processing (1)
lib/functions/artifacts/artifact-rootfs.sh
fbb6537 to
7eaba04
Compare
The current error path runs result.stderr through parse_log_lines_from_stderr (which drops anything not formatted as `type::msg`) and prints only the last 5 surviving lines. When config-dump-json subprocesses are failing mysteriously, that filter hides the actual cause — see PR #9866 chasing BUILD_DESKTOP=no/BUILD_MINIMAL=yes targets dying silently in matrix-prep on GHA but working locally. Dump full raw stdout + stderr (Python `repr` so newlines/escapes stay visible) on the failure path. Trim this once matrix-prep is stable.
Three coupled changes that together fix "I pushed to configng but the new image still has the old YAML": 1) New helper `fetch_armbian_configng()` in lib/functions/configuration/config-desktop.sh. Idempotent wrapper around `fetch_from_repo` (mirror-friendly via GITHUB_SOURCE), returns early if BUILD_DESKTOP != yes, save/restores PWD, declares the locals fetch_from_repo writes to so they don't leak. 2) Call the helper from BOTH prep_conf_main_build_single (line 31, right after do_main_configuration) AND prep_conf_main_minimal_ni (line 73, same position). Previously only the full-build path ran the fetch via interactive_desktop_main_configuration; the json-info / matrix-prep path (cli-jsoninfo) silently used a stale or absent clone. 3) Revert artifact_rootfs_config_dump to a read-only consumer of the now-guaranteed-fresh on-disk clone. No fetch in there, no mid-flow side effects per artifact. Keeps the info-level breadcrumb that logs short-hash + commit subject. Also fixes a latent bug in git_ensure_safe_directory (lib/functions/general/git.sh): the gate on `[[ -e "$1/.git" ]]` caused safe.directory to be silently skipped when fetch_from_repo prepared a fresh work tree before `git init`. The subsequent checkout/clean then failed with "fatal: detected dubious ownership" in Docker-volume-mounted setups where the parent directory has a non-matching uid. Dropping the gate makes the add unconditional; setting safe.directory on a non-git path is harmless. Net effect: every code path that reads CONFIGNG_DESKTOPS_HASH now sees the actual HEAD of armbian-configng, the artifact_full_oci_target resolves correctly, CI's remote-cache lookup hits the right (or correctly-missing) tag, and full builds under Docker stop hitting the safe-directory wall.
…loss)
armbian_run_command_and_parse_json_from_stdout (used by the matrix-prep
JSON info path) was building its subprocess env from scratch with only
5 keys: CONFIG_DEFS_ONLY, ANSI_COLOR, WRITE_EXTENSIONS_METADATA,
ALLOW_ROOT, PRE_PREPARED_HOST. HOME was not in the dict.
Effect: the subprocess git invocations ran without HOME, couldn't
locate ~/.gitconfig, and missed the safe.directory entry that the
parent process had just added for cache/sources/armbian-configng.
Every per-build config-dump-json call then exploded with:
fatal: detected dubious ownership in repository at
'/armbian/cache/sources/armbian-configng'
git --no-pager checkout -f -q ...
git --no-pager clean -q -d -f ...
JSON output missing → matrix-prep fails to parse → "Expecting value:
line 1 column 1" → cascade.
Fix is one diff: env={**os.environ, ...overrides...}. Inherit the
parent's HOME, PATH, GIT_* vars, locale, etc., then layer the
build-specific knobs on top.
… git race)
info-gatherer-image.py fans matrix-prep across up to 128 parallel
config-dump-json subprocesses. Each one ran prep_conf_main_minimal_ni
→ fetch_armbian_configng → fetch_from_repo on the same shared on-disk
clone — every worker raced for .git/index.lock, `git checkout` blew
up with exit 128 ("fatal: Unable to create '.git/index.lock'"), the
subprocess produced no JSON, and the gatherer raised "Expecting
value: line 1 column 1 (char 0)" for every desktop target in the
matrix.
The parent has already populated the clone before launching the
gatherer (cli-jsoninfo.sh's explicit fetch on the matrix-prep path,
or prep_conf_main_build_single's fetch on the single-image path), so
subprocesses just need to read what's on disk. Gate the fetch on
CONFIG_DEFS_ONLY!=yes, which is the existing reliable marker that
armbian_run_command_and_parse_json_from_stdout sets in the
subprocess env.
The current error path runs result.stderr through parse_log_lines_from_stderr (which drops anything not formatted as `type::msg`) and prints only the last 5 surviving lines. When config-dump-json subprocesses are failing mysteriously, that filter hides the actual cause — see PR #9866 chasing BUILD_DESKTOP=no/BUILD_MINIMAL=yes targets dying silently in matrix-prep on GHA but working locally. Dump full raw stdout + stderr (Python `repr` so newlines/escapes stay visible) on the failure path. Trim this once matrix-prep is stable.
On GHA the matrix-prep parent runs with CI=true. When we inherited that via **os.environ, start_logging_section (section-logging.sh:29) emitted `::group::[🥑] Group <name>` markers to subprocess stdout, preceding the JSON output. json.loads then choked at line 1 col 1 with "Expecting value (char 0)" for every BUILD_DESKTOP=no/ BUILD_MINIMAL=yes target — and only on GHA, never locally, because locally CI is unset. Drop CI and GITHUB_ACTIONS before layering the build-specific overrides, so config-dump-json subprocesses always behave like a plain non-CI invocation and stdout is pure JSON regardless of where the parent matrix-prep runs.
ef37e4b to
8282e96
Compare
…onf_main The BUILD_DESKTOP=yes gate meant fetch_armbian_configng was effectively a no-op for the matrix-prep parent (which has no specific board, so BUILD_DESKTOP is never "yes" at that level). The clone then depended on cli-jsoninfo.sh:131's secondary fetch, which is itself gated on the inventory file's absence — so on CLEAN_INFO=no the clone is never refreshed and CONFIGNG_DESKTOPS_HASH resolves against stale state. Fetch unconditionally instead. The CONFIG_DEFS_ONLY=yes skip stays — that's the one that prevents 128 parallel config-dump-json workers from racing on .git/index.lock. Cost of always-fetching from prep_conf_main_minimal_ni is one ls-remote per build entry, which is the right tradeoff against silent cache miss.
The bug
The CI matrix-prep path (`cli-jsoninfo` → `prep_conf_main_minimal_ni`) never called `interactive_desktop_main_configuration` — which is where the `fetch_from_repo` for armbian-configng lives. That path read `git log -1` against an absent or stale clone, computed the wrong `CONFIGNG_DESKTOPS_HASH`, resolved `artifact_full_oci_target` to a stale OCI tag, and CI pulled the pre-change rootfs from `ghcr.io/armbian/os/`. New image ships old config.
Additionally, full image builds in Docker volume-mounted environments were also blowing up — `git_ensure_safe_directory` had an ordering bug that surfaces the moment ownership-mismatched clones meet a fresh `git init`.
The fix — three coupled changes + one orthogonal fix
1. New helper
fetch_armbian_configng()`lib/functions/configuration/config-desktop.sh`. Idempotent wrapper around `fetch_from_repo` (mirror-friendly via `GITHUB_SOURCE`). Early-returns on `BUILD_DESKTOP != yes`, save/restores PWD, declares the locals `fetch_from_repo` writes to.
2. Call the helper from both build entry points
`lib/functions/main/config-prepare.sh`. Both `prep_conf_main_build_single` (full builds) AND `prep_conf_main_minimal_ni` (json-info / matrix-prep) now invoke `fetch_armbian_configng` right after `do_main_configuration` — i.e. as soon as `BUILD_DESKTOP` is authoritative.
3. `artifact_rootfs_config_dump` becomes a read-only consumer
`lib/functions/artifacts/artifact-rootfs.sh`. No more fetch logic per artifact — just `git log -1` against the clone that the prep_conf_main layer already refreshed. Keeps the info-level breadcrumb logging short-hash + commit subject.
4. `git_ensure_safe_directory` ordering fix
`lib/functions/general/git.sh`. Dropped the `-e "$1/.git"` gate. `fetch_from_repo` calls `git_ensure_safe_directory` BEFORE running `git init` on a freshly-prepared work tree — the gate caused the safe-directory add to be silently skipped, then subsequent `regular_git checkout/clean` blew up with "fatal: detected dubious ownership" on Docker volume mounts where the parent dir has a non-matching uid. Setting `safe.directory` on a non-git path is harmless; git only consults the entry when an actual repo is accessed there.
Why git and not the GitHub API
Considered `curl + jq` against `/repos/armbian/configng/commits?path=tools/modules/desktops&per_page=1`. Faster, simpler. Rejected because: operators behind corporate proxies set `GITHUB_SOURCE=https://git.internal/...\` which `fetch_from_repo` respects via URL rewrite; `api.github.com` is usually blocked or unrewritten in such environments. The mirror-compatibility win outweighs the one-time clone cost.
Test plan