fix(ci): repair rustup proxy AFTER cache restore on macOS release builds#254
Merged
Merged
Conversation
Release pipeline #97 (v0.5.98) failed on the aarch64-apple-darwin build with: error: unexpected argument 'build' found Usage: rustup-init[EXE] [OPTIONS] Stack backtrace: rustup_init::run_rustup_inner ... PR #245 already added a proxy-repair line ('rustup default ...') to the 'Install pinned nightly' step, and the in-step 'cargo --version' smoke check still passes — but the very next step, 'Swatinem/rust-cache', restores '~/.cargo/bin/' from a prior run's cache. On macos-latest that cache holds a poisoned 'cargo' symlink pointing at 'rustup-init' (the installer) rather than the toolchain proxy. The restore overwrites the freshly-repaired '~/.cargo/bin/cargo', and the subsequent 'cargo build' picks up the poisoned proxy and exits with the rustup-init backtrace before reaching any cargo subcommand. Empirical proof from the failing job (76219153458): 16:36:34Z cargo --version -> cargo 1.97.0-nightly (proxy healthy) 16:36:35Z Swatinem/rust-cache: restoring ~/.cargo/bin/ ... 16:36:54Z cargo build -> rustup-init backtrace (proxy poisoned) Fix: add a second proxy-repair step *after* the cache-restore step and *before* the build step. Re-running 'rustup default "$(rustup show active-toolchain | awk '{print $1}')"' rewrites the proxy binaries in '~/.cargo/bin' so the build resolves to the active toolchain's cargo even when the restored cache was poisoned. On the next successful run, 'Swatinem/rust-cache' saves a clean cache and the poison clears itself. Both 'release.yml::build-release-binaries' and 'release-cache-warm.yml::warm' get the same step (with matching commentary) — without the fix in the warm workflow, every macOS warm run would re-save the poisoned cache and perpetuate the regression. Step is gated on 'runner.os == macOS' so Linux and Windows legs (which do not exhibit this symptom) are unaffected. Pre-cache repair line is intentionally kept so the in-step 'cargo --version' smoke check still fails fast on a broken runner-image proxy.
This was referenced May 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Release pipeline #97 (v0.5.98) failed on the aarch64-apple-darwin matrix leg with:
```
error: unexpected argument 'build' found
Usage: rustup-init[EXE] [OPTIONS]
Stack backtrace: rustup_init::run_rustup_inner ...
```
The Linux leg passed for the first time since v0.5.96 — confirming PR #251 fixed the "Show binary sizes" regression and we are now hitting a different failure mode on macOS specifically.
Diagnosis
PR #245 added a proxy-repair line (`rustup default "$(rustup show active-toolchain | awk '{print $1}')"`) inside the Install pinned nightly step. That line works — the in-step `cargo --version` smoke check passes — but the very next step, `Swatinem/rust-cache`, restores `~/.cargo/bin/` from the cache. On `macos-latest` that cache holds a poisoned `cargo` symlink pointing at `rustup-init` (the installer) rather than the toolchain proxy. Restoring it overwrites the freshly-repaired proxy, and the next `cargo build` picks up the bad one and exits with the rustup-init backtrace before reaching any cargo subcommand.
Empirical proof from the failing job (`76219153458`):
So PR #245's repair is too early. It happens before the cache restore that re-poisons the proxy.
Fix
Add a second proxy-repair step after the cache-restore step and before the build step:
```yaml
- name: Repair rustup proxy after cache restore (macOS guard)
if: runner.os == 'macOS'
shell: bash
run: |
rustup default "$(rustup show active-toolchain | awk '{print $1}')"
cargo --version
```
Re-running `rustup default` rewrites the proxy binaries in `~/.cargo/bin` again, so the subsequent `cargo build` resolves to the active toolchain's real cargo even when the restored cache was poisoned. On the next successful run, `Swatinem/rust-cache` will save a clean cache and the poison clears itself.
Both `release.yml::build-release-binaries` and `release-cache-warm.yml::warm` get the same step (with matching commentary). Without the fix in the warm workflow, every macOS warm run would re-save the poisoned cache and perpetuate the regression.
Gated on `runner.os == 'macOS'` so the Linux and Windows legs — which do not exhibit this symptom — are unaffected.
Why not just disable `cache-bin` in Swatinem/rust-cache?
`Swatinem/rust-cache` has a `cache-bin: false` switch that would skip `~/.cargo/bin/` entirely and prevent the poisoning at its source. Trade-off: every cache miss would re-install `cargo-cyclonedx`, `cargo-deny`, `cargo-machete`, etc. The post-cache repair preserves that caching benefit and is surgical to the macOS rustup-proxy symptom.
Why not `gh cache delete` the macOS cache key right now?
The poisoned cache will be re-saved on the next macOS warm run regardless — without the workflow fix in this PR, deleting the cache is a one-shot bandage. This PR is needed in both cases. Once it merges, you can optionally evict the bad cache (`gh cache delete `) to skip one cycle of self-healing, but the workflow will heal on its own after one successful macOS run on either workflow.
Verification
Sequencing for the v0.5.98 ship
The currently-running pipeline #97 will finish failing the macOS leg (and skip the publish job). After this PR merges, re-dispatch the same v0.5.98 release pipeline manually — same tag, same artifacts, the macOS leg should now build cleanly. No version bump needed because v0.5.98 was never published.