feat: S3-backed cache-s3 siblings (golang/rust/node-pnpm) for Hetzner RGW#187
feat: S3-backed cache-s3 siblings (golang/rust/node-pnpm) for Hetzner RGW#187mike-ainsel wants to merge 4 commits into
Conversation
…RGW) Add -s3 sibling cache actions mirroring the existing GitHub-cache ones, following the turborepo/cache-s3 precedent. They restore/save toolchain caches to the Hetzner RadosGW (s3.hz.platforma.bio) via tespkg/actions-cache (pinned to v1.10.2 / e07e2d49) instead of the GitHub-hosted cache. Purpose: on the hz self-hosted runners, drop the shared-hostPath node caches (cross-job contamination surface) and let each job restore/save its own copy from RGW; node-local NVMe is left for the ephemeral per-job workdir only. - Additive only; the existing golang/cache, rust/cache, node/cache-pnpm are untouched, so AWS-runner jobs keep using the GitHub-hosted cache. - Identical key schemes to their siblings. - use-fallback=true → resilient to a transient RGW outage. - Linux only (the hz fleet is Linux/x86).
2d1d99d to
4a5c0ed
Compare
| runs: | ||
| using: "composite" |
There was a problem hiding this comment.
retry-count may be silently ignored
retry is a documented input for tespkg/actions-cache, but retry-count does not appear in the published action interface. GitHub Actions silently discards unrecognised with: keys, so all three actions would retry with the action's default count (likely 1–3) rather than the intended value of 3. If you want a deterministic retry count, verify this input is supported at the pinned commit; otherwise remove it to avoid misleading configuration. The same pattern appears in rust/cache-s3 and node/cache-pnpm-s3.
Prompt To Fix With AI
This is a comment left during a code review.
Path: actions/golang/cache-s3/action.yaml
Line: 56-57
Comment:
**`retry-count` may be silently ignored**
`retry` is a documented input for `tespkg/actions-cache`, but `retry-count` does not appear in the published action interface. GitHub Actions silently discards unrecognised `with:` keys, so all three actions would retry with the action's default count (likely 1–3) rather than the intended value of `3`. If you want a deterministic retry count, verify this input is supported at the pinned commit; otherwise remove it to avoid misleading configuration. The same pattern appears in `rust/cache-s3` and `node/cache-pnpm-s3`.
How can I resolve this? If you propose a fix, please make it concise.| uses: tespkg/actions-cache@e07e2d4953dc8c020d447363e5064e36d04f3cf9 # v1.10.2 | ||
| with: | ||
| endpoint: ${{ inputs.endpoint }} | ||
| region: ${{ inputs.region }} | ||
| bucket: ${{ inputs.bucket }} | ||
| insecure: ${{ inputs.insecure }} | ||
| accessKey: ${{ inputs.access-key }} | ||
| secretKey: ${{ inputs.secret-key }} | ||
| use-fallback: ${{ inputs.use-fallback }} | ||
| retry: 'true' | ||
| retry-count: '3' | ||
| path: | | ||
| ~/.cache/go-build | ||
| ~/go/pkg/mod | ||
| key: ${{ runner.os }}-${{ runner.arch }}-cache-go-${{ inputs.cache-version }}-${{ hashFiles(inputs.cache-dependency-hashfiles-path) }} | ||
| restore-keys: | | ||
| ${{ runner.os }}-${{ runner.arch }}-cache-go-${{ inputs.cache-version }}- |
There was a problem hiding this comment.
No
save-always equivalent; cache not saved on job failure
The sibling golang/cache defaults cache-save-always: true, which passes save-always: true to actions/cache so the cache is written even when a later step fails. tespkg/actions-cache does not document a save-always input, so the S3 action only saves on clean-run success. On the hz fleet, where jobs are more likely to see flaky infra failures, this means a partially-built Go module cache is silently discarded and the next run has to start cold. Consider documenting this divergence in the action description, or adding a separate post-step / save-always-like mechanism if the action supports it.
Prompt To Fix With AI
This is a comment left during a code review.
Path: actions/golang/cache-s3/action.yaml
Line: 60-76
Comment:
**No `save-always` equivalent; cache not saved on job failure**
The sibling `golang/cache` defaults `cache-save-always: true`, which passes `save-always: true` to `actions/cache` so the cache is written even when a later step fails. `tespkg/actions-cache` does not document a `save-always` input, so the S3 action only saves on clean-run success. On the hz fleet, where jobs are more likely to see flaky infra failures, this means a partially-built Go module cache is silently discarded and the next run has to start cold. Consider documenting this divergence in the action description, or adding a separate post-step / `save-always`-like mechanism if the action supports it.
How can I resolve this? If you propose a fix, please make it concise.…ache tespkg/actions-cache extracts with 'tar --keep-old-files' (unlike stock actions/cache which overwrites), so the read-only (0444) Go module cache trips 'Cannot open: File exists'. Cache only ~/.cache/go-build (the compile-time win); modules repopulate from GOPROXY.
cache-backend=s3 routes the Go build cache to an S3/RGW bucket (tespkg) instead of the Azure-backed GitHub Actions cache, which self-hosted hz runners can't reach. Linux-only, go-build only (modules repopulate from GOPROXY). Default stays 'github' so other repos/runners are unchanged. prepare threads the new inputs through and pins golang/cache to this branch (folds to @v4 on merge).
…le to prepare Consolidate on the golang/cache-s3 sibling action for the s3/RGW Go cache (consistent with node/cache-pnpm-s3 + rust/cache-s3). golang/cache returns to its original GitHub actions/cache form; golang/prepare gains cache-enabled (default true) so a caller can skip the built-in GitHub cache and use the golang/cache-s3 sibling instead (no double-cache).
Adds
-s3sibling cache actions for the hz self-hosted runner fleet, following the existingturborepo/cache-s3precedent.What
actions/golang/cache-s3actions/rust/cache-s3actions/node/cache-pnpm-s3Each mirrors its GitHub-cache sibling (same key scheme) but uses tespkg/actions-cache (pinned to
e07e2d49= v1.10.2) against an S3 endpoint, defaulting to the Hetzner RadosGWs3.hz.platforma.bio(split-DNS → in-cluster RGW on hz, public elsewhere), bucketci-actions-cache.Why
On hz runners we are dropping the shared-hostPath toolchain caches (a cross-job contamination surface) and letting each job restore/save its own copy from RGW — node-local NVMe is left purely for the ephemeral per-job workdir. Caches now live in object storage exactly like the turbo cache.
Safety
golang/cache/rust/cache/node/cache-pnpmare untouched, so AWS-runner jobs are unaffected.use-fallback: true→ falls back to the GitHub-hosted cache if RGW is briefly unreachable.HZ_CI_CACHE_S3_ACCESS_KEY/_SECRET_KEY), never baked.Consumption (next, in the per-repo migration PRs)
hz jobs call e.g.
milaboratory/github-ci/actions/golang/cache-s3@v4with the org secrets; on rl8,rust/cache-s3takescargo-home: /opt/rust/cargo(the image bakes CARGO_HOME there).Greptile Summary
Adds three additive
cache-s3composite actions (golang/cache-s3,rust/cache-s3,node/cache-pnpm-s3) that mirror their GitHub-cache siblings but route cache I/O through the Hetzner RadosGW viatespkg/actions-cache(pinned by full commit SHA). Existing actions and AWS-runner jobs are completely untouched.use-fallback: true) works seamlessly.rust/cache-s3adds acargo-homeinput (default~/.cargo) to handle the hz-rl8 runner image whereCARGO_HOME=/opt/rust/cargois baked in.insecuredefaults tofalse(TLS on).Confidence Score: 4/5
All three actions are additive-only; no existing workflows are touched and the changes are straightforward YAML wrappers around a pinned third-party action.
The actions are well-structured mirrors of their siblings. Two observations temper a clean bill of health: retry-count is not a documented input of tespkg/actions-cache and is likely silently ignored, and golang/cache-s3 drops the save-always guarantee present in golang/cache so Go caches will not be written when a job fails mid-run.
All three files share the same retry-count concern; actions/golang/cache-s3/action.yaml additionally warrants a second look for the missing save-on-failure behaviour.
Important Files Changed
Sequence Diagram
%%{init: {'theme': 'neutral'}}%% sequenceDiagram participant Job as hz Runner Job participant Action as cache-s3 Action participant RGW as Hetzner RadosGW participant GHCache as GitHub-hosted Cache Job->>Action: invoke with access-key / secret-key Action->>RGW: restore cache (key lookup) alt RGW reachable RGW-->>Action: cache hit / miss else RGW unreachable and use-fallback true Action->>GHCache: restore cache (same key) GHCache-->>Action: cache hit / miss end Action-->>Job: cache restored (or cold build) Note over Job: build / test steps run alt Job succeeds Job->>Action: post-step: save cache Action->>RGW: upload cache artifact RGW-->>Action: saved else Job fails Note over Action: cache NOT saved to RGW end%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% sequenceDiagram participant Job as hz Runner Job participant Action as cache-s3 Action participant RGW as Hetzner RadosGW participant GHCache as GitHub-hosted Cache Job->>Action: invoke with access-key / secret-key Action->>RGW: restore cache (key lookup) alt RGW reachable RGW-->>Action: cache hit / miss else RGW unreachable and use-fallback true Action->>GHCache: restore cache (same key) GHCache-->>Action: cache hit / miss end Action-->>Job: cache restored (or cold build) Note over Job: build / test steps run alt Job succeeds Job->>Action: post-step: save cache Action->>RGW: upload cache artifact RGW-->>Action: saved else Job fails Note over Action: cache NOT saved to RGW endPrompt To Fix All With AI
Reviews (1): Last reviewed commit: "feat: S3-backed cache-s3 siblings for go..." | Re-trigger Greptile