refresh-dwarf bot: runtime-verify each generated JSON via cephadm#129
Open
taodd wants to merge 2 commits into
Open
refresh-dwarf bot: runtime-verify each generated JSON via cephadm#129taodd wants to merge 2 commits into
taodd wants to merge 2 commits into
Conversation
The refresh bot previously only proved a generated JSON parses and links into the embedded header -- never that the tools actually trace meaningfully through it. Add a parallel per-version runtime verification: - Split the workflow into three jobs: generate -> verify (matrix) -> open-pr. - verify fans out one runner per generated version, rebuilds osdtrace + radostrace with the new JSON embedded, provisions a single-host cephadm cluster on quay.io/ceph/ceph:v<version> (whose ceph-osd build_id matches the el9 RPM the JSON was extracted from), drives an S3 workload, and traces a live OSD + radosgw through the EMBEDDED path. - open-pr includes only versions that passed; failures are dropped from the PR and listed for retry next run. functional-test-cephadm-rgw.sh gains two opt-in knobs (existing matrix behaviour unchanged when unset): - CEPH_IMAGE: pin an exact point-release image instead of the per-major latest. - REQUIRE_EMBEDDED=1: make the 'Using embedded DWARF data' marker mandatory, so a silent fall-back to live DWARF parsing fails the test.
actions/upload-artifact@v4 rejects ':' in file paths, and the JSON filenames embed the package epoch (osd-2:19.2.2-0.el9_dwarf.json), so the raw-file upload failed. Bundle the new JSONs into a colon-free tarball (alongside the manifest TSVs) in generate, and untar in the verify and open-pr jobs. No other logic change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The embedded-DWARF refresh bot previously only proved a generated JSON parses and links into the embedded header — never that osdtrace/radostrace actually trace meaningfully through it. This adds per-version runtime trace verification that exercises the embedded path against a real cluster of the matching version.
What changed
Workflow split into three jobs (
.github/workflows/refresh-embedded-dwarf.yaml):quay.io/ceph/ceph:v<version>, drives an S3 workload, and traces a live OSD + radosgw through the embedded path.REQUIRE_EMBEDDED=1makes theUsing embedded DWARF datamarker mandatory — a silent fall-back to live DWARF parsing fails the cell.tests/functional-test-cephadm-rgw.shgains two opt-in knobs (existing PR matrix behavior unchanged when unset):CEPH_IMAGE— pin an exact point-release image instead of the per-major latest.REQUIRE_EMBEDDED=1— make the embedded-DWARF boot marker mandatory.Why a cephadm cluster of the exact version works
Verified empirically: for tagged point releases the quay.io image ships the same binary as the el9 RPM the JSON is generated from —
ceph-osdbuild_id forv19.2.2matched the el9 RPM's build_id exactly (702d13c4…). So the embedded JSON matches by build_id and the embedded path genuinely engages.Test plan
bash -n+actionlintclean.workflow_dispatchrun to validate the three-job flow end-to-end (spins up one cephadm cluster per generated version).🤖 Generated with Claude Code