Skip to content

docs: retrofit controller-crash + ship-readiness guides with real PR #10191 evidence#38

Merged
weicao merged 1 commit intomainfrom
alice/retrofit-real-chaos-evidence
May 3, 2026
Merged

docs: retrofit controller-crash + ship-readiness guides with real PR #10191 evidence#38
weicao merged 1 commit intomainfrom
alice/retrofit-real-chaos-evidence

Conversation

@weicao
Copy link
Copy Markdown
Contributor

@weicao weicao commented May 3, 2026

Summary

Allen's quality dashboard v2 flagged addon-controller-crash-resilience-guide.md and addon-ship-readiness-multi-phase-validation-guide.md as prose-heavy / 0 evidence / 0 code block. apecloud/kubeblocks PR #10191 just sealed and provides exactly the kind of substantive evidence both guides lacked: a controller-restart chaos gate that triggers on InstanceSet annotation transition (not log scrape), and a five-layer validation matrix that turned a clean 49-sample baseline into a four-contract-gap race finder.

This PR adds two engine-specific case appendices (Valkey / PR #10191) without touching the methodology bodies.

  • addon-controller-crash-resilience-guide.md: Valkey RebuildInstance cleanup-timing window appendix with the exact polling loop, before/after annotation snapshot pattern, and the four-property terminal verification list.
  • addon-ship-readiness-multi-phase-validation-guide.md: five-layer matrix appendix (10 + 4 + 35 + 1 + 35 = 84+ samples) showing how the last two layers exposed the four contract gaps that baseline N=10 did not. Includes trial-loop wrapper, forensics grep predicate, and PVC / PV invariant queries.

Both appendices add cross-refs to addon-pvc-rebind-via-workload-intent-guide.md and addon-design-contract-review-during-xp-guide.md so the four guides form a connected sub-graph on the RebuildInstance race fix.

Test plan

…d command-line examples

Allen's quality dashboard v2 flagged
addon-controller-crash-resilience-guide.md and
addon-ship-readiness-multi-phase-validation-guide.md as
prose-heavy / 0 evidence / 0 code block. Both guides predated the
RebuildInstance race fix work in apecloud/kubeblocks PR #10191,
which is now sealed and gives concrete cross-engine evidence:
controller restart inside a precise cleanup-timing window did
exercise the design assumption, and a five-layer validation matrix
beyond the baseline three-segment ship matrix did surface four
contract-level race gaps.

Two new case appendices, both engine-specific (Valkey / PR #10191)
following the project convention that engine details live in case
appendices and the body remains methodology:

- addon-controller-crash-resilience-guide.md: a Valkey RebuildInstance
  cleanup-timing-window appendix with the exact polling loop that
  triggers the chaos restart on InstanceSet annotation transition
  + pod Ready=False (not on a controller log scrape), the
  before/after annotation-snapshot capture pattern, and the
  four-property terminal verification list.

- addon-ship-readiness-multi-phase-validation-guide.md: a
  five-layer matrix appendix (baseline 10 + extension 4 +
  same-cluster dense initial 35 + chaos gate 1 + same-cluster
  dense follow-up 35) showing how the last two layers turned a
  clean 49-sample baseline into a 4-contract-gap race finder.
  Includes the trial-loop wrapper, the forensics grep predicate,
  and the PVC / PV invariant queries.

Both appendices add direct cross-refs to addon-pvc-rebind-via-
workload-intent-guide.md and addon-design-contract-review-during-xp-
guide.md so the four guides form a connected sub-graph on the
RebuildInstance race fix.

No changes to the guide bodies; appendices are additive only.
@weicao weicao merged commit a5e383d into main May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant