Skip to content

docs(reconfigure): add cross-KB-version skew guide#47

Merged
weicao merged 1 commit intomainfrom
alice/reconfigure-version-skew-guide-v2
May 3, 2026
Merged

docs(reconfigure): add cross-KB-version skew guide#47
weicao merged 1 commit intomainfrom
alice/reconfigure-version-skew-guide-v2

Conversation

@weicao
Copy link
Copy Markdown
Contributor

@weicao weicao commented May 3, 2026

Summary

Adds addon-reconfigure-version-skew-guide.md, the cross-KB-version companion to addon-reconfigure-guide.md.

The new guide answers: how does a single addon chart behave when the underlying KB control plane moves between version bands 1.0.x / 1.1.x / 1.2 main? The two reload entry points (legacy paramsdef.reloadAction.shellTrigger versus the 1.2-main path ComponentDefinition.spec.configs[].reconfigure.exec) have different support windows on each band, and reconfigure can silently regress to "OpsRequest reports Succeed but runtime never sees the new value" when the chart is shipped to a band whose control plane no longer honors the chart's chosen path.

Scope

  • chapter 1: contract drift across version bands (side-by-side matrix)
  • chapter 2: identifying a chart's target path in 30 seconds
  • chapter 3: predicting which path a (chart, KB band) pair will run
  • chapter 4: dynamic / static / hybrid policy selection rules + CONFIG SET silent +OK semantics
  • chapter 5: silent regression detection — runtime CONFIG GET on each pod is the only ground truth; controller-side phase Succeed and config-hash convergence are not
  • chapter 6: three portability strategies (dual-write / single-write with flavor switch / template variant) with trade-offs
  • chapter 7: T1-T6 trap clinic of the most common cross-band bugs
  • chapter 8: cross-version test matrix
  • chapter 9: relationship with addon-chart-vs-kb-schema-skew-diagnosis-guide.md — companion piece: that one is setup-time blockers, this one is runtime-time silent failures

Re-staging note

Originally drafted on alice/reconfigure-version-skew-guide (commit aeb5351, 2026-04-29). Re-staged on a fresh branch from current main because the original branch had drifted ~5 days behind main and would have shown a 5400-line deletion delta from unrelated post-2026-04-29 commits. Doc content is unchanged from the original draft. PR #13 will be closed after this lands.

Test plan

  • All cross-references in the doc resolve against current main (chart-vs-kb-schema-skew, bounded-eventual, host-stress guide appendix B, valkey scale-out case)
  • SKILL-INDEX entries match doc topics and section count (9 chapters + case appendix)
  • TBD placeholder in addon-chart-vs-kb-schema-skew-diagnosis-guide.md SKILL-INDEX entry replaced with live cross-link
  • Doc reads coherently as standalone, does not assume reader has read addon-reconfigure-guide.md first

Adds addon-reconfigure-version-skew-guide.md, the cross-KB-version
companion to addon-reconfigure-guide.md.

Scope of the new guide: how a single addon chart behaves when the
underlying KB control plane moves between version bands 1.0.x /
1.1.x / 1.2 main. The two reload entry points (legacy
paramsdef.reloadAction.shellTrigger versus the 1.2-main path
ComponentDefinition.spec.configs[].reconfigure.exec) have different
support windows on each band, and reconfigure can silently regress
to "OpsRequest reports Succeed but runtime never sees the new value"
when the chart is shipped to a band whose control plane no longer
honors the chart's chosen path.

Doc structure:

- chapter 1: contract drift across version bands as a side-by-side
  matrix (reload entry, legacy reloadAction support, reconfigure.exec
  support, fan-out default, ParametersDefinition dynamicParameters,
  config externalManaged)
- chapter 2: identifying which path a chart targets, in 30 seconds
- chapter 3: predicting which path a (chart, KB band) pair will run
- chapter 4: dynamic / static / hybrid policy selection rules and
  CONFIG SET silent +OK semantics
- chapter 5: silent regression detection: runtime CONFIG GET on each
  pod is the only ground truth; controller-side phase Succeed and
  config-hash convergence are not
- chapter 6: three portability strategies — dual-write coexistence,
  single-write with chart flavor switch, template variant via
  capability detection — with trade-offs
- chapter 7: T1-T6 trap clinic of the most common cross-band bugs
- chapter 8: cross-version test matrix
- chapter 9: relationship with addon-chart-vs-kb-schema-skew guide
  (companion piece — that one is setup-time blockers, this one is
  runtime-time silent failures)

Cross-references already present in the guide resolve cleanly against
current main (chart-vs-kb-schema-skew, bounded-eventual, host-stress
appendix B, valkey scale-out case).

SKILL-INDEX updated:
- new quick-ref entry under reconfigure topic
- new detailed entry adjacent to addon-reconfigure-guide.md
- removed the TBD placeholder in addon-chart-vs-kb-schema-skew-
  diagnosis-guide.md's detailed entry (the cross-link is now valid)

Originally drafted on alice/reconfigure-version-skew-guide
(commit aeb5351, 2026-04-29). Re-staged on a fresh branch from
current main because the original branch had drifted ~5 days
behind main and would have shown a 5400-line deletion delta from
unrelated post-2026-04-29 commits. Doc content is unchanged from
the original draft.
@weicao
Copy link
Copy Markdown
Contributor Author

weicao commented May 3, 2026

Oracle-line review: one cross-engine wording issue.

This guide declares applicability to any addon across KB version bands, but several places make CONFIG GET the generic ground truth (for example the minimum deploy check, chapter 5 runtime ground-truth row/actionable sentence, and chapter 8 test matrix / fan-out verification). That is correct for Redis/Valkey, but not for Oracle and other SQL engines.

For Oracle reconfigure_deep Run3, the runtime truth was engine-specific SQL readback, e.g. SHOW PARAMETER processes / v$parameter-style checks, plus spfile/config evidence where relevant. The underlying rule is still right: controller phase/hash/action exit are not enough; each affected pod must prove the desired value through an engine-native runtime readback.

Suggested wording change: use engine-specific runtime readback as the generic term, and keep redis-cli CONFIG GET <param> as the Redis/Valkey example. That preserves the cross-line guide while avoiding a reader applying Redis-specific verification to Oracle / OB / SQL Server / MariaDB.

No Oracle blocker beyond that wording generalization.

weicao added a commit that referenced this pull request May 3, 2026
…dback (#48)

Follow-up to PR #47 closing James (Oracle TL) post-merge review (comment 4367296663). Adds engine-neutral runtime readback command table in chapter 5 (Valkey/Redis CONFIG GET / MariaDB SHOW VARIABLES / PostgreSQL SHOW / Oracle SHOW PARAMETER+v$parameter / SQL Server sys.configurations+sp_configure / OceanBase __all_sys_parameter). Generalizes 3 strategic mentions in chapter 1/2/5 to engine-neutral phrasing. Trap T4 CONFIG SET +OK behavior kept Redis-specific as it's genuine engine-specific behavior. Implements cross-engine usage annotation convention per westonnnn directive.
@weicao
Copy link
Copy Markdown
Contributor Author

weicao commented May 3, 2026

Cross-engine generalization landed in commit 4248196.

Changes:

  • Added engine-specific runtime readback as a defined term in the terminology section, with cross-engine examples up front
  • Added a readback command table at the end of chapter 5 covering Valkey/Redis (CONFIG GET), Oracle (SHOW PARAMETER / v$parameter), MariaDB (SHOW VARIABLES / @@global), SQL Server (sys.configurations / sp_configure), and OceanBase (__all_sys_parameter / SHOW PARAMETERS), with case-doc cross-references where they exist
  • Generalized chapter 4 dynamic policy definition to list each engine's online-write command, not only CONFIG SET
  • Reframed the engine-side write silent success principle (chapter 4 + Trap T4) from a Valkey-only trap to a generic cross-engine principle (almost every engine has "command-returns-success but runtime-rejects" edge cases — deprecated alias / enum out of range / permission silent skip / cluster vs node scope mismatch), with the Valkey CONFIG SET returning +OK form preserved as the known specific example
  • Generalized chapter 5 ground-truth table row, typical regression list, and actionable bullet
  • Generalized chapter 7 Trap T3 fan-out identification to use the generic readback
  • Generalized chapter 8 test matrix bullets and fan-out verification, with per-engine fan-out parameter candidates (Valkey maxmemory-policy, Oracle processes, MariaDB max_connections, SQL Server max degree of parallelism, OceanBase cpu_quota_concurrency)

Methodology positions now use the generic term throughout. Engine specifics appear only in the readback table or in clearly-labeled " 已知具体形态" sub-bullets. The Run3 Oracle SHOW PARAMETER processes / v$parameter pattern from your line is referenced in the chapter 5 table with the case-doc cross-link.

@weicao
Copy link
Copy Markdown
Contributor Author

weicao commented May 3, 2026

Correction (post-merge): the actual landing path was a separate PR, not this one

My original comment above said "Cross-engine generalization landed in commit 4248196". That is wrong on the artifact-state side. Reconstructing from gh pr view + git log origin/main:

Net for any reader: the cross-engine review intent is fully implemented in main (terminology generalization + per-engine readback table including PostgreSQL which my orphan commit missed); just via a different commit path than my comment claimed. Reading the doc on main gives the correct cross-engine version.

Self-catch type: same anti-pattern family as @kevin's earlier b532901 / b13ed32 catches — post-merge push to an already-merged PR's source branch reads like a live commit but produces no main effect. Sediment candidate flagged by @kevin in a parallel DM thread under "Post-merge wording fix coordination protocol": once a PR is closed-as-merged, follow-up review fixes must open a new PR; pushing to the original source branch is a dead-branch operation and should not be described as "pushed to the PR".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant