-
Notifications
You must be signed in to change notification settings - Fork 2
feat!(cozystack): wizard chain + 2-plugin consolidation (breaking) #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
lexfrei
wants to merge
38
commits into
main
Choose a base branch
from
feat/cozystack-install-skill
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
c0bec97
feat(skills): add cozystack-install skill
lexfrei a9c4de0
refactor!: unify five cozystack skills into one cozystack plugin
lexfrei 13524f4
refactor(cozystack): clarify skill names with cluster-/package- prefixes
lexfrei 52bedf8
refactor(linstor): rename drbd-recovery plugin to linstor:recover
lexfrei 46288e1
feat(cluster-install): early storage discovery + LVM thin pool provis…
lexfrei 78f288c
feat(cozystack): wizard orchestrator + k3s-bootstrap + linux-prep + t…
lexfrei 53c2df7
fix(cluster-install): break OIDC chicken-and-egg + domain ownership gate
lexfrei b911f3b
feat(cluster-install): extractedprism enabled by default for generic …
lexfrei ff0a4b3
refactor(cozystack): 3-route wizard + ubuntu-bootstrap ansible wrappe…
lexfrei ebd33b3
feat(cozystack): sops opt-in for secret files in cluster config dir
lexfrei 653c4cc
chore(cozystack): bump plugin version 1.2.0 → 1.3.0 (sops opt-in)
lexfrei ec37d54
feat(cozystack): debug skill — investigate, classify, fix, draft upst…
lexfrei fc37154
feat(cozystack): wizard free-form intro + match operator language acr…
lexfrei 8956f55
feat(cozystack): one-path principle + talos-bootstrap actually runs talm
lexfrei 1256237
feat(cozystack): front-load every interview into one consolidated intake
lexfrei 1c18ea1
fix(cozystack): layer-pure operator output — no wizard mentions in sk…
lexfrei 98c2f99
fix(cozystack): retrospective-driven fixes from /tmp/stage-dev17 run
lexfrei bb75653
fix(cozystack): branch-review fixes — cross-refs, --context disciplin…
lexfrei b6802db
chore(review): address review-2 blockers
lexfrei e5f4c06
fix(review): land round-3 blockers
lexfrei 19a7ba9
fix(review): extractedprism endpoints schema + layer-pure refusal text
lexfrei 8e649fe
fix(review): inline sops-missing recovery options in cluster-install
lexfrei 4870e5d
fix(review): drop stale LinstorSatelliteConfiguration references
lexfrei 1b46e36
feat(wizard): full chain front-load + state contract enforcement
lexfrei dd7a280
fix(talos-bootstrap): apply learnings from a real install run
lexfrei 3266a4f
fix(cluster-install): NAT-aware publishing, inline tenant patch, real…
lexfrei 965af6f
fix(cluster-install): real Talos storage path + actually-implemented …
lexfrei eb87b14
docs(cluster-install): add provider-pitfalls reference + bump 1.12.0
lexfrei 3331445
fix(review): address post-batch-B blockers
lexfrei 7f426fe
fix(cluster-install): doc-structure fixes from review
lexfrei f805c4e
feat(wizard): seed operator-contributed case-studies knowledge base +…
lexfrei 9852abd
Revert "feat(wizard): seed operator-contributed case-studies knowledg…
lexfrei 49f3e0c
feat(wizard): Phase 4.5 active research — runtime, skeptical, source-…
lexfrei 339fd53
fix(talos-bootstrap): cert-SAN trap guardrail + Talos 1.12 probe + mu…
lexfrei addb314
fix(cluster-install,talos-bootstrap): auto-upgrade to tuned + OCI ver…
lexfrei 3a09106
feat(wizard,cluster-install): CIDRs from source + variant split + bat…
lexfrei 757d91f
feat(talos-reset): new skill for cloud-provider terminate+relaunch re…
lexfrei 0f12037
fix(review): platform_variant enum + CIDR contradiction + field-confu…
lexfrei File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| name: validate | ||
|
|
||
| on: | ||
| push: | ||
| branches: [main] | ||
| pull_request: | ||
|
|
||
| jobs: | ||
| jq: | ||
| name: jq lint manifests | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Validate marketplace.json | ||
| run: jq . .claude-plugin/marketplace.json > /dev/null | ||
| - name: Validate every plugin.json | ||
| run: | | ||
| set -euo pipefail | ||
| fail=0 | ||
| while IFS= read -r f; do | ||
| if ! jq . "$f" > /dev/null; then | ||
| echo "FAIL: $f is not valid JSON" >&2 | ||
| fail=1 | ||
| fi | ||
| done < <(find plugins -name plugin.json -type f) | ||
| exit "$fail" | ||
|
|
||
| cross-refs: | ||
| name: cross-reference validator | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: tools/check-refs.sh | ||
| run: bash tools/check-refs.sh |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,20 +14,80 @@ Add the marketplace: | |
| Install a plugin: | ||
|
|
||
| ```text | ||
| /plugin install <plugin-name>@cozystack-claude-plugins | ||
| /plugin install cozystack@cozystack-claude-plugins | ||
| /plugin install linstor@cozystack-claude-plugins | ||
| ``` | ||
|
|
||
| ## Plugins | ||
|
|
||
| ### Skills | ||
| ### cozystack | ||
|
|
||
| | Plugin | Description | | ||
| Platform skills bundle. One install gives you nine skills, invoked as `/cozystack:<name>`. Start with `/cozystack:wizard` — it asks Talos / Ubuntu / Existing and picks the chain. | ||
|
|
||
| | Skill | Description | | ||
| | --- | --- | | ||
| | **/cozystack:wizard** | Entry point. Opens with a free-form "tell me about your setup and goal" question so context comes through in the operator's own words and pre-fills the structured questions. Then asks Talos / Ubuntu / Existing, builds a chain, dispatches downstream skills via a cluster config directory the operator picks. Artifacts (inventory, kubeconfig, state, platform-package YAML) all live there — operator manages git on their own. Every skill in the chain matches the operator's natural language. | | ||
| | **/cozystack:talos-bootstrap** | Bootstrap Talos nodes via talm. Default: probe nodes for maintenance mode (Talos-1.12-aware `get disks --insecure`); if ready, write per-node multidoc machine-config stubs with VIP-link static IPv4, run NAT-provider cert-SAN guardrail (auto-populates `values.yaml.certSANs` with public IPs before first `talm apply`), then `talm apply` + `talosctl bootstrap` + kubeconfig fetch + cozystack-tuned shape verification with auto-upgrade to tuned image when nodes booted from base Talos. Opt-in boot-method picker (OCI Custom Image / boot-to-talos / ISO / PXE) only when nodes aren't yet imaged. | | ||
| | **/cozystack:talos-reset** | Cloud-provider recovery helper when Talos nodes are unrecoverable from inside the cluster (cert-SAN trap, broken machine-config, lost talosconfig). Wraps `oci` / `aws` / `gcloud` / `hcloud` to terminate + relaunch from the cozystack-tuned image while preserving block volumes, secondary VNICs, NSG memberships. Sequential per-node to maintain etcd quorum. Hands off to `cozystack:talos-bootstrap` for re-bootstrap. | | ||
| | **/cozystack:ubuntu-bootstrap** | Bootstrap Ubuntu / Debian nodes by wrapping `cozystack/ansible-cozystack/examples/ubuntu/` — OS prep, drbd-dkms for Secure Boot, ZFS + KubeVirt modules, k3s install with cozystack-compatible flags, kubeconfig retrieval. Stops before Cozystack itself. | | ||
| | **/cozystack:cluster-install** | Cozystack on a ready cluster — node-readiness validation, variant picker, interactive values, per-node ZFS pool provisioning, extractedprism for kube-apiserver HA, cozy-installer chart, Platform Package apply, root Tenant ingress patch, wait until every HelmRelease is Ready, NOTES summary. | | ||
| | **/cozystack:debug** | Investigate a stuck or broken Cozystack install. Gathers symptoms, classifies (operator error / config drift / upstream bug / not-yet-supported), applies fixes or workarounds, drafts upstream issues with diagnostic bundle on approval. Never opens PRs or files silently. Auto-dispatched by the wizard when any chain step fails. | | ||
| | **/cozystack:cluster-upgrade** | Guided upgrade of a running Cozystack v1.x cluster — release-notes analysis, prechecks, stop gates, helm upgrade, targeted post-upgrade verification, known-failure recovery. | | ||
| | **/cozystack:package-deploy** | Deploy a single Cozystack package to a dev cluster via make + cozyhr — handles fresh install and dev-loop iteration with ExternalArtifact support. | | ||
| | **/cozystack:package-bump** | Bump a single package inside the cozystack monorepo — reads upstream changelog, adapts to breaking changes, regenerates schema, optionally deploys to a dev cluster. | | ||
| | **/cozystack:external-app-create** | Scaffold a new Cozystack external app package with dependency integration (managed CNPG Postgres, external secret references). | | ||
|
|
||
| Chains the wizard builds: | ||
|
|
||
| | Target | Chain | | ||
| | ----------- | ----------- | | ||
| | Bare-metal Talos | `talos-bootstrap` → `cluster-install` | | ||
| | Bare-metal Ubuntu / Debian | `ubuntu-bootstrap` → `cluster-install` | | ||
| | Existing Kubernetes (self-managed or managed) | `cluster-install` | | ||
| | Existing Cozystack | refuse → `cozystack:cluster-upgrade` | | ||
|
|
||
| ### linstor | ||
|
|
||
| LINSTOR / DRBD operations bundle. Useful on any Kubernetes cluster that runs piraeus-operator / LINSTOR, not just on Cozystack. | ||
|
|
||
| | Skill | Description | | ||
| | --- | --- | | ||
| | **cozy-deploy** | Deploy a Cozystack package to a dev cluster via make + cozyhr | | ||
| | **cozy-external-app** | Scaffold a new Cozystack external app package with dependency integration | | ||
| | **drbd-recovery** | Diagnose and recover DRBD/LINSTOR storage issues in Kubernetes clusters | | ||
| | **cozystack-upgrade** | Guided upgrade of a running Cozystack v1.x cluster to a newer v1.x patch or minor version | | ||
| | **cozy-bump** | Bump a cozystack monorepo package — reads upstream changelog, adapts to breaking changes, regenerates schema, optionally deploys to a dev cluster | | ||
| | **/linstor:recover** | Diagnose and recover broken DRBD resources — handles StandAlone, DELETING, Inconsistent, Diskless, quorum loss, bitmap errors, and other common failure modes. | | ||
|
|
||
| ## Third-party dependencies | ||
|
|
||
| `cozystack:cluster-install` default-installs [extractedprism](https://github.com/lexfrei/extractedprism) on `generic` variant clusters (k3s / kubeadm / RKE2). extractedprism is a per-node TCP load balancer that gives generic Linux Kubernetes the same `localhost:7445` kube-apiserver shape Talos has built-in (KubePrism), so Cilium and KubeOVN can dial a stable local address regardless of which control-plane node is up. | ||
|
|
||
| Project metadata: | ||
|
|
||
| - Source: `https://github.com/lexfrei/extractedprism` (BSD-3-Clause). | ||
| - Helm chart: `oci://ghcr.io/lexfrei/charts/extractedprism`. | ||
| - Maintained independently by a Cozystack contributor; reviewed and approved by the Cozystack platform team for use as the generic-variant HA proxy. | ||
|
|
||
| Operators can opt out with `--no-extractedprism` and supply their own `--api-host=<ip>` (external LB, VIP, or single CP IP with the SPOF caveat) — see `cozystack:cluster-install` Phase 4. Talos and hosted variants do not need extractedprism. | ||
|
|
||
| ## Repository layout | ||
|
|
||
| ```text | ||
| plugins/ | ||
| cozystack/ # platform bundle (9 skills) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| .claude-plugin/plugin.json | ||
| skills/ | ||
| wizard/ # entry point: interview + chain dispatcher | ||
| talos-bootstrap/ # Talos node prep | ||
| talos-reset/ # cloud-provider terminate+relaunch helper | ||
| ubuntu-bootstrap/ # Ubuntu/Debian via ansible-cozystack wrapper | ||
| cluster-install/ # Cozystack on a ready cluster | ||
| debug/ # investigate + classify + workaround + issue draft | ||
| cluster-upgrade/ # v1.x patch/minor upgrade | ||
| package-deploy/ # dev-loop deploy of a single package | ||
| package-bump/ # bump a monorepo package | ||
| external-app-create/ # scaffold a new external-apps package | ||
| linstor/ # storage bundle (1 skill) | ||
| .claude-plugin/plugin.json | ||
| skills/ | ||
| recover/ | ||
| ``` | ||
|
|
||
| ## License | ||
|
|
||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| { | ||
| "name": "cozystack", | ||
| "version": "1.14.1", | ||
| "description": "Cozystack platform skills bundle. Start with cozystack:wizard — it begins with a free-form 'tell me about your setup and goal' question, parses hints, then asks Talos / Ubuntu / Existing, builds a chain, and dispatches downstream skills via a cluster config directory the operator picks (every artifact lives there: inventory.yml, kubeconfig, .state.yaml, cozystack-platform-package.yaml — operator manages git on their own; optional sops opt-in encrypts secret files in-tree). Skills, invoked as cozystack:<name>: wizard (orchestrator + 3-route dispatcher + Phase 4.5 active research + auto-dispatches debug on any failed_at), talos-bootstrap (Talos node prep via talm — Talos-1.12-aware maintenance probe, NAT-provider cert-SAN guardrail before first talm apply, multidoc machine-config with per-node VIP-link IPv4 stubs, etcd bootstrap, kubeconfig fetch, cozystack-tuned shape verification with Phase 11.5 auto-upgrade), talos-reset (cloud-provider terminate+relaunch helper for OCI/AWS/GCP/Hetzner when nodes are unrecoverable from inside; preserves block volumes + secondary VNICs + NSG memberships), ubuntu-bootstrap (wraps cozystack/ansible-cozystack — OS prep + k3s install in one go), cluster-install (Cozystack on a ready cluster — node-readiness, ZFS pool provisioning via privileged DaemonSet on Talos with hostNetwork, extractedprism for kube-apiserver HA, OCI-tag-normalized cozy-installer chart, Platform Package, inline tenants/root ingress patch + LINSTOR pool registration during watch loop with combined HRs-Ready + pools-registered gate, Phase 8.6 default StorageClasses for v1.3.x, Phase 9.1 end-to-end reachability probe), debug (investigate a stuck or broken install — gathers symptoms, classifies operator error / config drift / upstream bug / not-yet-supported, applies fixes or workarounds, drafts upstream issues on approval; never opens PRs or files silently), cluster-upgrade (v1.x patch/minor upgrade with release-notes analysis), package-deploy (dev-loop deploy of a single package with ExternalArtifact support), package-bump (single-package version bump with changelog adaptation), external-app-create (scaffold a new external-apps package). All skills match the operator's natural language detected from conversation context — code identifiers, commands, file paths, and GitHub-public text stay canonical. All skills follow the same gate-and-confirm discipline: read-only lookups run freely; any mutation needs explicit per-step approval.", | ||
| "author": { | ||
| "name": "Cozystack", | ||
| "url": "https://github.com/cozystack" | ||
| } | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description for the
cozystackplugin states that it provides nine skills, but it actually includes ten. This should be updated for accuracy.