Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
c0bec97
feat(skills): add cozystack-install skill
lexfrei May 15, 2026
a9c4de0
refactor!: unify five cozystack skills into one cozystack plugin
lexfrei May 15, 2026
13524f4
refactor(cozystack): clarify skill names with cluster-/package- prefixes
lexfrei May 15, 2026
52bedf8
refactor(linstor): rename drbd-recovery plugin to linstor:recover
lexfrei May 15, 2026
46288e1
feat(cluster-install): early storage discovery + LVM thin pool provis…
lexfrei May 15, 2026
78f288c
feat(cozystack): wizard orchestrator + k3s-bootstrap + linux-prep + t…
lexfrei May 15, 2026
53c2df7
fix(cluster-install): break OIDC chicken-and-egg + domain ownership gate
lexfrei May 15, 2026
b911f3b
feat(cluster-install): extractedprism enabled by default for generic …
lexfrei May 15, 2026
ff0a4b3
refactor(cozystack): 3-route wizard + ubuntu-bootstrap ansible wrappe…
lexfrei May 15, 2026
ebd33b3
feat(cozystack): sops opt-in for secret files in cluster config dir
lexfrei May 15, 2026
653c4cc
chore(cozystack): bump plugin version 1.2.0 → 1.3.0 (sops opt-in)
lexfrei May 15, 2026
ec37d54
feat(cozystack): debug skill — investigate, classify, fix, draft upst…
lexfrei May 15, 2026
fc37154
feat(cozystack): wizard free-form intro + match operator language acr…
lexfrei May 15, 2026
8956f55
feat(cozystack): one-path principle + talos-bootstrap actually runs talm
lexfrei May 15, 2026
1256237
feat(cozystack): front-load every interview into one consolidated intake
lexfrei May 15, 2026
1c18ea1
fix(cozystack): layer-pure operator output — no wizard mentions in sk…
lexfrei May 15, 2026
98c2f99
fix(cozystack): retrospective-driven fixes from /tmp/stage-dev17 run
lexfrei May 15, 2026
bb75653
fix(cozystack): branch-review fixes — cross-refs, --context disciplin…
lexfrei May 15, 2026
b6802db
chore(review): address review-2 blockers
lexfrei May 15, 2026
e5f4c06
fix(review): land round-3 blockers
lexfrei May 15, 2026
19a7ba9
fix(review): extractedprism endpoints schema + layer-pure refusal text
lexfrei May 15, 2026
8e649fe
fix(review): inline sops-missing recovery options in cluster-install
lexfrei May 15, 2026
4870e5d
fix(review): drop stale LinstorSatelliteConfiguration references
lexfrei May 15, 2026
1b46e36
feat(wizard): full chain front-load + state contract enforcement
lexfrei May 15, 2026
dd7a280
fix(talos-bootstrap): apply learnings from a real install run
lexfrei May 15, 2026
3266a4f
fix(cluster-install): NAT-aware publishing, inline tenant patch, real…
lexfrei May 15, 2026
965af6f
fix(cluster-install): real Talos storage path + actually-implemented …
lexfrei May 15, 2026
eb87b14
docs(cluster-install): add provider-pitfalls reference + bump 1.12.0
lexfrei May 15, 2026
3331445
fix(review): address post-batch-B blockers
lexfrei May 15, 2026
7f426fe
fix(cluster-install): doc-structure fixes from review
lexfrei May 15, 2026
f805c4e
feat(wizard): seed operator-contributed case-studies knowledge base +…
lexfrei May 16, 2026
9852abd
Revert "feat(wizard): seed operator-contributed case-studies knowledg…
lexfrei May 16, 2026
49f3e0c
feat(wizard): Phase 4.5 active research — runtime, skeptical, source-…
lexfrei May 16, 2026
339fd53
fix(talos-bootstrap): cert-SAN trap guardrail + Talos 1.12 probe + mu…
lexfrei May 16, 2026
addb314
fix(cluster-install,talos-bootstrap): auto-upgrade to tuned + OCI ver…
lexfrei May 16, 2026
3a09106
feat(wizard,cluster-install): CIDRs from source + variant split + bat…
lexfrei May 16, 2026
757d91f
feat(talos-reset): new skill for cloud-provider terminate+relaunch re…
lexfrei May 16, 2026
0f12037
fix(review): platform_variant enum + CIDR contradiction + field-confu…
lexfrei May 16, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 7 additions & 25 deletions .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
@@ -1,40 +1,22 @@
{
"$schema": "https://anthropic.com/claude-code/marketplace.schema.json",
"name": "cozystack-claude-plugins",
"description": "Claude Code plugins for the Cozystack ecosystem — deployment skills, development workflows, and infrastructure tools",
"description": "Claude Code plugins for the Cozystack ecosystem — cozystack:* platform skills (wizard, talos-bootstrap, talos-reset, ubuntu-bootstrap, cluster-install, debug, cluster-upgrade, package-deploy, package-bump, external-app-create) plus linstor:* DRBD/LINSTOR operations",
"owner": {
"name": "Cozystack",
"url": "https://github.com/cozystack"
},
"plugins": [
{
"name": "cozy-deploy",
"description": "Deploy a Cozystack package to a dev cluster via make + cozyhr — handles fresh install and dev-loop iteration with ExternalArtifact support",
"source": "./skills/cozy-deploy",
"name": "cozystack",
"description": "Cozystack platform skills bundle — wizard (entry-point orchestrator that interviews + dispatches the chain), talos-bootstrap (Talos node prep via talm with maintenance-mode probe, cert-SAN NAT guardrail, multidoc machine-config, and opt-in boot-method picker), talos-reset (cloud-provider terminate+relaunch helper for unrecoverable Talos nodes — OCI/AWS/GCP/Hetzner; preserves disks + VNICs + NSGs), ubuntu-bootstrap (Ubuntu/Debian k3s bootstrap wrapping ansible-cozystack), cluster-install (Cozystack on a ready cluster — node-readiness, ZFS pool provisioning, extractedprism HA proxy, all-HRs-Ready + storage-pools-registered gate), debug (investigate stuck installs — classify operator-error/config-drift/upstream-bug/not-supported, apply fixes or workarounds, draft upstream issues on approval), cluster-upgrade (release-notes-driven v1.x patch/minor upgrade), package-deploy (dev-loop deploy with ExternalArtifact support), package-bump (single-package version bump with changelog adaptation), external-app-create (scaffold a new external-apps package). Invoked as cozystack:wizard, cozystack:talos-bootstrap, cozystack:talos-reset, cozystack:ubuntu-bootstrap, cozystack:cluster-install, cozystack:debug, cozystack:cluster-upgrade, cozystack:package-deploy, cozystack:package-bump, cozystack:external-app-create.",
"source": "./plugins/cozystack",
"category": "infrastructure"
},
{
"name": "cozy-external-app",
"description": "Scaffold a new Cozystack external app package — generates chart skeleton, ApplicationDefinition, and handles dependency integration (e.g. Immich → Postgres) via managed CNPG clusters or external secret references",
"source": "./skills/cozy-external-app",
"category": "infrastructure"
},
{
"name": "drbd-recovery",
"description": "Diagnose and recover DRBD/LINSTOR storage issues in Kubernetes clusters — handles StandAlone, DELETING, Inconsistent, Diskless, quorum loss, bitmap errors, and other common failure modes",
"source": "./skills/drbd-recovery",
"category": "infrastructure"
},
{
"name": "cozystack-upgrade",
"description": "Guided upgrade of a running Cozystack v1.x cluster to a newer v1.x patch or minor version — release-notes analysis, prechecks, stop gates, helm upgrade, targeted post-upgrade verification, known failure recovery",
"source": "./skills/cozystack-upgrade",
"category": "infrastructure"
},
{
"name": "cozy-bump",
"description": "Bump a cozystack monorepo package — reads upstream changelog, adapts to breaking changes, regenerates schema, optionally deploys to a dev cluster",
"source": "./skills/cozy-bump",
"name": "linstor",
"description": "LINSTOR / DRBD operations bundle for Kubernetes — invoked as linstor:recover for diagnosing and recovering broken DRBD resources (StandAlone, DELETING, Inconsistent, Diskless, quorum loss, bitmap errors). Useful on any Kubernetes cluster that runs piraeus-operator / LINSTOR.",
"source": "./plugins/linstor",
"category": "infrastructure"
}
]
Expand Down
34 changes: 34 additions & 0 deletions .github/workflows/validate.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: validate

on:
push:
branches: [main]
pull_request:

jobs:
jq:
name: jq lint manifests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate marketplace.json
run: jq . .claude-plugin/marketplace.json > /dev/null
- name: Validate every plugin.json
run: |
set -euo pipefail
fail=0
while IFS= read -r f; do
if ! jq . "$f" > /dev/null; then
echo "FAIL: $f is not valid JSON" >&2
fail=1
fi
done < <(find plugins -name plugin.json -type f)
exit "$fail"

cross-refs:
name: cross-reference validator
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: tools/check-refs.sh
run: bash tools/check-refs.sh
69 changes: 57 additions & 12 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,67 @@ This file provides guidance to Claude Code when working with this repository.

## What This Is

Cozystack Claude Plugins (CCP) — an external marketplace repository for Claude Code
plugins for the Cozystack ecosystem.
Cozystack Claude Plugins (CCP) — an external marketplace repository for Claude Code plugins for the Cozystack ecosystem.

## Repository Structure

- **`agents/`** — Agent definitions
- **`skills/`** — Skill definitions (SKILL.md with frontmatter)
- **`mcp/`** — MCP server definitions
- **`hooks/`** — Hook plugins
```text
plugins/
<plugin-name>/
.claude-plugin/plugin.json # plugin metadata
skills/
<skill-name>/
SKILL.md # skill spec (frontmatter + workflow)
references/ # supporting docs the skill reads
.claude-plugin/
marketplace.json # registry; lists every plugin + description
tools/
check-refs.sh # cross-reference validator (CI gate)
.github/workflows/
validate.yml # PR validation: jq + check-refs.sh
README.md # operator-facing skill catalogue
CLAUDE.md # this file — contributor guidance
```

Registry: `.claude-plugin/marketplace.json`
Two plugins ship today:

- `plugins/cozystack/` — platform bundle (10 skills: wizard, talos-bootstrap, talos-reset, ubuntu-bootstrap, cluster-install, debug, cluster-upgrade, package-deploy, package-bump, external-app-create).
- `plugins/linstor/` — storage-recovery (1 skill: recover).

Multi-skill plugin shape: every plugin has one `.claude-plugin/plugin.json` at its root, and one directory per skill under `skills/`. Skills are addressed by Claude Code as `/<plugin>:<skill>` (e.g. `/cozystack:wizard`).

## Adding a New Skill to an Existing Plugin

1. `mkdir plugins/<plugin>/skills/<new-skill>/{references}` (references optional).
2. Write `plugins/<plugin>/skills/<new-skill>/SKILL.md` with YAML frontmatter (`name:`, `description:`, optional `argument-hint:`).
3. Update `plugins/<plugin>/.claude-plugin/plugin.json` `description` to mention the new skill (the cross-reference checker in `tools/check-refs.sh` enforces this).
4. Update `.claude-plugin/marketplace.json` `plugins[].description` for the parent plugin — list every skill the plugin ships.
5. Update `README.md` skills table.
6. `bash tools/check-refs.sh` locally before commit.

## Adding a New Plugin

1. Create directory under the appropriate type
2. Add `.claude-plugin/plugin.json` with metadata
3. Add content files (SKILL.md, agent .md, .mcp.json, or hooks.json)
4. Register in `.claude-plugin/marketplace.json`
5. Update README.md
1. `mkdir -p plugins/<plugin>/{.claude-plugin,skills}`.
2. Write `plugins/<plugin>/.claude-plugin/plugin.json` with `name`, `version`, `description` (mentioning every skill).
3. Add one or more skills per the section above.
4. Register the plugin in `.claude-plugin/marketplace.json` `plugins[]` with `name`, `description`, `source: ./plugins/<plugin>`, `category`.
5. Update `README.md`.
6. `bash tools/check-refs.sh`.

## Cross-reference discipline

The skills lean heavily on each other (`cozystack:wizard` dispatches `cozystack:talos-bootstrap` etc.), and skill bodies reference sibling skills and `references/<file>.md` documents. Stale paths and renamed skill identifiers cause silent breakage — operators type a skill name that no longer exists, or follow a link to a file that's been moved. `tools/check-refs.sh` walks the plugin tree and validates:

- Every `references/<file>.md` mentioned in a SKILL.md exists on disk.
- Every `cozystack:<skill>` / `linstor:<skill>` mention resolves to an actual directory under `plugins/<plugin>/skills/`.
- Every plugin's `description` in `marketplace.json` and in its own `plugin.json` mentions every skill present under `plugins/<plugin>/skills/`.

Run before any commit that touches skill names, references, or descriptions.

## Versioning

`plugin.json` `version` follows semver. Bump:

- patch — text-only fixes (typos, doc cleanup).
- minor — new skills, new features, schema additions.
- major — breaking changes for installed users (renames, removals, layout shifts).
76 changes: 68 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,20 +14,80 @@ Add the marketplace:
Install a plugin:

```text
/plugin install <plugin-name>@cozystack-claude-plugins
/plugin install cozystack@cozystack-claude-plugins
/plugin install linstor@cozystack-claude-plugins
```

## Plugins

### Skills
### cozystack

| Plugin | Description |
Platform skills bundle. One install gives you nine skills, invoked as `/cozystack:<name>`. Start with `/cozystack:wizard` — it asks Talos / Ubuntu / Existing and picks the chain.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for the cozystack plugin states that it provides nine skills, but it actually includes ten. This should be updated for accuracy.

Suggested change
Platform skills bundle. One install gives you nine skills, invoked as `/cozystack:<name>`. Start with `/cozystack:wizard` — it asks Talos / Ubuntu / Existing and picks the chain.
Platform skills bundle. One install gives you ten skills, invoked as `/cozystack:<name>`. Start with `/cozystack:wizard` — it asks Talos / Ubuntu / Existing and picks the chain.


| Skill | Description |
| --- | --- |
| **/cozystack:wizard** | Entry point. Opens with a free-form "tell me about your setup and goal" question so context comes through in the operator's own words and pre-fills the structured questions. Then asks Talos / Ubuntu / Existing, builds a chain, dispatches downstream skills via a cluster config directory the operator picks. Artifacts (inventory, kubeconfig, state, platform-package YAML) all live there — operator manages git on their own. Every skill in the chain matches the operator's natural language. |
| **/cozystack:talos-bootstrap** | Bootstrap Talos nodes via talm. Default: probe nodes for maintenance mode (Talos-1.12-aware `get disks --insecure`); if ready, write per-node multidoc machine-config stubs with VIP-link static IPv4, run NAT-provider cert-SAN guardrail (auto-populates `values.yaml.certSANs` with public IPs before first `talm apply`), then `talm apply` + `talosctl bootstrap` + kubeconfig fetch + cozystack-tuned shape verification with auto-upgrade to tuned image when nodes booted from base Talos. Opt-in boot-method picker (OCI Custom Image / boot-to-talos / ISO / PXE) only when nodes aren't yet imaged. |
| **/cozystack:talos-reset** | Cloud-provider recovery helper when Talos nodes are unrecoverable from inside the cluster (cert-SAN trap, broken machine-config, lost talosconfig). Wraps `oci` / `aws` / `gcloud` / `hcloud` to terminate + relaunch from the cozystack-tuned image while preserving block volumes, secondary VNICs, NSG memberships. Sequential per-node to maintain etcd quorum. Hands off to `cozystack:talos-bootstrap` for re-bootstrap. |
| **/cozystack:ubuntu-bootstrap** | Bootstrap Ubuntu / Debian nodes by wrapping `cozystack/ansible-cozystack/examples/ubuntu/` — OS prep, drbd-dkms for Secure Boot, ZFS + KubeVirt modules, k3s install with cozystack-compatible flags, kubeconfig retrieval. Stops before Cozystack itself. |
| **/cozystack:cluster-install** | Cozystack on a ready cluster — node-readiness validation, variant picker, interactive values, per-node ZFS pool provisioning, extractedprism for kube-apiserver HA, cozy-installer chart, Platform Package apply, root Tenant ingress patch, wait until every HelmRelease is Ready, NOTES summary. |
| **/cozystack:debug** | Investigate a stuck or broken Cozystack install. Gathers symptoms, classifies (operator error / config drift / upstream bug / not-yet-supported), applies fixes or workarounds, drafts upstream issues with diagnostic bundle on approval. Never opens PRs or files silently. Auto-dispatched by the wizard when any chain step fails. |
| **/cozystack:cluster-upgrade** | Guided upgrade of a running Cozystack v1.x cluster — release-notes analysis, prechecks, stop gates, helm upgrade, targeted post-upgrade verification, known-failure recovery. |
| **/cozystack:package-deploy** | Deploy a single Cozystack package to a dev cluster via make + cozyhr — handles fresh install and dev-loop iteration with ExternalArtifact support. |
| **/cozystack:package-bump** | Bump a single package inside the cozystack monorepo — reads upstream changelog, adapts to breaking changes, regenerates schema, optionally deploys to a dev cluster. |
| **/cozystack:external-app-create** | Scaffold a new Cozystack external app package with dependency integration (managed CNPG Postgres, external secret references). |

Chains the wizard builds:

| Target | Chain |
| ----------- | ----------- |
| Bare-metal Talos | `talos-bootstrap` → `cluster-install` |
| Bare-metal Ubuntu / Debian | `ubuntu-bootstrap` → `cluster-install` |
| Existing Kubernetes (self-managed or managed) | `cluster-install` |
| Existing Cozystack | refuse → `cozystack:cluster-upgrade` |

### linstor

LINSTOR / DRBD operations bundle. Useful on any Kubernetes cluster that runs piraeus-operator / LINSTOR, not just on Cozystack.

| Skill | Description |
| --- | --- |
| **cozy-deploy** | Deploy a Cozystack package to a dev cluster via make + cozyhr |
| **cozy-external-app** | Scaffold a new Cozystack external app package with dependency integration |
| **drbd-recovery** | Diagnose and recover DRBD/LINSTOR storage issues in Kubernetes clusters |
| **cozystack-upgrade** | Guided upgrade of a running Cozystack v1.x cluster to a newer v1.x patch or minor version |
| **cozy-bump** | Bump a cozystack monorepo package — reads upstream changelog, adapts to breaking changes, regenerates schema, optionally deploys to a dev cluster |
| **/linstor:recover** | Diagnose and recover broken DRBD resources — handles StandAlone, DELETING, Inconsistent, Diskless, quorum loss, bitmap errors, and other common failure modes. |

## Third-party dependencies

`cozystack:cluster-install` default-installs [extractedprism](https://github.com/lexfrei/extractedprism) on `generic` variant clusters (k3s / kubeadm / RKE2). extractedprism is a per-node TCP load balancer that gives generic Linux Kubernetes the same `localhost:7445` kube-apiserver shape Talos has built-in (KubePrism), so Cilium and KubeOVN can dial a stable local address regardless of which control-plane node is up.

Project metadata:

- Source: `https://github.com/lexfrei/extractedprism` (BSD-3-Clause).
- Helm chart: `oci://ghcr.io/lexfrei/charts/extractedprism`.
- Maintained independently by a Cozystack contributor; reviewed and approved by the Cozystack platform team for use as the generic-variant HA proxy.

Operators can opt out with `--no-extractedprism` and supply their own `--api-host=<ip>` (external LB, VIP, or single CP IP with the SPOF caveat) — see `cozystack:cluster-install` Phase 4. Talos and hosted variants do not need extractedprism.

## Repository layout

```text
plugins/
cozystack/ # platform bundle (9 skills)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The repository layout description for cozystack incorrectly states it has 9 skills. This should be updated to 10 to match the actual number of skills in the bundle.

Suggested change
cozystack/ # platform bundle (9 skills)
cozystack/ # platform bundle (10 skills)

.claude-plugin/plugin.json
skills/
wizard/ # entry point: interview + chain dispatcher
talos-bootstrap/ # Talos node prep
talos-reset/ # cloud-provider terminate+relaunch helper
ubuntu-bootstrap/ # Ubuntu/Debian via ansible-cozystack wrapper
cluster-install/ # Cozystack on a ready cluster
debug/ # investigate + classify + workaround + issue draft
cluster-upgrade/ # v1.x patch/minor upgrade
package-deploy/ # dev-loop deploy of a single package
package-bump/ # bump a monorepo package
external-app-create/ # scaffold a new external-apps package
linstor/ # storage bundle (1 skill)
.claude-plugin/plugin.json
skills/
recover/
```

## License

Expand Down
9 changes: 9 additions & 0 deletions plugins/cozystack/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"name": "cozystack",
"version": "1.14.1",
"description": "Cozystack platform skills bundle. Start with cozystack:wizard — it begins with a free-form 'tell me about your setup and goal' question, parses hints, then asks Talos / Ubuntu / Existing, builds a chain, and dispatches downstream skills via a cluster config directory the operator picks (every artifact lives there: inventory.yml, kubeconfig, .state.yaml, cozystack-platform-package.yaml — operator manages git on their own; optional sops opt-in encrypts secret files in-tree). Skills, invoked as cozystack:<name>: wizard (orchestrator + 3-route dispatcher + Phase 4.5 active research + auto-dispatches debug on any failed_at), talos-bootstrap (Talos node prep via talm — Talos-1.12-aware maintenance probe, NAT-provider cert-SAN guardrail before first talm apply, multidoc machine-config with per-node VIP-link IPv4 stubs, etcd bootstrap, kubeconfig fetch, cozystack-tuned shape verification with Phase 11.5 auto-upgrade), talos-reset (cloud-provider terminate+relaunch helper for OCI/AWS/GCP/Hetzner when nodes are unrecoverable from inside; preserves block volumes + secondary VNICs + NSG memberships), ubuntu-bootstrap (wraps cozystack/ansible-cozystack — OS prep + k3s install in one go), cluster-install (Cozystack on a ready cluster — node-readiness, ZFS pool provisioning via privileged DaemonSet on Talos with hostNetwork, extractedprism for kube-apiserver HA, OCI-tag-normalized cozy-installer chart, Platform Package, inline tenants/root ingress patch + LINSTOR pool registration during watch loop with combined HRs-Ready + pools-registered gate, Phase 8.6 default StorageClasses for v1.3.x, Phase 9.1 end-to-end reachability probe), debug (investigate a stuck or broken install — gathers symptoms, classifies operator error / config drift / upstream bug / not-yet-supported, applies fixes or workarounds, drafts upstream issues on approval; never opens PRs or files silently), cluster-upgrade (v1.x patch/minor upgrade with release-notes analysis), package-deploy (dev-loop deploy of a single package with ExternalArtifact support), package-bump (single-package version bump with changelog adaptation), external-app-create (scaffold a new external-apps package). All skills match the operator's natural language detected from conversation context — code identifiers, commands, file paths, and GitHub-public text stay canonical. All skills follow the same gate-and-confirm discipline: read-only lookups run freely; any mutation needs explicit per-step approval.",
"author": {
"name": "Cozystack",
"url": "https://github.com/cozystack"
}
}
Loading
Loading