Skip to content

feat: Handle skills as resources#31

Merged
gonzalezzfelipe merged 4 commits into
mainfrom
feat/skills-as-resources
May 22, 2026
Merged

feat: Handle skills as resources#31
gonzalezzfelipe merged 4 commits into
mainfrom
feat/skills-as-resources

Conversation

@gonzalezzfelipe
Copy link
Copy Markdown
Contributor

@gonzalezzfelipe gonzalezzfelipe commented May 21, 2026

Summary by CodeRabbit

  • New Features

    • Added a skill catalog system (bundled + OCI-backed) with trust controls, size limits, and skill-driven prompts/resources for extension install and workload troubleshooting.
    • New workload tools: install, list, get, metrics, logs, and enhanced workload inspection and secrets runtime operations.
  • Documentation

    • Comprehensive skill catalog docs, validation rules, and publishing guidance.
  • Chores

    • CI workflows, Helm chart/values, and runtime config updated to generate/load/manage skill catalogs; improved OCI fetching and Helm timeout handling.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Warning

Rate limit exceeded

@gonzalezzfelipe has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 41 minutes and 58 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 31c57e99-59bf-43a9-ad30-c3f78e76cdd7

📥 Commits

Reviewing files that changed from the base of the PR and between 9222870 and c498f40.

📒 Files selected for processing (2)
  • mcp-server/src/oci_client.rs
  • mcp-server/src/tools/args.rs
📝 Walkthrough

Walkthrough

Adds a skill catalog: manifest + generated JSON, generation/publish scripts, OCI client to fetch/verify catalogs, SkillCatalog domain and loader, server/router/URI integration, Helm/deployment wiring and env vars, tools refactor (args/schema validation/workloads), and CI steps to generate the catalog.

Changes

Skill Catalog Feature Implementation

Layer / File(s) Summary
Catalog content and scripts
catalog/skill-catalog.manifest.json, catalog/skill-catalog.json, catalog/scripts/generate-skill-catalog.mjs, catalog/scripts/publish-catalogs.sh, catalog/README.md
Adds manifest and generated skill-catalog JSON, script to compile manifest+markdown into JSON, publish script for OCI artifacts, and documentation describing the skill catalog contract and validation rules.
CI workflow generation
.github/workflows/*
Expands workflow path filters to include catalog/** and skills/** and adds Node.js + generation steps to build/check/test workflows to ensure the generated catalog is current.
Config and deployment wiring
mcp-server/src/config.rs, mcp-server/src/errors.rs, extensions/control-plane/Chart.yaml, extensions/control-plane/values.yaml, extensions/control-plane/templates/stage4-04-statefulset-supernode-mcp.yaml, extensions/control-plane/README.md
Adds SkillCatalogConfig/SkillCatalogSource parsing from MCP_SKILL_CATALOG_* env vars, error variants, Helm chart version bump and values for OCI-backed skill catalog, and StatefulSet env var wiring for runtime.
Skill catalog domain
mcp-server/src/skills/mod.rs, mcp-server/src/skills/source.rs, mcp-server/src/skills/oci.rs
Introduces SkillCatalog types, validation (schema version, unique/URI-safe ids, non-empty title/description/content), bundled loading, OCI-backed loading via a loader that validates trusted refs and size limits, and mapped load errors.
OCI client and adapters
mcp-server/src/oci_client.rs, mcp-server/src/catalog/oci.rs, mcp-server/src/skills/oci.rs
Adds a shared OCI client that parses oci:// refs, fetches manifests/blobs, enforces size limits and verifies sha256 digests; catalog/skills modules delegate fetching/verification to this client.
Server integration and routing
mcp-server/src/main.rs, mcp-server/src/server.rs, mcp-server/src/mcp.rs, mcp-server/src/resources/router.rs, mcp-server/src/resources/uri.rs, mcp-server/src/prompts/catalog.rs
Wires SkillCatalog into server initialization and ResourceRouter, adds supernode://skills and supernode://skills/{id} URIs, serves skills in list/read endpoints, and updates prompts to reference skill-driven workflows.
Helm command handling
mcp-server/src/helm.rs, mcp-server/Cargo.toml
Refactors Helm invocations to require absolute helm binary, use NamedTempFile for values, add timeouts, extend error variants, and adds tempfile dependency.
Kubernetes/Session changes
mcp-server/src/k8s/helm_releases.rs, mcp-server/src/session/sqlite.rs
Adds MAX_INFLATED_RELEASE_SIZE and redacts config in Debug; caps decompression size. Moves SQLite session DB ops into spawn_blocking closures.
Tools refactor & workloads
mcp-server/src/tools/{args,schema_validation,router,vault}.rs, mcp-server/src/tools/workloads/*
Adds shared argument helpers and schema validation, moves workload/vault implementations into dedicated modules (get/list/metrics/install/upgrade/delete), tightens input schemas, and rewires ToolRouter to delegate to module-level handlers.

Sequence Diagram

sequenceDiagram
  participant Config
  participant Server
  participant Loader as SkillLoader
  participant Oci as OciClient
  participant Catalog as SkillCatalog
  Config->>Server: provide MCP_SKILL_CATALOG_* env
  Server->>Loader: load_skill_catalog(config)
  alt bundled
    Loader->>Catalog: SkillCatalog::bundled()
  else oci
    Loader->>Oci: is_trusted_reference(ref)
    Oci-->>Loader: trusted?
    Loader->>Oci: fetch_artifact_json(ref, max_bytes, media_type)
    Oci-->>Loader: catalog bytes
    Loader->>Catalog: from_json_str(bytes)
  end
  Loader-->>Server: Arc<SkillCatalog>
  Server->>ResourceRouter: new(catalog, skill_catalog)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • txpipe/metis#29: Related prior work on prompt catalog and ResourceRouter/URI handling that this PR extends to support skills.
  • txpipe/metis#30: Related OCI/catalog retrieval refactor that this PR builds on by adding a shared OCI client and delegating catalog fetches.

Suggested reviewers

  • mduthey
  • scarmuega

Poem

🐰 I hopped through manifests, markdown, and CI,
Bundled or OCI — choose which suits thee,
Servers now list skills with URIs neat,
Tempfiles and digests make fetching complete,
A tiny catalog rabbit applauds this spree!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 24.82% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: Handle skills as resources' accurately and concisely summarizes the main change—introducing skills as MCP resources with catalog loading, routing, and metadata integration throughout the codebase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/skills-as-resources

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
.github/workflows/check_mcp_server.yml (1)

31-33: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add explicit Node.js setup before running the generation script.

This workflow has the same issue as build_mcp_server.yml: it runs a Node.js script without explicitly setting up Node.js, relying on the pre-installed version in the runner image. This creates brittleness and potential compatibility issues.

🔧 Suggested fix to add Node.js setup
       - name: Checkout
         uses: actions/checkout@v3

+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '22'
+
       - name: Generate skill catalog
         working-directory: .
         run: node catalog/scripts/generate-skill-catalog.mjs
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/check_mcp_server.yml around lines 31 - 33, The workflow
step "Generate skill catalog" runs a Node.js script without ensuring a specific
Node version; add a preceding step that uses actions/setup-node@v3 (e.g., name
it "Set up Node.js") and set node-version to a pinned value (like 18 or the
repo's supported LTS) so the runner installs and uses that Node; you can also
enable npm/yarn caching via actions/setup-node's cache input if desired. Ensure
the new step appears before the "Generate skill catalog" step so node is
configured when node catalog/scripts/generate-skill-catalog.mjs runs.
.github/workflows/test_mcp_server.yml (1)

31-33: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add explicit Node.js setup before running the generation script.

This workflow has the same issue as build_mcp_server.yml and check_mcp_server.yml: it runs a Node.js script without explicitly setting up Node.js, relying on the pre-installed version in the runner image.

🔧 Suggested fix to add Node.js setup
       - name: Checkout
         uses: actions/checkout@v3

+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '22'
+
       - name: Generate skill catalog
         working-directory: .
         run: node catalog/scripts/generate-skill-catalog.mjs
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test_mcp_server.yml around lines 31 - 33, Add an explicit
Node.js setup step before the "Generate skill catalog" run step: insert an
actions/setup-node step (e.g., name "Setup Node.js") configured with a pinned
node-version (or matrix variable) so the runner installs the expected Node.js
runtime before executing node catalog/scripts/generate-skill-catalog.mjs; ensure
the setup step appears immediately above the step that runs the script so the
"node" binary is provided by setup-node rather than relying on the runner image.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/build_mcp_server.yml:
- Around line 47-48: Add an explicit Node.js setup step using
actions/setup-node@v4 pinned to node-version: '22' immediately before the
"Generate skill catalog" run step that executes node
catalog/scripts/generate-skill-catalog.mjs; ensure the new step is named (e.g.,
"Set up Node.js") and placed directly above the step that runs
generate-skill-catalog.mjs so the workflow uses the same pinned Node version as
check_catalog.yml.

In `@catalog/README.md`:
- Around line 7-16: The README and repository disagree: README states
skill-catalog.json is generated and gitignored, but this PR adds
catalog/skill-catalog.json; restore consistency by removing
catalog/skill-catalog.json from the commit (git rm --cached if needed) and
ensure skill-catalog.json remains listed in .gitignore, then regenerate the file
locally as described (skill-catalog.manifest.json → node
catalog/scripts/generate-skill-catalog.mjs) and update README only if you
intentionally want to change the policy; if you do intend to track the artifact
instead, instead update the README text to explicitly state that
catalog/skill-catalog.json is now tracked and why.

In `@catalog/skill-catalog.json`:
- Around line 60-75: The skill "cardano-block-producer-upgrade" has an
out-of-sync tools allowlist: the "tools" array (currently listing
supernode.status.get, extensions.catalog.get, workloads.get,
workloads.metrics.get, vault.runtime.metadata.get, workloads.upgrade,
workloads.logs.get, cluster.events.list) does not include all MCP calls
referenced in the "content" workflow (e.g., workloads.install,
cluster.storage_classes.list, workloads.list, vault.runtime.write/patch, and any
other workload/cluster/vault calls mentioned). Update the tools array for the
skill id "cardano-block-producer-upgrade" to include every unique MCP call
referenced in the content (at minimum add workloads.install,
cluster.storage_classes.list, workloads.list, vault.runtime.write,
vault.runtime.patch) and repeat this alignment for the other listed skills
(lines referenced in the review) so each manifest's tools allowlist exactly
matches the MCP calls used, then regenerate the catalog JSON.

---

Duplicate comments:
In @.github/workflows/check_mcp_server.yml:
- Around line 31-33: The workflow step "Generate skill catalog" runs a Node.js
script without ensuring a specific Node version; add a preceding step that uses
actions/setup-node@v3 (e.g., name it "Set up Node.js") and set node-version to a
pinned value (like 18 or the repo's supported LTS) so the runner installs and
uses that Node; you can also enable npm/yarn caching via actions/setup-node's
cache input if desired. Ensure the new step appears before the "Generate skill
catalog" step so node is configured when node
catalog/scripts/generate-skill-catalog.mjs runs.

In @.github/workflows/test_mcp_server.yml:
- Around line 31-33: Add an explicit Node.js setup step before the "Generate
skill catalog" run step: insert an actions/setup-node step (e.g., name "Setup
Node.js") configured with a pinned node-version (or matrix variable) so the
runner installs the expected Node.js runtime before executing node
catalog/scripts/generate-skill-catalog.mjs; ensure the setup step appears
immediately above the step that runs the script so the "node" binary is provided
by setup-node rather than relying on the runner image.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19cfd583-a0ea-4249-92f0-79705a5c9b82

📥 Commits

Reviewing files that changed from the base of the PR and between cf3e2e8 and 86e593f.

📒 Files selected for processing (24)
  • .github/workflows/build_mcp_server.yml
  • .github/workflows/check_catalog.yml
  • .github/workflows/check_mcp_server.yml
  • .github/workflows/test_mcp_server.yml
  • catalog/README.md
  • catalog/scripts/generate-skill-catalog.mjs
  • catalog/scripts/publish-catalogs.sh
  • catalog/skill-catalog.json
  • catalog/skill-catalog.manifest.json
  • extensions/control-plane/Chart.yaml
  • extensions/control-plane/README.md
  • extensions/control-plane/templates/stage4-04-statefulset-supernode-mcp.yaml
  • extensions/control-plane/values.yaml
  • mcp-server/src/config.rs
  • mcp-server/src/errors.rs
  • mcp-server/src/main.rs
  • mcp-server/src/mcp.rs
  • mcp-server/src/prompts/catalog.rs
  • mcp-server/src/resources/router.rs
  • mcp-server/src/resources/uri.rs
  • mcp-server/src/server.rs
  • mcp-server/src/skills/mod.rs
  • mcp-server/src/skills/oci.rs
  • mcp-server/src/skills/source.rs

Comment thread .github/workflows/build_mcp_server.yml
Comment thread catalog/README.md Outdated
Comment thread catalog/skill-catalog.json
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
mcp-server/src/tools/workloads/dolos.rs (1)

213-239: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

This refresh path deletes every StatefulSet PVC, not just the Dolos data volume.

managed_pvc_names() expands all volume_claim_templates, and snapshot_refresh() deletes every returned claim. That is broader than the tool contract here: if the chart ever adds another persistent volume, this operation will wipe it too.

Please resolve the specific snapshot/data claim instead of deleting every template-derived PVC.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mcp-server/src/tools/workloads/dolos.rs` around lines 213 - 239,
managed_pvc_names() currently expands every volume_claim_templates and causes
snapshot_refresh() to delete all PVCs; limit deletion to the specific
data/snapshot claim instead. Update managed_pvc_names (or add a new helper like
managed_data_pvc_names) to accept or hardcode the target claim name (e.g.,
"data" or the chart's snapshot volume name) and only generate PVC names for
templates whose template.metadata.name matches that target; ensure you still use
stateful_set.metadata.name and replicas as before to build names, and update
snapshot_refresh() to call the new/updated function so only the intended
data/snapshot PVCs are deleted.
mcp-server/src/tools/router.rs (1)

72-97: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Dynamic tools can be listed but still cannot execute.

list_with_dynamic / get_with_dynamic accept catalog-supplied definitions, but call only dispatches hardcoded names. Any new skill/resource tool exposed through dynamic_definitions will fall through to not_implemented_result, so the router can advertise tools that are unusable at runtime. Please either thread dynamic dispatch into call or stop surfacing unresolved dynamic tools here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mcp-server/src/tools/router.rs` around lines 72 - 97, The router's call
method only matches hardcoded tool names so tools returned by
list_with_dynamic/get_with_dynamic (dynamic_definitions from ExtensionCatalog)
cannot be executed; update call to, after the existing match arms and before
falling back to not_implemented_result, check the catalog's dynamic_definitions
(or a catalog method that looks up a dynamic ToolDefinition by name) for
definition.name and, if found, dispatch to the dynamic tool executor (or
delegate to the catalog to run it) so dynamic tools advertised by
list_with_dynamic/get_with_dynamic are actually callable; reference call,
list_with_dynamic/get_with_dynamic, dynamic_definitions (or catalog lookup
method) and not_implemented_result when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/build_mcp_server.yml:
- Around line 47-50: The workflow step using "uses: actions/setup-node@v4" (the
"Set up Node.js" step) should be pinned to a specific commit SHA instead of the
"`@v4`" tag; update the "uses" field to the full commit SHA for actions/setup-node
(obtain the latest stable commit SHA from the actions/setup-node repository) so
the CI uses an immutable reference and not a floating tag.

In @.github/workflows/check_mcp_server.yml:
- Around line 31-34: The workflow step "Set up Node.js" uses the mutable tag
actions/setup-node@v4; replace that with the repository pinned to a full commit
SHA for actions/setup-node (i.e., change the uses value from
"actions/setup-node@v4" to "actions/setup-node@<full-commit-sha>") so the CI
action is pinned immutably and supply-chain risk is reduced.

In @.github/workflows/test_mcp_server.yml:
- Around line 31-34: The workflow step named "Set up Node.js" currently
references actions/setup-node@v4 which is a floating tag; replace that reference
with the action's full commit SHA (pin to a specific immutable revision) so the
step uses actions/setup-node@<full-commit-sha> instead of `@v4`; keep the existing
with: node-version: '22' configuration and update any related documentation or
comments to note the pinned SHA.

In `@catalog/skill-catalog.json`:
- Around line 137-140: The skill's "extensions" array only lists "cardano-relay"
and "cardano-block-producer" but the skill description says it covers "Cardano
or Apex Fusion workload metrics", so Apex Fusion installs will never match;
update the "extensions" array to include the Apex Fusion extension identifiers
used by that platform (e.g., add the Apex Fusion extension names matching your
registry) and also amend the original source manifest that generates
catalog/skill-catalog.json so the change persists when regenerating.

In `@catalog/skill-catalog.manifest.json`:
- Around line 252-257: The skill manifest grants an extra write capability;
remove "hydra.keys.generate" from the "tools" array in the skill manifest for
the hydra-head-operations workflow so it remains inspection-only (keep
"workloads.get", "workloads.metrics.get", "workloads.logs.get" as the allowed
tools, or replace "hydra.keys.generate" with a read-only equivalent like
"hydra.keys.get" if key visibility is required). Ensure the change is made in
the skill-catalog.manifest.json tools array where those tool names are listed.

In `@mcp-server/src/oci_client.rs`:
- Around line 24-29: The manifest fetch can follow cross-host redirects and
bypass the trusted-registry boundary; update the code so the reqwest::Client
used by fetch_manifest does not follow redirects (e.g., build a client with
reqwest::redirect::Policy::none() or otherwise enforce same-host redirects)
instead of the current Client built in fetch_artifact_json()/oci_client.rs;
specifically, when calling fetch_manifest(&client, &reference) ensure that
client is created with no-redirect policy (or create a dedicated no-redirect
client for fetch_manifest) so that is_trusted_reference() cannot be subverted
while keeping blob/digest verification via select_artifact_descriptor unchanged.

In `@mcp-server/src/tools/args.rs`:
- Around line 38-43: The helper function optional_string currently validates
that the string is non-empty after trimming but returns the original untrimmed
slice, causing downstream mismatches; update optional_string (the function
taking arguments: Option<&JsonObject>, name: &str) to return the trimmed value
instead of the original by applying trim() before converting to String (e.g.,
replace the final map(str::to_string) with a map that trims the &str and then
to_string()/to_owned()) so inputs like " hydra " become "hydra".

In `@mcp-server/src/tools/schema_validation.rs`:
- Around line 137-153: The current dereference_schema function only resolves a
single $ref hop; update dereference_schema to follow $ref chains transitively
(resolve A -> B -> ... until a non-$ref schema or missing definition) so
validate_property_value gets the final concrete schema; implement a loop that
checks for "$ref" on each resolved Value, looks up name via the same
strip_prefix logic against root_schema "definitions"/"$defs", and continues
resolving, while guarding against infinite cycles by tracking visited reference
names (or a max hop count) and returning the original schema if resolution fails
or a cycle is detected.

In `@mcp-server/src/tools/vault.rs`:
- Around line 119-149: The error paths in runtime_write currently hardcode
"vault.runtime.write" which can mislabel failures when this helper is reused for
other modes (e.g., patch); change runtime_write signature to accept a tool_name:
&str (or similar) and use that tool_name in the vault_error calls instead of the
literal string; update the three vault_error invocations inside runtime_write
and adjust its callers to pass the correct tool identifier so errors from
write_mode, VaultClient::from_env, and client.write_runtime_secret report the
proper tool name.

---

Outside diff comments:
In `@mcp-server/src/tools/router.rs`:
- Around line 72-97: The router's call method only matches hardcoded tool names
so tools returned by list_with_dynamic/get_with_dynamic (dynamic_definitions
from ExtensionCatalog) cannot be executed; update call to, after the existing
match arms and before falling back to not_implemented_result, check the
catalog's dynamic_definitions (or a catalog method that looks up a dynamic
ToolDefinition by name) for definition.name and, if found, dispatch to the
dynamic tool executor (or delegate to the catalog to run it) so dynamic tools
advertised by list_with_dynamic/get_with_dynamic are actually callable;
reference call, list_with_dynamic/get_with_dynamic, dynamic_definitions (or
catalog lookup method) and not_implemented_result when making the change.

In `@mcp-server/src/tools/workloads/dolos.rs`:
- Around line 213-239: managed_pvc_names() currently expands every
volume_claim_templates and causes snapshot_refresh() to delete all PVCs; limit
deletion to the specific data/snapshot claim instead. Update managed_pvc_names
(or add a new helper like managed_data_pvc_names) to accept or hardcode the
target claim name (e.g., "data" or the chart's snapshot volume name) and only
generate PVC names for templates whose template.metadata.name matches that
target; ensure you still use stateful_set.metadata.name and replicas as before
to build names, and update snapshot_refresh() to call the new/updated function
so only the intended data/snapshot PVCs are deleted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b379bd74-f211-4b59-ba47-adb8323be074

📥 Commits

Reviewing files that changed from the base of the PR and between 86e593f and 9222870.

⛔ Files ignored due to path filters (1)
  • mcp-server/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (29)
  • .github/workflows/build_mcp_server.yml
  • .github/workflows/check_mcp_server.yml
  • .github/workflows/test_mcp_server.yml
  • catalog/README.md
  • catalog/skill-catalog.json
  • catalog/skill-catalog.manifest.json
  • mcp-server/Cargo.toml
  • mcp-server/src/catalog/oci.rs
  • mcp-server/src/helm.rs
  • mcp-server/src/k8s/helm_releases.rs
  • mcp-server/src/main.rs
  • mcp-server/src/oci_client.rs
  • mcp-server/src/session/sqlite.rs
  • mcp-server/src/skills/oci.rs
  • mcp-server/src/tools/args.rs
  • mcp-server/src/tools/hydra.rs
  • mcp-server/src/tools/mod.rs
  • mcp-server/src/tools/router.rs
  • mcp-server/src/tools/schema_validation.rs
  • mcp-server/src/tools/vault.rs
  • mcp-server/src/tools/workloads/delete.rs
  • mcp-server/src/tools/workloads/dolos.rs
  • mcp-server/src/tools/workloads/get.rs
  • mcp-server/src/tools/workloads/install.rs
  • mcp-server/src/tools/workloads/list.rs
  • mcp-server/src/tools/workloads/logs.rs
  • mcp-server/src/tools/workloads/metrics.rs
  • mcp-server/src/tools/workloads/mod.rs
  • mcp-server/src/tools/workloads/upgrade.rs
✅ Files skipped from review due to trivial changes (3)
  • mcp-server/Cargo.toml
  • mcp-server/src/tools/mod.rs
  • catalog/README.md

Comment on lines +47 to +50
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '22'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify workflows still have tag-only action refs
rg -n '^\s*uses:\s*[^@]+@v?[0-9]+(\.[0-9]+)*\s*$' .github/workflows

Repository: txpipe/metis

Length of output: 2413


Pin actions/setup-node to a commit SHA (avoid @v4 tags).

.github/workflows/build_mcp_server.yml uses actions/setup-node@v4 at line 48, which weakens supply-chain guarantees under strict CI policies.

🔧 Suggested hardening
-      - name: Set up Node.js
-        uses: actions/setup-node@v4
+      - name: Set up Node.js
+        uses: actions/setup-node@<full_commit_sha> # v4
         with:
           node-version: '22'
🧰 Tools
🪛 zizmor (1.25.2)

[error] 48-48: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/build_mcp_server.yml around lines 47 - 50, The workflow
step using "uses: actions/setup-node@v4" (the "Set up Node.js" step) should be
pinned to a specific commit SHA instead of the "`@v4`" tag; update the "uses"
field to the full commit SHA for actions/setup-node (obtain the latest stable
commit SHA from the actions/setup-node repository) so the CI uses an immutable
reference and not a floating tag.

Comment on lines +31 to +34
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '22'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate and display the relevant workflow section
ls -la .github/workflows/check_mcp_server.yml
nl -ba .github/workflows/check_mcp_server.yml | sed -n '1,120p'

# Search for other uses of actions/setup-node in the repo (pinning patterns)
rg -n "uses:\s*actions/setup-node@" -S .github/workflows || true
rg -n "zizmor|unpinned|pin(ed)? action" -S .github/workflows .github || true

Repository: txpipe/metis

Length of output: 204


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file=".github/workflows/check_mcp_server.yml"
echo "== Workflow file =="

# Show first ~120 lines with line numbers (no nl dependency)
awk '{printf "%5d  %s\n", NR, $0}' "$file" | sed -n '1,140p'

echo "== Search for actions/setup-node usage =="

rg -n "uses:\s*actions/setup-node@" .github/workflows || true
rg -n "uses:\s*actions/" .github/workflows | head -n 200 || true

echo "== Search for unpinned/mutable action references patterns =="

rg -n "uses:\s*[^@]+@[vV]([0-9]+|\w+)" .github/workflows .github || true
rg -n "zizmor|unpinned|pin(ed)? action" .github/workflows .github || true

Repository: txpipe/metis

Length of output: 8394


Pin actions/setup-node to a full commit SHA (not @v4).

.github/workflows/check_mcp_server.yml uses actions/setup-node@v4 (mutable tag), weakening CI supply-chain guarantees.

Suggested fix
-      - name: Set up Node.js
-        uses: actions/setup-node@v4
+      - name: Set up Node.js
+        uses: actions/setup-node@<full_commit_sha>
         with:
           node-version: '22'
🧰 Tools
🪛 zizmor (1.25.2)

[error] 32-32: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/check_mcp_server.yml around lines 31 - 34, The workflow
step "Set up Node.js" uses the mutable tag actions/setup-node@v4; replace that
with the repository pinned to a full commit SHA for actions/setup-node (i.e.,
change the uses value from "actions/setup-node@v4" to
"actions/setup-node@<full-commit-sha>") so the CI action is pinned immutably and
supply-chain risk is reduced.

Comment on lines +31 to +34
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '22'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP 'uses:\s*[^@\s]+/[^@\s]+@(?![0-9a-f]{40}\b)' .github/workflows

Repository: txpipe/metis

Length of output: 2792


Pin actions/setup-node to a full commit SHA.

.github/workflows/test_mcp_server.yml line 32 uses a floating tag (actions/setup-node@v4), which weakens supply-chain guarantees versus SHA-pinning.

🧰 Tools
🪛 zizmor (1.25.2)

[error] 32-32: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test_mcp_server.yml around lines 31 - 34, The workflow
step named "Set up Node.js" currently references actions/setup-node@v4 which is
a floating tag; replace that reference with the action's full commit SHA (pin to
a specific immutable revision) so the step uses
actions/setup-node@<full-commit-sha> instead of `@v4`; keep the existing with:
node-version: '22' configuration and update any related documentation or
comments to note the pinned SHA.

Comment on lines +137 to +140
"extensions": [
"cardano-relay",
"cardano-block-producer"
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Apex Fusion workloads won't match this skill.

The content for this skill says it covers "Cardano or Apex Fusion workload metrics", but the extension filter only includes the two Cardano extensions. If skill discovery keys off extensions, Apex Fusion relay/block-producer installs will never see it.

Because this file is generated, the source manifest needs the same fix before regenerating.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@catalog/skill-catalog.json` around lines 137 - 140, The skill's "extensions"
array only lists "cardano-relay" and "cardano-block-producer" but the skill
description says it covers "Cardano or Apex Fusion workload metrics", so Apex
Fusion installs will never match; update the "extensions" array to include the
Apex Fusion extension identifiers used by that platform (e.g., add the Apex
Fusion extension names matching your registry) and also amend the original
source manifest that generates catalog/skill-catalog.json so the change persists
when regenerating.

Comment on lines +252 to +257
"tools": [
"workloads.get",
"workloads.metrics.get",
"workloads.logs.get",
"hydra.keys.generate"
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep hydra-head-operations inspection-only.

This workflow only reads workload state, but the allowlist now also grants hydra.keys.generate. If skill gating is enforced from tools, this skill can create new key material even though its workflow is documented as read-only boundary inspection.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@catalog/skill-catalog.manifest.json` around lines 252 - 257, The skill
manifest grants an extra write capability; remove "hydra.keys.generate" from the
"tools" array in the skill manifest for the hydra-head-operations workflow so it
remains inspection-only (keep "workloads.get", "workloads.metrics.get",
"workloads.logs.get" as the allowed tools, or replace "hydra.keys.generate" with
a read-only equivalent like "hydra.keys.get" if key visibility is required).
Ensure the change is made in the skill-catalog.manifest.json tools array where
those tool names are listed.

Comment thread mcp-server/src/oci_client.rs
Comment thread mcp-server/src/tools/args.rs
Comment on lines +137 to +153
fn dereference_schema<'a>(schema: &'a Value, root_schema: &'a Value) -> &'a Value {
let Some(reference) = schema.get("$ref").and_then(Value::as_str) else {
return schema;
};
let Some(name) = reference
.strip_prefix("#/definitions/")
.or_else(|| reference.strip_prefix("#/$defs/"))
else {
return schema;
};

root_schema
.get("definitions")
.or_else(|| root_schema.get("$defs"))
.and_then(|definitions| definitions.get(name))
.unwrap_or(schema)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Resolve $ref transitively, not just one hop.

This helper stops after a single lookup. If a property schema points to #/definitions/A and A is itself another $ref, scalar and array values can skip all of the real constraints because validate_property_value never dereferences again after this call.

Proposed fix
 fn dereference_schema<'a>(schema: &'a Value, root_schema: &'a Value) -> &'a Value {
-    let Some(reference) = schema.get("$ref").and_then(Value::as_str) else {
-        return schema;
-    };
-    let Some(name) = reference
-        .strip_prefix("`#/definitions/`")
-        .or_else(|| reference.strip_prefix("`#/`$defs/"))
-    else {
-        return schema;
-    };
-
-    root_schema
-        .get("definitions")
-        .or_else(|| root_schema.get("$defs"))
-        .and_then(|definitions| definitions.get(name))
-        .unwrap_or(schema)
+    let mut current = schema;
+    let mut seen = std::collections::BTreeSet::new();
+
+    while let Some(reference) = current.get("$ref").and_then(Value::as_str) {
+        if !seen.insert(reference) {
+            break;
+        }
+
+        let Some(name) = reference
+            .strip_prefix("`#/definitions/`")
+            .or_else(|| reference.strip_prefix("`#/`$defs/"))
+        else {
+            break;
+        };
+
+        let Some(next) = root_schema
+            .get("definitions")
+            .or_else(|| root_schema.get("$defs"))
+            .and_then(|definitions| definitions.get(name))
+        else {
+            break;
+        };
+
+        current = next;
+    }
+
+    current
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mcp-server/src/tools/schema_validation.rs` around lines 137 - 153, The
current dereference_schema function only resolves a single $ref hop; update
dereference_schema to follow $ref chains transitively (resolve A -> B -> ...
until a non-$ref schema or missing definition) so validate_property_value gets
the final concrete schema; implement a loop that checks for "$ref" on each
resolved Value, looks up name via the same strip_prefix logic against
root_schema "definitions"/"$defs", and continues resolving, while guarding
against infinite cycles by tracking visited reference names (or a max hop count)
and returning the original schema if resolution fails or a cycle is detected.

Comment on lines +119 to +149
pub(crate) async fn runtime_write(
arguments: Option<&JsonObject>,
default_mode: WriteMode,
) -> CallToolResult {
let path = match runtime_vault_path(arguments) {
Ok(path) => path,
Err(error) => return error,
};
let secret = match secret_argument(arguments) {
Ok(secret) => secret,
Err(error) => return error,
};
let mode = match write_mode(arguments, default_mode) {
Ok(mode) => mode,
Err(error) => return vault_error("vault.runtime.write", error),
};
let client = match VaultClient::from_env() {
Ok(client) => client,
Err(error) => return vault_error("vault.runtime.write", error),
};

match client.write_runtime_secret(&path, &secret, mode).await {
Ok(receipt) => success(json!({
"path": receipt.path,
"writtenKeys": receipt.written_keys,
"version": receipt.version,
"secretValuesReturned": false,
})),
Err(error) => vault_error("vault.runtime.write", error),
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Parameterize tool name in runtime_write error paths.

Because this helper is mode-driven (default_mode) and likely reused by both write/patch flows, hardcoding "vault.runtime.write" (Lines 133, 137, 147) can mislabel patch failures in tool/audit responses.

🔧 Suggested refactor
 pub(crate) async fn runtime_write(
     arguments: Option<&JsonObject>,
     default_mode: WriteMode,
+    tool_name: &'static str,
 ) -> CallToolResult {
@@
-        Err(error) => return vault_error("vault.runtime.write", error),
+        Err(error) => return vault_error(tool_name, error),
@@
-        Err(error) => return vault_error("vault.runtime.write", error),
+        Err(error) => return vault_error(tool_name, error),
@@
-        Err(error) => vault_error("vault.runtime.write", error),
+        Err(error) => vault_error(tool_name, error),
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub(crate) async fn runtime_write(
arguments: Option<&JsonObject>,
default_mode: WriteMode,
) -> CallToolResult {
let path = match runtime_vault_path(arguments) {
Ok(path) => path,
Err(error) => return error,
};
let secret = match secret_argument(arguments) {
Ok(secret) => secret,
Err(error) => return error,
};
let mode = match write_mode(arguments, default_mode) {
Ok(mode) => mode,
Err(error) => return vault_error("vault.runtime.write", error),
};
let client = match VaultClient::from_env() {
Ok(client) => client,
Err(error) => return vault_error("vault.runtime.write", error),
};
match client.write_runtime_secret(&path, &secret, mode).await {
Ok(receipt) => success(json!({
"path": receipt.path,
"writtenKeys": receipt.written_keys,
"version": receipt.version,
"secretValuesReturned": false,
})),
Err(error) => vault_error("vault.runtime.write", error),
}
}
pub(crate) async fn runtime_write(
arguments: Option<&JsonObject>,
default_mode: WriteMode,
tool_name: &'static str,
) -> CallToolResult {
let path = match runtime_vault_path(arguments) {
Ok(path) => path,
Err(error) => return error,
};
let secret = match secret_argument(arguments) {
Ok(secret) => secret,
Err(error) => return error,
};
let mode = match write_mode(arguments, default_mode) {
Ok(mode) => mode,
Err(error) => return vault_error(tool_name, error),
};
let client = match VaultClient::from_env() {
Ok(client) => client,
Err(error) => return vault_error(tool_name, error),
};
match client.write_runtime_secret(&path, &secret, mode).await {
Ok(receipt) => success(json!({
"path": receipt.path,
"writtenKeys": receipt.written_keys,
"version": receipt.version,
"secretValuesReturned": false,
})),
Err(error) => vault_error(tool_name, error),
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mcp-server/src/tools/vault.rs` around lines 119 - 149, The error paths in
runtime_write currently hardcode "vault.runtime.write" which can mislabel
failures when this helper is reused for other modes (e.g., patch); change
runtime_write signature to accept a tool_name: &str (or similar) and use that
tool_name in the vault_error calls instead of the literal string; update the
three vault_error invocations inside runtime_write and adjust its callers to
pass the correct tool identifier so errors from write_mode,
VaultClient::from_env, and client.write_runtime_secret report the proper tool
name.

@gonzalezzfelipe gonzalezzfelipe merged commit 8a7585a into main May 22, 2026
21 checks passed
@gonzalezzfelipe gonzalezzfelipe deleted the feat/skills-as-resources branch May 22, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant