diff --git a/.machine_readable/REGISTRY.a2ml b/.machine_readable/REGISTRY.a2ml index b9f84045..916c0c53 100644 --- a/.machine_readable/REGISTRY.a2ml +++ b/.machine_readable/REGISTRY.a2ml @@ -216,7 +216,7 @@ name = "RSR — Rhodium Standard Repositories" stream = "governance" home = "rhodium-standard-repositories/" canonical_doc = "rhodium-standard-repositories/README.adoc" -source_hash = "sha256:2d4e465bee215808306f28053a84d2f146a7fb7f6e6e3780e5d6f4c1d18c7404" +source_hash = "sha256:01e6373ae01939b5ed24c72e1c4ace7ea55559b3fc765a956bc2e7ad722b244b" route = "the repository-compliance standard every repo is graded against" [[spec]] diff --git a/docs/migrations/pmpl-to-mpl-sweep-runbook.adoc b/docs/migrations/pmpl-to-mpl-sweep-runbook.adoc new file mode 100644 index 00000000..dae0abf7 --- /dev/null +++ b/docs/migrations/pmpl-to-mpl-sweep-runbook.adoc @@ -0,0 +1,217 @@ +// SPDX-License-Identifier: CC-BY-SA-4.0 +// SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell (hyperpolymath) += Estate-wide PMPL → MPL-2.0 licence sweep — delegation runbook +:toc: left +:revdate: 2026-06-27 + +Organising brief + capabilities/access spec for executing the stray-`PMPL-1.0` +SPDX-header cleanup across `org:hyperpolymath`. Produced from a complete +GitHub code-search inventory (3 pages, 271 distinct matches, deduplicated). + +IMPORTANT: This is FLAG-AND-PLAN, not a licence change in itself. Per estate +policy every licence edit is per-file, owner-approval-gated, and NEVER a bulk +SPDX sweep. This runbook exists so a delegated agent can execute *correctly*. + +== 1. The headline: the sweep is NARROW + +Of all `SPDX-License-Identifier: PMPL-1.0[-or-later]` headers in the org, the +overwhelming majority are *legitimate* or *must-not-touch*. The genuinely +actionable SPDX-header surface is **22 files across 4 repos** (5 atomic + +17 in `nextgen-databases`, resolved in §4). A much larger *body-text* drift +surface (236 matches) is template-propagated — see §7. + +[cols="3,1,1,4",options="header"] +|=== +| Repo | Files | Class | Disposition + +| `palimpsest-license` | 126 | 🟢 CARVE-OUT | PMPL is CORRECT here. Leave. +| `palimpsest-plasma` | 74 | 🟢 CARVE-OUT | PMPL is CORRECT here. Leave. +| `007` | 5 | 🔴 ARR | DO NOT TOUCH. All-Rights-Reserved. Surface to owner only. +| `idaptik` | 3 | 🟠 SON-SHARED | → **AGPL-3.0-or-later** (NOT MPL-2.0). Actionable. +| `developer-ecosystem` | 43 | 🔵 DRIFT | 1 actionable; 42 are vendored `rescript/` fork — leave. +| `email-octad-experiment`| 1 | 🔵 DRIFT | 1 actionable. +| `nextgen-databases` | 17 | 🔵 DRIFT | IN SCOPE — estate-authored ReScript clients → MPL-2.0/CC-BY-SA-4.0 (§4). +| `standards` | 1 | 🔵 DRIFT | `LICENSES/PMPL-1.0-or-later.txt` = licence-exhibit text. Leave. +| `zotero-tools` | 1 | 🔵 DRIFT | `zotpress/` upstream WordPress fork. Leave. +|=== + +`consent-aware-http` (the 3rd legitimate carve-out) shows **zero** PMPL headers — +its carve-out is prospective-only per policy, so that is expected and correct. + +== 2. The actionable set (22 files, owner-approval-gated) + +=== 2a. → MPL-2.0 (sole-owner code) +* `developer-ecosystem` : `affinescript-ecosystem/rattlescript/affinescript/lib/wasm_encode.ml` + (OCaml source header `(* SPDX-License-Identifier: PMPL-1.0-or-later *)`). +* `email-octad-experiment` : `.github/workflows/workflow-linter.yml`. +* `nextgen-databases` : the 17 `lithoglyph/clients/rescript/…` + + `verisimdb/connectors/clients/rescript/…` files — `.res` → MPL-2.0, the one + client `README.md` (prose) → CC-BY-SA-4.0 (estate-authored, see §4). + +=== 2b. → AGPL-3.0-or-later (son-shared — NOT MPL-2.0) +* `idaptik` : `.github/workflows/governance.yml` +* `idaptik` : `.github/workflows/hypatia-scan.yml` +* `idaptik` : `.github/workflows/scorecard.yml` + +idaptik is co-maintained with Joshua; estate policy fixes son-shared repos at +AGPL-3.0-or-later. A delegated agent MUST special-case this — flipping idaptik +to MPL-2.0 would be a policy violation. + +== 3. The leave-alone set (DO NOT EDIT — reasons) + +* *Carve-outs* (`palimpsest-license`, `palimpsest-plasma`): PMPL is the correct + licence; 200 files, no action ever. +* *`007`*: ARR. Out of scope for any normalisation/scan/label. Owner-only. +* *Vendored `rescript/` forks* (`developer-ecosystem` 42 files): the `rescript/` + path segment is the estate's upstream-*compiler*-fork carve-out. Never sweep + upstream fork headers. NOTE: this does NOT cover `nextgen-databases`, whose + `rescript/` dirs are estate-authored clients, not the vendored compiler (§4). +* *`zotero-tools/zotpress/`*: upstream WordPress-plugin fork. +* *Licence-exhibit text* (`standards/LICENSES/PMPL-1.0-or-later.txt`, and within + the carve-outs: `legal/`, `/exhibits/`, `EXHIBIT-*`, `PMPL-SPEC.adoc`): the + verbatim Palimpsest licence body. A repo legitimately keeps the PMPL text on + file regardless of its own licence. + +== 4. `nextgen-databases` — RESOLVED (estate-authored → in scope) + +The 17 matches sit under `lithoglyph/clients/rescript/...` and +`verisimdb/connectors/clients/rescript/...`. Evidence settles it: the repo's own +`verisimdb/llm-warmup-dev.md` reads *"VeriSimDB … Part of the nextgen-databases +monorepo. License: PMPL-1.0-or-later. **Author: Jonathan D.A. Jewell.**"* These +are **owner-authored** estate databases and their ReScript *clients*, NOT the +vendored ReScript-*compiler* fork the `rescript/` carve-out protects. + +⇒ **IN SCOPE.** The `.res` client sources → `MPL-2.0`; the client `README.md` +(prose) → `CC-BY-SA-4.0`. (The estate's ReScript→AffineScript ban is orthogonal: +these grandfathered `.res` files still get the correct licence now.) This adds +the 17 `nextgen-databases` files to the actionable set, taking it to **22 files +across 4 repos**. + +== 5. Code vs prose split (which licence per file) + +* *Code / config / workflow / source headers* → `MPL-2.0` (or `AGPL-3.0-or-later` + for idaptik). +* *Prose docs* (`.adoc`/`.md` narrative: README, guides, EXPLAINME, ROADMAP, + MAINTAINERS, CONTRIBUTING bodies) → `CC-BY-SA-4.0`. +* *Machine-readable* (`.a2ml`, `.ncl`, `.scm`, `Justfile`, `Mustfile`) → treat as + code → `MPL-2.0`. +* GitHub-required `.md` (SECURITY/CONTRIBUTING/CoC/CHANGELOG) keep `.md` but the + SPDX line still flips per the code/prose rule above. + +== 6. Per-file discipline (the part every prior LLM sweep got wrong) + +For each candidate file, OPEN it and flip ONLY a genuine *self-declaration* SPDX +header. NEVER touch, even when the bytes say `PMPL`: + +. Policy/classification tables (a CLAUDE.md that *lists* PMPL as a category). +. History (CHANGELOG / AUDIT / commit-trail entries describing the old licence). +. Glossaries / docs that say "PMPL is the FORMER licence". +. Archived material, incl. any doc embedding an old `sed`-sweep command. +. Third-party / vendored / forked text and headers. +. Licence-exhibit / `LICENSES/` / `legal/` verbatim text. + +Precedent: in `absolute-zero` #92, 3 stale self-declarations were flipped and +**10** legitimate PMPL references were deliberately left. Expect a similar +keep:flip ratio. + +== 7. Secondary drift forms (Phase 2 — INVENTORIED) + +The §1 inventory covers the canonical `SPDX-License-Identifier:` header. Two +further org-wide searches quantify what that pass misses: + +=== 7a. Body-text `License: PMPL-1.0-or-later` — **236 matches, estate-wide** +This is the LARGEST drift surface by count, and it is **template-propagated**, +not hand-written. The bulk come from a shared scaffold replicated across dozens +of repos: `llm-warmup-dev.md` / `llm-warmup-user.md`, `.well-known/trust.txt` / +`humans.txt`, REUSE `dep5`, and `CITATION.cff` (many still carrying `{{AUTHOR}}` +placeholders). Caught absolute-zero's `ai.txt`/`humans.txt`/`.ipkg`; this header +search does NOT. + +**Strategy — fix at the source, not 236 times.** Locate the scaffold generator +(rsr-template-repo / scaffoldia / the `llm-warmup-*` template) and flip the +template's licence line to the correct per-category value, THEN re-propagate. +Doing 236 manual edits invites immediate re-drift on the next scaffold run. +Exclude carve-outs, `007`, son-shared, and forks from any propagation, exactly +as in §1/§3. + +=== 7b. Banned `MPL-2.0-or-later` → `MPL-2.0` — **25 matches** +Estate policy is `MPL-2.0` (never `-or-later`). The 25 hits concentrate in +delicate areas: the `standards` RSR *satellites* (`…/satellites/palimpsest-license/…`, +`…/consent-aware-http/…`), the carve-out repos themselves, and the `zotpress` +fork. Few are plain sole-owner code; each needs the §6 per-file eyeball before +dropping the `-or-later`. Lower priority than 7a. + +=== 7c. REUSE files (`.reuse/dep5`, `REUSE.toml`) +A subset of 7a — mostly template `dep5` files with `{{AUTHOR}}` placeholders. +Fix with the scaffold, per 7a. + +== 8. Per-repo execution procedure + +For each actionable repo (developer-ecosystem, email-octad-experiment, idaptik, +nextgen-databases — all four confirmed in §1/§3/§4): + +. Branch `claude/` off latest `main`; never push to main. +. Open each candidate file; apply §6 discipline; flip only genuine headers to + the §5 licence. +. Re-run the repo's own CI locally where it has content-addressed gates + (e.g. a registry/topology regen — `standards` needed `just registry` after a + tracked-file edit; expect similar in repos with `.machine_readable/REGISTRY`). +. Commit per-repo with a clear `chore(licence):` message enumerating files + + the keep-list. +. Push `-u origin claude/`; open a **draft** PR; let CI run. +. Treat pre-existing repo-wide governance/Hypatia-baseline reds as out-of-scope + (they fail on main too); only fix reds your diff introduced. +. **Get explicit owner approval on the per-repo change-list BEFORE marking + ready / merging.** This is the hard gate — owner-only, per repo. + +== 9. Capabilities / access a delegated agent needs + +[cols="2,5",options="header"] +|=== +| Need | Why / specifics + +| GitHub MCP scope | Session repo-scope must include the *drift* repos: + `developer-ecosystem`, `email-octad-experiment`, `idaptik`, + `nextgen-databases`. (The proof-repo session that produced this runbook is + scoped to the 10 formal-methods repos — a different set.) Use the session's + `add_repo` / scope config to add them. Do NOT add `007` (ARR). +| Write + PR | push to `claude/` branches + open draft PRs (the + `mcp__github__*` create_pull_request / push path). +| Read-only verify | `search_code` for the secondary forms in §7; + `get_file_contents` to apply §6 per-file. +| Policy doc | This runbook + the estate 5-way licence policy + (`standards/.claude/CLAUDE.md` §"License Policy — Manual Only"). +| Toolchain | NONE for the header edits (no proof/build assistant required). + Only the repo's own CI helpers (e.g. a registry regenerator script) where a + content-addressed gate exists. +| NOT needed | Any access to `007`; any bulk-rewrite tooling; any + cross-repo automated SPDX sweeper (explicitly forbidden). +|=== + +== 10. What this session already did (context) + +* `absolute-zero` #92 (merged) — 3 stale PMPL body-declarations → MPL-2.0. +* `epistemic-types` #13 (merged) — stale "No LICENSE" note corrected. +* `standards` #433 (MERGED) — `rsr-audit.sh` LICENSE checks MIT/Palimpsest + → MPL-2.0 (owner-approved discharge of #390's flag). + +None of the 10 in-scope proof repos carry stray PMPL *SPDX headers* — only +`standards` appeared in §1, and its single hit is licence-exhibit text (leave). + +== 11. FLAG — `panll` governance conflict (owner decision, do NOT auto-flip) + +`panll` carries body-text PMPL (`docs/guides/llm-warmup-user.md`: "License: +PMPL-1.0-or-later. Author: Jonathan D.A. Jewell") AND its own +`.claude/CLAUDE.md` still mandates *"PMPL-1.0-or-later on all original source +files."* By estate policy `panll` is sole-owner ⇒ should be `MPL-2.0`, and is +NOT one of the three Palimpsest carve-outs — so the local CLAUDE.md is itself +stale drift. But a repo whose own checked-in governance explicitly mandates PMPL +must NOT be flipped autonomously: that is a deliberate-looking contradiction +only the owner can resolve. + +**Decision needed:** either (a) `panll` flips to `MPL-2.0` (estate policy) — +in which case its `CLAUDE.md` licence line flips too, in the same change; or +(b) `panll` is an intentional PMPL exception — in which case the estate policy's +carve-out list must be widened to name it. Until the owner rules, `panll` is +flag-only. (This is the one in-scope proof repo with a body-text PMPL conflict; +the SPDX-header sweep in §1 is unaffected.) diff --git a/rhodium-standard-repositories/rsr-audit.sh b/rhodium-standard-repositories/rsr-audit.sh index ac231041..f3af5ada 100755 --- a/rhodium-standard-repositories/rsr-audit.sh +++ b/rhodium-standard-repositories/rsr-audit.sh @@ -246,7 +246,9 @@ audit_category_2_documentation() { # SECURITY.md validation if [[ -f "$REPO_PATH/SECURITY.md" ]]; then check_file_contains "SECURITY.md" "Reporting" "SECURITY.md has vulnerability reporting" - check_file_contains "SECURITY.md" "24 hours" "SECURITY.md has response timeline" + # Estate-tolerant: credit any documented response SLA phrasing, not just + # the literal "24 hours" (repos use "Response Timeline", "business day", etc.) + check_file_contains "SECURITY.md" "24 hours\\|48 hours\\|72 hours\\|business day\\|[Rr]esponse [Tt]ime\\|SLA" "SECURITY.md has response timeline" fi # CONTRIBUTING.md validation (TPCF)