Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .machine_readable/REGISTRY.a2ml
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ name = "RSR — Rhodium Standard Repositories"
stream = "governance"
home = "rhodium-standard-repositories/"
canonical_doc = "rhodium-standard-repositories/README.adoc"
source_hash = "sha256:2d4e465bee215808306f28053a84d2f146a7fb7f6e6e3780e5d6f4c1d18c7404"
source_hash = "sha256:01e6373ae01939b5ed24c72e1c4ace7ea55559b3fc765a956bc2e7ad722b244b"
route = "the repository-compliance standard every repo is graded against"

[[spec]]
Expand Down
217 changes: 217 additions & 0 deletions docs/migrations/pmpl-to-mpl-sweep-runbook.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
// SPDX-License-Identifier: CC-BY-SA-4.0
// SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell (hyperpolymath) <j.d.a.jewell@open.ac.uk>
= Estate-wide PMPL → MPL-2.0 licence sweep — delegation runbook
:toc: left
:revdate: 2026-06-27

Organising brief + capabilities/access spec for executing the stray-`PMPL-1.0`
SPDX-header cleanup across `org:hyperpolymath`. Produced from a complete
GitHub code-search inventory (3 pages, 271 distinct matches, deduplicated).

IMPORTANT: This is FLAG-AND-PLAN, not a licence change in itself. Per estate
policy every licence edit is per-file, owner-approval-gated, and NEVER a bulk
SPDX sweep. This runbook exists so a delegated agent can execute *correctly*.

== 1. The headline: the sweep is NARROW

Of all `SPDX-License-Identifier: PMPL-1.0[-or-later]` headers in the org, the
overwhelming majority are *legitimate* or *must-not-touch*. The genuinely
actionable SPDX-header surface is **22 files across 4 repos** (5 atomic +
17 in `nextgen-databases`, resolved in §4). A much larger *body-text* drift
surface (236 matches) is template-propagated — see §7.

[cols="3,1,1,4",options="header"]
|===
| Repo | Files | Class | Disposition

| `palimpsest-license` | 126 | 🟢 CARVE-OUT | PMPL is CORRECT here. Leave.
| `palimpsest-plasma` | 74 | 🟢 CARVE-OUT | PMPL is CORRECT here. Leave.
| `007` | 5 | 🔴 ARR | DO NOT TOUCH. All-Rights-Reserved. Surface to owner only.
| `idaptik` | 3 | 🟠 SON-SHARED | → **AGPL-3.0-or-later** (NOT MPL-2.0). Actionable.
| `developer-ecosystem` | 43 | 🔵 DRIFT | 1 actionable; 42 are vendored `rescript/` fork — leave.
| `email-octad-experiment`| 1 | 🔵 DRIFT | 1 actionable.
| `nextgen-databases` | 17 | 🔵 DRIFT | IN SCOPE — estate-authored ReScript clients → MPL-2.0/CC-BY-SA-4.0 (§4).
| `standards` | 1 | 🔵 DRIFT | `LICENSES/PMPL-1.0-or-later.txt` = licence-exhibit text. Leave.
| `zotero-tools` | 1 | 🔵 DRIFT | `zotpress/` upstream WordPress fork. Leave.
|===

`consent-aware-http` (the 3rd legitimate carve-out) shows **zero** PMPL headers —
its carve-out is prospective-only per policy, so that is expected and correct.

== 2. The actionable set (22 files, owner-approval-gated)

=== 2a. → MPL-2.0 (sole-owner code)
* `developer-ecosystem` : `affinescript-ecosystem/rattlescript/affinescript/lib/wasm_encode.ml`
(OCaml source header `(* SPDX-License-Identifier: PMPL-1.0-or-later *)`).
* `email-octad-experiment` : `.github/workflows/workflow-linter.yml`.
* `nextgen-databases` : the 17 `lithoglyph/clients/rescript/…` +
`verisimdb/connectors/clients/rescript/…` files — `.res` → MPL-2.0, the one
client `README.md` (prose) → CC-BY-SA-4.0 (estate-authored, see §4).

=== 2b. → AGPL-3.0-or-later (son-shared — NOT MPL-2.0)
* `idaptik` : `.github/workflows/governance.yml`
* `idaptik` : `.github/workflows/hypatia-scan.yml`
* `idaptik` : `.github/workflows/scorecard.yml`

idaptik is co-maintained with Joshua; estate policy fixes son-shared repos at
AGPL-3.0-or-later. A delegated agent MUST special-case this — flipping idaptik
to MPL-2.0 would be a policy violation.

== 3. The leave-alone set (DO NOT EDIT — reasons)

* *Carve-outs* (`palimpsest-license`, `palimpsest-plasma`): PMPL is the correct
licence; 200 files, no action ever.
* *`007`*: ARR. Out of scope for any normalisation/scan/label. Owner-only.
* *Vendored `rescript/` forks* (`developer-ecosystem` 42 files): the `rescript/`
path segment is the estate's upstream-*compiler*-fork carve-out. Never sweep
upstream fork headers. NOTE: this does NOT cover `nextgen-databases`, whose
`rescript/` dirs are estate-authored clients, not the vendored compiler (§4).
* *`zotero-tools/zotpress/`*: upstream WordPress-plugin fork.
* *Licence-exhibit text* (`standards/LICENSES/PMPL-1.0-or-later.txt`, and within
the carve-outs: `legal/`, `/exhibits/`, `EXHIBIT-*`, `PMPL-SPEC.adoc`): the
verbatim Palimpsest licence body. A repo legitimately keeps the PMPL text on
file regardless of its own licence.

== 4. `nextgen-databases` — RESOLVED (estate-authored → in scope)

The 17 matches sit under `lithoglyph/clients/rescript/...` and
`verisimdb/connectors/clients/rescript/...`. Evidence settles it: the repo's own
`verisimdb/llm-warmup-dev.md` reads *"VeriSimDB … Part of the nextgen-databases
monorepo. License: PMPL-1.0-or-later. **Author: Jonathan D.A. Jewell.**"* These
are **owner-authored** estate databases and their ReScript *clients*, NOT the
vendored ReScript-*compiler* fork the `rescript/` carve-out protects.

⇒ **IN SCOPE.** The `.res` client sources → `MPL-2.0`; the client `README.md`
(prose) → `CC-BY-SA-4.0`. (The estate's ReScript→AffineScript ban is orthogonal:
these grandfathered `.res` files still get the correct licence now.) This adds
the 17 `nextgen-databases` files to the actionable set, taking it to **22 files
across 4 repos**.

== 5. Code vs prose split (which licence per file)

* *Code / config / workflow / source headers* → `MPL-2.0` (or `AGPL-3.0-or-later`
for idaptik).
* *Prose docs* (`.adoc`/`.md` narrative: README, guides, EXPLAINME, ROADMAP,
MAINTAINERS, CONTRIBUTING bodies) → `CC-BY-SA-4.0`.
* *Machine-readable* (`.a2ml`, `.ncl`, `.scm`, `Justfile`, `Mustfile`) → treat as
code → `MPL-2.0`.
* GitHub-required `.md` (SECURITY/CONTRIBUTING/CoC/CHANGELOG) keep `.md` but the
SPDX line still flips per the code/prose rule above.

== 6. Per-file discipline (the part every prior LLM sweep got wrong)

For each candidate file, OPEN it and flip ONLY a genuine *self-declaration* SPDX
header. NEVER touch, even when the bytes say `PMPL`:

. Policy/classification tables (a CLAUDE.md that *lists* PMPL as a category).
. History (CHANGELOG / AUDIT / commit-trail entries describing the old licence).
. Glossaries / docs that say "PMPL is the FORMER licence".
. Archived material, incl. any doc embedding an old `sed`-sweep command.
. Third-party / vendored / forked text and headers.
. Licence-exhibit / `LICENSES/` / `legal/` verbatim text.

Precedent: in `absolute-zero` #92, 3 stale self-declarations were flipped and
**10** legitimate PMPL references were deliberately left. Expect a similar
keep:flip ratio.

== 7. Secondary drift forms (Phase 2 — INVENTORIED)

The §1 inventory covers the canonical `SPDX-License-Identifier:` header. Two
further org-wide searches quantify what that pass misses:

=== 7a. Body-text `License: PMPL-1.0-or-later` — **236 matches, estate-wide**
This is the LARGEST drift surface by count, and it is **template-propagated**,
not hand-written. The bulk come from a shared scaffold replicated across dozens
of repos: `llm-warmup-dev.md` / `llm-warmup-user.md`, `.well-known/trust.txt` /
`humans.txt`, REUSE `dep5`, and `CITATION.cff` (many still carrying `{{AUTHOR}}`
placeholders). Caught absolute-zero's `ai.txt`/`humans.txt`/`.ipkg`; this header
search does NOT.

**Strategy — fix at the source, not 236 times.** Locate the scaffold generator
(rsr-template-repo / scaffoldia / the `llm-warmup-*` template) and flip the
template's licence line to the correct per-category value, THEN re-propagate.
Doing 236 manual edits invites immediate re-drift on the next scaffold run.
Exclude carve-outs, `007`, son-shared, and forks from any propagation, exactly
as in §1/§3.

=== 7b. Banned `MPL-2.0-or-later` → `MPL-2.0` — **25 matches**
Estate policy is `MPL-2.0` (never `-or-later`). The 25 hits concentrate in
delicate areas: the `standards` RSR *satellites* (`…/satellites/palimpsest-license/…`,
`…/consent-aware-http/…`), the carve-out repos themselves, and the `zotpress`
fork. Few are plain sole-owner code; each needs the §6 per-file eyeball before
dropping the `-or-later`. Lower priority than 7a.

=== 7c. REUSE files (`.reuse/dep5`, `REUSE.toml`)
A subset of 7a — mostly template `dep5` files with `{{AUTHOR}}` placeholders.
Fix with the scaffold, per 7a.

== 8. Per-repo execution procedure

For each actionable repo (developer-ecosystem, email-octad-experiment, idaptik,
nextgen-databases — all four confirmed in §1/§3/§4):

. Branch `claude/<name>` off latest `main`; never push to main.
. Open each candidate file; apply §6 discipline; flip only genuine headers to
the §5 licence.
. Re-run the repo's own CI locally where it has content-addressed gates
(e.g. a registry/topology regen — `standards` needed `just registry` after a
tracked-file edit; expect similar in repos with `.machine_readable/REGISTRY`).
. Commit per-repo with a clear `chore(licence):` message enumerating files +
the keep-list.
. Push `-u origin claude/<name>`; open a **draft** PR; let CI run.
. Treat pre-existing repo-wide governance/Hypatia-baseline reds as out-of-scope
(they fail on main too); only fix reds your diff introduced.
. **Get explicit owner approval on the per-repo change-list BEFORE marking
ready / merging.** This is the hard gate — owner-only, per repo.

== 9. Capabilities / access a delegated agent needs

[cols="2,5",options="header"]
|===
| Need | Why / specifics

| GitHub MCP scope | Session repo-scope must include the *drift* repos:
`developer-ecosystem`, `email-octad-experiment`, `idaptik`,
`nextgen-databases`. (The proof-repo session that produced this runbook is
scoped to the 10 formal-methods repos — a different set.) Use the session's
`add_repo` / scope config to add them. Do NOT add `007` (ARR).
| Write + PR | push to `claude/<name>` branches + open draft PRs (the
`mcp__github__*` create_pull_request / push path).
| Read-only verify | `search_code` for the secondary forms in §7;
`get_file_contents` to apply §6 per-file.
| Policy doc | This runbook + the estate 5-way licence policy
(`standards/.claude/CLAUDE.md` §"License Policy — Manual Only").
| Toolchain | NONE for the header edits (no proof/build assistant required).
Only the repo's own CI helpers (e.g. a registry regenerator script) where a
content-addressed gate exists.
| NOT needed | Any access to `007`; any bulk-rewrite tooling; any
cross-repo automated SPDX sweeper (explicitly forbidden).
|===

== 10. What this session already did (context)

* `absolute-zero` #92 (merged) — 3 stale PMPL body-declarations → MPL-2.0.
* `epistemic-types` #13 (merged) — stale "No LICENSE" note corrected.
* `standards` #433 (MERGED) — `rsr-audit.sh` LICENSE checks MIT/Palimpsest
→ MPL-2.0 (owner-approved discharge of #390's flag).

None of the 10 in-scope proof repos carry stray PMPL *SPDX headers* — only
`standards` appeared in §1, and its single hit is licence-exhibit text (leave).

== 11. FLAG — `panll` governance conflict (owner decision, do NOT auto-flip)

`panll` carries body-text PMPL (`docs/guides/llm-warmup-user.md`: "License:
PMPL-1.0-or-later. Author: Jonathan D.A. Jewell") AND its own
`.claude/CLAUDE.md` still mandates *"PMPL-1.0-or-later on all original source
files."* By estate policy `panll` is sole-owner ⇒ should be `MPL-2.0`, and is
NOT one of the three Palimpsest carve-outs — so the local CLAUDE.md is itself
stale drift. But a repo whose own checked-in governance explicitly mandates PMPL
must NOT be flipped autonomously: that is a deliberate-looking contradiction
only the owner can resolve.

**Decision needed:** either (a) `panll` flips to `MPL-2.0` (estate policy) —
in which case its `CLAUDE.md` licence line flips too, in the same change; or
(b) `panll` is an intentional PMPL exception — in which case the estate policy's
carve-out list must be widened to name it. Until the owner rules, `panll` is
flag-only. (This is the one in-scope proof repo with a body-text PMPL conflict;
the SPDX-header sweep in §1 is unaffected.)
4 changes: 3 additions & 1 deletion rhodium-standard-repositories/rsr-audit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,9 @@ audit_category_2_documentation() {
# SECURITY.md validation
if [[ -f "$REPO_PATH/SECURITY.md" ]]; then
check_file_contains "SECURITY.md" "Reporting" "SECURITY.md has vulnerability reporting"
check_file_contains "SECURITY.md" "24 hours" "SECURITY.md has response timeline"
# Estate-tolerant: credit any documented response SLA phrasing, not just
# the literal "24 hours" (repos use "Response Timeline", "business day", etc.)
check_file_contains "SECURITY.md" "24 hours\\|48 hours\\|72 hours\\|business day\\|[Rr]esponse [Tt]ime\\|SLA" "SECURITY.md has response timeline"
fi

# CONTRIBUTING.md validation (TPCF)
Expand Down
Loading