|
| 1 | +# mcpp.toml: Build Environment, Platform-Conditional Config, and `build.mcpp` (Design) |
| 2 | + |
| 3 | +Date: 2026-06-29 |
| 4 | +Status: **Design — for discussion.** Synthesizes a multi-tool survey (Cargo, Zig, |
| 5 | +vcpkg, Bazel, xmake, Conan) and the build-systems literature (*Build Systems à la |
| 6 | +Carte*, PubGrub, hermeticity/SLSA) against mcpp's current internals. Scope: |
| 7 | +`src/manifest.cppm`, `src/config.cppm`, `src/xlings.cppm`, `src/toolchain/*`, |
| 8 | +`src/build/{prepare,plan,flags}.cppm`, `src/libs/toml.cppm`. Companion (consumer |
| 9 | +side): mcpp-index `.agents/docs/2026-06-29-mcpp-native-workspace-ci-design.md`. |
| 10 | + |
| 11 | +## 0. Organizing principle |
| 12 | + |
| 13 | +From *Build Systems à la Carte* (Mokhov/Mitchell/Peyton Jones, ICFP 2018): a build |
| 14 | +system's task graph is either **applicative** (dependencies known *before* running |
| 15 | +a task → statically analyzable, lockable, cacheable, parallel-schedulable) or |
| 16 | +**monadic** (dependencies discovered only by *executing* → maximally flexible, not |
| 17 | +analyzable up front). The whole design follows one rule: |
| 18 | + |
| 19 | +> **Keep the dependency graph applicative. Confine imperative logic to bounded |
| 20 | +> leaves.** Anything that decides *which dependencies/inputs exist* is declarative |
| 21 | +> (TOML); per-package flag/source/codegen choices may be imperative — but never |
| 22 | +> gate the top-level graph. |
| 23 | +
|
| 24 | +Five layers result, from the host environment up to the produced artifact: |
| 25 | + |
| 26 | +| Layer | Decides | Evaluated on | Today | Industry anchor | |
| 27 | +|---|---|---|---|---| |
| 28 | +| **L-1 Environment** | toolchains + host build-tools + env vars + per-project sandbox | host | only `index_repos` written; xlings `deps`/`workspace`/`envs`/`subos` unused | xlings (already); venv / nix-shell | |
| 29 | +| **L0 Dependency categories** | normal / dev / build | normal·dev = target; build = host | all three parsed; `[build-dependencies]` inert | Cargo, Conan | |
| 30 | +| **L1 Conditional graph** | per-OS/arch deps & features (+ lazy fetch) | resolved **target** triple | none in `mcpp.toml`; recipe-only host-keyed `mcpp.<os>` | Cargo `cfg()`, Zig `.lazy` + hash | |
| 31 | +| **L2 Static codegen** | synthesized headers/sources | — | `generated_files` (exists) | — | |
| 32 | +| **L3 `build.mcpp`** | host probing / dynamic codegen (**leaf only**) | host build, cfg gated on target | none (recipe `install()` is the *third-party* analog) | Zig `build.zig`, Cargo `build.rs` | |
| 33 | + |
| 34 | +--- |
| 35 | + |
| 36 | +## L-1 — Build environment: align `mcpp.toml` to xlings' `.xlings.json` |
| 37 | + |
| 38 | +### Current state |
| 39 | +mcpp's only writer of `.xlings.json` is `seed_xlings_json` (`src/xlings.cppm:1070-1088`) |
| 40 | +and the project variant `ensure_project_index_dir` (`src/config.cppm:661-705`); both |
| 41 | +write **only** `index_repos` / `lang` / `mirror`. Toolchains come from `[toolchain]` |
| 42 | +(`manifest.cppm:1072-1079`) → `xim:gcc@16.1.0` (`toolchain/registry.cppm:146-183`) → |
| 43 | +`xlings install` into the `~/.mcpp/registry` sandbox. Host tools (ninja, patchelf) |
| 44 | +are hard-pinned in `xlings.cppm:33-37`. The `workspace.<pkg>` pin you see in |
| 45 | +`.xlings.json` is written by **xlings/xvm**, not mcpp. |
| 46 | + |
| 47 | +Meanwhile xlings *already* models a full per-project environment (`xvm/README.md:84-92`, |
| 48 | +`core/config.cppm`): `deps` (a project package-set, installed by bare `xlings install`), |
| 49 | +`workspace` (target→version, **per-OS keyed**, resolved project > subos > global), |
| 50 | +per-version `envs`, and named/anonymous `subos` (a project sandbox at |
| 51 | +`<proj>/.xlings/subos/...` with its own `bin/lib/usr`). **mcpp surfaces none of it.** |
| 52 | + |
| 53 | +### Design — `[environment]` materializes 1:1 onto xlings keys |
| 54 | +Do **not** invent new vocabulary. A `mcpp.toml` `[environment]` block maps directly |
| 55 | +to the xlings `.xlings.json` schema and is written into mcpp's existing project file |
| 56 | +`<proj>/.mcpp/.xlings.json` (extend the `index_repos`-only writer at |
| 57 | +`config.cppm:699-705` to also emit `deps`/`workspace`/`envs`/`subos`): |
| 58 | + |
| 59 | +```toml |
| 60 | +[environment] |
| 61 | +subos = "dev" # → .xlings.json "subos" (named project sandbox) |
| 62 | +tools = ["make@4", "python@3.13.1"] # → "deps" (host build-tools; bare `xlings install` provisions) |
| 63 | +[environment.workspace] # → "workspace" (pin tool versions; per-OS values allowed) |
| 64 | +gcc = "16.1.0" |
| 65 | +[environment.env] # → per-tool "envs" applied by xvm shims |
| 66 | +OPENBLAS_NUM_THREADS = "1" |
| 67 | +``` |
| 68 | + |
| 69 | +- **`[toolchain]` folds in**: today it installs globally; route it into the project |
| 70 | + `workspace` so two checkouts on one machine pin different compilers via xvm shims |
| 71 | + (xlings supports this; mcpp just never seeds it). `[toolchain]` stays as the |
| 72 | + ergonomic shorthand; `[environment.workspace]` is the general form. |
| 73 | +- **Provisioning** = the already-wired project-mode `build_command_prefix` |
| 74 | + (`xlings.cppm:716-724`) + bare `xlings install` over the emitted `deps`. |
| 75 | +- **This is purely "surface + writer"** — no new resolution machinery; the |
| 76 | + materialization target is wired but unused. |
| 77 | + |
| 78 | +### Gap closed |
| 79 | +Per-project toolchain pinning, host build-tools (`make`/`python`/`cmake`) as |
| 80 | +declared deps, per-tool env vars, and a reproducible per-project sandbox — all |
| 81 | +expressible, all backed by existing xlings behavior. |
| 82 | + |
| 83 | +--- |
| 84 | + |
| 85 | +## L0 — Dependency categories: keep three axes, wire `build` |
| 86 | + |
| 87 | +Keep Cargo/Conan's **normal / dev / build** trichotomy (mcpp already declares all |
| 88 | +three, `manifest.cppm:251-253`; `[build-dependencies]` is parsed but **inert** — no |
| 89 | +consumer in `prepare`/`plan`). They answer two orthogonal questions: |
| 90 | + |
| 91 | +- **normal vs dev** = *when* linked (always / test-only). Both evaluated against the |
| 92 | + **target**; both link into a runnable artifact (`mcpp test` resolves dev-deps — |
| 93 | + `manifest.cppm:44-46`). |
| 94 | +- **normal vs build** = *which machine* it runs on (target / **host**). build-deps |
| 95 | + do not enter the product; they feed the build itself. |
| 96 | + |
| 97 | +**Three distinct "build-tool" homes — do not conflate:** |
| 98 | +1. **Recipe source-build tools** — `xpm.<os>.deps` in an xpkg recipe (`xim:python`, |
| 99 | + `xim:make`); host, install-time, resolved by xim/`pkginfo.build_dep`. *This is |
| 100 | + where `compat.xcb`'s Python lives.* |
| 101 | +2. **Project environment tools** — L-1 `[environment].tools` (host tools the project |
| 102 | + needs available: cmake, protoc). |
| 103 | +3. **`build.mcpp` libraries** — `[build-dependencies]` (host libraries the native |
| 104 | + build program links). **Wire this here** — it gains a real consumer at L3. |
| 105 | + |
| 106 | +--- |
| 107 | + |
| 108 | +## L1 — Conditional dependency graph (Cargo-style `cfg`, target-evaluated) |
| 109 | + |
| 110 | +### Decision: Cargo-style `[target.'cfg(...)']` tables, trimmed token set |
| 111 | +A declarative table keeps the graph applicative (L0 principle). vcpkg's inline |
| 112 | +`platform=` has no home for flags; a flat `[<os>]` block can't express arch / |
| 113 | +combinations / negation. Cargo's table shape wins and — unlike Cargo — mcpp lets |
| 114 | +the **same predicate namespace carry both deps and flags** (Cargo can't put flags |
| 115 | +in the manifest at all): |
| 116 | + |
| 117 | +```toml |
| 118 | +# exact-triple escape hatch (already exists, highest precedence) |
| 119 | +[target.x86_64-pc-windows-msvc.dependencies] |
| 120 | +detours = "4.0" |
| 121 | + |
| 122 | +# predicate tables — primary mechanism; deps AND flags share the namespace |
| 123 | +[target.'cfg(windows)'.dependencies.compat] |
| 124 | +openblas = { version = "0.3.33", lazy = true } |
| 125 | + |
| 126 | +[target.'cfg(windows)'.build] |
| 127 | +ldflags = ["-Llib", "-llibopenblas"] |
| 128 | + |
| 129 | +[target.'cfg(all(linux, not(arch = "aarch64")))'.build] |
| 130 | +cxxflags = ["-march=x86-64-v2"] |
| 131 | +``` |
| 132 | + |
| 133 | +- **Grammar**: Cargo's `all()/any()/not()` over `key = "value"` plus bare aliases |
| 134 | + `windows`/`unix`/`linux`/`macos`. **Token set trimmed to mcpp's existing |
| 135 | + dimensions**: `os`, `arch`, `family`, `env` — these *are* `AbiDim` |
| 136 | + (`toolchain/abi.cppm:34-38`); `parse_abi_capability` already parses |
| 137 | + `abi:arch=aarch64`. Defer `pointer_width`/`endian`/`vendor` until asked. |
| 138 | +- **Evaluate against the resolved TARGET triple, not host.** `abi_profile(triple)` |
| 139 | + (`abi.cppm:67-91`) derives `os`/`arch` from any triple. Timing is already correct: |
| 140 | + `--target` + `[target.<triple>]` resolve at `prepare.cppm:497-560`, *before* dep |
| 141 | + resolution at `~731` — merge the matching `cfg` tables into `m->dependencies` / |
| 142 | + `buildConfig.*` in that window. (The recipe-side `mcpp.<os>` merge keys on **host** |
| 143 | + `xpkg_platform` — wrong for cross-compiles; the `mcpp.toml` feature must not copy |
| 144 | + that.) |
| 145 | +- **Parser hooks**: extend the `[target]` loop (`manifest.cppm:1154-1171`, today only |
| 146 | + `toolchain`+`linkage`) to also read `dependencies`/`dev-dependencies`/ |
| 147 | + `build-dependencies`/`build`. The TOML reader already lexes quoted keys |
| 148 | + (`toml.cppm:191-203`), so `[target.'cfg(windows)'.dependencies]` parses — but |
| 149 | + iterate `get_table("target")` entries manually (dotted `get_*` mis-splits quoted |
| 150 | + segments). |
| 151 | +- **Precedence**: exact triple > cfg; multiple matching cfg tables → flags |
| 152 | + concatenate, conflicting scalar (e.g. linker) → error (Cargo's rule). |
| 153 | + |
| 154 | +### Lazy + content-hash identity (from Zig) |
| 155 | +- **`lazy = true`** on a dependency → fetched only when a build path (after cfg/feature |
| 156 | + gating) actually requests it (Zig `b.lazyDependency`). A Linux build then **never |
| 157 | + downloads** a `cfg(windows)` dependency. Generalizes mcpp's platform-specific compat |
| 158 | + packages directly. |
| 159 | +- **Hash is identity; url is a mirror** (Zig `build.zig.zon`: *"packages come from a |
| 160 | + hash, not a url"*). Adopting this **root-causes two items in mcpp's history**: mirror |
| 161 | + auto-detection (GitHub/GitCode/CN are interchangeable *locations* of one hash) and |
| 162 | + the stale-index-shard "fetch false-failure" (content identity can't be impersonated |
| 163 | + by a moved shard path). Pair fetch with a content-addressed store. |
| 164 | + |
| 165 | +--- |
| 166 | + |
| 167 | +## L2 — Static codegen |
| 168 | +`generated_files` (`manifest.cppm:1975`) already covers codegen-as-data (synthesized |
| 169 | +headers/sources, e.g. compat.zlib's config header, the openblas Windows anchor). No |
| 170 | +change; it is the preferred mechanism whenever the generated content is static. |
| 171 | + |
| 172 | +--- |
| 173 | + |
| 174 | +## L3 — `build.mcpp`: a native imperative build program (do it directly) |
| 175 | + |
| 176 | +### Two distinct mechanisms, two audiences |
| 177 | +- **`install()`** (Lua, in an xpkg recipe) — the **third-party package** build hook |
| 178 | + (builds a dependency from source; `compat.openblas`/`compat.xcb`). Stays. |
| 179 | +- **`build.mcpp`** (C++, in the project) — the **mcpp-native project** build program. |
| 180 | + **New, built directly** (not deferred). Both get the same two disciplines below. |
| 181 | + |
| 182 | +### Form — Zig's in-language model, in C++ |
| 183 | +A `build.mcpp` (e.g. `build/main.cpp`) is itself a tiny mcpp build compiled with the |
| 184 | +**host** toolchain and run before the main build — Zig's `build.zig` model, but in the |
| 185 | +project's own language (C++), so no second language and it dogfoods mcpp. Its |
| 186 | +dependencies are exactly `[build-dependencies]` (L0 item 3 — now with a consumer). |
| 187 | + |
| 188 | +### Discipline 1 — structured output protocol (not global mutation) |
| 189 | +The program communicates only via stdout directives the engine consumes (Cargo's |
| 190 | +`cargo::` model): |
| 191 | +``` |
| 192 | +mcpp:link-lib=openblas |
| 193 | +mcpp:link-search=<abs-dir> |
| 194 | +mcpp:cxxflag=-DHAVE_FOO |
| 195 | +mcpp:cfg=has_avx2 |
| 196 | +mcpp:generated=<path> # a source/header to add to the graph |
| 197 | +mcpp:rerun-if-changed=<path> |
| 198 | +mcpp:rerun-if-env-changed=VAR |
| 199 | +``` |
| 200 | +It *requests* graph edges; it never silently mutates build state. |
| 201 | + |
| 202 | +### Discipline 2 — explicit declared inputs/outputs (fixes the `.mcpp_ok` blind spot) |
| 203 | +`rerun-if-changed`/`rerun-if-env-changed` give the opaque program declared inputs, so |
| 204 | +incremental builds stay correct. This is the documented fix for mcpp's `.mcpp_ok` |
| 205 | +gap ("process exited 0 ≠ outputs correct"): replace the bare success marker with a |
| 206 | +**declared-input/declared-output contract** — the program (or recipe) records what it |
| 207 | +read and what it produced; the build re-runs iff a declared input changed and treats |
| 208 | +missing declared outputs as failure. |
| 209 | + |
| 210 | +### Constraints (à la carte + supply-chain) |
| 211 | +- **Leaf only**: `build.mcpp` chooses flags/sources/codegen and emits link |
| 212 | + requirements — it must **not** gate the top-level dependency graph (that stays in |
| 213 | + L1 cfg tables, applicative). cfg gating of `build.mcpp` itself evaluates on the |
| 214 | + **target**, but it compiles/runs on the **host**. |
| 215 | +- **Isolation**: treat its execution as a build *action* with declared inputs/outputs; |
| 216 | + run sandboxed; prefer platform-level isolation (SLSA) over trusting program code — |
| 217 | + the consensus is script-level sandboxing alone is insufficient. The existing xim |
| 218 | + Lua sandbox (no `os.curdir`/`files`/`trymkdir`) is the right instinct; extend the |
| 219 | + same declared-I/O contract to `install()`. |
| 220 | + |
| 221 | +--- |
| 222 | + |
| 223 | +## Phasing (recommended order) |
| 224 | + |
| 225 | +1. **L-1 environment** — highest ROI, self-contained, no upstream dep beyond mcpp; the |
| 226 | + xlings target model already exists. Surface `[environment]` + extend the project |
| 227 | + `.xlings.json` writer. Fold `[toolchain]` into `workspace`. Wire `[build-dependencies]`. |
| 228 | +2. **mcpp-index workspace** (companion doc) — the first real consumer; exercises L1's |
| 229 | + need (Windows-only `compat.openblas`). |
| 230 | +3. **L1 conditional graph** — `[target.'cfg()']` deps+flags, target-evaluated; then |
| 231 | + `lazy` + content-hash identity. |
| 232 | +4. **L3 `build.mcpp`** — native build program with the two disciplines; backport the |
| 233 | + declared-I/O contract to recipe `install()`. |
| 234 | + |
| 235 | +## Appendix — cross-tool summary |
| 236 | +- **Declarative-table vs imperative-script**: TOML is static data → declarative tables |
| 237 | + for the graph (Cargo/vcpkg/Bazel); reserve imperative for a separate file |
| 238 | + (Cargo `build.rs`, Zig `build.zig`). |
| 239 | +- **Conditional grammar**: de-facto token set across all six tools = os / arch / |
| 240 | + family / env; Cargo's `all/any/not` is the composable standard. |
| 241 | +- **Host vs target**: Cargo/Zig/Bazel evaluate manifest conditionals on the **target**; |
| 242 | + build/tool deps run on the **host**. Conflating them breaks cross-compile (cf. mcpp's |
| 243 | + aarch64/musl bootstrap hazards — bootstrap tools must be host-static). |
| 244 | +- **Resolution**: the field is converging on PubGrub (complete + explainable); consider |
| 245 | + it (or MVS / content-addressed identity) over ad-hoc backtracking long-term. |
| 246 | + |
| 247 | +### Sources |
| 248 | +Build Systems à la Carte (ICFP 2018); Cargo reference (specifying-dependencies, |
| 249 | +build-scripts, config); Rust Reference (cfg); Zig build system + build.zig.zon + |
| 250 | +cross-compilation docs; vcpkg manifest reference; Bazel configurable attributes; |
| 251 | +Meson reference tables; Conan 2 requirements; PubGrub; Bazel hermeticity; SLSA. |
0 commit comments