Skip to content

Commit 95335f0

Browse files
committed
docs: design — mcpp.toml build environment, platform-conditional config, build.mcpp
Layered manifest-evolution design (L-1 environment provisioning aligned to xlings .xlings.json subos/workspace/deps/envs; L0 normal/dev/build host-vs-target; L1 Cargo-style [target.'cfg()'] conditional deps+flags evaluated on the resolved target triple, + lazy fetch + hash-as-identity; L2 generated_files; L3 build.mcpp native C++ build program with structured-output + declared-I/O disciplines). Grounded in mcpp internals + a Cargo/Zig/vcpkg/Bazel/xmake survey + the build- systems literature (a la carte applicative-graph principle, hermeticity/SLSA).
1 parent 16b14cf commit 95335f0

1 file changed

Lines changed: 251 additions & 0 deletions

File tree

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
# mcpp.toml: Build Environment, Platform-Conditional Config, and `build.mcpp` (Design)
2+
3+
Date: 2026-06-29
4+
Status: **Design — for discussion.** Synthesizes a multi-tool survey (Cargo, Zig,
5+
vcpkg, Bazel, xmake, Conan) and the build-systems literature (*Build Systems à la
6+
Carte*, PubGrub, hermeticity/SLSA) against mcpp's current internals. Scope:
7+
`src/manifest.cppm`, `src/config.cppm`, `src/xlings.cppm`, `src/toolchain/*`,
8+
`src/build/{prepare,plan,flags}.cppm`, `src/libs/toml.cppm`. Companion (consumer
9+
side): mcpp-index `.agents/docs/2026-06-29-mcpp-native-workspace-ci-design.md`.
10+
11+
## 0. Organizing principle
12+
13+
From *Build Systems à la Carte* (Mokhov/Mitchell/Peyton Jones, ICFP 2018): a build
14+
system's task graph is either **applicative** (dependencies known *before* running
15+
a task → statically analyzable, lockable, cacheable, parallel-schedulable) or
16+
**monadic** (dependencies discovered only by *executing* → maximally flexible, not
17+
analyzable up front). The whole design follows one rule:
18+
19+
> **Keep the dependency graph applicative. Confine imperative logic to bounded
20+
> leaves.** Anything that decides *which dependencies/inputs exist* is declarative
21+
> (TOML); per-package flag/source/codegen choices may be imperative — but never
22+
> gate the top-level graph.
23+
24+
Five layers result, from the host environment up to the produced artifact:
25+
26+
| Layer | Decides | Evaluated on | Today | Industry anchor |
27+
|---|---|---|---|---|
28+
| **L-1 Environment** | toolchains + host build-tools + env vars + per-project sandbox | host | only `index_repos` written; xlings `deps`/`workspace`/`envs`/`subos` unused | xlings (already); venv / nix-shell |
29+
| **L0 Dependency categories** | normal / dev / build | normal·dev = target; build = host | all three parsed; `[build-dependencies]` inert | Cargo, Conan |
30+
| **L1 Conditional graph** | per-OS/arch deps & features (+ lazy fetch) | resolved **target** triple | none in `mcpp.toml`; recipe-only host-keyed `mcpp.<os>` | Cargo `cfg()`, Zig `.lazy` + hash |
31+
| **L2 Static codegen** | synthesized headers/sources || `generated_files` (exists) ||
32+
| **L3 `build.mcpp`** | host probing / dynamic codegen (**leaf only**) | host build, cfg gated on target | none (recipe `install()` is the *third-party* analog) | Zig `build.zig`, Cargo `build.rs` |
33+
34+
---
35+
36+
## L-1 — Build environment: align `mcpp.toml` to xlings' `.xlings.json`
37+
38+
### Current state
39+
mcpp's only writer of `.xlings.json` is `seed_xlings_json` (`src/xlings.cppm:1070-1088`)
40+
and the project variant `ensure_project_index_dir` (`src/config.cppm:661-705`); both
41+
write **only** `index_repos` / `lang` / `mirror`. Toolchains come from `[toolchain]`
42+
(`manifest.cppm:1072-1079`) → `xim:gcc@16.1.0` (`toolchain/registry.cppm:146-183`) →
43+
`xlings install` into the `~/.mcpp/registry` sandbox. Host tools (ninja, patchelf)
44+
are hard-pinned in `xlings.cppm:33-37`. The `workspace.<pkg>` pin you see in
45+
`.xlings.json` is written by **xlings/xvm**, not mcpp.
46+
47+
Meanwhile xlings *already* models a full per-project environment (`xvm/README.md:84-92`,
48+
`core/config.cppm`): `deps` (a project package-set, installed by bare `xlings install`),
49+
`workspace` (target→version, **per-OS keyed**, resolved project > subos > global),
50+
per-version `envs`, and named/anonymous `subos` (a project sandbox at
51+
`<proj>/.xlings/subos/...` with its own `bin/lib/usr`). **mcpp surfaces none of it.**
52+
53+
### Design — `[environment]` materializes 1:1 onto xlings keys
54+
Do **not** invent new vocabulary. A `mcpp.toml` `[environment]` block maps directly
55+
to the xlings `.xlings.json` schema and is written into mcpp's existing project file
56+
`<proj>/.mcpp/.xlings.json` (extend the `index_repos`-only writer at
57+
`config.cppm:699-705` to also emit `deps`/`workspace`/`envs`/`subos`):
58+
59+
```toml
60+
[environment]
61+
subos = "dev" # → .xlings.json "subos" (named project sandbox)
62+
tools = ["make@4", "python@3.13.1"] # → "deps" (host build-tools; bare `xlings install` provisions)
63+
[environment.workspace] # → "workspace" (pin tool versions; per-OS values allowed)
64+
gcc = "16.1.0"
65+
[environment.env] # → per-tool "envs" applied by xvm shims
66+
OPENBLAS_NUM_THREADS = "1"
67+
```
68+
69+
- **`[toolchain]` folds in**: today it installs globally; route it into the project
70+
`workspace` so two checkouts on one machine pin different compilers via xvm shims
71+
(xlings supports this; mcpp just never seeds it). `[toolchain]` stays as the
72+
ergonomic shorthand; `[environment.workspace]` is the general form.
73+
- **Provisioning** = the already-wired project-mode `build_command_prefix`
74+
(`xlings.cppm:716-724`) + bare `xlings install` over the emitted `deps`.
75+
- **This is purely "surface + writer"** — no new resolution machinery; the
76+
materialization target is wired but unused.
77+
78+
### Gap closed
79+
Per-project toolchain pinning, host build-tools (`make`/`python`/`cmake`) as
80+
declared deps, per-tool env vars, and a reproducible per-project sandbox — all
81+
expressible, all backed by existing xlings behavior.
82+
83+
---
84+
85+
## L0 — Dependency categories: keep three axes, wire `build`
86+
87+
Keep Cargo/Conan's **normal / dev / build** trichotomy (mcpp already declares all
88+
three, `manifest.cppm:251-253`; `[build-dependencies]` is parsed but **inert** — no
89+
consumer in `prepare`/`plan`). They answer two orthogonal questions:
90+
91+
- **normal vs dev** = *when* linked (always / test-only). Both evaluated against the
92+
**target**; both link into a runnable artifact (`mcpp test` resolves dev-deps —
93+
`manifest.cppm:44-46`).
94+
- **normal vs build** = *which machine* it runs on (target / **host**). build-deps
95+
do not enter the product; they feed the build itself.
96+
97+
**Three distinct "build-tool" homes — do not conflate:**
98+
1. **Recipe source-build tools**`xpm.<os>.deps` in an xpkg recipe (`xim:python`,
99+
`xim:make`); host, install-time, resolved by xim/`pkginfo.build_dep`. *This is
100+
where `compat.xcb`'s Python lives.*
101+
2. **Project environment tools** — L-1 `[environment].tools` (host tools the project
102+
needs available: cmake, protoc).
103+
3. **`build.mcpp` libraries**`[build-dependencies]` (host libraries the native
104+
build program links). **Wire this here** — it gains a real consumer at L3.
105+
106+
---
107+
108+
## L1 — Conditional dependency graph (Cargo-style `cfg`, target-evaluated)
109+
110+
### Decision: Cargo-style `[target.'cfg(...)']` tables, trimmed token set
111+
A declarative table keeps the graph applicative (L0 principle). vcpkg's inline
112+
`platform=` has no home for flags; a flat `[<os>]` block can't express arch /
113+
combinations / negation. Cargo's table shape wins and — unlike Cargo — mcpp lets
114+
the **same predicate namespace carry both deps and flags** (Cargo can't put flags
115+
in the manifest at all):
116+
117+
```toml
118+
# exact-triple escape hatch (already exists, highest precedence)
119+
[target.x86_64-pc-windows-msvc.dependencies]
120+
detours = "4.0"
121+
122+
# predicate tables — primary mechanism; deps AND flags share the namespace
123+
[target.'cfg(windows)'.dependencies.compat]
124+
openblas = { version = "0.3.33", lazy = true }
125+
126+
[target.'cfg(windows)'.build]
127+
ldflags = ["-Llib", "-llibopenblas"]
128+
129+
[target.'cfg(all(linux, not(arch = "aarch64")))'.build]
130+
cxxflags = ["-march=x86-64-v2"]
131+
```
132+
133+
- **Grammar**: Cargo's `all()/any()/not()` over `key = "value"` plus bare aliases
134+
`windows`/`unix`/`linux`/`macos`. **Token set trimmed to mcpp's existing
135+
dimensions**: `os`, `arch`, `family`, `env` — these *are* `AbiDim`
136+
(`toolchain/abi.cppm:34-38`); `parse_abi_capability` already parses
137+
`abi:arch=aarch64`. Defer `pointer_width`/`endian`/`vendor` until asked.
138+
- **Evaluate against the resolved TARGET triple, not host.** `abi_profile(triple)`
139+
(`abi.cppm:67-91`) derives `os`/`arch` from any triple. Timing is already correct:
140+
`--target` + `[target.<triple>]` resolve at `prepare.cppm:497-560`, *before* dep
141+
resolution at `~731` — merge the matching `cfg` tables into `m->dependencies` /
142+
`buildConfig.*` in that window. (The recipe-side `mcpp.<os>` merge keys on **host**
143+
`xpkg_platform` — wrong for cross-compiles; the `mcpp.toml` feature must not copy
144+
that.)
145+
- **Parser hooks**: extend the `[target]` loop (`manifest.cppm:1154-1171`, today only
146+
`toolchain`+`linkage`) to also read `dependencies`/`dev-dependencies`/
147+
`build-dependencies`/`build`. The TOML reader already lexes quoted keys
148+
(`toml.cppm:191-203`), so `[target.'cfg(windows)'.dependencies]` parses — but
149+
iterate `get_table("target")` entries manually (dotted `get_*` mis-splits quoted
150+
segments).
151+
- **Precedence**: exact triple > cfg; multiple matching cfg tables → flags
152+
concatenate, conflicting scalar (e.g. linker) → error (Cargo's rule).
153+
154+
### Lazy + content-hash identity (from Zig)
155+
- **`lazy = true`** on a dependency → fetched only when a build path (after cfg/feature
156+
gating) actually requests it (Zig `b.lazyDependency`). A Linux build then **never
157+
downloads** a `cfg(windows)` dependency. Generalizes mcpp's platform-specific compat
158+
packages directly.
159+
- **Hash is identity; url is a mirror** (Zig `build.zig.zon`: *"packages come from a
160+
hash, not a url"*). Adopting this **root-causes two items in mcpp's history**: mirror
161+
auto-detection (GitHub/GitCode/CN are interchangeable *locations* of one hash) and
162+
the stale-index-shard "fetch false-failure" (content identity can't be impersonated
163+
by a moved shard path). Pair fetch with a content-addressed store.
164+
165+
---
166+
167+
## L2 — Static codegen
168+
`generated_files` (`manifest.cppm:1975`) already covers codegen-as-data (synthesized
169+
headers/sources, e.g. compat.zlib's config header, the openblas Windows anchor). No
170+
change; it is the preferred mechanism whenever the generated content is static.
171+
172+
---
173+
174+
## L3 — `build.mcpp`: a native imperative build program (do it directly)
175+
176+
### Two distinct mechanisms, two audiences
177+
- **`install()`** (Lua, in an xpkg recipe) — the **third-party package** build hook
178+
(builds a dependency from source; `compat.openblas`/`compat.xcb`). Stays.
179+
- **`build.mcpp`** (C++, in the project) — the **mcpp-native project** build program.
180+
**New, built directly** (not deferred). Both get the same two disciplines below.
181+
182+
### Form — Zig's in-language model, in C++
183+
A `build.mcpp` (e.g. `build/main.cpp`) is itself a tiny mcpp build compiled with the
184+
**host** toolchain and run before the main build — Zig's `build.zig` model, but in the
185+
project's own language (C++), so no second language and it dogfoods mcpp. Its
186+
dependencies are exactly `[build-dependencies]` (L0 item 3 — now with a consumer).
187+
188+
### Discipline 1 — structured output protocol (not global mutation)
189+
The program communicates only via stdout directives the engine consumes (Cargo's
190+
`cargo::` model):
191+
```
192+
mcpp:link-lib=openblas
193+
mcpp:link-search=<abs-dir>
194+
mcpp:cxxflag=-DHAVE_FOO
195+
mcpp:cfg=has_avx2
196+
mcpp:generated=<path> # a source/header to add to the graph
197+
mcpp:rerun-if-changed=<path>
198+
mcpp:rerun-if-env-changed=VAR
199+
```
200+
It *requests* graph edges; it never silently mutates build state.
201+
202+
### Discipline 2 — explicit declared inputs/outputs (fixes the `.mcpp_ok` blind spot)
203+
`rerun-if-changed`/`rerun-if-env-changed` give the opaque program declared inputs, so
204+
incremental builds stay correct. This is the documented fix for mcpp's `.mcpp_ok`
205+
gap ("process exited 0 ≠ outputs correct"): replace the bare success marker with a
206+
**declared-input/declared-output contract** — the program (or recipe) records what it
207+
read and what it produced; the build re-runs iff a declared input changed and treats
208+
missing declared outputs as failure.
209+
210+
### Constraints (à la carte + supply-chain)
211+
- **Leaf only**: `build.mcpp` chooses flags/sources/codegen and emits link
212+
requirements — it must **not** gate the top-level dependency graph (that stays in
213+
L1 cfg tables, applicative). cfg gating of `build.mcpp` itself evaluates on the
214+
**target**, but it compiles/runs on the **host**.
215+
- **Isolation**: treat its execution as a build *action* with declared inputs/outputs;
216+
run sandboxed; prefer platform-level isolation (SLSA) over trusting program code —
217+
the consensus is script-level sandboxing alone is insufficient. The existing xim
218+
Lua sandbox (no `os.curdir`/`files`/`trymkdir`) is the right instinct; extend the
219+
same declared-I/O contract to `install()`.
220+
221+
---
222+
223+
## Phasing (recommended order)
224+
225+
1. **L-1 environment** — highest ROI, self-contained, no upstream dep beyond mcpp; the
226+
xlings target model already exists. Surface `[environment]` + extend the project
227+
`.xlings.json` writer. Fold `[toolchain]` into `workspace`. Wire `[build-dependencies]`.
228+
2. **mcpp-index workspace** (companion doc) — the first real consumer; exercises L1's
229+
need (Windows-only `compat.openblas`).
230+
3. **L1 conditional graph**`[target.'cfg()']` deps+flags, target-evaluated; then
231+
`lazy` + content-hash identity.
232+
4. **L3 `build.mcpp`** — native build program with the two disciplines; backport the
233+
declared-I/O contract to recipe `install()`.
234+
235+
## Appendix — cross-tool summary
236+
- **Declarative-table vs imperative-script**: TOML is static data → declarative tables
237+
for the graph (Cargo/vcpkg/Bazel); reserve imperative for a separate file
238+
(Cargo `build.rs`, Zig `build.zig`).
239+
- **Conditional grammar**: de-facto token set across all six tools = os / arch /
240+
family / env; Cargo's `all/any/not` is the composable standard.
241+
- **Host vs target**: Cargo/Zig/Bazel evaluate manifest conditionals on the **target**;
242+
build/tool deps run on the **host**. Conflating them breaks cross-compile (cf. mcpp's
243+
aarch64/musl bootstrap hazards — bootstrap tools must be host-static).
244+
- **Resolution**: the field is converging on PubGrub (complete + explainable); consider
245+
it (or MVS / content-addressed identity) over ad-hoc backtracking long-term.
246+
247+
### Sources
248+
Build Systems à la Carte (ICFP 2018); Cargo reference (specifying-dependencies,
249+
build-scripts, config); Rust Reference (cfg); Zig build system + build.zig.zon +
250+
cross-compilation docs; vcpkg manifest reference; Bazel configurable attributes;
251+
Meson reference tables; Conan 2 requirements; PubGrub; Bazel hermeticity; SLSA.

0 commit comments

Comments
 (0)