From 489173b23f25e9e35ea51213f43be134cee3a967 Mon Sep 17 00:00:00 2001 From: sunrisepeak Date: Tue, 30 Jun 2026 12:55:52 +0800 Subject: [PATCH 1/3] feat(build.mcpp): typed import mcpp; build module bundled in the binary (v0.0.81) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit build.mcpp can be written modules-first — import mcpp; (no #include, no import std;) — calling a typed API (mcpp::cxxflag/define/link_lib/generated/…) that emits the same mcpp: wire protocol the engine already parses. Architecture (see .agents/docs/2026-06-30-build-mcpp-module-library-design.md): the helper IS part of the engine's ABI (it speaks this mcpp's protocol), so it ships WITH the engine, not as a versioned package — the Zig std.Build model, not Cargo's build-dep model. Embed the module SOURCE (constexpr string), not a BMI (BMIs are compiler-version-locked); compile on demand against the resolved host toolchain into target/.build-mcpp/ (GCC: -fmodules gcm.cache; Clang: --precompile .pcm). I/O is C-level so the module needs no import std. Gated on actual use: mcpp only builds/links the module when build.mcpp contains 'import mcpp' — a #include-based program compiles byte-identically to before (zero blast radius). Uses the 0.0.79 capture_exec cwd to let GCC find gcm.cache/. - src/build/build_program.cppm: kMcppModuleSource + build_mcpp_module + use-gating - tests/e2e/92_build_mcpp_import.sh (GCC path); docs/07-build-mcpp.md (+zh) - design doc; version -> 0.0.81. Clang path covered by the mcpp-index build-mcpp member's workspace job on macOS/Windows. --- ...-06-30-build-mcpp-module-library-design.md | 114 ++++++++++++++++++ docs/07-build-mcpp.md | 37 +++++- docs/zh/07-build-mcpp.md | 35 +++++- mcpp.toml | 2 +- src/build/build_program.cppm | 105 +++++++++++++++- src/toolchain/fingerprint.cppm | 2 +- tests/e2e/92_build_mcpp_import.sh | 55 +++++++++ 7 files changed, 342 insertions(+), 8 deletions(-) create mode 100644 .agents/docs/2026-06-30-build-mcpp-module-library-design.md create mode 100755 tests/e2e/92_build_mcpp_import.sh diff --git a/.agents/docs/2026-06-30-build-mcpp-module-library-design.md b/.agents/docs/2026-06-30-build-mcpp-module-library-design.md new file mode 100644 index 0000000..75bd934 --- /dev/null +++ b/.agents/docs/2026-06-30-build-mcpp-module-library-design.md @@ -0,0 +1,114 @@ +# The `mcpp` build-module library for `build.mcpp` (Architecture & Design) + +How mcpp provides a **typed module API** to `build.mcpp` so it can be written +modules-first (`import mcpp;`, no `#include`, no `import std;`) instead of printing +raw `mcpp:` protocol strings. Evaluated on five axes: **简洁 (simplicity) / 覆盖 +(coverage) / 优化 (optimization) / 稳定 (stability) / 适配 (adaptability)**. + +## The constraint that drives the whole design + +The helper's job is to emit the *exact* `mcpp:` wire protocol **this** mcpp parses. +So it is **not a third-party library — it is part of the engine's ABI.** Any design +that lets the helper drift from the engine's protocol version (a separately +released package, a pinned dependency) introduces skew. This single fact rules out +most of the "obvious" options and points straight at "ship it with the engine." + +## Options considered + +| Option | What | Verdict | +|---|---|---| +| **A. Ship a prebuilt BMI** (`mcpp.gcm`/`.pcm` in the release) | precompiled module interface | ✗ BMIs are **not portable** across compiler vendor/version/flags (GCC gcm is locked to the exact GCC build). Would need a combinatorial matrix of BMIs. Fragile. | +| **B. Header-only** (`#include `) | a shipped header | ✗ contradicts modules-first ("no headers"). (Most portable, but off-brand; kept as a mental fallback only.) | +| **C. Cargo model** — helper is a normal `[build-dependencies]` package in the index | `build.mcpp` depends on a published `mcpp` package, resolved + compiled like any dep | △ composable, but adds a **resolution step** for a leaf script and reintroduces **version skew** (the package version vs the engine's protocol). Cargo's `build-rs` crate works this way — but Cargo's protocol is far more stable than a young tool's. | +| **D. Zig model** — helper is part of the tool, always present, version-matched | embed the module **source** in the binary; compile on demand against the host toolchain | ✓ **chosen.** Zig's `std.Build` ships with the compiler; the build API and the engine are one artifact, so they can never disagree. | + +### Why "embed the **source**, not the BMI" + +Source is the only **toolchain-portable** form. A BMI is compiler-version-locked; +source compiles against *whatever* host toolchain resolved for this build (gcc on +Linux, clang on macOS/Windows), at whatever version, with the same sysroot flags +the build already computes. So one embedded `constexpr std::string_view` adapts to +every toolchain — no matrix, no skew. This is the crux of **适配 + 稳定**. + +## Chosen design + +``` +mcpp binary +└── constexpr std::string_view kMcppModuleSource // the `mcpp` module, embedded + │ (module; #include export module mcpp; … inline emitters) + ▼ only when build.mcpp contains `import mcpp` + /target/.build-mcpp/ + ├── mcpp.cppm written from the embedded source + ├── mcpp.gcm / .pcm compiled BMI (GCC gcm.cache/ | Clang pcm) + ├── mcpp.o module object (linked into build.mcpp.bin) + └── build.mcpp.bin +``` + +1. **Embedded, version-matched** (`build_program.cppm` `kMcppModuleSource`). The + functions mirror the directive set 1:1 and `std::printf` the `mcpp:` lines. I/O + is C-level (global module fragment `#include `), so **the module needs + no `import std;`** — neither does a `build.mcpp` that only `import mcpp;`. +2. **Compiled on demand, into `target/`** — not in the project tree. GCC: + `-fmodules` → `gcm.cache/mcpp.gcm` + `mcpp.o`; Clang: `--precompile` → `mcpp.pcm` + then `-c` → `mcpp.o`. Reuses the build's own `host_base_flags` (sysroot etc.). +3. **Gated on actual use** — mcpp scans `build.mcpp` for `import mcpp`; only then + is the module built + linked and the compile run from `target/.build-mcpp/` + (so GCC finds `gcm.cache/` relative to cwd, via the 0.0.79 `capture_exec` cwd). + A `#include`-based `build.mcpp` compiles **byte-identically to before** — zero + blast radius. + +## Five-axis evaluation + +- **简洁** — one embedded string + one compile helper; no packaging, no install, no + registry entry, no version field. The user writes `import mcpp;` and it's there. +- **覆盖** — GCC (gcm) on Linux + Clang (pcm) on macOS/Windows = mcpp's whole + toolchain matrix (mcpp uses clang, not MSVC, on Windows). The directive API + covers every wire directive 1:1. +- **优化** — built only when `build.mcpp` *uses* it AND is being (re)compiled + (already gated by the declared-input cache), so a stable build.mcpp pays nothing. + Cost when it does run: one ~0.3 s module compile. *Future*: a **global + per-toolchain BMI cache** (`~/.mcpp/bmi/build-module//`, + symlinked into each project's `gcm.cache/`) would compile once per machine + instead of once per project — deferred; the per-project compile is cheap and + keeps the code simple. +- **稳定** — embedded source ⇒ **no version skew** (the headline win); use-gating + ⇒ existing `#include` programs are untouched; failures surface as a clear "mcpp + module compile failed" with the compiler output. +- **适配** — source-on-demand adapts to any host toolchain/version automatically; + adding a directive = adding one `inline` function to the embedded string; + per-compiler module ABI handled by the GCC/Clang branch. + +## Naming + +`import mcpp;` (top-level) for brevity — `build.mcpp` context makes the scope +unambiguous. Future non-build helpers can live under `mcpp.` modules without +colliding. (`import mcpp.build;` was considered for namespace precision; rejected +for the common case's verbosity — revisit only if a second `mcpp` module appears.) + +## API (mirrors the wire protocol 1:1) + +```cpp +import mcpp; +int main() { + mcpp::cxxflag("-DHAVE_X=1"); + mcpp::cflag("-DFOR_C"); + mcpp::link_lib("m"); // -lm + mcpp::link_search("vendor/lib"); // -L… + mcpp::define("HAVE_FEATURE"); // cfg= → -DHAVE_FEATURE + mcpp::generated("src/gen.cpp"); + mcpp::rerun_if_changed("config.h"); + mcpp::rerun_if_env_changed("USE_FAST"); +} +``` + +The raw stdout protocol stays the documented low-level substrate; `import mcpp;` is +the typed layer over it (the Cargo `build-rs`-over-`cargo::` shape, but +engine-bundled à la Zig). + +## Coverage / stability boundaries (recorded) + +- **Windows/macOS Clang path** is exercised by the mcpp-index `build-mcpp` + workspace member (its `mcpp test --workspace` runs on macOS/Windows with clang); + the e2e `92_build_mcpp_import.sh` covers the GCC path (it `requires: gcc`). +- Cross `--target` builds still skip `build.mcpp` entirely (host-only), so the + module is host-only too. diff --git a/docs/07-build-mcpp.md b/docs/07-build-mcpp.md index 9f26522..7fdd25f 100644 --- a/docs/07-build-mcpp.md +++ b/docs/07-build-mcpp.md @@ -54,9 +54,44 @@ is ignored, so you can freely log diagnostics. The program **requests** build edges (flags, libraries, sources). It cannot add a registry dependency — keep your dependency graph declarative in `mcpp.toml` -(including platform-conditional `[target.'cfg(...)'.dependencies]`). `build.mcpp` +(including platform-conditional `[target.windows.dependencies]`). `build.mcpp` is for *leaf* decisions: flags, codegen, link requirements. +## Typed API: `import mcpp;` (recommended) + +Instead of printing raw strings you can write `build.mcpp` **modules-first** — +`import mcpp;`, no `#include`, no `import std;`. The `mcpp` module is bundled in the +mcpp binary (so it always matches your mcpp's protocol) and is compiled on demand; +its functions just emit the directives above: + +```cpp +// build.mcpp +import mcpp; + +int main() { + mcpp::cxxflag("-DHAVE_BANNER=1"); + mcpp::link_lib("m"); // -lm + mcpp::link_search("vendor/lib"); // -L… + mcpp::define("HAVE_FEATURE"); // == mcpp:cfg= → -DHAVE_FEATURE + mcpp::generated("src/gen.cpp"); + mcpp::rerun_if_changed("config.h"); + mcpp::rerun_if_env_changed("USE_FAST"); +} +``` + +| Function | Emits | +|---|---| +| `mcpp::cxxflag(s)` / `mcpp::cflag(s)` | `mcpp:cxxflag=` / `mcpp:cflag=` | +| `mcpp::link_lib(s)` / `mcpp::link_search(s)` | `mcpp:link-lib=` / `mcpp:link-search=` | +| `mcpp::define(s)` | `mcpp:cfg=` (i.e. `-D`) | +| `mcpp::generated(p)` | `mcpp:generated=` | +| `mcpp::rerun_if_changed(p)` / `mcpp::rerun_if_env_changed(v)` | the matching `rerun-*` directives | + +If your `build.mcpp` also needs to *write* a generated file, mix in a textual +`#include ` — that's fine; only `import std;` is unnecessary. The raw +stdout protocol above remains the low-level substrate; `import mcpp;` is the typed +layer over it. + ## Incremental: declared inputs (no needless re-runs) mcpp does **not** re-run `build.mcpp` on every build. It caches the program's diff --git a/docs/zh/07-build-mcpp.md b/docs/zh/07-build-mcpp.md index 6e9fbfa..4ef7614 100644 --- a/docs/zh/07-build-mcpp.md +++ b/docs/zh/07-build-mcpp.md @@ -50,9 +50,42 @@ mcpp build # 编译 + 运行 build.mcpp,然后构建工程 | `mcpp:rerun-if-env-changed=` | 该环境变量变化时重跑 `build.mcpp` | 程序**请求**构建边(开关、库、源码),它**不能**新增注册表依赖——请把依赖图保持在 -`mcpp.toml` 里声明式管理(包括平台条件依赖 `[target.'cfg(...)'.dependencies]`)。 +`mcpp.toml` 里声明式管理(包括平台条件依赖 `[target.windows.dependencies]`)。 `build.mcpp` 用于*叶子*决策:开关、代码生成、链接需求。 +## 类型化 API:`import mcpp;`(推荐) + +除了打印裸字符串,你还可以把 `build.mcpp` 写成**模块优先**——`import mcpp;`,无 +`#include`、无 `import std;`。`mcpp` 模块**内置在 mcpp 二进制里**(因此永远和你这版 mcpp +的协议匹配),按需编译;它的函数只是 emit 上面那些指令: + +```cpp +// build.mcpp +import mcpp; + +int main() { + mcpp::cxxflag("-DHAVE_BANNER=1"); + mcpp::link_lib("m"); // -lm + mcpp::link_search("vendor/lib"); // -L… + mcpp::define("HAVE_FEATURE"); // == mcpp:cfg= → -DHAVE_FEATURE + mcpp::generated("src/gen.cpp"); + mcpp::rerun_if_changed("config.h"); + mcpp::rerun_if_env_changed("USE_FAST"); +} +``` + +| 函数 | emit | +|---|---| +| `mcpp::cxxflag(s)` / `mcpp::cflag(s)` | `mcpp:cxxflag=` / `mcpp:cflag=` | +| `mcpp::link_lib(s)` / `mcpp::link_search(s)` | `mcpp:link-lib=` / `mcpp:link-search=` | +| `mcpp::define(s)` | `mcpp:cfg=`(即 `-D`) | +| `mcpp::generated(p)` | `mcpp:generated=` | +| `mcpp::rerun_if_changed(p)` / `mcpp::rerun_if_env_changed(v)` | 对应的 `rerun-*` 指令 | + +如果 `build.mcpp` 还需要*写*生成文件,混入一个文本 `#include ` 即可——这没问题, +只有 `import std;` 是不必要的。上面的裸 stdout 协议仍是底层基底;`import mcpp;` 是其上的 +类型化层。 + ## 增量:声明输入(避免无谓重跑) mcpp **不会**每次构建都重跑 `build.mcpp`。它会缓存程序产出的指令,只有当它依赖的东西 diff --git a/mcpp.toml b/mcpp.toml index 10ff521..c455e2f 100644 --- a/mcpp.toml +++ b/mcpp.toml @@ -1,6 +1,6 @@ [package] name = "mcpp" -version = "0.0.80" +version = "0.0.81" description = "Modern C++ build & package management tool" license = "Apache-2.0" authors = ["mcpp-community"] diff --git a/src/build/build_program.cppm b/src/build/build_program.cppm index 644a84d..572ef9a 100644 --- a/src/build/build_program.cppm +++ b/src/build/build_program.cppm @@ -145,6 +145,73 @@ std::vector host_base_flags(const mcpp::toolchain::Toolchain& tc) { return f; } +// The bundled `mcpp` build module — a typed API over the stdout wire protocol so +// build.mcpp can `import mcpp;` (no `#include`, no `import std;`). I/O uses +// C-level primitives in the global module fragment, so the module needs no std +// module BMI. The functions mirror the directive set 1:1; they just print the +// `mcpp:` lines the engine already parses. Embedded in the binary (not shipped as +// a file) so it always matches this mcpp's protocol. +constexpr std::string_view kMcppModuleSource = R"CPP(module; +#include +export module mcpp; +export namespace mcpp { +inline void cxxflag(const char* flag) { std::printf("mcpp:cxxflag=%s\n", flag); } +inline void cflag(const char* flag) { std::printf("mcpp:cflag=%s\n", flag); } +inline void link_lib(const char* name) { std::printf("mcpp:link-lib=%s\n", name); } +inline void link_search(const char* dir) { std::printf("mcpp:link-search=%s\n", dir); } +inline void define(const char* name) { std::printf("mcpp:cfg=%s\n", name); } +inline void generated(const char* path) { std::printf("mcpp:generated=%s\n", path); } +inline void rerun_if_changed(const char* path) { std::printf("mcpp:rerun-if-changed=%s\n", path); } +inline void rerun_if_env_changed(const char* var) { std::printf("mcpp:rerun-if-env-changed=%s\n", var); } +} +)CPP"; + +// Compile the bundled `mcpp` module into `bdir` and return the extra flags the +// build.mcpp compile needs to import it (the object `mcpp.o` is linked alongside). +// GCC : -fmodules → gcm.cache/mcpp.gcm + mcpp.o; build.mcpp compiles from +// `bdir` (cwd) so GCC finds gcm.cache/mcpp.gcm. +// Clang : --precompile → mcpp.pcm, then -c → mcpp.o; pass -fmodule-file=mcpp=. +std::expected, std::string> +build_mcpp_module(const fs::path& bdir, const fs::path& compiler, + const std::vector& base, const std::string& stdFlag, + bool isClang) { + std::error_code ec; + fs::path cppm = bdir / "mcpp.cppm"; + { std::ofstream os(cppm, std::ios::trunc); + os << kMcppModuleSource; + if (!os) return std::unexpected(std::string("could not write mcpp module source")); } + + auto run = [&](std::vector argv, const char* what) + -> std::expected { + auto r = mcpp::platform::process::capture_exec(argv, {}, bdir.string()); + if (r.exit_code != 0) + return std::unexpected(std::format("mcpp module {} failed (exit {}):\n{}", + what, r.exit_code, r.output)); + return {}; + }; + auto with_base = [&](std::vector head) { + for (auto& b : base) head.push_back(b); + return head; + }; + + std::vector extra; + if (isClang) { + if (auto r = run(with_base({compiler.string(), stdFlag, "--precompile", + "mcpp.cppm", "-o", "mcpp.pcm"}), "precompile"); !r) + return std::unexpected(r.error()); + if (auto r = run(with_base({compiler.string(), stdFlag, "-c", + "mcpp.pcm", "-o", "mcpp.o"}), "object"); !r) + return std::unexpected(r.error()); + extra.push_back("-fmodule-file=mcpp=" + (bdir / "mcpp.pcm").string()); + } else { + if (auto r = run(with_base({compiler.string(), stdFlag, "-fmodules", "-c", + "mcpp.cppm", "-o", "mcpp.o"}), "compile"); !r) + return std::unexpected(r.error()); + extra.push_back("-fmodules"); + } + return extra; +} + // ── Cache (line-based; one record per line, internal format) ─────────────── // program // compiler @@ -286,20 +353,50 @@ std::expected run_build_program( return {}; } - fs::create_directories(build_dir(root), ec); - fs::path bin = build_dir(root) / "build.mcpp.bin"; + fs::path bdir = build_dir(root); + fs::create_directories(bdir, ec); + fs::path bin = bdir / "build.mcpp.bin"; // ── Compile build.mcpp with the host toolchain ────────────────────────── std::string std_flag = "-std=" + std::string(cppStandard.empty() ? "c++23" : cppStandard); + auto base = host_base_flags(tc); + + // Only wire the bundled `mcpp` module when build.mcpp actually imports it — + // so the common `#include`-based program compiles exactly as before (no + // -fmodules, cwd = project root). When it does `import mcpp;`, compile the + // module, link its object, and run the build.mcpp compile from `bdir` so GCC + // finds gcm.cache/mcpp.gcm. + std::string srcText; + { std::ifstream is(src); std::ostringstream ss; ss << is.rdbuf(); srcText = ss.str(); } + bool usesModule = srcText.find("import mcpp") != std::string::npos; + + std::vector moduleFlags; + if (usesModule) { + auto mf = build_mcpp_module(bdir, hostCompiler, base, std_flag, + mcpp::toolchain::is_clang(tc)); + if (!mf) return std::unexpected(mf.error()); + moduleFlags = std::move(*mf); + } + // `-x c++` is required: the `.mcpp` extension is unknown to the compiler, so // without it the driver hands build.mcpp to the linker as a linker script. std::vector compileArgv = { hostCompiler.string(), std_flag, "-O0" }; - for (auto& bf : host_base_flags(tc)) compileArgv.push_back(bf); + for (auto& bf : base) compileArgv.push_back(bf); + for (auto& mf : moduleFlags) compileArgv.push_back(mf); compileArgv.push_back("-x"); compileArgv.push_back("c++"); compileArgv.push_back(src.string()); + if (usesModule) { + // Link the module object (reset the input language first so the .o isn't + // treated as C++ source). + compileArgv.push_back("-x"); compileArgv.push_back("none"); + compileArgv.push_back((bdir / "mcpp.o").string()); + } compileArgv.push_back("-o"); compileArgv.push_back(bin.string()); mcpp::ui::info("build.mcpp", "compiling"); - auto cres = mcpp::platform::process::capture_exec(compileArgv, {}, root.string()); + // GCC resolves `import mcpp;` via gcm.cache/ relative to the compile cwd, so + // run the module-using compile from bdir; otherwise the project root is fine. + std::string compileCwd = usesModule ? bdir.string() : root.string(); + auto cres = mcpp::platform::process::capture_exec(compileArgv, {}, compileCwd); if (cres.exit_code != 0) { return std::unexpected(std::format( "build.mcpp failed to compile (exit {}):\n{}", cres.exit_code, cres.output)); diff --git a/src/toolchain/fingerprint.cppm b/src/toolchain/fingerprint.cppm index 59e3ff5..2f397b7 100644 --- a/src/toolchain/fingerprint.cppm +++ b/src/toolchain/fingerprint.cppm @@ -18,7 +18,7 @@ import mcpp.toolchain.detect; export namespace mcpp::toolchain { -inline constexpr std::string_view MCPP_VERSION = "0.0.80"; +inline constexpr std::string_view MCPP_VERSION = "0.0.81"; struct FingerprintInputs { Toolchain toolchain; diff --git a/tests/e2e/92_build_mcpp_import.sh b/tests/e2e/92_build_mcpp_import.sh new file mode 100755 index 0000000..0346a37 --- /dev/null +++ b/tests/e2e/92_build_mcpp_import.sh @@ -0,0 +1,55 @@ +#!/usr/bin/env bash +# 92_build_mcpp_import.sh — the bundled `import mcpp;` typed build library. A +# build.mcpp written modules-first (no #include, no `import std;`) calls the typed +# API (mcpp::cxxflag / define / generated / link_lib …), which emits the same +# `mcpp:` wire protocol the engine parses. mcpp compiles the bundled module on the +# fly and links it. See .agents/docs/2026-06-30-l3-build-mcpp-implementation-design.md. +# +# requires: gcc +set -e + +TMP=$(mktemp -d) +trap "rm -rf $TMP" EXIT +cd "$TMP" + +mkdir -p app/src +cd app +cat > mcpp.toml <<'EOF' +[package] +name = "app" +version = "0.1.0" +EOF + +# Modules-first build program: import mcpp;, no headers, no import std. +cat > build.mcpp <<'EOF' +import mcpp; +int main() { + mcpp::generated("src/gen.cpp"); // a generated source to link + mcpp::cxxflag("-DVIA_MODULE=1"); // a compile flag + mcpp::define("MODULE_DEFINE"); // cfg= → -DMODULE_DEFINE + mcpp::rerun_if_changed("build.mcpp"); + return 0; +} +EOF +# build.mcpp uses generated() to declare src/gen.cpp; write it (the program could +# also generate it — here we just commit it so the test stays focused on import). +cat > src/gen.cpp <<'EOF' +int gen_value() { return 99; } +EOF +cat > src/main.cpp <<'EOF' +#if !defined(VIA_MODULE) || !defined(MODULE_DEFINE) +#error "import mcpp typed directives did not reach the build" +#endif +int gen_value(); +int main() { return gen_value() == 99 ? 0 : 1; } +EOF + +"$MCPP" build > b.log 2>&1 || { cat b.log; echo "FAIL: import mcpp build errored"; exit 1; } +grep -q "build.mcpp" b.log || { cat b.log; echo "FAIL: build.mcpp not invoked"; exit 1; } +# The bundled module is compiled into target/.build-mcpp/. +[ -f target/.build-mcpp/mcpp.o ] || { echo "FAIL: bundled mcpp module not compiled"; exit 1; } +# The binary returns 0 only if both the module-emitted define AND the generated +# source took effect. +"$MCPP" run > r.log 2>&1 || { cat r.log; echo "FAIL: run non-zero (module directives/generated source missing)"; exit 1; } + +echo "OK" From 13f6cd4f42f96665db9e785628c555b8f8b6c670 Mon Sep 17 00:00:00 2001 From: sunrisepeak Date: Tue, 30 Jun 2026 13:16:56 +0800 Subject: [PATCH 2/3] fix(build.mcpp): placeholder the embedded module decl so the regex scanner doesn't misread it The Windows self-host build uses the default regex module scanner, which read the 'export module mcpp;' line inside kMcppModuleSource (a raw string literal) as build_program.cppm exporting a second module -> 'file already exports module ... cannot export mcpp'. Use a @MODULE@ placeholder in the embedded source, substituted with 'export module' when written. No behavior change; the generated mcpp.cppm is identical. --- src/build/build_program.cppm | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/build/build_program.cppm b/src/build/build_program.cppm index 572ef9a..04db6c7 100644 --- a/src/build/build_program.cppm +++ b/src/build/build_program.cppm @@ -151,9 +151,12 @@ std::vector host_base_flags(const mcpp::toolchain::Toolchain& tc) { // module BMI. The functions mirror the directive set 1:1; they just print the // `mcpp:` lines the engine already parses. Embedded in the binary (not shipped as // a file) so it always matches this mcpp's protocol. +// NOTE: the module declaration line uses a `@MODULE@` placeholder (substituted +// with `export module` when written) so mcpp's own line-based module scanner does +// not mistake this embedded string for build_program.cppm exporting a 2nd module. constexpr std::string_view kMcppModuleSource = R"CPP(module; #include -export module mcpp; +@MODULE@ mcpp; export namespace mcpp { inline void cxxflag(const char* flag) { std::printf("mcpp:cxxflag=%s\n", flag); } inline void cflag(const char* flag) { std::printf("mcpp:cflag=%s\n", flag); } @@ -177,8 +180,11 @@ build_mcpp_module(const fs::path& bdir, const fs::path& compiler, bool isClang) { std::error_code ec; fs::path cppm = bdir / "mcpp.cppm"; + std::string moduleSrc(kMcppModuleSource); + if (auto p = moduleSrc.find("@MODULE@"); p != std::string::npos) + moduleSrc.replace(p, std::string_view("@MODULE@").size(), "export module"); { std::ofstream os(cppm, std::ios::trunc); - os << kMcppModuleSource; + os << moduleSrc; if (!os) return std::unexpected(std::string("could not write mcpp module source")); } auto run = [&](std::vector argv, const char* what) From ed6fb1d09460267580f1c89302ed55c8ccced186 Mon Sep 17 00:00:00 2001 From: sunrisepeak Date: Tue, 30 Jun 2026 13:17:32 +0800 Subject: [PATCH 3/3] docs: record the regex-scanner gotcha in the build-module design --- .../2026-06-30-build-mcpp-module-library-design.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/.agents/docs/2026-06-30-build-mcpp-module-library-design.md b/.agents/docs/2026-06-30-build-mcpp-module-library-design.md index 75bd934..d4be627 100644 --- a/.agents/docs/2026-06-30-build-mcpp-module-library-design.md +++ b/.agents/docs/2026-06-30-build-mcpp-module-library-design.md @@ -105,6 +105,17 @@ The raw stdout protocol stays the documented low-level substrate; `import mcpp;` the typed layer over it (the Cargo `build-rs`-over-`cargo::` shape, but engine-bundled à la Zig). +## Implementation gotcha (recorded) + +The embedded source contains the line `export module mcpp;`. mcpp's **default +line-based regex module scanner** (used on the Windows self-host build; the P1689 +compiler-driven scanner ignores string literals) read that line *inside the raw +string literal* as `build_program.cppm` declaring a second module → "file already +exports module … cannot export 'mcpp'". Fix: write the declaration with a +`@MODULE@` placeholder substituted to `export module` at file-write time, so no +literal `export module ` text appears in mcpp's own source. (A broader fix +would be to teach the regex scanner to skip string/raw-string literals.) + ## Coverage / stability boundaries (recorded) - **Windows/macOS Clang path** is exercised by the mcpp-index `build-mcpp`