Skip to content

Commit b3627f9

Browse files
authored
fix: emit absolute -fprebuilt-module-path so clangd resolves modules (#77)
* fix: emit absolute `-fprebuilt-module-path` so clangd resolves modules src/build/flags.cppm was emitting a bare `-fprebuilt-module-path=pcm.cache` (or `=gcm.cache`). This works for `mcpp build` because ninja runs commands with cwd = `<projectRoot>/target/<triple>/<fp>/` so the relative path resolves correctly. But the same flag is captured verbatim into `compile_commands.json`, whose `directory` field is `plan.projectRoot`. Per CDB spec, clangd does `cd <directory>` before resolving the args, so it looks for `<projectRoot>/pcm.cache` — which doesn't exist. Effect: `import` resolution silently fails in clangd ("module 'X' not found"), even when `mcpp build` succeeds. The other `-fmodule-file=` flags in the same block were already absolute (escape_path-wrapped); this one was an oversight. Fix: format the flag with `escape_path(plan.outputDir / traits.bmiDir)`. Single line. ninja still works (absolute paths are cwd-independent), and clangd now finds the BMI cache. Regression test: tests/e2e/47_cdb_prebuilt_module_path_abs.sh creates a fresh project, runs `mcpp build`, parses `compile_commands.json`, asserts every `-fprebuilt-module-path=X`: - is absolute (starts with `/` on POSIX or `<drive>:` on Windows) - ends in `/pcm.cache` or `/gcm.cache` - points to an existing directory Pass-through for GCC-only flows that don't emit the flag at all. * ci/test: un-escape ninja sequences in CDB args; harden e2e for Windows First CI round revealed two layered issues exposed by the new e2e: 1. **The shipped CDB carries ninja-escape artefacts.** `flags.cppm`'s `escape_path` ninja-escapes paths (`$ ` for space, `$:` for colon, `$$` for literal `$`) so they survive embedding in ninja rule strings. But the same escaped strings flow verbatim through `f.cxx` into `compile_commands.json`, whose `arguments` array is exec'd by clangd without any ninja involved. On Windows that makes `C:` appear as `C$:` in CDB → clangd can't resolve the path. POSIX usually escaped nothing in practice, hiding the bug; Windows drive letters always trigger `$:`. Fix in `compile_commands.cppm::split_flags`: same single pass now splits tokens *and* unescapes the three ninja sequences. Splitting was also broken for paths with a literal space (would've broken `-isystem "/path with space/include"` mid-arg) — that's fixed simultaneously since `$ ` is no longer a token separator. 2. **e2e test 47 needed Windows-friendly checks.** The original `[^"]+` grep returned the raw JSON-escaped form (`\\Users\\...`) and the `^[A-Za-z]:` regex didn't match `C$:`. Rewrote using `jq -r` (which does JSON un-escape natively, preinstalled on all GHA runners) and: - explicit ninja-escape sniff (`$:`, `$ `, `$$`) - both POSIX and Windows-drive absolute-path checks - basename check normalised against `\` / `/` Net: CDB now contains plain paths in both runtimes, and the test fails loudly if either layer regresses. * test(e2e/47): strip trailing CR from jq output on git-bash/Windows The Windows CI run reported: FAIL: basename is not pcm.cache/gcm.cache: 'pcm.cache ' The closing quote on the next line is the giveaway — the value carries a trailing `\r`. `jq.exe` on git-bash emits CRLF; `mapfile -t` strips LF but leaves CR, so every downstream comparison saw `pcm.cache$'\r'` and failed. Trim with `${v%$'\r'}` per element. The source-code fix (un-escape ninja sequences) is working — the value itself is now a clean `C:\Users\...\pcm.cache`, no `$:` artefact. * test(e2e/47): drop bash-4 `mapfile` for macOS bash 3.2 compat Apple ships GPLv2-era bash 3.2 on macOS, which has no `mapfile` / `readarray`. The previous round failed on ci-macos with: line 31: mapfile: command not found FAIL: 47_cdb_prebuilt_module_path_abs.sh (exit 127) Replace `mapfile -t vals < <(jq …)` with `jq … > tmp.txt` followed by a `while … read … done < tmp.txt`. Using input redirection (not a `|` pipeline) keeps the loop in the current shell so `fail=1` propagates; the temp file lives under $TMP and is cleaned up by the script's trap. Empty-list signal now uses `[[ ! -s "$TMP/vals.txt" ]]` instead of array length — same semantics, no bashism.
1 parent 7f8d3e4 commit b3627f9

3 files changed

Lines changed: 137 additions & 12 deletions

File tree

src/build/compile_commands.cppm

Lines changed: 38 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -37,20 +37,47 @@ bool is_c_source(const std::filesystem::path& src) {
3737
return src.extension() == ".c";
3838
}
3939

40-
// Split a flag string into individual tokens.
40+
// Split a flag string into individual tokens AND un-escape ninja-style
41+
// path escapes (`$ ` → space, `$:` → `:`, `$$` → `$`).
42+
//
43+
// `flags.cppm::escape_path` ninja-escapes path arguments so they survive
44+
// embedding in ninja rule strings. Those escaped strings are then captured
45+
// into `f.cxx` / `f.cc` which is what we receive here. CDB consumers like
46+
// clangd exec the `arguments` array literally — no ninja involved — so
47+
// escaped chars must be undone or paths like `C:\Users\...` come through
48+
// as `C$:\Users\...` and break clangd's path resolution on Windows. (The
49+
// same issue would silently affect any POSIX path containing a space or
50+
// `$` — those just happen to be rare.)
51+
//
52+
// Splitting and un-escaping in one pass is correct: a literal space inside
53+
// a path appears as `$ ` in the input, which we must NOT treat as a token
54+
// separator.
4155
std::vector<std::string> split_flags(std::string_view s) {
4256
std::vector<std::string> out;
43-
std::size_t i = 0;
44-
while (i < s.size()) {
45-
while (i < s.size() && s[i] == ' ')
46-
++i;
47-
if (i >= s.size())
48-
break;
49-
std::size_t start = i;
50-
while (i < s.size() && s[i] != ' ')
51-
++i;
52-
out.emplace_back(s.substr(start, i - start));
57+
std::string token;
58+
auto flush = [&] {
59+
if (!token.empty()) {
60+
out.push_back(std::move(token));
61+
token.clear();
62+
}
63+
};
64+
for (std::size_t i = 0; i < s.size(); ++i) {
65+
char c = s[i];
66+
if (c == '$' && i + 1 < s.size()) {
67+
char nc = s[i + 1];
68+
if (nc == ' ' || nc == ':' || nc == '$') {
69+
token.push_back(nc);
70+
++i;
71+
continue;
72+
}
73+
}
74+
if (c == ' ') {
75+
flush();
76+
} else {
77+
token.push_back(c);
78+
}
5379
}
80+
flush();
5481
return out;
5582
}
5683

src/build/flags.cppm

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,17 @@ CompileFlags compute_flags(const BuildPlan& plan) {
212212
auto traits = mcpp::toolchain::bmi_traits(plan.toolchain);
213213
std::string prebuilt_module_flag;
214214
if (traits.needsPrebuiltModulePath) {
215-
prebuilt_module_flag = std::format(" -fprebuilt-module-path={}", traits.bmiDir);
215+
// Absolute path: a bare `pcm.cache` / `gcm.cache` works at ninja
216+
// time because ninja runs commands with cwd = outputDir, but the
217+
// same flag ends up verbatim in `compile_commands.json` whose
218+
// `directory` field is the project root. clangd does `cd directory`
219+
// before resolving the flag, so a bare relative path points at
220+
// `<projectRoot>/pcm.cache` (which doesn't exist) and `import`
221+
// resolution fails with `module 'X' not found`. The other
222+
// `-fmodule-file=` flags in this block are already escape_path'd
223+
// (absolute) for the same reason — this one was a leftover.
224+
prebuilt_module_flag = std::format(" -fprebuilt-module-path={}",
225+
escape_path(plan.outputDir / traits.bmiDir));
216226
}
217227
f.cxx = std::format("-std=c++23{}{}{}{}{}{}{}{}{}{}", module_flag, std_module_flag,
218228
std_compat_module_flag, prebuilt_module_flag,
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
#!/usr/bin/env bash
2+
# requires:
3+
# 47_cdb_prebuilt_module_path_abs.sh — `-fprebuilt-module-path` in
4+
# compile_commands.json must be an ABSOLUTE path, NOT a bare `pcm.cache`,
5+
# AND must not carry ninja-escape artefacts like `C$:` on Windows.
6+
# Reason: CDB `directory` is the project root and clangd does `cd
7+
# directory` before running the args, so a bare relative path points at
8+
# `<projectRoot>/pcm.cache` (missing) and a `C$:` prefix is treated as a
9+
# literal string, not a Windows drive letter. Both modes silently break
10+
# clangd's module resolution while `mcpp build` itself keeps working
11+
# (ninja runs from outputDir AND unescapes its own escape sequences).
12+
set -e
13+
14+
TMP=$(mktemp -d)
15+
trap "rm -rf $TMP" EXIT
16+
17+
cd "$TMP"
18+
"$MCPP" new app > /dev/null
19+
cd app
20+
"$MCPP" build > /dev/null
21+
22+
cdb=compile_commands.json
23+
[[ -f "$cdb" ]] || { echo "FAIL: no $cdb generated"; exit 1; }
24+
25+
command -v jq >/dev/null 2>&1 || {
26+
echo "SKIP: jq not on PATH (preinstalled on GitHub-hosted runners)"
27+
exit 0
28+
}
29+
30+
# jq returns each value JSON-unescaped (\\ → \, etc.). Stage to a temp
31+
# file then read with a redirected while loop — bash 3.2 on macOS lacks
32+
# `mapfile`/`readarray`, and a `| while` pipeline puts the loop in a
33+
# subshell so `fail=1` would not propagate. Input redirection runs the
34+
# loop in the current shell, preserving the flag.
35+
jq -r '
36+
.[] | .arguments[]?
37+
| select(type == "string" and startswith("-fprebuilt-module-path="))
38+
| sub("^-fprebuilt-module-path="; "")
39+
' "$cdb" > "$TMP/vals.txt"
40+
41+
if [[ ! -s "$TMP/vals.txt" ]]; then
42+
# GCC's libstdc++ flow uses -fmodules / gcm.cache without the explicit
43+
# -fprebuilt-module-path flag (see bmi_traits.needsPrebuiltModulePath).
44+
# Nothing to assert in that mode.
45+
echo "OK (no prebuilt-module-path flag — GCC toolchain)"
46+
exit 0
47+
fi
48+
49+
fail=0
50+
while IFS= read -r v; do
51+
# `jq` on git-bash/Windows emits CRLF; strip the trailing CR so basename
52+
# / regex comparisons don't trip over an invisible `\r`.
53+
v="${v%$'\r'}"
54+
[[ -z "$v" ]] && continue
55+
echo " checking: $v"
56+
57+
# Must NOT carry ninja-escape artefacts. The key signal is `$:` (drive
58+
# letter) or `$ ` / `$$` (path with space / dollar). If any of these
59+
# survives into CDB the JSON-args runtime treats them as literal text
60+
# → clangd fails to find the BMI.
61+
if [[ "$v" == *'$:'* || "$v" == *'$ '* || "$v" == *'$$'* ]]; then
62+
echo "FAIL: value retains ninja escape sequence ('\$:' / '\$ ' / '\$\$') — must be plain path in CDB"
63+
fail=1
64+
fi
65+
66+
# Absolute: POSIX (starts with '/') or Windows drive (e.g. 'C:').
67+
if [[ "$v" =~ ^/ || "$v" =~ ^[A-Za-z]: ]]; then
68+
:
69+
else
70+
echo "FAIL: value is relative: '$v'"
71+
echo " CDB 'directory' is the project root, but the BMI cache"
72+
echo " lives under target/<triple>/<fp>/ — clangd resolves to"
73+
echo " the wrong location and module imports fail."
74+
fail=1
75+
fi
76+
77+
# Basename must be pcm.cache or gcm.cache (cross-platform: normalise
78+
# backslashes first so Windows paths like 'C:\foo\pcm.cache' work).
79+
normalised="${v//\\//}"
80+
case "${normalised##*/}" in
81+
pcm.cache|gcm.cache) ;;
82+
*) echo "FAIL: basename is not pcm.cache/gcm.cache: '${normalised##*/}'"
83+
fail=1 ;;
84+
esac
85+
done < "$TMP/vals.txt"
86+
87+
[[ $fail -eq 0 ]] || exit 1
88+
echo "OK"

0 commit comments

Comments
 (0)