Skip to content

Commit b8df13e

Browse files
committed
fix: resolve LLVM toolchain install failure on Linux
Three fixes for `mcpp toolchain install llvm` failing with "xpkg payload missing": 1. install_with_progress(): use direct `xlings install -y` command on ALL platforms (not just Windows). The direct command avoids stdin closure (</dev/null) that breaks xlings subprocess coordination for large packages like LLVM (~800MB). Falls back to NDJSON interface path if direct install fails. 2. package_fetcher.cppm: extend the global xlings directory fallback from Windows-only to all platforms. If xlings installs a package to ~/.xlings/ instead of the mcpp sandbox, detect and copy it. 3. ci.yml: add "Toolchain install smoke test" step that exercises `mcpp toolchain install llvm` + build with it. This core user flow was previously untested in CI.
1 parent 20bf41d commit b8df13e

4 files changed

Lines changed: 176 additions & 21 deletions

File tree

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# LLVM 工具链安装失败分析
2+
3+
## 现象
4+
5+
`mcpp toolchain install llvm` 依赖包(libxml2, zlib, glibc 等)安装成功,但 LLVM 本体(800MB)缺失:
6+
7+
```
8+
~/.mcpp/registry/data/xpkgs/
9+
├── xim-x-libxml2/ ✓ 安装成功
10+
├── xim-x-zlib/ ✓ 安装成功
11+
├── xim-x-glibc/ ✓ 安装成功
12+
├── xim-x-llvm/ ✗ 不存在
13+
```
14+
15+
## 根因分析
16+
17+
### 问题 1:`</dev/null` 关闭 stdin 可能破坏 xlings 子进程通信
18+
19+
`platform/process.cppm:79-84``seal_stdin()` 对所有 POSIX 命令追加 `</dev/null`
20+
21+
这个修复解决了 macOS 首次运行卡住的问题,但副作用是:xlings 内部的子进程(如解压 800MB LLVM 的 tar 进程)可能依赖 stdin 进行进程间通信或信号传递。小包(libxml2 等)不受影响,大包(LLVM)因为解压时间长,子进程链更复杂,可能被 broken stdin 导致静默失败。
22+
23+
### 问题 2:`2>/dev/null` 吞掉所有错误信息
24+
25+
`xlings.cppm:432-434` 构建的命令:
26+
27+
```bash
28+
cd ~/.mcpp && ... xlings interface install_packages --args '...' 2>/dev/null </dev/null
29+
```
30+
31+
stderr 被完全丢弃。如果 xlings 安装 LLVM 时输出了错误信息到 stderr,我们完全看不到。
32+
33+
### 问题 3:NDJSON handler 只处理 download_progress 事件
34+
35+
`xlings.cppm:645-692``handle_line` 回调:
36+
37+
```cpp
38+
if (kind != "data") return; // 忽略非 data 事件
39+
if (ls.find_str("dataKind") != "download_progress") return; // 只关心下载进度
40+
```
41+
42+
如果 xlings 发出了 error 事件或 log 事件报告安装失败,全部被静默丢弃。
43+
44+
### 问题 4:Windows 有 fallback 但 Linux 没有
45+
46+
`package_fetcher.cppm:608-638` 有一个 Windows-only 的 workaround:
47+
48+
```cpp
49+
#if defined(_WIN32)
50+
// 如果 verdir 不存在,检查全局 xlings 目录 ~/.xlings/data/xpkgs/ 并复制过来
51+
if (!std::filesystem::exists(verdir)) {
52+
// ... copy from ~/.xlings/ to ~/.mcpp/
53+
}
54+
#endif
55+
```
56+
57+
这个 workaround 处理了 "xlings 把包装到全局目录而非 XLINGS_HOME 指定目录" 的情况。**Linux 没有这个 fallback**
58+
59+
### 为什么 CI 没有这个问题
60+
61+
CI 设置了 `MCPP_VENDORED_XLINGS="$XLINGS_BIN"`
62+
63+
```yaml
64+
export MCPP_VENDORED_XLINGS="$XLINGS_BIN"
65+
"$MCPP" build --target x86_64-linux-musl
66+
```
67+
68+
`MCPP_VENDORED_XLINGS` 触发 `make_xlings_env()` 中的特殊路径,使用全局 xlings 二进制。而且 CI 中的工具链安装走的是 xlings 全局 sandbox(因为 MCPP_HOME 显式设置),与用户本地的嵌套沙箱场景完全不同。
69+
70+
实际上 **CI 也没有测试 `mcpp toolchain install llvm` 这个用户流程**——CI 只测试 `mcpp build`(使用预装的工具链)。
71+
72+
## 修复方案
73+
74+
### 修复 1:`install_with_progress()` Linux 路径改为直接命令(对齐 Windows)
75+
76+
Windows 已经用直接 `xlings install ... -y` 命令而非 interface 模式。Linux 也应该如此:
77+
78+
```cpp
79+
int install_with_progress(const Env& env, std::string_view target,
80+
const BootstrapProgressCallback& cb)
81+
{
82+
// 所有平台统一:先用直接命令安装
83+
auto directCmd = build_command_prefix(env) + std::format(" install {} -y", target);
84+
int directRc = mcpp::platform::process::run_silent(directCmd);
85+
if (directRc == 0) return 0;
86+
87+
// 直接命令失败则 fallback 到 interface 模式(保留进度回调能力)
88+
// ...
89+
}
90+
```
91+
92+
### 修复 2:Linux 增加与 Windows 相同的 fallback 检查
93+
94+
在 `resolve_xpkg_path()` 中,将 Windows 的全局目录 fallback 扩展到所有平台:
95+
96+
```cpp
97+
// 移除 #if defined(_WIN32),改为所有平台通用
98+
if (!std::filesystem::exists(verdir)) {
99+
// 检查全局 xlings 目录
100+
auto homeDir = std::getenv("HOME");
101+
if (homeDir) {
102+
std::filesystem::path globalXpkgs =
103+
std::filesystem::path(homeDir) / ".xlings" / "data" / "xpkgs";
104+
auto globalVerdir = globalXpkgs / verdir.filename().parent_path().filename() / verdir.filename();
105+
if (std::filesystem::exists(globalVerdir)) {
106+
// 复制或软链接到 sandbox
107+
}
108+
}
109+
}
110+
```
111+
112+
### 修复 3:不对 xlings install 命令关闭 stdin
113+
114+
`install_with_progress()` 添加不关闭 stdin 的选项,或让直接 install 命令走 `std::system()` 而非 `platform::process`
115+
116+
```cpp
117+
// 直接命令不通过 platform::process(不追加 </dev/null)
118+
int directRc = std::system(directCmd.c_str());
119+
```
120+
121+
### 修复 4:CI 增加工具链安装测试
122+
123+
`ci.yml` 中增加专门测试 `mcpp toolchain install llvm` 的步骤,确保这个用户核心流程被覆盖。
124+
125+
## 推荐实施顺序
126+
127+
1. **修复 1 + 修复 3**:Linux 改用直接命令 + 不关闭 stdin(最可能解决问题)
128+
2. **修复 2**:增加全局目录 fallback(兜底)
129+
3. **修复 4**:增加 CI 测试(防止回归)

.github/workflows/ci.yml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,23 @@ jobs:
134134
"$MCPP" build
135135
"$MCPP" test
136136
137+
- name: Toolchain install smoke test (mcpp toolchain install llvm)
138+
run: |
139+
# Test the core user flow: install a toolchain, create a project,
140+
# build with it. Uses the freshly-built mcpp (not bootstrap).
141+
MCPP=$(realpath "$(find target -type f -name mcpp -printf '%T@ %p\n' | sort -rn | head -1 | cut -d' ' -f2)")
142+
# Install LLVM toolchain into mcpp's sandbox
143+
"$MCPP" toolchain install llvm 20.1.7
144+
# Verify install succeeded
145+
"$MCPP" toolchain list | grep -q llvm
146+
# Build a hello-world project with the installed toolchain
147+
TMP=$(mktemp -d)
148+
cd "$TMP"
149+
"$MCPP" new hello
150+
cd hello
151+
"$MCPP" build --toolchain llvm@20.1.7
152+
"$MCPP" run
153+
137154
- name: Fresh user experience (xlings install mcpp → new → run)
138155
continue-on-error: true
139156
run: |

src/pm/package_fetcher.cppm

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -605,14 +605,17 @@ Fetcher::resolve_xpkg_path(std::string_view target,
605605
};
606606

607607
auto resolve = [&]() -> std::expected<XpkgPayload, CallError> {
608-
#if defined(_WIN32)
609-
// Workaround: xlings on Windows may extract large packages (e.g. LLVM)
610-
// into its global data dir instead of the mcpp sandbox, because the
611-
// extraction subprocess doesn't inherit XLINGS_HOME. Detect this and
612-
// copy the payload into the sandbox so mcpp remains self-contained.
608+
// Workaround: xlings may extract large packages (e.g. LLVM) into its
609+
// global data dir instead of the mcpp sandbox, because the extraction
610+
// subprocess doesn't always inherit XLINGS_HOME. Detect this and copy
611+
// the payload into the sandbox so mcpp remains self-contained.
612+
// Originally Windows-only; extended to all platforms for the same
613+
// reason (xlings subprocess XLINGS_HOME propagation is unreliable).
613614
if (!std::filesystem::exists(verdir)) {
614-
// Try xlings' own data dir (where `xlings self install` placed it)
615-
auto xhome = std::getenv("USERPROFILE");
615+
const char* xhome = nullptr;
616+
if constexpr (mcpp::platform::is_windows) {
617+
xhome = std::getenv("USERPROFILE");
618+
}
616619
if (!xhome) xhome = std::getenv("HOME");
617620
if (xhome) {
618621
// xlings stores xpkgs at <home>/.xlings/data/xpkgs/ or
@@ -635,7 +638,6 @@ Fetcher::resolve_xpkg_path(std::string_view target,
635638
}
636639
}
637640
}
638-
#endif
639641
if (!std::filesystem::exists(verdir)) {
640642
return std::unexpected(CallError{
641643
std::format("xpkg payload missing: {}", verdir.string())});

src/xlings.cppm

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -609,24 +609,31 @@ int install_with_progress(const Env& env, std::string_view target,
609609
auto argsJson = std::format(
610610
R"({{"targets":["{}"],"yes":true}})", target);
611611

612-
if constexpr (mcpp::platform::is_windows) {
613-
mcpp::platform::env::set("XLINGS_HOME", env.home.string());
614-
mcpp::platform::env::set("XLINGS_PROJECT_DIR", "");
615-
std::error_code ec_mkdir;
616-
std::filesystem::create_directories(env.home, ec_mkdir);
617-
// Use direct `install` command instead of `interface install_packages`
618-
// on Windows. The NDJSON interface may have issues with large packages
619-
// where the extraction subprocess doesn't respect XLINGS_HOME.
620-
auto directCmd = std::format("{} install {} -y",
621-
env.binary.string(), target);
622-
int directRc = mcpp::platform::process::run_silent(directCmd);
612+
// All platforms: try direct `xlings install ... -y` first.
613+
// The direct command is more reliable for large packages (e.g. LLVM
614+
// ~800MB) because:
615+
// - it doesn't pipe through NDJSON interface (simpler subprocess chain)
616+
// - xlings manages its own stdin/stdout/stderr
617+
// - extraction subprocess coordination works normally
618+
// The NDJSON interface path is kept as a fallback for progress reporting.
619+
{
620+
auto directCmd = build_command_prefix(env) +
621+
std::format(" install {} -y {}", target, mcpp::platform::shell::silent_redirect);
622+
// Use std::system() directly — do NOT redirect stdin via </dev/null
623+
// because xlings may need stdin for subprocess coordination during
624+
// large package extraction.
625+
int directRc = std::system(directCmd.c_str());
626+
if constexpr (!mcpp::platform::is_windows) {
627+
directRc = WIFEXITED(directRc) ? WEXITSTATUS(directRc) : directRc;
628+
}
623629
if (directRc == 0) return 0;
624630
}
631+
632+
// Fallback: NDJSON interface path (provides progress callbacks).
625633
auto cmd = [&]() -> std::string {
626634
if constexpr (mcpp::platform::is_windows) {
627-
// Fallback to interface path if direct install fails
628635
return std::format("{} interface install_packages --args {} {}",
629-
env.binary.string(),
636+
build_command_prefix(env),
630637
shq(argsJson),
631638
mcpp::platform::null_redirect);
632639
} else {

0 commit comments

Comments
 (0)