Skip to content

Centralize sysroot path translation and case-fold#33

Merged
jserv merged 1 commit into
mainfrom
path-handling
May 15, 2026
Merged

Centralize sysroot path translation and case-fold#33
jserv merged 1 commit into
mainfrom
path-handling

Conversation

@jserv
Copy link
Copy Markdown
Contributor

@jserv jserv commented May 15, 2026

Move guest-to-host path resolution into a single entry point in src/syscall/path.{c,h}. path_translate_at honors three modes (no-follow, follow, create-with-optional-parents), preserves the resolver errno on failure so callers translate it via linux_errno() instead of flattening to ENAMETOOLONG, and rejects ".." in the basename when follow_final is false so an lstat cannot escape above the sysroot.

Factor --sysroot/--create-sysroot provisioning out of main.c into src/core/sysroot.{c,h}. Validate caller-supplied sysroot length before any heap allocation, treat collision-sentinel truncation as a hard validation failure rather than failing open, and set errno on every parse path so the cleanup logger reports a real reason.

Add a case-fold sidecar at src/syscall/sidecar.{c,h} for case-insensitive macOS volumes. The sidecar keeps colliding Linux guest names distinct by mapping each to a hidden token file plus a per-directory index, so that guest workloads relying on case-sensitive Linux path semantics still work on the host's case-insensitive APFS or HFS+. Procemu-virtual paths (/proc, /sys, /dev) short-circuit the sidecar walk after normalization so they reach the procemu intercept intact instead of failing with ENOENT against a directory that does not exist in the sysroot.

Fix /proc/self/exe sysroot prefix strip: proc_set_sysroot stores the realpath canonical form, so the readlink handler now canonicalizes the stored elf_path before the prefix check, otherwise macOS symlinks such as /var -> /private/var make the strncmp diverge and leak the host path back to the guest.

Serialize sysroot_casefold across fork IPC so child processes keep the sidecar feature after clone/fork. Lock elf_path against torn reads from sibling vCPUs during execve and expose proc_elf_path_snapshot for content-consuming callers; proc_get_elf_path keeps the legacy boolean-test contract.


Summary by cubic

Centralizes guest-to-host path translation and adds a case-fold sidecar so Linux paths behave correctly on macOS case-insensitive volumes. Also moves sysroot setup into core/sysroot with capability probing, and fixes /proc/self/exe canonicalization plus exec path races.

  • New Features

    • Introduced path_translate_at as the single resolver with no-follow, follow, and create(+parents) modes; adopted across exec, stat, xattr, open, readdir, and rename paths.
    • Added a case-fold sidecar (src/syscall/sidecar.{c,h}) to keep colliding names distinct; skips /proc, /sys, /dev, remaps readdir names back to guest form, and is serialized across fork IPC.
    • Moved --sysroot/--create-sysroot into src/core/sysroot.{c,h} with case-sensitivity/case-preserving probing, stricter dir validation, and mount/create helpers.
  • Bug Fixes

    • Canonicalized /proc/self/exe handling to avoid leaking host prefixes via macOS symlink aliases.
    • Locked elf_path and added proc_elf_path_snapshot to prevent torn reads (also used for proc comm names).
    • Preserved resolver errno and rejected .. in the final basename when not following, preventing escapes above the sysroot.

Written for commit a80aa17. Summary will update on new commits.

cubic-dev-ai[bot]

This comment was marked as resolved.

Move guest-to-host path resolution into a single entry point in
src/syscall/path.{c,h}. path_translate_at honors three modes (no-follow,
follow, create-with-optional-parents), preserves the resolver errno on
failure so callers translate it via linux_errno() instead of flattening
to ENAMETOOLONG, and rejects ".." in the basename when follow_final is
false so an lstat cannot escape above the sysroot.

Factor --sysroot/--create-sysroot provisioning out of main.c into
src/core/sysroot.{c,h}. Validate caller-supplied sysroot length before
any heap allocation, treat collision-sentinel truncation as a hard
validation failure rather than failing open, and set errno on every parse
path so the cleanup logger reports a real reason.

Add a case-fold sidecar at src/syscall/sidecar.{c,h} for case-insensitive
macOS volumes. The sidecar keeps colliding Linux guest names distinct by
mapping each to a hidden token file plus a per-directory index, so that
guest workloads relying on case-sensitive Linux path semantics still
work on the host's case-insensitive APFS or HFS+. Procemu-virtual paths
(/proc, /sys, /dev) short-circuit the sidecar walk after normalization
so they reach the procemu intercept intact instead of failing with ENOENT
against a directory that does not exist in the sysroot.

Fix /proc/self/exe sysroot prefix strip: proc_set_sysroot stores the
realpath canonical form, so the readlink handler now canonicalizes the
stored elf_path before the prefix check, otherwise macOS symlinks such
as /var -> /private/var make the strncmp diverge and leak the host path
back to the guest.

Serialize sysroot_casefold across fork IPC so child processes keep the
sidecar feature after clone/fork. Lock elf_path against torn reads from
sibling vCPUs during execve and expose proc_elf_path_snapshot for
content-consuming callers; proc_get_elf_path keeps the legacy
boolean-test contract.
@jserv jserv merged commit 3f19ece into main May 15, 2026
4 checks passed
@jserv jserv deleted the path-handling branch May 15, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant