fix(security): drop symlink entries before visitor sees them#12
Open
aaronjmars wants to merge 1 commit into
Open
fix(security): drop symlink entries before visitor sees them#12aaronjmars wants to merge 1 commit into
aaronjmars wants to merge 1 commit into
Conversation
A file-typed symlink planted inside a scan root makes the walker surface the entry to its visitor; the per-ecosystem scanners then open through os.Open which follows the link, parse the unrelated target as if it were the expected metadata file, and emit a package record whose package_name/version come from the target. With one symlink at e.g. node_modules/<pkg>/package.json -> ~/.config/<other-app>/state.json an attacker can have arbitrary readable JSON fields ride out through the configured records sink. Extend the existing directory-symlink defense (walk.go comment 'the walker never crosses into an unrelated subtree by indirection') to file entries: drop every fs.DirEntry whose Type carries ModeSymlink before the visitor is called. Mirrors exposure.go:161 which already takes this stance for --exposure-catalog directory loading. Detected by Aeon manual review. Severity: medium CWE-59 (Improper Link Resolution Before File Access)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The walker surfaces file-typed symlinks to its visitor, and the per-ecosystem scanners open files through
os.Openwhich follows symbolic links. A symlink planted inside a scan root therefore makes the scanner read and parse an arbitrary file outside the configured scope, then emit apackagerecord whosepackage_name/versioncome from the link target. SECURITY.md states the threat model as "a read-only filesystem walker" that "does not parse source code"; in practice it parses whatever JSON/RFC-822 file the symlink points at, which can be a config file the operator never asked to scan.This PR extends the existing directory-symlink defense (
walk.go:212-218comment: "the walker never crosses into an unrelated subtree by indirection") to file-typed entries: anyfs.DirEntrywhoseType()carriesos.ModeSymlinkis dropped before the visitor sees it.Impact
Concrete chain:
postinstallhook drops a symlink atnode_modules/<pkg>/package.jsonpointing at, say,~/.config/<other-app>/state.json(or any JSON file readable by the invoking user).bumblebeenever executes that hook, but the symlink is already on disk by the time the next scheduled scan runs.npm.ScanNodeModulesPackageJSONcallsos.Open(path)which follows the link;f.Stat().IsRegular()is true because the target is a regular file;json.Unmarshalsucceeds whenever the target has parseable JSON.packagerecord is emitted withecosystem=npm,source_file=<symlink path inside the scan root>, andpackage_name/versionlifted verbatim from the target file'sname/versionfields. Anything the JSON target exposes through those two keys is now part of the NDJSON batch the operator's receiver sees.Verified end-to-end against
cmd/bumblebeebuilt frommain(v0.1.2-0.20260523091658-611dc7920847): apackage.jsonsymlink atproj/node_modules/evil/package.json→proj/../elsewhere/oracle-target.jsonproduces a clean exit-0 NDJSON record carrying the target's field values; nothing in the parser path filters or even notices the indirection.Adjacency notes (these still work as designed; they're not regressed by this PR):
envvalues onto the record schema, so MCP env-block secrets stay out even if the symlink retargets at a Claude / Gemini host config.NameandVersionheaders, so*.dist-info/METADATAsymlinks leak only those two fields, notAuthor-emailetc.readBoundedalready rejects non-regular files viainfo.Mode().IsRegular(), so directory symlinks fail closed at the open. The hole is specifically file-typed symlinks that resolve to regular files.Severity: medium. The blast radius is bounded by what the invoking user can already read (no privilege escalation), but the exfiltration path is silent (the records sink is the operator's receiver, not the user's terminal), and the trigger is one symlink in any of the parsed file names. CWE-59 (Improper Link Resolution Before File Access — "Link Following").
Location
internal/walk/walk.go:208-217— walker visitor branch that surfaces every entry, including symlinks.internal/ecosystem/*/*.go:readBounded()—os.Openfollows the link; the subsequentIsRegular()check is true when the target is a regular file.Fix
internal/walk/walk.go: skip anyfs.DirEntrywhoseType() & os.ModeSymlinkis non-zero before calling the visitor. The existing directory-symlinkLstatinside thed.IsDir()branch was unreachable in practice —filepath.WalkDirreports directory symlinks as symlink-typed (d.IsDir() == false), so the directory case never entered that block. The single early-return now covers both directory- and file-typed symlinks at the right layer, mirroring the symmetric handling already present inexposure.go:161for--exposure-catalogdirectory loading.The package doc comment is updated to call out the new invariant ("symlink entries are never surfaced to the visitor (neither directory-typed nor file-typed)") so the next reader of
walk.godoesn't reintroduce the gap.Detected by
Aeon manual review — drove this scan from the perpendicular-axis read on SECURITY.md (the explicit "Env values are never captured" and "does not parse source code" claims invited closer reading of the read-only-walker invariant; the directory-symlink defense was the existing positive control). Scanner triage was supporting context:
semgrep(p/security-audit + p/owasp-top-ten + p/secrets across Go),trufflehog filesystem + git, andosv-scanner --recursiveall came back clean (0 ERROR/WARNING, 0 verified secrets, 0 advisories) on the v0.1.1 tag, so the finding is purely from reading the walker + per-scannerreadBoundedtogether.Verification
go test ./...— 19/19 packages pass, including:TestWalkSkipsFileSymlinksininternal/walk/walk_test.go(asserts the walker does not surface a file-typed symlink while still surfacing the sibling real file).TestFileSymlinkDoesNotExfiltrateOutOfScopeininternal/scanner/scanner_test.go(the end-to-end shape: plantnode_modules/evil/package.json -> elsewhere/stolen.json, runRun(Config{...Profile: project, Roots: ...}), assert no record carries the target'sname/versionand the symlink path never appears assource_file).TestSymlinkLoopSafety(directory-symlink-loop termination guarantee) still passes — the loop guard viaseen[dev+inode]is preserved.Reproducer (out-of-tree, attached in the agent's scratch dir) builds
cmd/bumblebeefrom this branch and re-runs the attack shape from the Impact section above:files_considered=1,records=1— onepackagerecord withpackage_name=SECRET-FIELD-FROM-OUTSIDE-SCAN-ROOT,version=v-leaked-1.2.3,ecosystem=npm,source_file=.../node_modules/evil-pkg/package.json.files_considered=0,records=0— the symlink is dropped at the walker layer, the npm dispatcher is never called, and the scan_summary closes clean.No new runtime dependencies. No behavior change for non-symlink entries. Pnpm/yarn workspaces — which use OS symlinks at the directory level (
node_modules/<top-level-pkg>→node_modules/.pnpm/<pkg>@<v>/...) — are unaffected becausefilepath.WalkDiralready declined to descend into directory symlinks; the real package metadata lives under the hard-linked / copy-on-write.pnpmstore path, which the walker reaches via the normal traversal, not via the symlink.Filed by Aeon.