Skip to content

fix(julia): port parameterized-type / qualified-def / qualified-import fixes to WASM#1128

Open
carlos-alm wants to merge 3 commits into
mainfrom
fix/1111-julia-wasm-extractor-bugs
Open

fix(julia): port parameterized-type / qualified-def / qualified-import fixes to WASM#1128
carlos-alm wants to merge 3 commits into
mainfrom
fix/1111-julia-wasm-extractor-bugs

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Port the three Julia extractor fixes from feat(native): port Julia extractor to Rust #1098 (native) to the WASM extractor at src/extractors/julia.ts per the dual-engine policy.
  • Adds a findBaseName helper that recurses through binary_expression / parametrized_type_expression / parameterized-identifier / type-parameter / type-argument wrappers, mirroring the native find_base_name.
  • Guards module prefixing with !base.includes('.') in handleFunctionDef and handleAssignment so qualified names like Base.show inside module Foo no longer become Foo.Base.show.
  • Updates handleImport to walk into selected_import children, treating scoped_identifier modules as the source and the trailing segment as the imported display name.
  • Adds a signatureCall helper to unwrap function_definition → signature → call_expression. Without this, findChild(node, 'call_expression') was matching the function body's first call (e.g. println(...)) and recording it as the function name — a latent bug not surfaced by the existing fixtures, but required for the qualified-def test to pass.

Test plan

  • npx vitest run tests/parsers/julia.test.ts — 10/10 pass, including the four new parity tests (extracts parameterized struct base name, qualified short-form method does not double-prefix, qualified function def does not double-prefix, selected_import handles qualified module).
  • cargo test --lib julia — native side still green (16/16).
  • Biome clean on touched files.

Out of scope

While auditing, I found three additional WASM-only divergences not in scope of #1111 — broken handleAbstractDef, broken handleMacroDef, and missing signature-call skip in handleCall. Filed as #1126.

Closes #1111

…t fixes to WASM

The native Julia extractor was fixed in #1098 for three issues that were
already latent in the WASM extractor but not surfaced by the existing
fixtures. Per the dual-engine policy, port the fixes so both engines
produce identical results.

1. Parameterized struct names (`struct Vec{T} <: AbstractArray{T,1}`) no
   longer silently emit the raw type-head text as the definition name —
   `findBaseName` recurses through `binary_expression`,
   `parametrized_type_expression`, and related wrappers to locate the
   base identifier.

2. Qualified function defs / short-form methods inside a module no
   longer get double-prefixed: `function Base.show ... end` inside
   `module Foo` now records `Base.show` (not `Foo.Base.show`); same for
   short-form `Foo.bar(x, y) = x + y` inside `module Outer`.

3. `selected_import` with a qualified module (`import LinearAlgebra.BLAS: gemm`)
   now correctly records `LinearAlgebra.BLAS` as the import source and
   `gemm` as the imported name.

Also fixes a related latent bug: `findChild(node, 'call_expression')` on
a `function_definition` was matching the body's first call (e.g.
`println(...)`) instead of the signature, because the signature is
wrapped in a `signature` node. Added a `signatureCall` helper mirroring
the native code.

Closes #1111
@claude
Copy link
Copy Markdown

claude Bot commented May 15, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Julia WASM Extractor Bug Fixes

Todo List:

  • Phase 1: Root cause analysis - understand the problem being solved
  • Phase 2: Check for configuration changes that might mask issues
  • Phase 3: Verify test integrity and quality
  • Phase 4: Code quality review of Julia extractor implementation
  • Phase 5: Backlog compliance assessment
  • Provide final review recommendation

Root Cause Analysis

Problem Being Solved: This PR addresses real dual-engine parity bugs in the Julia WASM extractor. Three specific issues from #1098 (native engine fixes) needed to be ported to WASM:

  1. Parameterized type base name extraction - struct Vec{T} was not extracting the base name "Vec" correctly
  2. Qualified definition double-prefixing - function Base.show() inside module Foo was becoming Foo.Base.show instead of Base.show
  3. Import resolution for qualified modules - import Foo.Bar: baz was not properly handling scoped identifiers as module sources

Why This Approach: The fix correctly implements dual-engine parity by mirroring the native Rust implementation. The approach adds targeted helper functions (findBaseName, signatureCall) and guards (!base.includes('.')) rather than broad refactoring, which is appropriate for a bug fix.

Risk Assessment: Low risk. This is correcting extraction accuracy to match the native engine - no behavioral changes to the query layer or API contracts.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies, builds on existing tree-sitter infrastructure
  • Foundation-aligned: ✓ — Maintains dual-engine parity (P1: always-current) with no API changes (P5: embeddable-first)
  • Problem-fit: 5 — Directly prevents hallucinated dependencies by fixing extraction bugs that cause missing/incorrect symbols in the graph
  • Breaking: No — Pure internal bug fixes with no API or output changes
  • Tier: 1

Critical Concerns

None identified. This is a well-executed bug fix that addresses real dual-engine parity issues.

Strengths:

  • Targeted fixes: Each helper function addresses a specific extraction bug without over-engineering
  • Comprehensive testing: New tests directly validate the bug fixes with both positive and negative assertions
  • Clear documentation: Comments explain Julia grammar specifics and why each fix is needed
  • Maintains existing patterns: Follows established code style and structure

Code quality highlights:

  • findBaseName() properly handles Julia's complex parameterized type structures (lines 189-202)
  • qualifyName() prevents double-prefixing with simple !base.includes('.') guard (lines 86-93)
  • signatureCall() fixes latent bug where body calls were matched instead of signature calls (lines 105-109)
  • Enhanced import handling correctly distinguishes module sources from imported names (lines 316-335)

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: High-quality bug fix that addresses real dual-engine parity issues. Well-tested, follows established patterns, and maintains code quality. The approach is appropriately targeted rather than over-engineered. Only minor consideration is that this is routine maintenance rather than exceptional work.

Job Run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Codegraph Impact Analysis

8 functions changed9 callers affected across 2 files

  • find_base_name in crates/codegraph-core/src/extractors/julia.rs:319 (4 transitive callers)
  • qualifyName in src/extractors/julia.ts:86 (4 transitive callers)
  • signatureCall in src/extractors/julia.ts:113 (3 transitive callers)
  • handleFunctionDef in src/extractors/julia.ts:119 (2 transitive callers)
  • handleAssignment in src/extractors/julia.ts:153 (2 transitive callers)
  • findBaseName in src/extractors/julia.ts:202 (3 transitive callers)
  • handleStructDef in src/extractors/julia.ts:217 (2 transitive callers)
  • handleImport in src/extractors/julia.ts:313 (2 transitive callers)

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 15, 2026

Greptile Summary

This PR ports the three Julia symbol-extraction fixes from #1098 (native Rust engine) to the WASM TypeScript extractor, closing the dual-engine parity gap tracked by #1111. All new helpers — signatureCall, findBaseName/TYPE_HEAD_WRAPPERS, and qualifyName — are direct translations of the Rust equivalents, and the reworked handleImport logic for selected_import is byte-for-byte equivalent to the native path.

  • signatureCall: unwraps function_definition → signature → call_expression so handleFunctionDef reads the actual signature rather than the first body call; identical logic to Rust signature_call.
  • findBaseName / handleStructDef: replaces the brittle findChild(typeHead, 'identifier') ?? typeHead with a proper recursive walk through binary_expression / parametrized_type_expression / parameterized_identifier wrappers; correctly extracts name and supertype for both plain and parameterized structs.
  • qualifyName guard + handleImport rework: prevents double-prefixing for qualified definitions (Base.show inside module Foo) and correctly splits selected_import into source module vs. imported names.

Confidence Score: 5/5

Safe to merge — all new helpers are faithful ports of the already-reviewed Rust implementations, and all four new test cases plus the existing suite pass.

Each of the three WASM helpers (signatureCall, findBaseName, qualifyName) is a direct translation of its Rust counterpart with identical branching logic. The handleImport rewrite mirrors the native handle_import case-by-case. The two acknowledged divergences (handleMacroDef not using signatureCall, handleCall signature-skip guard) are pre-existing, explicitly documented, and filed as #1126. No new defects are introduced by these changes.

No files require special attention.

Important Files Changed

Filename Overview
src/extractors/julia.ts Ports three Julia extractor fixes to WASM: adds signatureCall helper (mirrors native), findBaseName with TYPE_HEAD_WRAPPERS, qualifyName guard against double-prefixing, and reworked handleImport for selected_import. Logic is in full dual-engine parity with the Rust extractor.
crates/codegraph-core/src/extractors/julia.rs Documentation-only changes: improves doc-comments on signature_call and find_base_name to explain grammar assumptions and intentional exclusions; also removes type_parameter_list / type_argument_list from the wrapper match arms. No behavioural changes to Rust logic.
tests/parsers/julia.test.ts Adds four new parity tests covering parameterized struct extraction, non-parameterized struct inheritance, double-prefix prevention for qualified short-form and long-form definitions, and selected_import with a scoped module. Strengthens the existing import test with tighter assertions.

Reviews (5): Last reviewed commit: "test(julia): assert selected_import shap..." | Re-trigger Greptile

Comment thread src/extractors/julia.ts
Comment on lines +215 to +228
const binary = findChild(typeHead, 'binary_expression');
if (binary) {
// Walk into each side of the binary expression to find the base-name
// identifier — handles parameterized forms like `Vec{T} <: AbstractArray{T,1}`.
const sides: TreeSitterNode[] = [];
for (let i = 0; i < binary.childCount; i++) {
const c = binary.child(i);
if (c && c.type !== 'operator') sides.push(c);
}
nameNode = sides[0] ? findBaseName(sides[0]) : null;
supertypeNode = sides[1] ? findBaseName(sides[1]) : null;
} else {
nameNode = findBaseName(typeHead);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing test for non-parameterized struct inheritance

The old code explicitly looked for a subtype_expression node (findChild(typeHead, 'subtype_expression')) to detect Point <: AbstractPoint. That path has been entirely removed and replaced with a binary_expression lookup. If the tree-sitter-julia grammar represents simple non-parameterized inheritance with a subtype_expression node (rather than binary_expression), findChild(typeHead, 'binary_expression') returns null, findBaseName(typeHead) recurses without entering subtype_expression (not in TYPE_HEAD_WRAPPERS), and returns null — so the entire struct is silently dropped from ctx.definitions. The new parameterized test (Vec{T} <: AbstractArray{T,1}) confirms the grammar uses binary_expression for that form, but there is no test for the simple case struct Point <: AbstractPoint to verify the same grammar node is used and the extends relationship is still recorded.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 8c2e148 — added a test for non-parameterized struct inheritance (struct Point <: AbstractPoint) in tests/parsers/julia.test.ts. Confirmed via AST inspection that the Julia grammar wraps both the simple and parameterized cases in a binary_expression node, so the new code path handles both correctly. The native engine already had this test (crates/codegraph-core/src/extractors/julia.rs:592), so this brings WASM to parity.

Comment thread src/extractors/julia.ts
Comment on lines +181 to +202
const TYPE_HEAD_WRAPPERS: ReadonlySet<string> = new Set([
'binary_expression',
'parametrized_type_expression',
'parameterized_identifier',
'type_parameter_list',
'type_argument_list',
]);

function findBaseName(node: TreeSitterNode): TreeSitterNode | null {
if (node.type === 'identifier') return node;
const direct = findChild(node, 'identifier');
if (direct) return direct;
for (let i = 0; i < node.childCount; i++) {
const child = node.child(i);
if (!child) continue;
if (TYPE_HEAD_WRAPPERS.has(child.type)) {
const found = findBaseName(child);
if (found) return found;
}
}
return null;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 type_parameter_list / type_argument_list in TYPE_HEAD_WRAPPERS can yield the wrong identifier

findBaseName checks findChild(node, 'identifier') before recursing, so in practice the struct name is found before the loop reaches a type_parameter_list or type_argument_list. However, if findBaseName is ever called with a node that lacks a direct identifier child and does have one of those wrapper types as a child — for example, a future call site or an unusual parameterized form — the function will recurse into type_parameter_list and return the first type-parameter identifier (e.g. T) instead of the struct name. Removing those two entries from TYPE_HEAD_WRAPPERS would eliminate the risk without affecting correctness.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 8c2e148 — removed type_parameter_list and type_argument_list from TYPE_HEAD_WRAPPERS in both the WASM and native engines (preserving dual-engine parity per CLAUDE.md). AST inspection confirmed Julia's grammar uses curly_expression for {T} constructs, not those node kinds, so the entries were dead code. Removing them eliminates the risk of recursing into a type-parameter list and returning a type variable as the struct name, as you noted.

…ype test (#1128)

- Remove type_parameter_list / type_argument_list from TYPE_HEAD_WRAPPERS
  in both WASM and native engines. Julia grammar uses curly_expression
  for {T} constructs, so these were dead code. Removing them prevents
  findBaseName from ever recursing into a type-parameter list and
  returning a type variable (e.g. T) instead of the struct name.
- Add WASM test for non-parameterized struct inheritance
  (struct Point <: AbstractPoint). The native engine already covers
  this case; the WASM side now has parity.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

…lback (#1128)

- Strengthen the `import Base: show` test to assert the corrected
  source/names shape (was only checking that imports were emitted at
  all, so a regression to the broken pre-fix shape would have slipped
  through).
- Document the grammar assumption behind `signatureCall` /
  `signature_call` in both engines: the call_expression fallback exists
  only for defensive grammar-drift protection, not as a routine path —
  if it ever fires on a real definition, the function name will silently
  match the first body call_expression instead.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed Greptile's round-2 feedback in 92ce81b:

  • Issue 1 (signatureCall fallback may match body call): Documented the grammar assumption in both engines (WASM + native). The fallback to findChild(node, 'call_expression') exists only as defensive protection against grammar drift — if it ever fires on a real definition, callers must treat it as a parser/grammar mismatch worth investigating.
  • Issue 2 (existing selected_import test doesn't assert corrected source/names): Strengthened the import Base: show test to assert source: 'Base' and names: ['show'], with negative assertions that the names array does not contain 'Base'. The pre-fix broken shape would now fail this test.

Both Rust and TypeScript still in parity; all julia tests pass (11 WASM, 16 native).

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Julia WASM extractor has the same parameterized-type / qualified-def / qualified-selected-import bugs that #1098 fixed in native

1 participant