Skip to content

parser: emit REFERENCES edges for Python callback arguments (#363)#424

Merged
tirth8205 merged 1 commit intotirth8205:mainfrom
Pr1ncePS2002:fix/dead-code-callback-references
May 7, 2026
Merged

parser: emit REFERENCES edges for Python callback arguments (#363)#424
tirth8205 merged 1 commit intotirth8205:mainfrom
Pr1ncePS2002:fix/dead-code-callback-references

Conversation

@Pr1ncePS2002
Copy link
Copy Markdown
Contributor

Closes #363 (Python portion).

Summary

find_dead_code already considers REFERENCES edges incoming to a node when deciding whether to flag it as dead. But for Python, those edges were never being produced for the most common callback shapes — executor.submit(handler), filter(predicate, xs), map(transform, xs), df.apply(fn), returned inner functions, etc. — so callback functions were systematically reported as unused.

Root cause

In code_review_graph/parser.py, _extract_value_references dispatches on the AST node type wrapping a call's arguments:

if node_type == \"arguments\":          # JS/TS only — Python is \"argument_list\"
    self._ref_from_arguments(...)

Python's tree-sitter grammar uses argument_list. The dispatcher never fired for Python, so _ref_from_arguments (which already knows how to walk identifier children) was never invoked.

Fix

One-line change in intent: introduce a frozenset of accepted argument-container node types and dispatch on membership.

_ARGUMENTS_TYPES = frozenset({\"arguments\", \"argument_list\"})

if node_type in self._ARGUMENTS_TYPES:
    self._ref_from_arguments(...)

Mirrors the existing _PAIR_TYPES / _ARRAY_TYPES style in the same class.

Why this doesn't add noise

_ref_from_arguments only emits when the argument is a bare identifier, and _emit_reference_if_known further gates emission on the name being locally defined or imported. So print(x) where x is a local variable does not produce a spurious REFERENCES edge to a function named x in some other file.

What this does not fix

  • Keyword-argument callbacks (executor.submit(callable=h)) — keyword_argument is a different child shape; out of scope for this PR
  • Function references in module-level assignments (HANDLERS = my_func) — handled by _ref_from_assignment already
  • Other languages with first-class functions (Ruby Symbol#to_proc, Go function values, etc.) — separate work

Test plan

New fixture tests/fixtures/sample_callback_refs.py covers the three patterns called out in the issue body (executor.submit, filter, map). Two regression tests in tests/test_parser.py:

  • test_python_callback_references_emitted — asserts REFERENCES edges land on executor_callback, filter_callback, map_callback
  • test_python_callback_references_not_treated_as_dead — end-to-end check: parses the fixture, stores in a real GraphStore, runs find_dead_code, verifies the three callback functions are NOT in the dead list

Verified locally:

  • uv run pytest tests/test_parser.py -k python_callback -v — 2/2 passing
  • uv run pytest tests/test_parser.py -k \"value_ref or callback or reference\" -q — 13/13 passing (existing JS/TS REFERENCES tests unchanged)
  • uv run ruff check code_review_graph/parser.py tests/fixtures/sample_callback_refs.py — clean
  • No new failures vs main baseline (Windows-only PermissionError cluster pre-exists on both)

…5#363)

The dispatcher in `_extract_value_references` only matched the
`arguments` AST node type, which is the JS/TS shape. Python's
tree-sitter grammar uses `argument_list` for the same construct, so
function references in callback positions — `executor.submit(handler)`,
`filter(predicate, xs)`, `map(transform, xs)`, `df.apply(fn)`, etc. —
silently produced no REFERENCES edges.

Downstream effect: `find_dead_code` already checks `has_references`
(refactor.py), but for Python the edge was never created, so callback
functions were systematically flagged as dead — exactly the high false
positive rate users report in tirth8205#363 (executor pools, pandas .apply,
higher-order functions, returned closures).

Fix is one line of intent: introduce `_ARGUMENTS_TYPES = {"arguments",
"argument_list"}` and dispatch on set membership instead of string
equality. The existing `_ref_from_arguments` walker already handles
identifier children correctly and `_emit_reference_if_known` still
gates emission on the name being locally defined or imported, so this
does not introduce noise from arbitrary identifiers.

Scoped to the dispatcher only: JS/TS shorthand-property, pair-value,
and array-element handling are unchanged.
Copilot AI review requested due to automatic review settings May 3, 2026 11:02
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Tree-sitter parser’s “function-as-value” reference extraction so Python call argument containers (argument_list) are handled the same way as JS/TS (arguments), enabling REFERENCES edges for common callback patterns and preventing false-positive dead-code reports.

Changes:

  • Expand _extract_value_references dispatch to treat both arguments and argument_list as callback-argument containers.
  • Add a Python fixture demonstrating callbacks passed as bare identifiers (executor.submit / filter / map).
  • Add regression tests asserting REFERENCES edges are emitted and that find_dead_code does not flag those callbacks as dead.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
code_review_graph/parser.py Adds _ARGUMENTS_TYPES and updates value-reference dispatch so Python argument_list triggers _ref_from_arguments.
tests/fixtures/sample_callback_refs.py New fixture covering typical Python callback-as-argument usage patterns.
tests/test_parser.py New tests verifying REFERENCES edge emission and end-to-end dead-code behavior via GraphStore + find_dead_code.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tirth8205 tirth8205 merged commit 9866ea2 into tirth8205:main May 7, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dead code detection reports false positives for callback functions and higher-order usage patterns

3 participants