Skip to content

follow-up: backfill misses newly-added dropped-language files on incrementals #1083

@carlos-alm

Description

@carlos-alm

Deferred from PR #1082 review.

Original reviewer comment: #1082 (comment)

Context:

PR #1082 gates backfillNativeDroppedFiles on result.isFullBuild || result.removedCount > 0 to skip a ~45ms fs walk + DB query overhead on clean incrementals against current binaries (which carry #1070's detect_removed_files filter).

There is a narrow correctness gap: when a user adds a brand-new file with a dropped-language extension (.clj, .jl, .r, .erl, .fs, .gleam) during an incremental pass on a current binary, the orchestrator's narrower file_collector never enumerates the new file. As a result:

  • removedCount = 0 (nothing was deleted)
  • isFullBuild = false (this is an incremental)
  • changedCount = 0 for the dropped-language file (orchestrator's collector skips that extension entirely)

So the gate evaluates to false, backfillNativeDroppedFiles is skipped, and the new file is absent from nodes/file_hashes until the next forced full rebuild.

Workaround (current): Users can run codegraph build --force or modify any code in a Rust-supported extension alongside the new file.

Possible fixes:

  1. Extend the gate to detect dropped-language additions cheaply — e.g. compare on-disk count of dropped-language files vs nodes count for those extensions (one query, one fs walk filter).
  2. Have the orchestrator's narrower file_collector also surface a droppedExtensionCount for added files in skipped extensions, so JS can gate on it.
  3. Track dropped-language file additions through the journal (works only in codegraph watch mode).

Severity is low (rare workflow, recoverable via --force), but the silent gap should be closed for full correctness.

Metadata

Metadata

Assignees

No one assigned

    Labels

    follow-upDeferred work from PR reviews that needs tracking

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions