Skip to content

[fix](iceberg) Fix NPE in COUNT(*) pushdown when snapshot summary omits total-* counters#64648

Open
raghav-reglobe wants to merge 1 commit into
apache:masterfrom
raghav-reglobe:fix-iceberg-count-pushdown-npe
Open

[fix](iceberg) Fix NPE in COUNT(*) pushdown when snapshot summary omits total-* counters#64648
raghav-reglobe wants to merge 1 commit into
apache:masterfrom
raghav-reglobe:fix-iceberg-count-pushdown-npe

Conversation

@raghav-reglobe

Copy link
Copy Markdown

Proposed changes

IcebergScanNode.getCountFromSnapshot() reads total-equality-deletes,
total-position-deletes and total-records from the Iceberg snapshot summary
and calls .equals() / Long.parseLong() directly on the Map.get() results.

An Iceberg snapshot summary is not guaranteed to carry these total-*
counters — snapshots written by compaction/replace (and some writers) may omit
them. When a counter is absent, SELECT COUNT(*) throws:

java.lang.NullPointerException: Cannot invoke "String.equals(Object)" because
the return value of "java.util.Map.get(Object)" is null
    at org.apache.doris.datasource.iceberg.source.IcebergScanNode.getCountFromSnapshot(IcebergScanNode.java:1154)
    at org.apache.doris.datasource.iceberg.source.IcebergScanNode.isBatchMode(...)
    at org.apache.doris.datasource.FileQueryScanNode.createScanRangeLocations(...)

SELECT * on the same table works (it scans); only COUNT(*) fails (it takes
the metadata-count shortcut). Reproducible on current master.

Fix

Extract the summary parsing into a pure static
getCountFromSummary(Map<String, String> summary, boolean ignoreDanglingDelete)
that null-checks the counters and falls back to a normal scan (return -1,
the method's existing "cannot push down count" signal) when any required
counter is absent. Behaviour is otherwise unchanged.

A similar unguarded access exists in IcebergUtils
(Long.parseLong(summary.get(TOTAL_RECORDS)) - Long.parseLong(summary.get(TOTAL_POSITION_DELETES))).
Happy to guard it in this PR or a follow-up — let me know your preference.

Release note

Fix a NullPointerException on SELECT COUNT(*) over an Iceberg table whose
latest snapshot summary omits the total-* counters (e.g. snapshots produced
by compaction/replace).

Check List

  • New feature / bug fix has unit test
    (IcebergCountPushDownTest: missing-counter, no-delete, equality-delete and
    position-delete cases)
  • Behaviour-preserving for complete summaries; only adds a null-safe
    fall-back to scan

…ts total-* counters

IcebergScanNode.getCountFromSnapshot() read total-equality-deletes /
total-position-deletes / total-records from the snapshot summary and called
.equals() / Long.parseLong() directly on the Map.get() results. An Iceberg
snapshot summary is not guaranteed to carry these counters (snapshots written
by compaction/replace, or some writers, may omit them), so `SELECT COUNT(*)`
threw a NullPointerException ("Cannot invoke String.equals because
Map.get(...) is null") on such tables while `SELECT *` worked fine.

Extract the summary parsing into a pure static getCountFromSummary() that
null-checks the counters and falls back to a normal scan (returns -1) when any
is absent. Add a unit test covering the missing-counter, no-delete,
equality-delete and position-delete cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants