Skip to content

feat(update): add OverwriteFiles for overwrite snapshot commits#741

Merged
wgtmac merged 5 commits into
apache:mainfrom
lishuxu:feature/overwrite-files
Jun 24, 2026
Merged

feat(update): add OverwriteFiles for overwrite snapshot commits#741
wgtmac merged 5 commits into
apache:mainfrom
lishuxu:feature/overwrite-files

Conversation

@lishuxu

@lishuxu lishuxu commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Summary:
Add a production OverwriteFiles builder that brings iceberg-cpp to semantic parity with Java's BaseOverwriteFiles. It supports explicit file replacement (DeleteFile + AddFile) and range-based replacement (OverwriteByRowFilter + AddFile) with the same family of pre-commit concurrency validations. The builder is a thin subclass of MergingSnapshotUpdate and reuses the existing commit kernel (Apply/summary/retry/cleanup) unchanged.

Changes:

  • New OverwriteFiles class (src/iceberg/update/overwrite_files.{h,cc}) and Table::NewOverwrite() / Transaction::NewOverwrite() entry points.
  • Builder surface: AddFile, DeleteFile, bulk DeleteFiles, OverwriteByRowFilter, ValidateFromSnapshot, ConflictDetectionFilter, ValidateNoConflictingData, ValidateNoConflictingDeletes, ValidateAddedFilesMatchOverwriteFilter, WithCaseSensitivity.
  • Validate(): conflict-filter resolution, concurrent add/delete conflict checks, and strict added-file range validation (projection + StrictMetricsEvaluator).
  • Tests (overwrite_files_test.cc, 45 cases) and CMake/meson wiring.

Behavior alignment with Java:

  • operation() returns append/delete/overwrite from builder content.
  • Conflict-filter resolution mirrors BaseOverwriteFiles (explicit -> row filter -> AlwaysTrue); replaced-file delete checks honor ConflictDetectionFilter.
  • Strict added-file validation uses a single DataSpec(), rejecting multi-spec and empty added-file sets.
  • Deviations: public WithCaseSensitivity (vs caseSensitive) to avoid a protected-name clash; ValidateFromSnapshot rejects negative ids early.

Comment thread src/iceberg/table.cc
Comment thread src/iceberg/update/overwrite_files.h Outdated
Comment thread src/iceberg/update/overwrite_files.h
shuxu.li and others added 5 commits June 23, 2026 23:30
Summary:
Add a production OverwriteFiles builder that brings iceberg-cpp to semantic
parity with Java's BaseOverwriteFiles. It supports explicit file replacement
(DeleteFile + AddFile) and range-based replacement (OverwriteByRowFilter +
AddFile) with the same family of pre-commit concurrency validations. The
builder is a thin subclass of MergingSnapshotUpdate and reuses the existing
commit kernel (Apply/summary/retry/cleanup) unchanged.

Changes:
- New OverwriteFiles class (src/iceberg/update/overwrite_files.{h,cc}) and
  Table::NewOverwrite() / Transaction::NewOverwrite() entry points.
- Builder surface: AddFile, DeleteFile, bulk DeleteFiles, OverwriteByRowFilter,
  ValidateFromSnapshot, ConflictDetectionFilter, ValidateNoConflictingData,
  ValidateNoConflictingDeletes, ValidateAddedFilesMatchOverwriteFilter,
  WithCaseSensitivity.
- Validate(): conflict-filter resolution, concurrent add/delete conflict checks,
  and strict added-file range validation (projection + StrictMetricsEvaluator).
- Tests (overwrite_files_test.cc, 45 cases) and CMake/meson wiring.

Behavior alignment with Java:
- operation() returns append/delete/overwrite from builder content.
- Conflict-filter resolution mirrors BaseOverwriteFiles (explicit -> row filter
  -> AlwaysTrue); replaced-file delete checks honor ConflictDetectionFilter.
- Strict added-file validation uses a single DataSpec(), rejecting multi-spec
  and empty added-file sets.
- Deviations: public WithCaseSensitivity (vs caseSensitive) to avoid a
  protected-name clash; ValidateFromSnapshot rejects negative ids early.
…rect AddFile/DeleteFile paths and trim overly verbose test comments.
@wgtmac wgtmac force-pushed the feature/overwrite-files branch from 14f6538 to 23699bb Compare June 24, 2026 02:26
@wgtmac wgtmac merged commit 3c9d13d into apache:main Jun 24, 2026
21 checks passed
@wgtmac

wgtmac commented Jun 24, 2026

Copy link
Copy Markdown
Member

Thanks @lishuxu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants