[python] Roll manifest files by target size during commit and compaction by XiaoHongbo-Hope · Pull Request #8373 · apache/paimon

XiaoHongbo-Hope · 2026-06-28T04:39:45Z

Purposes

Commit writes all manifest entries into a single file without rolling, producing oversized manifest files. These oversized files slow every subsequent commit that scans them. This PR fixes the above issue by rolling manifest files by target size

Tests

ManifestFileManagerTest.test_commit_manifest_exceeds_target_size

…tion Manifest files written by pypaimon commit and minor compaction were not split by manifest-target-file-size. A single OVERWRITE commit writing 400 partitions produced a 138 MiB manifest (17x the 8 MiB default target), slowing every subsequent commit that reads it. Add ManifestFileManager.rolling_write() that estimates the total serialized size, then splits entries across multiple files so each stays near the target. Apply it in FileStoreCommit delta/changelog manifest writing and ManifestFileMerger minor compaction.

…e serialization - Add try/except around chunk loop to delete already-written files on failure - Estimate entries_per_file from avg entry size instead of serializing twice - Extract _flush() and simplify write() to reuse it

…plitting When entry sizes are skewed, avg-based splitting can produce manifest files that exceed 2x the target. Switch to adaptive rolling: serialize each chunk, check actual size, shrink and re-serialize if overshooting, then adjust entries_per_chunk for the next iteration based on actual size. Add a skewed-entry test to cover this.

…eter rolling_write now takes name_prefix directly and generates file names as {prefix}-0, {prefix}-1, etc. Callers no longer append -0 suffix.

Replace full Avro serialization for size estimation with a 64-entry sample. Only serialize all entries when the sample suggests they fit in a single file. For multi-file rolling, the per-chunk adaptive logic handles estimation drift.

Record delta, changelog, and merge manifest file metas into new_manifest_files_for_abort so _cleanup_preparation_failure can delete them if a later step (e.g. manifest list write) fails. Previously the variable was declared but only populated by the merger, missing the delta/changelog files from rolling_write.

Replace sample-based estimation with fastavro.Writer streaming API: write entries one by one, flush every 100 records to check buf size, roll to a new file when size >= target. This matches Java's RollingFileWriterImpl behavior (check every N records, roll on threshold).

Replace fixed 100-record flush interval with fastavro sync_interval so buf.tell() updates after each record. Check size after every write instead of batched flushes, giving tighter control near the target. Use flush() instead of dump() to avoid writing empty trailing blocks.

…nstant

Previously _merge_candidates skipped single-file groups unconditionally. Now it only skips when the file is within the target size. An oversized single manifest is re-read and re-written through rolling_write, splitting it into multiple target-sized files.

Java's mergeCandidates unconditionally skips single-file groups. Revert the Python-only divergence to keep consistency.

…preparation failure When manifest list write fails after rolling_write succeeds, the except block now deletes the already-written manifest files before calling _cleanup_preparation_failure, preventing orphan files.

…uests rolling_write already has the byte buffer, so pass len(avro_bytes) directly instead of calling file_io.get_file_size() which issues a remote HEAD request per file.

Use written_files list instead of result to track files that need cleanup on failure. Covers the gap where _flush succeeds but _build_meta fails, leaving the current chunk's file as an orphan.

Verify each rolling manifest file is readable and entry counts match the metadata, ensuring the streaming Writer produces valid Avro.

Replace _cleanup_preparation_failure with two methods mirroring Java: - _clean_up_reuse_tmp_manifests: read delta/changelog manifest lists to find and delete their manifest files, then delete the lists - _clean_up_no_reuse_tmp_manifests: delete base manifest list, then only delete manifests in merge_after not present in merge_before

When manifest list write fails after rolling_write succeeds, reading the manifest list back for cleanup also fails. Fall back to the known metas already in hand to delete orphan manifest files.

…ove dead code - Wrap cleanup calls in try/except so cleanup errors don't mask the original preparation failure - Pass merge new_files directly instead of recomputing set difference - Remove unused write_with_meta method

…nifests

Java's cleanUpReuseTmpManifests only reads manifest list to find files to delete. If manifest list write failed, it skips cleanup. Align Python to the same behavior.

XiaoHongbo-Hope changed the title ~~[python] Split manifest files by target size during commit and compaction~~ [python] Roll manifest files by target size during commit and compaction Jun 28, 2026

XiaoHongbo-Hope added 13 commits June 28, 2026 12:51

[python] Replace fragile name parsing with explicit name_prefix param…

a650faa

…eter rolling_write now takes name_prefix directly and generates file names as {prefix}-0, {prefix}-1, etc. Callers no longer append -0 suffix.

[python] Fix flake8 E128 continuation line indent in manifest tests

d2060bd

[python] Update file_store_commit_test for renamed _write_manifest_files

bd1ec85

[python] Extract sync_interval magic number to _AVRO_SYNC_INTERVAL co…

43f1f86

…nstant

[python] Revert merger single-file split to match Java behavior

e2b0b55

Java's mergeCandidates unconditionally skips single-file groups. Revert the Python-only divergence to keep consistency.

XiaoHongbo-Hope marked this pull request as ready for review June 28, 2026 07:54

XiaoHongbo-Hope added 12 commits June 28, 2026 16:05

[python] Pass known file size to _build_meta to avoid remote HEAD req…

1e608f6

…uests rolling_write already has the byte buffer, so pass len(avro_bytes) directly instead of calling file_io.get_file_size() which issues a remote HEAD request per file.

[python] Track written files separately for rollback in rolling_write

10cd1f1

Use written_files list instead of result to track files that need cleanup on failure. Covers the gap where _flush succeeds but _build_meta fails, leaving the current chunk's file as an orphan.

[python] Add read-back assertion for rolling manifest files

2d5cce9

Verify each rolling manifest file is readable and entry counts match the metadata, ensuring the streaming Writer produces valid Avro.

[python] Remove extra blank line between test classes

8f36b3b

[python] Remove redundant manifest file cleanup in except block

0eb8094

[python] Add known metas fallback to _clean_up_reuse_tmp_manifests

247413f

When manifest list write fails after rolling_write succeeds, reading the manifest list back for cleanup also fails. Fall back to the known metas already in hand to delete orphan manifest files.

[python] Fix unbound merge_new_files and remove unused merge_after_ma…

d7f34ee

…nifests

[python] Remove trailing blank line at EOF

542aecc

[python] Remove known metas fallback to match Java CommitCleaner

235f93e

Java's cleanUpReuseTmpManifests only reads manifest list to find files to delete. If manifest list write failed, it skips cleanup. Align Python to the same behavior.

[python] Fix flake8 E128 indent in cleanup method signatures

a6790c9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[python] Roll manifest files by target size during commit and compaction#8373

[python] Roll manifest files by target size during commit and compaction#8373
XiaoHongbo-Hope wants to merge 26 commits into
apache:masterfrom
XiaoHongbo-Hope:manifest_write_fix

XiaoHongbo-Hope commented Jun 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

XiaoHongbo-Hope commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purposes

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

XiaoHongbo-Hope commented Jun 28, 2026 •

edited

Loading