out_s3: preserve $INDEX sequence state across restarts#11988
Conversation
The S3 output stores the $INDEX sequence counter in a plain file under the metadata stream directory (sequence/index_metadata/seq_index_<id>). This file is not a Chunk I/O chunk, so the metadata stream has zero registered fstore files. On (graceful) shutdown, flb_fstore_destroy() recursively deletes any stream that has no files, which removes the metadata stream and the persisted $INDEX value. On the next start the index is missing and resets to 0, so subsequent uploads reuse object keys (events_0, events_1, ...) and overwrite previously uploaded objects. This contradicts the documented behavior that $INDEX is saved in store_dir and continues from its last value after a restart on the same disk. Detach the metadata stream reference in s3_store_exit() before the store is destroyed (without deleting the directory), so the persisted index survives a restart and can be recovered by init_seq_index(). Signed-off-by: Damon P. Cortesi <d.lifehacker@gmail.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughIn Changes$INDEX metadata stream preservation on shutdown
Estimated code review effort🎯 2 (Simple) | ⏱️ ~5 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
PR follow-up: the exit-only version of this fix is not sufficient. End-to-end testing showed that the I am updating the proposed approach to make the This should cover both classes of deletion: graceful-exit fstore destruction and any startup/load cleanup path that prunes empty streams. |
Summary
Fixes #11987.
The S3 output persists the
$INDEXsequence counter as a plain file under the metadata stream directory (<store_dir>/<bucket>/sequence/index_metadata/seq_index_<id>). Because this file is not a Chunk I/O chunk, thesequencefstore stream has zero registered files, andflb_fstore_destroy()deletes any stream with zero files on shutdown — recursively removing the directory and the persisted$INDEX. After a graceful restart the index resets to0and uploads overwrite previously-written objects, contradicting the documented behavior that$INDEXcontinues from its last value whenstore_diris on a persistent disk.This detaches the metadata stream reference in
s3_store_exit()(without deleting the directory) before the store is destroyed, so the index file survives on disk andinit_seq_index()recovers it on the next start.Root cause
$INDEXis written as a raw file viainit_seq_index()/write_seq_index()inplugins/out_s3/s3.c, inside thesequencefstore stream, under a nestedindex_metadata/subdirectory.cio_scan_stream_files()(lib/chunkio/src/cio_scan.c) only registersDT_REGentries directly under a stream dir, so the nestedindex_metadata/is ignored and thesequencestream has 0 files.flb_fstore_destroy()(src/flb_fstore.c) deletes any stream withfiles == 0(flb_fstore_stream_destroy(stream, FLB_TRUE)→cio_stream_delete()→ recursive directory delete).s3_store_exit()(plugins/out_s3/s3_store.c) callsflb_fstore_destroy()on shutdown, so the metadata stream and the persisted$INDEXare removed on every graceful exit.Fix
flb_fstore_stream_destroy(stream, FLB_FALSE)removes the fstore reference fromfs->streamswithout deleting the on-disk directory; the underlyingcio_streamis still freed by the subsequentcio_destroy(). Only thesequence(metadata) stream is affected —multipart_upload_metadatauses real Chunk I/O chunks, so it is unaffected by the empty-stream cleanup.Testing
store_diron a persistent volume, a marker file placed insidesequence/and theseq_index_<id>file are both gone after a SIGTERM restart (thesequence/dir returns with a fresh inode), while a marker placed one level above on the same volume survives — confirming the volume is healthy and Fluent Bit is deleting its own metadata stream. NoSuccessfully recovered indexlog appears, and$INDEXresets to 0.init_seq_index()takes theread_seq_index()recovery path.ctx->stream_metadata(plugins/out_s3/s3.h),flb_fstore_stream_destroy(struct flb_fstore_stream *, int delete)(include/fluent-bit/flb_fstore.h).Notes
4.2.x,5.0.x, andmaster(the$INDEX+ metadata-stream design is unchanged across these).init_seq_index()state file.Summary by CodeRabbit