Skip to content

Batch multiple transactions in a single commit in the commitlog#5018

Open
joshua-spacetime wants to merge 1 commit into
masterfrom
joshua/msync-offset-index
Open

Batch multiple transactions in a single commit in the commitlog#5018
joshua-spacetime wants to merge 1 commit into
masterfrom
joshua/msync-offset-index

Conversation

@joshua-spacetime
Copy link
Copy Markdown
Contributor

Description of Changes

Batch multiple transactions in a single commit instead of one commit per transaction.

Reduces per commit overhead such as updating the segment offset index file.

API and ABI breaking changes

None

Expected complexity level and risk

1

Testing

Refactor

Also pass a range to msync when writing entries to the segment offset index file,
to be explicit and avoid flushing/examining unnecessary pages.
@joshua-spacetime joshua-spacetime requested a review from kim May 14, 2026 08:18
Comment on lines 372 to -377
let _ = self.append_internal().map_err(|e| {
warn!("failed to append to offset index: {e:?}");
});
let _ = self
.head
.async_flush()
.map_err(|e| warn!("failed to flush offset index: {e:?}"));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flush isn't needed since append_internal already flushes.

for tx in tx_buf.drain(..) {
clog.commit([tx.into_transaction()])?;
}
clog.commit(tx_buf.drain(..).map(|tx| tx.into_transaction()))?;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could include quite a lot of transactions, up to the size of the durability queue size. Not sure if this should be bounded further or if it matters at all.

Copy link
Copy Markdown
Contributor

@kim kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought about it many times, but at this point I'm not on board moving away from the 1-tx-per-commit restriction.

A torn write in the middle of a commit is guaranteed to destroy commit.n transactions. A smaller number of transactions per commit results in a higher number of commits per write, and increases the chance that at least some transactions are recoverable.

The trouble is that we are rather prone to torn writes, not least because they (the writes) are unaligned. Just advising to use confirmed reads is not enough, I'd argue, because users have no way of even knowing how many transactions could potentially be lost -- outside of benchmarking scenarios, I at least would want to design my application such that it doesn't write too much ahead of the "uncertainty window" of the durability layer.

We can certainly do better by improving our I/O model and recovery mechanisms, but at this point I think we'd basically weaken durability guarantees, and I don't think this is a good idea.

The offset index changes look fine to me. I would suggest to increase Options::offset_index_interval_bytes if there is data that suggests that we're updating the index too often.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants