Bug repro: RI.SET AOF ordering divergence under concurrent writes#1819
Draft
tiagonapoli wants to merge 1 commit into
Draft
Bug repro: RI.SET AOF ordering divergence under concurrent writes#1819tiagonapoli wants to merge 1 commit into
tiagonapoli wants to merge 1 commit into
Conversation
RI.SET uses a shared lock (BfTree is internally thread-safe), so concurrent writes to the same field execute in parallel. The AOF enqueue happens after the native insert as a separate unserialized step, so the AOF log order may not match BfTree's internal last-writer-wins order. On AOF replay (recovery), the replayed last write may differ from the primary's actual winner, causing primary/replica divergence. The test uses 8 workers writing the same field per round across 200 rounds, then recovers from AOF and compares. Empirically ~2-5% of rounds diverge. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
a67003c to
d8d4365
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
\RI.SET\ uses a shared lock (not exclusive) because BfTree is internally thread-safe for point operations. However, AOF logging happens after the native insert as a separate unserialized call. When multiple threads concurrently write the same field:
This causes primary/replica divergence.
Test
The test (\RISetAofOrderingDivergenceTest) uses 8 workers writing the same field per round across 200 rounds with a \Barrier\ for synchronization. After all rounds, it commits AOF, recovers, and compares each field's value.
Sample output (first run):
\
DIVERGENCE round=61 field=field-0061 primary=v-0061-w02 recovered=v-0061-w04
DIVERGENCE round=72 field=field-0072 primary=v-0072-w01 recovered=v-0072-w02
DIVERGENCE round=143 field=field-0143 primary=v-0143-w02 recovered=v-0143-w00
DIVERGENCE round=194 field=field-0194 primary=v-0194-w01 recovered=v-0194-w04
DIVERGENCE round=199 field=field-0199 primary=v-0199-w03 recovered=v-0199-w02
Total rounds=200 workers=8 mismatches=5
\\
~2-5% of rounds show divergence.
Root cause
In \StorageSession.RangeIndexSet():
Steps 2 and 3 are not atomic relative to other threads holding shared locks on the same key, so two threads can interleave their insert+log in different orders.
Possible fixes