Skip to content

AOF: Use log addresses for single-physical-log mode and add transaction acquire-barrier#1803

Open
vazois wants to merge 20 commits into
mainfrom
vazois/mlog-txn
Open

AOF: Use log addresses for single-physical-log mode and add transaction acquire-barrier#1803
vazois wants to merge 20 commits into
mainfrom
vazois/mlog-txn

Conversation

@vazois
Copy link
Copy Markdown
Contributor

@vazois vazois commented May 14, 2026

Summary

Optimizes AOF entry format for single-physical-log mode and fixes transaction replay coordination.

Changes

Use log addresses instead of sequence numbers (single-physical-log mode)

  • Eliminates 8 bytes per AOF entry by using BasicHeader (16B) instead of AofShardedHeader (24B) for non-txn entries
  • Adds AofSingleLogTransactionHeader (50B) replacing AofTransactionHeader (58B) for txn entries
  • Threads per-entry log addresses through RecoverLogDriver and cluster replay paths
  • Multi-physical-log path remains unchanged

Add acquire-barrier for transaction lock coordination

  • When multiple replay tasks process the same transaction on different sublogs, lock acquisition was previously unsynchronized
  • Adds a rendezvous barrier before xnManager.Run() keyed on TxnStart sequence number, preventing deadlocks from conflicting lock ordering

Fix batch read consistency propagation

  • After batch validation,
    eplicaReadContext.maximumSessionSequenceNumber was not updated with the batch result, breaking session-level prefix consistency for subsequent single-key reads

Note: This is a breaking change for AOF format in single-physical-log mode.

Copilot AI review requested due to automatic review settings May 14, 2026 19:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Reduces per-entry AOF size in single-physical-log + multi-replay mode by switching from sequence-number-bearing headers to log-address-based ordering, introduces an acquire-barrier so concurrent replay tasks lock transaction keys in a consistent order, and propagates batch read context back to the session context for prefix consistency.

Changes:

  • New AofSingleLogTransactionHeader (50B) and split of TransactionHeader into SingleLogTransactionHeader/MultiLogTransactionHeader; entry log addresses threaded through replay paths in place of embedded sequence numbers for single-physical-log mode.
  • Added an acquire-barrier (keyed on TxnStart address/sequence) before txnManager.Run() in addition to the existing release-barrier on TxnCommit, plus a new TransactionGroup.StartSequenceNumber.
  • Fixed AfterConsistentReadKeyBatch to copy batch read context (maximumSessionSequenceNumber, lastVirtualSublogIdx, lastHash) back into replicaReadContext.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
libs/server/AOF/AofHeader.cs Splits transaction header enum into single/multi log variants and adds AofSingleLogTransactionHeader.
libs/server/AOF/GarnetLog.cs Refactors enqueue paths to emit BasicHeader/AofSingleLogTransactionHeader for single-physical-log mode.
libs/server/AOF/GarnetAppendOnlyFile.cs Skips creating/resetting the sequence number generator when only one physical sublog is used.
libs/server/AOF/AofProcessor.cs Adds SinglePhysicalLogPreprocessKey, header-type-aware sequence/participant extraction, threads entryAddress through replay/skip paths.
libs/server/AOF/Recover/RecoverLogDriver.cs Computes and passes entryLogAddress to processor APIs.
libs/server/AOF/ReplayCoordinator/AofReplayCoordinator.cs Adds acquire-barrier; refactors ProcessSynchronizedOperation to accept seq/participant directly; threads entry addresses through.
libs/server/AOF/ReplayCoordinator/AofReplayContext.cs Adds startSequenceNumber to transaction group creation.
libs/server/AOF/ReplayCoordinator/TransactionGroup.cs Adds StartSequenceNumber field for acquire-barrier keying.
libs/server/AOF/ReadConsistency/ReplicaReadSessionContext.cs Propagates batch read context into session context after batch validation.
libs/cluster/.../ReplicaReplayDriver.cs Computes entryLogAddress and forwards it to replay APIs / ReplayRecord.
libs/cluster/.../ReplicaReplayTask.cs Adds entryAddress to ReplayRecord and forwards it to processor calls.

Comment thread libs/server/AOF/GarnetLog.cs Outdated
Comment thread libs/server/AOF/AofProcessor.cs Outdated
Comment thread libs/server/AOF/GarnetAppendOnlyFile.cs Outdated
Comment thread libs/server/AOF/AofProcessor.cs Outdated
Comment thread libs/server/AOF/ReplayCoordinator/AofReplayCoordinator.cs
@vazois vazois marked this pull request as draft May 19, 2026 18:08
@vazois vazois force-pushed the vazois/mlog-txn branch 2 times, most recently from bcb84df to 3b40350 Compare May 19, 2026 21:41
@vazois vazois marked this pull request as ready for review May 19, 2026 21:52
@vazois vazois force-pushed the vazois/mlog-txn branch from dc55ae7 to 069fb3e Compare May 20, 2026 21:16
vazois and others added 20 commits May 21, 2026 10:15
…aReadContext

After a successful batch validation, replicaReadContext.maximumSessionSequenceNumber
was never updated with the batch result. This could break session-level prefix
consistency for subsequent single-key reads on different sublogs, as the freshness
wait in VerifyKeyFreshness would use a stale maximumSessionSequenceNumber.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When AofPhysicalSublogCount == 1 and AofReplayTaskCount > 1, use log
entry addresses instead of embedded sequence numbers for read consistency
ordering. This eliminates 8 bytes per AOF entry by using BasicHeader
(16B) instead of AofShardedHeader (24B) for non-txn entries, and
AofSingleLogTransactionHeader (50B) instead of AofTransactionHeader
(58B) for txn entries.

Key changes:
- Add AofSingleLogTransactionHeader struct and enum value
- Update all 8 GarnetLog.Enqueue methods to use lighter headers
- Add SinglePhysicalLogPreprocessKey that uses entry addresses
- Update CanReplay/SkipReplay/GetReplayTaskIdx for new header types
- Thread per-entry log addresses through RecoverLogDriver and cluster
  replay paths (ReplicaReplayDriver, ReplicaReplayTask)
- Update AofReplayCoordinator to extract ordering info from header type
- Only create seqNumGen for multi-physical-log (AofPhysicalSublogCount > 1)
- Add debug assertions for positive entry addresses

Multi-physical-log path (AofPhysicalSublogCount > 1) is unchanged.
This is a breaking change for AOF format in single-physical-log mode.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rdinator

When multiple replay tasks process the same transaction on different sublogs,
lock acquisition was previously unsynchronized — only release was coordinated.
This could lead to deadlocks when overlapping transactions hold conflicting locks.

Add an acquire-barrier in ProcessTransactionGroup (replica path, multi-log mode)
so all participants rendezvous before any task calls txnManager.Run() to acquire
locks. The barrier is keyed on the TxnStart sequence number, naturally distinct
from the existing release-barrier keyed on the TxnCommit sequence number.

Changes:
- TransactionGroup: add StartSequenceNumber field from TxnStart entry
- AofReplayContext.AddTransactionGroup: accept and forward startSequenceNumber
- AddOrReplayTransactionOperation: compute start seqNum from TxnStart header
- ProcessTransactionGroup: add acquire-barrier before locking, restructure
  multi-log vs single-log paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…-log path

Extract shared EnqueueBroadcastEntry helper that handles single-log,
single-physical-log, and multi-physical-log branches. Both
EnqueueDatabaseCommit and EnqueueSafeFlushAOF now delegate to it.

This fixes the multi-physical-log branch of EnqueueSafeFlushAOF which
computed physicalSublogAccessVector but never enqueued the header to
any sublog — the safe-flush marker was silently dropped.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… updates

Add comment explaining why updating the sequence number before the
operation executes is correct for prefix consistency, and how atomicity
is preserved through coordinated locking (acquire/release barriers).
Also notes the alternative lock-free approach where only the commit
marker would advance the sequence number.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…r types

Replace default fallthrough with explicit BasicHeader and ShardedHeader
cases, and throw GarnetException for unrecognized header types to fail
fast instead of silently misinterpreting data.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three issues prevented replaying single-log (BasicHeader) AOF entries
with multi-replay task configuration:

1. RecoverLogDriver.Consume never synchronized between batches or
   cancelled the CTS after the last batch, causing BulkConsumeAllAsync
   to spin forever and replay tasks to deadlock on resetReady.Wait().

2. CanReplay() routed keyless BasicHeader entries (TxnStart, TxnCommit,
   StoredProcedure, Checkpoint) to task 0 only, but barrier-based
   coordination requires all tasks to participate.

3. AofReplayCoordinator read ShardedLogTransactionHeader fields from
   16-byte BasicHeader entries (garbage reads) for TxnStart,
   ProcessTransactionGroup, and ReplayStoredProc.

Fixes:
- Add WaitCompleted/Release cycle in Consume between batches and after
  the final batch, then cancel CTS to exit the polling loop.
- Skip SignalCompleted() in ContinuousBackgroundReplayAsync when CTS is
  cancelled to avoid post-shutdown deadlock.
- Return true for all replay tasks on keyless BasicHeader entries in
  CanReplay() so barriers can complete.
- Add HasKey() extension on AofEntryType to classify keyed vs keyless
  entry types.
- Add explicit BasicHeader cases with all-participate fallback in
  AddOrReplayTransactionOperation, ProcessTransactionGroup, and
  ReplayStoredProc.
- Add AofUpgradeTests (4 tests) validating SL-to-SLMR upgrade for
  standalone ops, transactions, custom procs, and mixed workloads.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Entry addresses are monotonically increasing within a physical sublog,
so once an entry exceeds untilSequenceNumber, all subsequent entries
will too. Break out of the batch loop immediately instead of iterating
through the remaining entries.

- Single-task path: cancel CTS and break on first SkipReplay hit
- Multi-task path: break inner loop when sequenceNumber > threshold

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pass clusterProvider to AofProcessor during AOF recovery in both
SingleDatabaseManager and MultiDatabaseManager. Without this, replaying
stored procedures in cluster-enabled standalone nodes hit a
NullReferenceException in TransactionManager.ResetCacheSlotVerificationResult()
because the replay session had clusterEnabled=true but clusterSession=null.

Move SL-to-SLMR AOF upgrade tests from standalone project into
ClusterReplicationShardedLog.cs to test recovery under cluster conditions
with both MULTI/EXEC and stored procedure (BulkIncrementBy) paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vazois vazois force-pushed the vazois/mlog-txn branch from 6341f0c to f484a0f Compare May 21, 2026 17:16
@vazois vazois requested a review from badrishc May 21, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants