Skip to content

fix: determine claim syncer starting block from earliest settled reference when DB is empty#1574

Merged
joanestebanr merged 15 commits intodevelopfrom
fix/starting_claim_empty_db
Apr 15, 2026
Merged

fix: determine claim syncer starting block from earliest settled reference when DB is empty#1574
joanestebanr merged 15 commits intodevelopfrom
fix/starting_claim_empty_db

Conversation

@joanestebanr
Copy link
Copy Markdown
Collaborator

@joanestebanr joanestebanr commented Apr 10, 2026

🔄 Changes Summary

  • When the claim syncer DB is empty on startup, the starting block is now determined using the earliest block among the three settled references (bridge exit block, imported bridge exit block, FEP L2 block) instead of the latest.
  • Added EarliestBlock() method to SettledBlocks that excludes LastSettledL2BlockNum when it is 0 (not found).
  • Added GetBlockNumbersFromCertHeader() to certificateQuerier to query all three block sources independently, recording errors per source without aborting the others.
  • When the imported bridge exit block cannot be resolved from local storage, falls back to RPC log scanning (GetLatestBlockNumByGlobalIndexFromRPC) using chunked FilterLogs backwards from the finalized block.
  • Early-exit in SetClaimSyncerNextRequiredBlock when the claim syncer DB already has processed blocks (idempotent startup).
  • Removed ClaimQuerier interface and GetBoundaryBlockForClaimType method from storage, processor, and all related interfaces/mocks. Deduplication between ClaimEvent and DetailedClaimEvent is now handled entirely by the DetailedClaimEvent handler in-block, removing the need for DB lookups during event processing.

⚠️ Breaking Changes

None

📋 Config Updates

None

✅ Testing

  • 🤖 Automatic:
    • make test-unit
  • 🖱️ Manual:
    • Deploy with empty claim syncer DB against a network that has settled certificates with imported bridge exits and verify the syncer starts from the correct earliest block.

🐞 Issues

  • Closes #N/A

🔗 Related PRs

None

📝 Notes

GetLatestBlockNumByGlobalIndexFromRPC auto-detects the RPC max-range limit from the first error message and retries in backward chunks until a matching claim log is found or the full range is exhausted.

…rence when DB is empty

When the claim syncer DB is empty on startup, use the earliest block among
the settled bridge exit sources (bridge exits, imported bridge exits, FEP L2 block)
instead of the latest, falling back to RPC log scanning when the imported bridge
exit block cannot be resolved from local storage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Apr 10, 2026

Claude finished @joanestebanr's task in 2m 11s —— View job


Code Review Complete ✅


🚀 What's New

This PR improves the claim syncer startup logic when the database is empty. The key change is using the earliest block (instead of latest) among the three settled references (bridge exit, imported bridge exit, FEP L2 block) as the starting point. This ensures all historical claims are properly synced from the earliest point forward. Additionally, when the imported bridge exit block cannot be resolved from local storage, the system now falls back to RPC log scanning with automatic chunking to handle RPC max-range limits.

🐛 Bug Fixes

  • Fixed incorrect starting block selection for claim syncer when DB is empty - now uses earliest block instead of latest
  • Fixed potential startup failure when imported bridge exit block is not in local DB by adding RPC fallback

📋 Config Updates

None

⚠️ Breaking Changes

None


🔴 Required Issues

1. Potential Logic Error in EarliestBlock() when LastSettledL2BlockNum is 0

Location: aggsender/types/types.go:425-441

The EarliestBlock() method excludes LastSettledL2BlockNum when it's 0, treating it as "not found". However, this creates an inconsistency with LatestBlock() which includes the 0 value.

Problem: If LastSettledL2BlockNum is legitimately 0 (genesis block), it will be incorrectly excluded from the minimum calculation.

// Current implementation
result := min(s.LastBridgeExitBlock, s.LastImportedBridgeExitBlock)
if s.LastSettledL2BlockNum != 0 {
    result = min(result, s.LastSettledL2BlockNum)
}

Recommendation: Use the error field to determine if a value is valid, not the zero check:

func (s SettledBlocks) EarliestBlock() (uint64, error) {
	if s.LastBridgeExitBlockErr != nil {
		return 0, s.LastBridgeExitBlockErr
	}
	if s.LastImportedBridgeExitBlockErr != nil {
		return 0, s.LastImportedBridgeExitBlockErr
	}
	if s.LastSettledL2BlockNumErr != nil {
		return 0, s.LastSettledL2BlockNumErr
	}

	// All valid - use min of all three
	return min(s.LastBridgeExitBlock, s.LastImportedBridgeExitBlock, s.LastSettledL2BlockNum), nil
}

Or add a comment explaining why block 0 should be excluded if this is intentional behavior.


2. Missing Test for EarliestBlock() Method

Location: aggsender/types/types_test.go

The new EarliestBlock() method lacks dedicated unit tests. The existing tests only cover String() and LatestBlock().

Recommendation: Add comprehensive tests covering:

  • All three sources with valid blocks (verify minimum is returned)
  • Mixed zero/non-zero values
  • Error cases for each source
  • Edge case: what happens when LastSettledL2BlockNum is legitimately 0

3. Incomplete Error Handling in RPC Fallback

Location: aggsender/query/initial_block_to_claimsync_setter.go:112-114

When RPC fallback is triggered but no claim is found, the error message doesn't provide context about why the RPC fallback was needed (the original error).

if !found {
    return 0, fmt.Errorf("no claim found for global index %s", blocks.SettledImportedBridgeExit.GlobalIndex.String())
}

Recommendation: Include the original error for better debugging:

if !found {
    return 0, fmt.Errorf("no claim found for global index %s via RPC fallback (original error: %w)", 
        blocks.SettledImportedBridgeExit.GlobalIndex.String(), blocks.LastImportedBridgeExitBlockErr)
}

🟡 Suggested Improvements

4. Redundant Nil Check in GetBlockNumbersFromCertHeader

Location: aggsender/query/certificate_query.go:52-76

The method checks if cert != nil but then unconditionally accesses cert.NetworkID at line 62, which would panic if cert is nil. The logic seems incomplete.

Current:

if cert != nil {
    // 1. Latest settled bridge exit block
    result.LastBridgeExitBlock, result.LastBridgeExitBlockErr = c.getBlockNumFromLER(ctx, cert.NewLocalExitRoot)
    // ...
    
    // 2. Latest settled imported bridge exit block
    networkState, err := c.agglayerClient.GetNetworkInfo(ctx, cert.NetworkID)  // ← Would panic if cert is nil

Recommendation: Either:

  1. Make all three queries conditional on cert != nil, OR
  2. Remove the nil check entirely if cert is always expected to be non-nil

5. Inconsistent Logging Level

Location: aggsender/query/initial_block_to_claimsync_setter.go:105

When no settled certificate exists in agglayer, the code logs at Debugf level but this is actually a significant state that affects system behavior.

n.logger.Debugf("no settled certificate in agglayer, skipping setClaimSyncerFromEmptyDB. Nothing to do")

Recommendation: Use Infof or Warnf since this is important operational information:

n.logger.Infof("no settled certificate in agglayer, cannot initialize claim syncer starting block")

6. Magic Number in RPC Scan Loop

Location: claimsync/claimsync.go:341

The condition if chunkFrom == 0 is checking for the beginning of the blockchain, but this could be clearer.

Recommendation: Add a comment or constant:

const GenesisBlock = uint64(0)
// ...
if chunkFrom == GenesisBlock {
    // Reached genesis - no more blocks to scan
    break
}

7. Early Exit Optimization Not Logged

Location: aggsender/query/initial_block_to_claimsync_setter.go:50-52

The early exit when claim syncer already has processed blocks is good, but lacks visibility.

Recommendation: Log at Info level instead of Infof since this prevents unnecessary work:

if found {
    n.logger.Infof("claim syncer already has processed blocks (latest=%d), skipping initialization", claimSyncerLatestProcessedBlock)
    return nil
}

✅ Already correct in the code!


🟢 Optional Suggestions

8. Consider Caching RPC-Discovered Block Numbers

Location: aggsender/query/initial_block_to_claimsync_setter.go:113

When RPC fallback successfully finds a block number, this information is only used temporarily and not persisted. If the service restarts, it will need to scan RPC again.

Recommendation: Consider persisting this mapping (global index → block number) to a cache or database to avoid repeated expensive RPC scans.


9. Add Metrics for RPC Fallback Usage

Location: claimsync/claimsync.go:230-348

The RPC fallback mechanism is a critical path that could be slow. Adding metrics would help monitor its usage and performance.

Recommendation: Add metrics for:

  • Number of times RPC fallback is triggered
  • Time spent in RPC scanning
  • Number of chunks scanned

10. Test Coverage for RPC Chunking Logic

Location: claimsync/claimsync.go:309-347

The chunking logic that auto-detects RPC limits is complex and critical, but appears to lack dedicated unit tests.

Recommendation: Add tests covering:

  • Successful full-range scan (no chunking needed)
  • Error triggering chunking
  • Finding match in first chunk
  • Finding match in last chunk (at genesis)
  • No match found after full scan

Summary of Findings

Strengths:
✅ Well-structured error handling with per-source error tracking
✅ Comprehensive test coverage for existing functionality
✅ Good logging for debugging
✅ Smart RPC fallback with automatic chunk size detection
✅ Idempotent startup (early exit if DB already has data)

Key Concerns:
❌ Logic inconsistency in EarliestBlock() zero-value handling
❌ Missing unit tests for new EarliestBlock() method
❌ Incomplete error context in RPC fallback failure

Overall Assessment:
This is a solid improvement to the claim syncer initialization logic. The use of earliest block makes sense for ensuring complete data synchronization. The RPC fallback is well-implemented with good error handling. The main concerns are around edge cases in the EarliestBlock() method and missing test coverage. I recommend addressing the Required issues before merging.


@joanestebanr joanestebanr self-assigned this Apr 10, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts claim syncer bootstrapping when its DB is empty by deriving the starting block from the earliest available settled reference (instead of the latest), and adds supporting plumbing to query/derive those references more robustly (including an RPC log-scan fallback).

Changes:

  • Introduces SettledBlocks (+ EarliestBlock/LatestBlock) and GetBlockNumbersFromCertHeader to query three settlement sources independently with per-source errors.
  • Updates claim syncer startup logic to be idempotent and to use earliest-settled block; adds RPC fallback to resolve imported-bridge-exit block when local storage can’t.
  • Adds an optional log hook to the EVM downloader and uses it to prefer DetailedClaimEvent over ClaimEvent, plus warns/drops an outdated detailed event variant.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
sync/evmdownloader.go Adds optional LogsHookFunc and plumbing to apply it after fetching logs.
sync/evmdownloader_test.go Unit tests for the new logs hook behavior and delegation.
claimsync/types/claim_syncer.go Extends ClaimSyncer interface with RPC fallback method.
claimsync/types/mocks/mock_claim_syncer.go Updates generated mock for new ClaimSyncer method.
claimsync/mocks/mock_claim_syncer.go Updates generated mock for new ClaimSyncer method.
claimsync/downloader.go Adds outdated detailed-claim signature handling + prefer-detailed hook implementation.
claimsync/claimsync.go Installs logs hook; caps SetNextRequiredBlock; adds RPC log-scan helper to resolve block by global index.
claimsync/claimsync_get_block_rpc_test.go New unit tests covering chunked RPC scan and parsing across event variants.
aggsender/types/types.go Adds SettledBlocks struct and helper methods for earliest/latest settled block selection.
aggsender/types/types_test.go Unit tests for SettledBlocks string/earliest/latest behavior.
aggsender/types/interfaces.go Extends CertificateQuerier interface with GetBlockNumbersFromCertHeader.
aggsender/query/initial_block_to_claimsync_setter.go Makes claim syncer startup idempotent; derives starting block via earliest settled source + RPC fallback.
aggsender/query/initial_block_to_claimsync_setter_test.go Updates/adds tests for new startup/idempotency and RPC fallback behaviors.
aggsender/query/certificate_query.go Refactors settled-block derivation into GetBlockNumbersFromCertHeader; reuses it for GetLastSettledCertificateToBlock.
aggsender/query/certificate_query_test.go Adds coverage for GetBlockNumbersFromCertHeader and updates existing expectations.
aggsender/mocks/mock_certificate_querier.go Updates generated mock for new CertificateQuerier method.
aggsender/aggsender.go Reorders startup stages so claim syncer next-required block is set before initial status checks.
AGENTS.md Adds a GitHub PR template and guidance on BlockNumberFinality usage.

Comment thread aggsender/query/initial_block_to_claimsync_setter.go Outdated
Comment thread aggsender/query/initial_block_to_claimsync_setter_test.go Outdated
Comment thread claimsync/downloader.go Outdated
Comment thread aggsender/types/types.go
joanestebanr and others added 6 commits April 10, 2026 15:46
…aimEvent

Drop the ClaimQuerier interface and GetBoundaryBlockForClaimType method
from storage, processor, and all related interfaces/mocks. The DB lookup
performed during event handling to skip ClaimEvents already covered by
a DetailedClaimEvent is no longer needed; deduplication is handled by the
DetailedClaimEvent handler which removes the redundant ClaimEvent in-block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread AGENTS.md Outdated
Comment thread aggsender/query/certificate_query.go Outdated
@joanestebanr joanestebanr enabled auto-merge (squash) April 15, 2026 07:13
@joanestebanr joanestebanr disabled auto-merge April 15, 2026 07:14
@joanestebanr joanestebanr enabled auto-merge (squash) April 15, 2026 07:40
@joanestebanr joanestebanr disabled auto-merge April 15, 2026 08:04
@joanestebanr joanestebanr enabled auto-merge (squash) April 15, 2026 08:04
@sonarqubecloud
Copy link
Copy Markdown

@joanestebanr joanestebanr merged commit 7c0707d into develop Apr 15, 2026
24 of 25 checks passed
@joanestebanr joanestebanr deleted the fix/starting_claim_empty_db branch April 15, 2026 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants