Skip to content

Don't spam config poller cache miss retries#2071

Open
ogtownsend wants to merge 2 commits into
mainfrom
ogt/dont-spam-retry-source-chain-config-fetch-retries
Open

Don't spam config poller cache miss retries#2071
ogtownsend wants to merge 2 commits into
mainfrom
ogt/dont-spam-retry-source-chain-config-fetch-retries

Conversation

@ogtownsend
Copy link
Copy Markdown
Contributor

NOPs reported excessive error log spam from processSourceChainConfigResults() when a source chain's GetSourceChainConfig RPC call fails persistently (e.g., due to an RPC misconfiguration like gas limit below intrinsic gas).

Root cause:

  • When GetAllConfigsLegacy fetches source chain configs and receives some failure, processSourceChainConfigResults logs the error and skips that chain and it's never added to the config poller cache.
  • In configPollerV2.GetOfframpSourceChainConfigs(), any chain not in the cache is treated as a "cache miss," which triggers an inline call to batchRefreshChainAndSourceConfigs.
  • Since the failing chain never gets cached, every subsequent call triggers another full batch fetch, producing the same error logs every time.
  • Two concurrent callers in the merkle root asyncObserver.sync() (ObserveOffRampNextSeqNums and ObserveLatestOnRampSeqNums) both hit this path on every tick, multiplying the log volume.

Fix:
pkg/reader/config_poller_v2.go:

  • Added attemptedSourceChains map to chainCache to track which source chains have been included in at least one fetch attempt.
  • GetOfframpSourceChainConfigs: A chain missing from the cache is only treated as a "cache miss" (triggering an inline fetch) if it has never been attempted. Chains that were previously attempted but failed are not re-fetched inline
  • The background poller (refreshAllKnownChains) will still retry them on its regular schedule.
  • batchRefreshChainAndSourceConfigs: After each fetch, all requested source chain selectors are recorded in attemptedSourceChains and sourceChainRefresh is always updated, regardless of whether individual chains succeeded.

pkg/chainaccessor/config_processors.go:

  • Downgraded the "Failed to get source chain config from result" log from Errorw to Warnw.
  • This is a per-chain partial failure that is retried by the background poller, Error level creates excessive noise

@ogtownsend ogtownsend force-pushed the ogt/dont-spam-retry-source-chain-config-fetch-retries branch from f5b6d07 to 0ca190d Compare May 15, 2026 19:41
@ogtownsend ogtownsend force-pushed the ogt/dont-spam-retry-source-chain-config-fetch-retries branch from 0ca190d to 4d05090 Compare May 15, 2026 20:21
@ogtownsend ogtownsend requested review from makramkd and winder May 15, 2026 20:23
@github-actions
Copy link
Copy Markdown

Metric ogt/dont-spam-retry-source-chain-config-fetch-retries main
Coverage 70.1% 69.7%

makramkd
makramkd previously approved these changes May 18, 2026
@ogtownsend ogtownsend marked this pull request as ready for review May 18, 2026 14:53
Copilot AI review requested due to automatic review settings May 18, 2026 14:53
@ogtownsend ogtownsend requested review from a team as code owners May 18, 2026 14:53
Comment thread pkg/reader/config_poller_v2.go Outdated
Comment on lines 305 to 310
} else if _, attempted := destChainCache.attemptedSourceChains[chain]; !attempted {
// Only treat as missing if we haven't attempted to fetch this chain before.
// Chains that were attempted but not returned are either unconfigured on-chain
// or had an RPC error. Re-fetching inline on every call creates a doom-loop
// of redundant RPC calls and error log spam. The background poller will retry.
missingChains = append(missingChains, chain)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Would it be possible for a bad config to cause the config to disappear? In that case this cache would need to have some sort of expiry to make sure we detect it's now missing.

winder
winder previously approved these changes May 18, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces repeated source-chain config fetches/logs when a source-chain config read fails and is omitted from the config poller cache.

Changes:

  • Tracks attempted source-chain config fetches to avoid repeated inline cache-miss refreshes.
  • Updates config-poller tests for failed, recovered, and newly discovered source chains.
  • Downgrades per-source-chain config result failures from error to warning.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
pkg/reader/config_poller_v2.go Adds attempted-source-chain tracking and updates cache refresh behavior.
pkg/reader/config_poller_v2_test.go Adds tests around failed source-chain inline retry suppression and background recovery.
pkg/chainaccessor/config_processors.go Downgrades source-chain config result failures from error to warning.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/reader/config_poller_v2.go Outdated
Comment on lines 305 to 310
} else if _, attempted := destChainCache.attemptedSourceChains[chain]; !attempted {
// Only treat as missing if we haven't attempted to fetch this chain before.
// Chains that were attempted but not returned are either unconfigured on-chain
// or had an RPC error. Re-fetching inline on every call creates a doom-loop
// of redundant RPC calls and error log spam. The background poller will retry.
missingChains = append(missingChains, chain)
Comment thread pkg/reader/config_poller_v2.go Outdated
Comment on lines +467 to +468
for _, chain := range sourceChainSelectors {
cache.attemptedSourceChains[chain] = struct{}{}
"GetAllConfigsLegacy",
mock.Anything,
destChain,
mock.MatchedBy(chainSelectorSliceMatcher(sourceChains))).
Comment on lines +81 to 83
lggr.Warnw("Failed to get source chain config from result",
"chain", chain,
"error", err)
@ogtownsend ogtownsend dismissed stale reviews from winder and makramkd via b3f9d94 May 19, 2026 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants