Skip to content

feat(chains): opt-in no-finality flag for chains without finalized block tag#191

Open
robbeverhelst wants to merge 1 commit into
drpcorg:mainfrom
settlemint:feat/no-finality-chains
Open

feat(chains): opt-in no-finality flag for chains without finalized block tag#191
robbeverhelst wants to merge 1 commit into
drpcorg:mainfrom
settlemint:feat/no-finality-chains

Conversation

@robbeverhelst
Copy link
Copy Markdown

Summary

Adds an opt-in no-finality: true flag on a chain's settings block. When enabled, NodeCore skips finalized-block polling and lets the chain supervisor promote upstreams to Available using head data alone. Default is false, so every existing chain in pkg/chains/public/chains.yaml keeps current behavior.

Why

Some EVM-compatible networks do not expose a finalized block: Hyperledger Besu QBFT, classic Clique / PoA, consortium chains. Calling eth_getBlockByNumber("finalized", ...) against them returns -39001: Unknown block every poll cycle.

On those chains today, NodeCore's upstream lifecycle never promotes upstreams to Available. The chain supervisor only stores an entry in upstreamStates when an upstream publishes a StateUpstreamEvent, and for non-finality chains that publication never happens. block_processor.poll(FinalizedBlock) logs couldn't detect finalized block of upstream X and disableDetection.Add(FinalizedBlock) stops further polling, but no BlockUpstreamStateEvent is ever published for FinalizedBlock. HealthValidator emits StatusUpstreamStateEvent{Status: Available}, which dedups against the upstream's initial Available state in processStateEvents (default branch hits stateEvent.Same(state)continue), so no StateUpstreamEvent is published either. The chain supervisor's upstreamStates map stays empty (statuses=[] in chain_supervisor.go:317), and every request past eth_chainId returns {"error":"no available upstreams to process a request","code":1}.

Three existing knobs were tried in a live cluster and none of them populated the supervisor's upstreamStates map: upstream-config.integrity.enabled: false, chain-defaults.<chain>.options.disable-chain-validation: true, and chain-defaults.<chain>.options.disable-validation: true (the nuclear option). They all left statuses=[] empty. That makes sense reading the source — they gate validation paths, but state-publication for non-finality chains never happens regardless of validation.

How

pkg/chains/chains.go

Adds NoFinality bool yaml:"no-finality" to the per-chain Settings struct. The existing deepMerge + yaml.Unmarshal flow in configureChains picks it up automatically from any chain's settings: block, so the embedded chains.yaml and any future extra-chains loader behave identically.

A small helper exposes the flag:

// IsNoFinalityChain reports whether the given chain is marked as having no
// finalized block tag in its chain-settings (e.g. Besu QBFT, classic PoA).
func IsNoFinalityChain(chain Chain) bool { ... }

internal/upstreams/blocks/block_processor.go

Takes a noFinality bool argument in NewEthLikeBlockProcessor. In Start(), skips both the initial b.poll(FinalizedBlock) and the ticker-driven follow-ups when noFinality is true, which stops the error-log spam and the useless RPC calls on chains where finalized never resolves.

internal/upstreams/upstream_events.go

In processStateEvents, on the first HeadUpstreamStateEvent for a no-finality chain (when validUpstream is already true), publishes a piggyback StateUpstreamEvent so the chain supervisor's upstreamStates map gets populated. Subsequent head events continue to publish only HeadUpstreamEvent as today, and a local stateBroadcast bool ensures the piggyback fires exactly once per upstream lifetime.

internal/upstreams/upstream_factory.go + upstream_processors.go

createBlockProcessor now receives the full *chains.ConfiguredChain instead of just chains.BlockchainType so it can read configuredChain.Settings.NoFinality. Small refactor with one caller; the original BlockchainType parameter became configuredChain.Type at the switch site.

Cache policies

Cache policies that key on finalization-type: finalized simply do not cache on no-finality chains because no finalized event ever fires. The docs section calls this out and recommends a TTL policy for no-finality chains. No code change in cache_processor is needed for the opt-in to work safely.

Production validation

This was developed in response to a real deployment of Hyperledger Besu QBFT (DALP staging, chain id 0xbb1a, two upstream nodes). Before the patch the chain supervisor stayed at statuses=[] indefinitely and every consumer call past eth_chainId returned no available upstreams to process a request. After deploying a build with this patch:

  • loaded extra chain definitions at startup, no couldn't detect finalized block errors thereafter.
  • State of SETTLEMINT-BESU: height=N, statuses=[AVAILABLE/2] within ~30s of pod start.
  • eth_blockNumber, eth_getBlockByNumber, eth_getLogs, trace_block, trace_replayBlockTransactions, debug_traceBlockByNumber all route correctly.
  • Full consumer stack (Blockscout, dapp, dapi, indexer, workflows) serves production traffic normally.

Tests

  • internal/upstreams/blocks/block_processor_test.go::TestEthLikeBlockProcessorSkipsFinalizedPollWhenNoFinality asserts the processor never calls SendRequest and never adds FinalizedBlock to disableDetection when noFinality: true.
  • pkg/chains/chains_test.go::TestIsNoFinalityChain_* covers three cases: defaults false for embedded chains, returns false for UnknownChain, returns true when Settings.NoFinality is set.
  • All existing tests pass and go vet ./... is clean.

Docs

New section "Chains without a finalized block tag" in docs/nodecore/05-upstream-config.md with the rationale, the failure mode, a minimal chain entry example, and a note on cache policy implications. The constructor signature for blocks.NewEthLikeBlockProcessor gained a noFinality bool parameter, which is an internal API with only one caller (createBlockProcessor), and the existing tests have been updated to pass false. No new dependencies have been introduced, and there are no changes to public RPC behavior on finality-having chains.

Notes for reviewers

  • The IsNoFinalityChain helper is a thin convenience on top of GetChain(chain.String()).Settings.NoFinality, and I am happy to inline it if maintainers prefer.
  • The piggyback StateUpstreamEvent fires once per upstream lifetime (gated by stateBroadcast), so the cost is one extra publish per upstream startup on opt-in chains.
  • I went with a Settings-level flag (per-chain, declared in chains.yaml) rather than a per-upstream option, because finality is a property of the consensus protocol, not the operator's deployment. I am open to feedback if you'd rather have it as a chain-defaults option for operator override.

…ock tag

Some EVM-compatible networks do not expose a finalized block: Hyperledger
Besu QBFT, classic Clique / PoA, consortium chains. Calling
eth_getBlockByNumber("finalized") against them returns -39001 "Unknown
block" forever. NodeCore's upstream lifecycle keeps such an upstream out
of the chain supervisor's upstreamStates map because no
BlockUpstreamStateEvent for FinalizedBlock ever fires and
StatusUpstreamStateEvent{Status: Available} from the health validator
dedups against the upstream's initial Available state. Result: the
router answers "no available upstreams to process a request" to every
call past eth_chainId.

This commit adds an opt-in no-finality flag on the chain Settings
struct. When set:

- block_processor skips the finalized poll loop entirely (no log spam,
  no wasted RPC calls);
- upstream piggybacks a StateUpstreamEvent on the first head
  publication so the chain supervisor learns the upstream from head
  data alone.

Default is false; existing chains are unaffected. Configured per-chain
in chain-settings.protocols[].chains[].settings.no-finality.

Validated in production against a real Besu QBFT chain (DALP staging,
chain-id 0xbb1a, two upstream nodes). Before the patch: supervisor
log "State of ...: height=N, statuses=[]" forever, every consumer
RPC fails with "no available upstreams". After: "statuses=[AVAILABLE/2]"
within ~30s of pod start, the full Blockscout / dapp / indexer stack
serves traffic normally.

Tests cover:
- block processor skips SendRequest entirely when noFinality
- IsNoFinalityChain reads Settings.NoFinality correctly
- defaults to false for embedded chains and UnknownChain

Docs section added to docs/nodecore/05-upstream-config.md.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant