Skip to content

fix(svm): preserve SolanaError through QuorumFallback for shouldFailImmediate codes#1443

Merged
pxrl merged 4 commits into
masterfrom
droplet/T90K0AL22-C03GHT4RV42-1779616844-595189
May 24, 2026
Merged

fix(svm): preserve SolanaError through QuorumFallback for shouldFailImmediate codes#1443
pxrl merged 4 commits into
masterfrom
droplet/T90K0AL22-C03GHT4RV42-1779616844-595189

Conversation

@droplet-rl
Copy link
Copy Markdown
Contributor

Summary

  • QuorumFallbackSolanaRpcFactory wrapped the underlying SolanaError in a plain Error at two points (final aggregation via createSendErrorWithMessage, and the quorumThreshold === 1 && shouldFailImmediate short-circuit). Both wrappers erased the SolanaError type, so callers' isSolanaError(err) checks fell through to default error handling.
  • The dataworker hit this on Solana mainnet: provider.getBlockTime(421829272) returned SVM_SLOT_SKIPPED (network-wide skipped slot, reproducible against the public RPC), but _callGetTimestampForSlotWithRetry saw a generic Error and re-threw instead of returning undefined. That broke getNearestSlotTime's slot walk-back and crashed zion-across-l2-executor-solana.
  • Both sites now rethrow the original error when shouldFailImmediate(method, error) is true. The final aggregation step still wraps with the descriptive Not enough providers succeeded... context whenever at least one rejection is a non-deterministic provider failure — that's the case where surfacing the provider URLs and successful peers is genuinely useful.

Test plan

  • New unit tests in test/SolanaQuorumFallbackProvider.ts cover: single-rejection skipped slot rethrows the SolanaError; multi-provider all-skipped rethrows the first rejection; non-shouldFailImmediate failures keep the wrapping path with cause; shouldFailImmediate codes on unrelated methods still wrap; mixed failures fall through to the wrapper; successful responses still pass through.
  • npx hardhat test test/SolanaQuorumFallbackProvider.ts test/providers/utils.test.ts test/providers/solana/utils.test.ts — 22 passing.
  • npx hardhat test test/SolanaCachedProvider.ts test/SolanaRateLimitedProvider.ts test/SolanaRetryProvider.ts — 15 passing (no regressions in the surrounding factory chain).
  • npx tsc --project tsconfig.build.json --noEmit clean. prettier --check and eslint clean.

Background

Surfaced in Slack thread; full diagnosis traced the error from the dataworker stack through SVMBlockFinder.findBlockgetNearestSlotTimegetTimestampForSlot → the QuorumFallback wrapping. Slot 421829272 was a real network-wide skipped slot — the dataworker recovers automatically on restart, but the same crash will recur whenever a skipped slot lands near the head until this lands.

🤖 Generated with Claude Code

…mmediate codes

QuorumFallbackSolanaRpcFactory wrapped the underlying SolanaError in a
plain Error via createSendErrorWithMessage on aggregation, and also wrapped
on the quorumThreshold===1 short-circuit. Both wrappers erased the
SolanaError type, so callers' isSolanaError(...) checks fell through to
default error handling.

This broke the slot walk-back in getNearestSlotTime: when a Solana RPC
returns SVM_SLOT_SKIPPED for the requested slot, _callGetTimestampForSlotWithRetry
is supposed to return undefined and let the do-while loop try the prior
slot. Instead the wrapped Error fell through `if (!isSolanaError(err))
throw err` and crashed the dataworker.

Both sites now rethrow the original error when shouldFailImmediate(method,
error) is true. The fallback short-circuit drops its own wrapper, and the
final aggregation step only wraps when at least one rejection is a non-
shouldFailImmediate failure (i.e. a real provider issue worth surfacing
with the "Not enough providers succeeded" context).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread src/providers/solana/quorumFallbackRpcFactory.ts
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@droplet-rl
Copy link
Copy Markdown
Contributor Author

Addressed pxrl's review feedback on #1443.

The comment said the comments in this PR were too verbose. I trimmed them in 403d5f7:

  • src/providers/solana/quorumFallbackRpcFactory.ts: collapsed both new block comments (the quorumThreshold === 1 && shouldFailImmediate short-circuit and the aggregation rethrow) to two-line summaries that match the existing comment style in this file.
  • test/SolanaQuorumFallbackProvider.ts: dropped four explanatory one-liners that the code already conveys (silent-logger note, helper docstring, readonly cast note, Promise.allSettled ordering note, and the .cause reachability note).

Checks: npx hardhat test test/SolanaQuorumFallbackProvider.ts → 7 passing, npx tsc --project tsconfig.build.json --noEmit clean, yarn lint-check clean. Replied in-thread on the review: #1443 (comment)

droplet-rl and others added 2 commits May 24, 2026 19:28
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirrors the src tree (src/providers/solana/quorumFallbackRpcFactory.ts) and
the convention set by test/providers/solana/utils.test.ts. The flat
test/Solana*.ts siblings predate that structure.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@pxrl pxrl merged commit 8220b7a into master May 24, 2026
4 checks passed
@pxrl pxrl deleted the droplet/T90K0AL22-C03GHT4RV42-1779616844-595189 branch May 24, 2026 19:44
@droplet-rl
Copy link
Copy Markdown
Contributor Author

pxrl approved #1443 with no inline comments, and the PR is already merged (head 0e25afa). Nothing further to address.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants