Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
662b214
build routers
Apr 28, 2026
da2a545
remove dual write test
Apr 28, 2026
865ef84
remove split write tests
Apr 28, 2026
879dac7
incremental progress
Apr 29, 2026
6c4927f
expand route API
Apr 29, 2026
1b7c9ab
add helper method to build a route to a migration manager
Apr 29, 2026
af35c76
Merge branch 'cjl/expanded-route-api' into cjl/migration-integration
Apr 29, 2026
fa89b45
cleanup
Apr 29, 2026
75a3b7c
made suggested change
Apr 29, 2026
3710f59
Merge branch 'cjl/expanded-route-api' into cjl/migration-integration
Apr 29, 2026
e0744b9
revert lots of changes
Apr 29, 2026
94c5e57
test utilities
Apr 29, 2026
78eb3bb
incremental progress on tests
Apr 29, 2026
2a6c03e
test evm migration
Apr 29, 2026
8a303e9
determinism
Apr 29, 2026
3900541
cleanup
Apr 29, 2026
d9fefc8
test post evm migration
Apr 29, 2026
eee79af
added next test
Apr 30, 2026
0b3cc6e
all migrated but bank
Apr 30, 2026
4f140de
final test
Apr 30, 2026
175b0f5
add thread safe wrapper
Apr 30, 2026
65c31dd
various fixes
Apr 30, 2026
2fc730a
Merge branch 'main' into cjl/migration-integration
May 1, 2026
97d0acc
implement thread safe router
May 1, 2026
ecd5a92
Add router based implementation of kvstore
May 1, 2026
8c9961c
made suggested changes
May 4, 2026
d0aead6
Merge branch 'cjl/thread-safe-router' into cjl/migration-integration
May 4, 2026
ee043b8
remove unused methods
May 4, 2026
ddafa97
Merge branch 'main' into cjl/migration-integration
May 4, 2026
9449881
test framework for migration tests
May 4, 2026
34d0dbe
made suggested changes
May 5, 2026
2f50f90
Merge branch 'cjl/migration-test-framework' into cjl/migration-integr…
May 5, 2026
cc367cd
Merge branch 'main' into cjl/migration-integration
May 5, 2026
247308f
Add replacement APIs for methods we intend to deprecate.
May 6, 2026
976c934
Merge branch 'main' into cjl/replacement-apis
May 6, 2026
0939d8d
fix tests
May 6, 2026
68a9dc1
Merge branch 'main' into cjl/migration-integration
May 6, 2026
4265c58
remove ctx
May 6, 2026
17915ce
Constants for migration code
May 6, 2026
d548214
made suggested change
May 6, 2026
5c2b89d
made suggested change
May 6, 2026
5e8524a
Merge branch 'cjl/migration-constants' into cjl/migration-integration
May 6, 2026
a864d0f
fix merge issue
May 6, 2026
01e41e2
Merge branch 'main' into cjl/migration-integration
May 7, 2026
76bfdf4
Merge branch 'cjl/replacement-apis' into cjl/migration-integration
May 7, 2026
95e7e03
add routers for steady state operation
May 7, 2026
bc1842c
add duplicating router
May 7, 2026
fd86181
add router for dual write mode
May 7, 2026
e218d1f
use routers in composite.Store
May 8, 2026
4a6c07c
fix compile issues
May 11, 2026
196524d
set initial version for flatKV
May 11, 2026
15d4415
nil checks
May 11, 2026
3a3db40
properly handle context lifecycle
May 11, 2026
91f1005
get version fixes
May 11, 2026
5f9c70f
init migration store in memiavl
May 11, 2026
818aee4
fix test store names
May 11, 2026
db7efd3
fixed unit tests
May 11, 2026
82f21df
Merge branch 'main' into cjl/composite-uses-routers-2
May 13, 2026
f432243
fix merge problems
May 13, 2026
cf22acc
fix build problems
May 13, 2026
782cb75
fix failing tests
May 13, 2026
99d44c3
add passthrough router
May 13, 2026
0bc546b
fix bug
May 13, 2026
70ac6ac
fix buggy test
May 14, 2026
dfabe19
fix migration manager construction after migration complete
May 14, 2026
1758c2b
Merge branch 'main' into cjl/composite-uses-routers-2
May 14, 2026
6d56a2d
Merge branch 'main' into cjl/composite-uses-routers-2
May 18, 2026
f32f60e
minor fixes
May 18, 2026
7b0acc3
fix integration test
May 18, 2026
2ec63e0
flatkv migration testings
blindchaser May 19, 2026
3ff2452
fix upgrade module
blindchaser May 19, 2026
a26b9a0
fix migration manager iterator
blindchaser May 19, 2026
38de22c
fix SetInitialVersion
blindchaser May 19, 2026
abbf2ec
fix flatkv evm migration progress in live blocks
blindchaser May 19, 2026
ef5a623
Merge origin/main into fkv migration branch
blindchaser May 19, 2026
912149e
more logs
blindchaser May 19, 2026
c5df306
add more metrics for testing
blindchaser May 20, 2026
b3b6cc2
Merge branch 'main' into yiren/fkv-mig-int
blindchaser May 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 52 additions & 17 deletions .github/workflows/integration-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,35 +114,70 @@ jobs:
# rotation, etc.) should be appended here rather than added as
# new matrix rows, so the CI surface stays one row per concern.
#
# Step ordering matters:
# 1-2 Deploy EVM fixture and capture baseline RPC reads while
# the cluster is still in memiavl-only mode.
# 3 Run the offline FlatKV import on every validator and
# restart the cluster with flatkv enabled (dual_write,
# lattice-hash off to preserve AppHash trajectory across
# the import boundary).
# 4-5 Re-run the same fixture probe and physical-layout check
# against the post-import cluster; must match step 2.
# 6 SIGKILL one validator, restart, and assert 4-node
# FlatKV digest equality. This exercises WAL recovery on
# the post-import flatkv state, which mirrors the real
# production scenario (chains reach flatkv via migration,
# not via genesis flag).
# 7-9 Destructive disaster-recovery scenarios. They run last
# The cluster boots with GIGA_STORAGE=true, so FlatKV is enabled
# from genesis (sc-write-mode = test_only_dual_write) and the
# FlatKV LtHash is folded into every block's AppHash from block 1.
# No mid-life migration is exercised here; an offline
# memiavl -> FlatKV import would change the AppHash trajectory
# at the import boundary and break tendermint replay.
#
# Step ordering:
# 1-3 Deploy an EVM fixture, smoke-check EVM RPC reads, and
# verify the fixture round-tripped through dual_write into
# physical FlatKV.
# 4 SIGKILL one validator, restart, and assert 4-node FlatKV
# digest equality. Exercises WAL recovery on a live FlatKV
# cluster.
# 5-7 Destructive disaster-recovery scenarios. They run last
# because they wipe or damage one validator's local state.
name: "FlatKV Integration",
env: "GIGA_STORAGE=true",
scripts: [
"docker exec sei-node-0 integration_test/contracts/deploy_flatkv_evm_fixture.sh",
"python3 integration_test/scripts/runner.py integration_test/seidb/flatkv_evm_test.yaml",
"./integration_test/contracts/import_flatkv_evm_cluster.sh",
"python3 integration_test/scripts/runner.py integration_test/seidb/flatkv_evm_test.yaml",
"docker exec sei-node-0 integration_test/contracts/verify_flatkv_evm_store.sh",
"./integration_test/contracts/verify_flatkv_crash_recovery.sh",
"./integration_test/contracts/verify_flatkv_statesync_crash_recovery.sh",
"./integration_test/contracts/verify_flatkv_total_loss_recovery.sh",
"./integration_test/contracts/verify_flatkv_partial_loss_fails_loudly.sh",
],
},
{
# 0 -> 1 (MigrateEVM) cluster migration coverage.
#
# The cluster boots with GIGA_MIGRATE_FROM_MEMIAVL=true so
# every validator starts in sc-write-mode = memiavl_only,
# i.e. v0 (FlatKV not yet allocated). This is the inverse of
# the FlatKV Integration row above (which boots in
# test_only_dual_write) and is what makes the cutover
# script's pre-flip check meaningful: if the matrix env ever
# silently lands the cluster in any mode other than
# memiavl_only, the script will fail loudly at the pre-flip
# grep instead of "succeeding" with a no-op migration.
#
# Steps:
# 1 Deposit an EVM fixture while in v0 so the migration
# has real account+code+storage to drain (otherwise
# "migration" trivially completes against an empty
# tree and the test exercises nothing).
# 2 Coordinated stop -> sed sc-write-mode -> restart on
# all 4 validators, then poll seidb migrate-evm-status
# until every validator reports completion. Cross-
# validator FlatKV digest agreement is asserted at a
# shared post-migration height; any non-determinism in
# the batch copier would surface here as a digest
# mismatch.
# 3 Re-run the fixture round-trip check against the
# now-FlatKV-backed EVM state, confirming pre-migration
# data survives the cutover intact (read transparency).
name: "FlatKV EVM Migration 0->1",
env: "GIGA_MIGRATE_FROM_MEMIAVL=true",
scripts: [
"docker exec sei-node-0 integration_test/contracts/deploy_flatkv_evm_fixture.sh",
"./integration_test/contracts/verify_flatkv_evm_migration_0_to_1.sh",
"docker exec sei-node-0 integration_test/contracts/verify_flatkv_evm_store.sh",
],
},
{
name: "EVM Module",
env: "GIGA_STORAGE=true",
Expand Down
5 changes: 3 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -295,7 +295,8 @@ CLUSTER_ENV_VARS = DOCKER_PLATFORM=$(DOCKER_PLATFORM) USERID=$(shell id -u) GROU
GIGA_OCC=$(GIGA_OCC) \
RECEIPT_BACKEND=$(RECEIPT_BACKEND) \
AUTOBAHN=$(AUTOBAHN) \
GIGA_STORAGE=$(GIGA_STORAGE)
GIGA_STORAGE=$(GIGA_STORAGE) \
GIGA_MIGRATE_FROM_MEMIAVL=$(GIGA_MIGRATE_FROM_MEMIAVL)

# Run a 4-node docker containers
docker-cluster-start: docker-cluster-stop build-docker-node
Expand All @@ -321,7 +322,7 @@ docker-cluster-start-skipbuild: docker-cluster-stop build-docker-node
else \
DETACH_FLAG=""; \
fi; \
DOCKER_PLATFORM=$(DOCKER_PLATFORM) USERID=$(shell id -u) GROUPID=$(shell id -g) GOCACHE=$(shell go env GOCACHE) NUM_ACCOUNTS=10 SKIP_BUILD=true docker compose up $$DETACH_FLAG
$(CLUSTER_ENV_VARS) SKIP_BUILD=true docker compose up $$DETACH_FLAG
.PHONY: localnet-start

# Stop 4-node docker containers
Expand Down
29 changes: 18 additions & 11 deletions app/seidb.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,14 @@ const (
FlagSCHistoricalProofRateLimit = "state-commit.sc-historical-proof-rate-limit"
FlagSCHistoricalProofBurst = "state-commit.sc-historical-proof-burst"
FlagSCWriteMode = "state-commit.sc-write-mode"
FlagSCReadMode = "state-commit.sc-read-mode"
FlagSCEnableLatticeHash = "state-commit.sc-enable-lattice-hash"
// Per-block batch size used by the MigrationManager when sc-write-mode
// is one of the in-flight modes (migrate_evm, migrate_bank,
// migrate_all_but_bank). Optional: when unset in app.toml the field
// stays at DefaultStateCommitConfig().KeysToMigratePerBlock (= 1024),
// which is appropriate for production drains. Lowering it spreads the
// migration across more blocks, which is useful for tests that need to
// exercise the resume / hybrid-read path mid-flight.
FlagSCKeysToMigratePerBlock = "state-commit.sc-keys-to-migrate-per-block"

// SS Store configs
FlagSSEnable = "state-store.ss-enable"
Expand Down Expand Up @@ -111,15 +117,6 @@ func parseSCConfigs(appOpts servertypes.AppOptions) config.StateCommitConfig {
}
scConfig.WriteMode = parsedWM
}
if rm := cast.ToString(appOpts.Get(FlagSCReadMode)); rm != "" {
parsedRM, err := config.ParseReadMode(rm)
if err != nil {
panic(fmt.Sprintf("invalid EVM SS read mode %q: %s", rm, err))
}
scConfig.ReadMode = parsedRM
}

scConfig.EnableLatticeHash = cast.ToBool(appOpts.Get(FlagSCEnableLatticeHash))
Comment thread
cursor[bot] marked this conversation as resolved.

if v := appOpts.Get(FlagSCHistoricalProofMaxInFlight); v != nil {
scConfig.HistoricalProofMaxInFlight = cast.ToInt(v)
Expand All @@ -130,6 +127,16 @@ func parseSCConfigs(appOpts servertypes.AppOptions) config.StateCommitConfig {
if v := appOpts.Get(FlagSCHistoricalProofBurst); v != nil {
scConfig.HistoricalProofBurst = cast.ToInt(v)
}
// Guard with v != nil so that an absent app.toml entry preserves the
// default of 1024 instead of clobbering it to 0, which would fail
// StateCommitConfig.Validate ("keys-to-migrate-per-block must be > 0")
// and bring the node down at startup the first time write-mode is
// flipped to a migration mode.
if v := appOpts.Get(FlagSCKeysToMigratePerBlock); v != nil {
if n := cast.ToInt(v); n > 0 {
scConfig.KeysToMigratePerBlock = n
}
}

return scConfig
}
Expand Down
2 changes: 0 additions & 2 deletions app/seidb_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,6 @@ func (t TestSeiDBAppOpts) Get(s string) interface{} {
return defaultSCConfig.MemIAVLConfig.SnapshotPrefetchThreshold
case FlagSCSnapshotWriteRateMBps:
return defaultSCConfig.MemIAVLConfig.SnapshotWriteRateMBps
case FlagSCEnableLatticeHash:
return defaultSCConfig.EnableLatticeHash
case FlagSSEnable:
return defaultSSConfig.Enable
case FlagSSBackend:
Expand Down
4 changes: 4 additions & 0 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ services:
- RECEIPT_BACKEND
- AUTOBAHN
- GIGA_STORAGE
- GIGA_MIGRATE_FROM_MEMIAVL
volumes:
- "${PROJECT_HOME}:/sei-protocol/sei-chain:Z"
- "${PROJECT_HOME}/../sei-tendermint:/sei-protocol/sei-tendermint:Z"
Expand Down Expand Up @@ -54,6 +55,7 @@ services:
- RECEIPT_BACKEND
- AUTOBAHN
- GIGA_STORAGE
- GIGA_MIGRATE_FROM_MEMIAVL
volumes:
- "${PROJECT_HOME}:/sei-protocol/sei-chain:Z"
- "${PROJECT_HOME}/../sei-tendermint:/sei-protocol/sei-tendermint:Z"
Expand Down Expand Up @@ -83,6 +85,7 @@ services:
- RECEIPT_BACKEND
- AUTOBAHN
- GIGA_STORAGE
- GIGA_MIGRATE_FROM_MEMIAVL
ports:
- "26662-26664:26656-26658"
- "9094-9095:9090-9091"
Expand Down Expand Up @@ -116,6 +119,7 @@ services:
- RECEIPT_BACKEND
- AUTOBAHN
- GIGA_STORAGE
- GIGA_MIGRATE_FROM_MEMIAVL
ports:
- "26665-26667:26656-26658"
- "9096-9097:9090-9091"
Expand Down
7 changes: 7 additions & 0 deletions docker/localnode/config/app.toml
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,13 @@ sc-snapshot-writer-limit = 2
# CacheSize defines the size of the LRU cache for each store on top of the tree, default to 100000.
sc-cache-size = 1000

# KeysToMigratePerBlock controls how many EVM keys the in-flight migration
# (sc-write-mode = migrate_evm / migrate_bank / migrate_all_but_bank) drains
# from memiavl into flatkv per block. Default 1024 is appropriate for
# production drains; tests lower it to spread the migration across more
# blocks and exercise the resume / hybrid-read path.
sc-keys-to-migrate-per-block = 1024

[state-store]

# Enable defines if the state-store should be enabled for historical queries.
Expand Down
57 changes: 37 additions & 20 deletions docker/localnode/scripts/step4_config_override.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,13 @@ GIGA_EXECUTOR=${GIGA_EXECUTOR:-false}
GIGA_OCC=${GIGA_OCC:-false}
AUTOBAHN=${AUTOBAHN:-false}
GIGA_STORAGE=${GIGA_STORAGE:-false}
# GIGA_MIGRATE_FROM_MEMIAVL=true boots the cluster in v0 (memiavl_only):
# memiavl is the sole SC backend, FlatKV is not allocated. This is the
# starting point for the 0->1 (MigrateEVM) cluster test, which drives a
# real workload in this mode and then performs a coordinated stop/flip/
# restart into migrate_evm. Mutually exclusive with GIGA_STORAGE=true;
# the script picks the more specific override if both are set.
GIGA_MIGRATE_FROM_MEMIAVL=${GIGA_MIGRATE_FROM_MEMIAVL:-false}

APP_CONFIG_FILE="build/generated/node_$NODE_ID/app.toml"
TENDERMINT_CONFIG_FILE="build/generated/node_$NODE_ID/config.toml"
Expand All @@ -23,37 +30,47 @@ sed -i.bak -e "s|^snapshot-directory *=.*|snapshot-directory = \"./build/generat
# Enable slow mode
sed -i.bak -e 's/slow = .*/slow = true/' ~/.sei/config/app.toml

# Boot the cluster in v0 (memiavl_only) for the 0->1 migration test.
# Doing this here keeps the override surface narrow: the test runner
# only has to set one env var to ship a v0-shaped config, and the
# follow-up flip script just rewrites sc-write-mode in place during the
# coordinated stop.
if [ "$GIGA_MIGRATE_FROM_MEMIAVL" = "true" ]; then
echo "Booting node $NODE_ID in memiavl_only mode (0->1 migration starting point)..."
if grep -q '^sc-write-mode[[:space:]]*=' ~/.sei/config/app.toml; then
sed -i 's/^sc-write-mode[[:space:]]*=.*/sc-write-mode = "memiavl_only"/' ~/.sei/config/app.toml
else
sed -i '/^\[state-store\]/i sc-write-mode = "memiavl_only"' ~/.sei/config/app.toml
fi
# The EVM SS split is irrelevant in this mode (flatkv is not allocated),
# but explicitly disabling it keeps app.toml self-describing in case an
# operator inspects it post-flip.
sed -i 's/^evm-ss-split[[:space:]]*=.*/evm-ss-split = false/' ~/.sei/config/app.toml
fi

# Enable Giga Storage: FlatKV SC dual-write + EVM SS split.
# When GIGA_STORAGE=true we also default the receipt backend to parquet; callers
# can still override this by setting RECEIPT_BACKEND explicitly.
# Set GIGA_STORAGE=false to disable.
if [ "$GIGA_STORAGE" = "true" ]; then
# GIGA_MIGRATE_FROM_MEMIAVL takes precedence: if both are set, the memiavl-only
# block above ran first and the test runner is responsible for the cutover.
if [ "$GIGA_STORAGE" = "true" ] && [ "$GIGA_MIGRATE_FROM_MEMIAVL" != "true" ]; then
RECEIPT_BACKEND=${RECEIPT_BACKEND:-parquet}
echo "Enabling Giga Storage for node $NODE_ID..."

# --- SC layer: dual_write + split_read + lattice hash ---
# SC must use dual_write (not split_write) because block execution reads
# EVM data from the memiavl tree via GetChildStoreByName. With split_write,
# EVM data only goes to FlatKV and the memiavl tree becomes stale.
# dual_write keeps memiavl up-to-date for reads while also populating FlatKV.
if grep -q "sc-write-mode" ~/.sei/config/app.toml; then
sed -i 's/sc-write-mode = .*/sc-write-mode = "dual_write"/' ~/.sei/config/app.toml
else
sed -i '/^\[state-store\]/i sc-write-mode = "dual_write"' ~/.sei/config/app.toml
fi
if grep -q "sc-read-mode" ~/.sei/config/app.toml; then
sed -i 's/sc-read-mode = .*/sc-read-mode = "split_read"/' ~/.sei/config/app.toml
else
sed -i '/^\[state-store\]/i sc-read-mode = "split_read"' ~/.sei/config/app.toml
fi
if grep -q "sc-enable-lattice-hash" ~/.sei/config/app.toml; then
sed -i 's/sc-enable-lattice-hash = .*/sc-enable-lattice-hash = true/' ~/.sei/config/app.toml
# --- SC layer: test_only_dual_write ---
# SC must use test_only_dual_write because block execution reads EVM data
# from the memiavl tree via GetChildStoreByName. dual-write keeps memiavl
# up-to-date for reads while also populating FlatKV. This mode is for test
# clusters only — never deploy to testnet/mainnet.
if grep -q '^sc-write-mode[[:space:]]*=' ~/.sei/config/app.toml; then
sed -i 's/^sc-write-mode[[:space:]]*=.*/sc-write-mode = "test_only_dual_write"/' ~/.sei/config/app.toml
else
sed -i '/^\[state-store\]/i sc-enable-lattice-hash = true' ~/.sei/config/app.toml
sed -i '/^\[state-store\]/i sc-write-mode = "test_only_dual_write"' ~/.sei/config/app.toml
fi

# --- SS layer: enable EVM split ---
sed -i 's/evm-ss-split = .*/evm-ss-split = true/' ~/.sei/config/app.toml
sed -i 's/^evm-ss-split[[:space:]]*=.*/evm-ss-split = true/' ~/.sei/config/app.toml
fi

# Enable Giga Executor (evmone-based) if requested
Expand Down
16 changes: 3 additions & 13 deletions docker/rpcnode/scripts/step1_configure_init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,21 +24,11 @@ if [ "$GIGA_STORAGE" = "true" ]; then
RECEIPT_BACKEND=${RECEIPT_BACKEND:-parquet}
echo "Enabling Giga Storage for RPC node..."

# SC layer: must match validators (dual_write + split_read + lattice hash)
# SC layer: must match validators (test_only_dual_write)
if grep -q "sc-write-mode" ~/.sei/config/app.toml; then
sed -i 's/sc-write-mode = .*/sc-write-mode = "dual_write"/' ~/.sei/config/app.toml
sed -i 's/sc-write-mode = .*/sc-write-mode = "test_only_dual_write"/' ~/.sei/config/app.toml
else
sed -i '/^\[state-store\]/i sc-write-mode = "dual_write"' ~/.sei/config/app.toml
fi
if grep -q "sc-read-mode" ~/.sei/config/app.toml; then
sed -i 's/sc-read-mode = .*/sc-read-mode = "split_read"/' ~/.sei/config/app.toml
else
sed -i '/^\[state-store\]/i sc-read-mode = "split_read"' ~/.sei/config/app.toml
fi
if grep -q "sc-enable-lattice-hash" ~/.sei/config/app.toml; then
sed -i 's/sc-enable-lattice-hash = .*/sc-enable-lattice-hash = true/' ~/.sei/config/app.toml
else
sed -i '/^\[state-store\]/i sc-enable-lattice-hash = true' ~/.sei/config/app.toml
sed -i '/^\[state-store\]/i sc-write-mode = "test_only_dual_write"' ~/.sei/config/app.toml
fi

# SS layer: enable EVM split
Expand Down
Loading
Loading