Update keynote benchmark to run for 60s by default#4978
Open
joshua-spacetime wants to merge 8 commits into
Open
Update keynote benchmark to run for 60s by default#4978joshua-spacetime wants to merge 8 commits into
joshua-spacetime wants to merge 8 commits into
Conversation
1351a3a to
79fb71a
Compare
cloutiertyler
pushed a commit
that referenced
this pull request
May 14, 2026
### This PR builds on #4978 # Description of Changes Methodology refresh of `templates/keynote-2/` with retry-policy cleanup and updated benchmark reporting. - **Retry policy normalized for benchmark fairness.** - Removed outer benchmark retry loop in `src/scenario_recipes/rpc_single_call.ts` (single-attempt call path). - Removed Cockroach connector-side retry wrapper in `src/connectors/rpc/cockroach_rpc.ts`. - Kept transaction-boundary retry in RPC servers (`withTxnRetry` in pg/crdb/supabase RPC servers). - **Per-second time-series sampling** added to `core/runner.ts`. Each run emits a `timeSeries` array for warmup/decay/collapse analysis. - **Bench CLI** supports `--alpha 0,1.5` (CSV), `--runs N`, and `--prep-between-alphas`, writing one JSON per `(connector, alpha, run)` tuple. - **Helper scripts added:** `start-bench.sh`, `stop-bench.sh`, `check-bench.sh`, `bench-stats.py`, `plot-bench.py`. - **README** updated with refreshed measurements and methodology notes. - Removed unused Docker files. # Methodology Notes - Benchmark client/scenario path is now single-attempt (no stacked outer retries). - Retry handling is kept at the server transaction layer for retryable SQL transaction errors. - Comparisons should be interpreted per topology profile (default local vs 5-node+HAProxy), and with identical runtime knobs per run (`clients`, `pipelining`, `max_pool`). # Updated Results ## Alpha = 0 | System | clients | pipelining | max_pool | TPS | TPS Stddev | p50 lat ms | p99 lat ms | |---|---:|---:|---:|---:|---:|---:|---:| | SpacetimeDB | 64 | 40 | N/A | 279,024 | 4,763 | 8 | 12 | | Node.js + SQLite | 64 | off | N/A | 3,121 | 80 | 19 | 40 | | Node.js + Supabase | 64 | off | 64 | 7,362 | 1,179 | 6 | 18 | | Bun + Postgres | 64 | off | 64 | 10,729 | 146 | 5 | 11 | | Node.js + Postgres | 64 | off | 64 | 9,904 | 223 | 6 | 11 | | Node.js + PlanetScale (SN) | 64 | off | 64 | 4,535 | 117 | 14 | 20 | | Node.js + PlanetScale (HA) | 384 | off | 384 | 4,275 | 135 | 89 | 110 | | Convex | 64 | off | N/A | 1,140 | 118 | 53 | 62 | | Node.js + CockroachDB (5 node) | 320 | off | 320 | 4,253 | 561 | 71 | 120 | | HAProxy - Node.js + CockroachDB (5 node) | 320 | off | 320 | 5,481 | 566 | 57 | 95 | ## Alpha = 1.5 | System | clients | pipelining | max_pool | TPS | TPS Stddev | p50 lat ms | p99 lat ms | |---|---:|---:|---:|---:|---:|---:|---:| | SpacetimeDB | 64 | 40 | N/A | 303,919 | 4,712 | 7 | 11 | | Node.js + SQLite | 64 | off | N/A | 3,188 | 73 | 18 | 39 | | Node.js + Supabase | 64 | off | 64 | 2,534 | 57 | 2 | 197 | | Bun + Postgres | 64 | off | 64 | 2,772 | 61 | 7 | 13 | | Node.js + Postgres | 64 | off | 64 | 961 | 25 | 10 | 16 | | Node.js + PlanetScale (SN) | 64 | off | 64 | 235 | 12 | 20 | 2,504 | | Node.js + PlanetScale (HA) | 384 | off | 384 | 248 | 13 | 416 | 10,121 | | Convex | 64 | off | N/A | 126 | 52 | 20 | 1,081 | | Node.js + CockroachDB (5 node) | 320 | off | 320 | 0.03 | 0.18 | 698 | 9,695 | | HAProxy - Node.js + CockroachDB (5 node) | 64 | off | 64 | 6.87 | 9.12 | 5,943 | 9,880 | ## Alpha = 0 PIPELINED TEST | System | clients | pipelining | max_pool | TPS | TPS Stddev | p50 lat ms | p99 lat ms | |---|---:|---:|---:|---:|---:|---:|---:| | Node.js + SQLite | 64 | 40 | N/A | 2,977 | 84 | 722 | 747 | | Node.js + Supabase | 64 | 40 | 64 | 8,874 | 308 | 284 | 303 | | Bun + Postgres | 64 | 40 | 64 | 10,184 | 120 | 250.1 | 260.5 | | Node.js + Postgres | 64 | 40 | 64 | 9,165 | 145 | 276 | 290 | | Node.js + PlanetScale (SN) | 64 | 40 | 64 | 4,325 | 85 | 590 | 604 | | Node.js + PlanetScale (HA) | 384 | 40 | 384 | 3,355 | 327 | 4,354 | 4,438 | | Convex | 64 | 40 | N/A | 1,154 | 134 | 2,119 | 2,150 | | Node.js + CockroachDB (5 node) | 320 | 40 | 320 | 4,250 | 766 | 3,030 | 3,161 | | HAProxy - Node.js + CockroachDB (5 node) | 320 | 40 | 320 | 5,992 | 1,765 | 2,431 | 2,562 | # API and ABI Breaking Changes No engine API/ABI changes. Internal keynote-template shape changes only. - `BenchOptions.alpha: number` → `alphas: number[]`. - New fields: `runs`, `prepBetweenAlphas`. - `RunResult` now includes required `timeSeries`. - Output JSON filename format is `test-1-<connector>-a<alpha>-<timestamp>.json`. # Expected Complexity Level and Risk **3.** Contained to `templates/keynote-2/`, but cross-cutting across runner/CLI/connectors/RPC servers/scripts/docs. # Testing Reviewer smoke test: ```bash pnpm run prep pnpm run bench --alpha 0,1.5 --connectors postgres_rpc --seconds 30 --runs 1 ls runs/test-1-postgres_rpc-a*.json # expect 2 files --------- Co-authored-by: joshua-spacetime <josh@clockworklabs.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of Changes
Updates
demoandbenchto use--seconds 60by default.API and ABI breaking changes
N/A
Expected complexity level and risk
1
Testing
N/A