Skip to content

Commit 16a2a8f

Browse files
Keynote 2 benchmark updates & refinements (#4997)
### This PR builds on #4978 # Description of Changes Methodology refresh of `templates/keynote-2/` with retry-policy cleanup and updated benchmark reporting. - **Retry policy normalized for benchmark fairness.** - Removed outer benchmark retry loop in `src/scenario_recipes/rpc_single_call.ts` (single-attempt call path). - Removed Cockroach connector-side retry wrapper in `src/connectors/rpc/cockroach_rpc.ts`. - Kept transaction-boundary retry in RPC servers (`withTxnRetry` in pg/crdb/supabase RPC servers). - **Per-second time-series sampling** added to `core/runner.ts`. Each run emits a `timeSeries` array for warmup/decay/collapse analysis. - **Bench CLI** supports `--alpha 0,1.5` (CSV), `--runs N`, and `--prep-between-alphas`, writing one JSON per `(connector, alpha, run)` tuple. - **Helper scripts added:** `start-bench.sh`, `stop-bench.sh`, `check-bench.sh`, `bench-stats.py`, `plot-bench.py`. - **README** updated with refreshed measurements and methodology notes. - Removed unused Docker files. # Methodology Notes - Benchmark client/scenario path is now single-attempt (no stacked outer retries). - Retry handling is kept at the server transaction layer for retryable SQL transaction errors. - Comparisons should be interpreted per topology profile (default local vs 5-node+HAProxy), and with identical runtime knobs per run (`clients`, `pipelining`, `max_pool`). # Updated Results ## Alpha = 0 | System | clients | pipelining | max_pool | TPS | TPS Stddev | p50 lat ms | p99 lat ms | |---|---:|---:|---:|---:|---:|---:|---:| | SpacetimeDB | 64 | 40 | N/A | 279,024 | 4,763 | 8 | 12 | | Node.js + SQLite | 64 | off | N/A | 3,121 | 80 | 19 | 40 | | Node.js + Supabase | 64 | off | 64 | 7,362 | 1,179 | 6 | 18 | | Bun + Postgres | 64 | off | 64 | 10,729 | 146 | 5 | 11 | | Node.js + Postgres | 64 | off | 64 | 9,904 | 223 | 6 | 11 | | Node.js + PlanetScale (SN) | 64 | off | 64 | 4,535 | 117 | 14 | 20 | | Node.js + PlanetScale (HA) | 384 | off | 384 | 4,275 | 135 | 89 | 110 | | Convex | 64 | off | N/A | 1,140 | 118 | 53 | 62 | | Node.js + CockroachDB (5 node) | 320 | off | 320 | 4,253 | 561 | 71 | 120 | | HAProxy - Node.js + CockroachDB (5 node) | 320 | off | 320 | 5,481 | 566 | 57 | 95 | ## Alpha = 1.5 | System | clients | pipelining | max_pool | TPS | TPS Stddev | p50 lat ms | p99 lat ms | |---|---:|---:|---:|---:|---:|---:|---:| | SpacetimeDB | 64 | 40 | N/A | 303,919 | 4,712 | 7 | 11 | | Node.js + SQLite | 64 | off | N/A | 3,188 | 73 | 18 | 39 | | Node.js + Supabase | 64 | off | 64 | 2,534 | 57 | 2 | 197 | | Bun + Postgres | 64 | off | 64 | 2,772 | 61 | 7 | 13 | | Node.js + Postgres | 64 | off | 64 | 961 | 25 | 10 | 16 | | Node.js + PlanetScale (SN) | 64 | off | 64 | 235 | 12 | 20 | 2,504 | | Node.js + PlanetScale (HA) | 384 | off | 384 | 248 | 13 | 416 | 10,121 | | Convex | 64 | off | N/A | 126 | 52 | 20 | 1,081 | | Node.js + CockroachDB (5 node) | 320 | off | 320 | 0.03 | 0.18 | 698 | 9,695 | | HAProxy - Node.js + CockroachDB (5 node) | 64 | off | 64 | 6.87 | 9.12 | 5,943 | 9,880 | ## Alpha = 0 PIPELINED TEST | System | clients | pipelining | max_pool | TPS | TPS Stddev | p50 lat ms | p99 lat ms | |---|---:|---:|---:|---:|---:|---:|---:| | Node.js + SQLite | 64 | 40 | N/A | 2,977 | 84 | 722 | 747 | | Node.js + Supabase | 64 | 40 | 64 | 8,874 | 308 | 284 | 303 | | Bun + Postgres | 64 | 40 | 64 | 10,184 | 120 | 250.1 | 260.5 | | Node.js + Postgres | 64 | 40 | 64 | 9,165 | 145 | 276 | 290 | | Node.js + PlanetScale (SN) | 64 | 40 | 64 | 4,325 | 85 | 590 | 604 | | Node.js + PlanetScale (HA) | 384 | 40 | 384 | 3,355 | 327 | 4,354 | 4,438 | | Convex | 64 | 40 | N/A | 1,154 | 134 | 2,119 | 2,150 | | Node.js + CockroachDB (5 node) | 320 | 40 | 320 | 4,250 | 766 | 3,030 | 3,161 | | HAProxy - Node.js + CockroachDB (5 node) | 320 | 40 | 320 | 5,992 | 1,765 | 2,431 | 2,562 | # API and ABI Breaking Changes No engine API/ABI changes. Internal keynote-template shape changes only. - `BenchOptions.alpha: number` → `alphas: number[]`. - New fields: `runs`, `prepBetweenAlphas`. - `RunResult` now includes required `timeSeries`. - Output JSON filename format is `test-1-<connector>-a<alpha>-<timestamp>.json`. # Expected Complexity Level and Risk **3.** Contained to `templates/keynote-2/`, but cross-cutting across runner/CLI/connectors/RPC servers/scripts/docs. # Testing Reviewer smoke test: ```bash pnpm run prep pnpm run bench --alpha 0,1.5 --connectors postgres_rpc --seconds 30 --runs 1 ls runs/test-1-postgres_rpc-a*.json # expect 2 files --------- Co-authored-by: joshua-spacetime <josh@clockworklabs.io>
1 parent c416055 commit 16a2a8f

33 files changed

Lines changed: 1214 additions & 976 deletions

pnpm-lock.yaml

Lines changed: 26 additions & 24 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

templates/keynote-2/.env.example

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,19 @@ SUPABASE_RPC_URL=http://127.0.0.1:4106
5757

5858
# ===== Seeding knobs =====
5959
SEED_ACCOUNTS=100000
60-
SEED_INITIAL_BALANCE=10000000
60+
SEED_INITIAL_BALANCE=1000000000
61+
62+
# ===== Bench knobs =====
63+
# Pool size for pg-based RPC servers (postgres, cockroach, supabase, planetscale). Default: 64.
64+
# Read at RPC-server startup — restart the RPC if you change this.
65+
MAX_POOL=64
66+
67+
# Pipelining for the bench client. Bench pipelining is global across connectors.
68+
# Some connectors may still have their own internal transport details.
69+
# Setting MAX_INFLIGHT_PER_WORKER alone does NOT enable pipelining for them.
70+
# If BENCH_PIPELINED=1, you must set MAX_INFLIGHT_PER_WORKER explicitly.
71+
#BENCH_PIPELINED=1
72+
#MAX_INFLIGHT_PER_WORKER=40
6173

6274
VERIFY=0
6375
ENABLE_RPC_SERVERS=0

templates/keynote-2/DEVELOP.md

Lines changed: 48 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,8 @@ The script will:
2727

2828
**Options:**
2929

30-
- `--seconds N` - Benchmark duration (default: 10)
31-
- `--concurrency N` - Concurrent connections (default: 50)
30+
- `--seconds N` - Benchmark duration (default: 300)
31+
- `--concurrency N` - Concurrent connections (default: 64)
3232
- `--alpha N` - Contention level (default: 1.5)
3333
- `--systems a,b,c` - Systems to compare (default: convex,spacetimedb)
3434
- `--stdb-compression none|gzip` - SpacetimeDB client compression mode (default: none)
@@ -193,26 +193,33 @@ pnpm run bench test-1 --connectors spacetimedb --stdb-compression gzip
193193

194194
# Only run selected connectors
195195
pnpm run bench test-1 --connectors spacetimedb,sqlite_rpc
196+
197+
# Sweep alpha values for a connector set
198+
pnpm run bench test-1 --alpha 0,1.5 --connectors postgres_rpc,bun --seconds 300
199+
200+
# Sweep contention (alpha) for a single connector: start,end,step,concurrency
201+
pnpm run bench test-1 --connectors cockroach_rpc --contention-tests 0,1.5,0.5,64
202+
203+
# Sweep concurrency for a single connector: start,end,factor,alpha
204+
pnpm run bench test-1 --connectors cockroach_rpc --concurrency-tests 16,512,2,1.5
196205
```
197206

198207
## CLI Arguments
199208

200-
From `src/cli.ts`:
201-
202209
- **`test-name`** (positional)
203210
- Name of the test folder under `src/tests/`
204211
- Default: `test-1`
205212

206213
- **`--seconds N`**
207214
- Duration of the benchmark in seconds
208-
- Default: `10`
215+
- Default: `300`
209216

210217
- **`--concurrency N`**
211218
- Number of workers / in-flight operations
212-
- Default: `50`
219+
- Default: `64`
213220

214221
- **`--alpha A`**
215-
- Zipf α parameter for account selection (hot vs cold distribution)
222+
- Zipf alpha parameter for account selection (hot vs cold distribution)
216223
- Default: `1.5`
217224

218225
- **`--connectors list`**
@@ -227,15 +234,34 @@ From `src/cli.ts`:
227234
- The valid names come from `tc.system` in the test modules and the keys in `CONNECTORS`
228235
- Valid names: `convex`, `spacetimedb`, `bun`, `postgres_rpc`, `cockroach_rpc`, `sqlite_rpc`, `supabase_rpc`, `planetscale_pg_rpc`
229236

230-
- **`--contention-tests startAlpha endAlpha step concurrency`**
231-
- Runs a sweep over Zipf α values for a single connector
232-
- Uses `startAlpha`, `endAlpha`, and `step` to choose the α values
233-
- Uses the provided `concurrency` for all runs
237+
- **`--systems list`**
238+
- Alias for `--connectors` in bench mode
239+
240+
- **`--runs N`**
241+
- Repeat each `(connector, alpha)` combination `N` times
242+
- Default: `1`
243+
244+
- **`--prep-between-alphas`**
245+
- Run `pnpm run prep` before each `(connector, alpha)` combination
246+
247+
- **`--contention-tests start,end,step,concurrency`**
248+
- Sweep Zipf alpha values for one connector
234249

235-
- **`--concurrency-tests startConc endConc step alpha`**
236-
- Runs a sweep over concurrency levels for a single connector
237-
- Uses `startConc`, `endConc`, and `step` to choose the concurrency values
238-
- Uses the provided `alpha` for all runs
250+
- **`--concurrency-tests start,end,factor,alpha`**
251+
- Sweep concurrency values for one connector
252+
253+
- **`--bench-pipelined` / `--no-bench-pipelined`**
254+
- Force pipelining on or off across connectors
255+
256+
- **`--max-inflight-per-worker N`**
257+
- Max in-flight requests per worker when pipelining is enabled
258+
- Required when `--bench-pipelined` is enabled
259+
260+
- **`--log-errors`**
261+
- Log per-operation errors during runs
262+
263+
- **`--verify-transactions`**
264+
- Run connector verification at end of run
239265

240266
---
241267

@@ -244,7 +270,7 @@ From `src/cli.ts`:
244270
You can also run the benchmark via Docker instead of Node directly:
245271

246272
```bash
247-
docker compose run --rm bench -- --seconds 5 --concurrency 50 --alpha 1 --connectors convex
273+
docker compose run --rm bench -- --seconds 5 --concurrency 64 --alpha 1 --connectors convex
248274
```
249275

250276
If using Docker, make sure to set `USE_DOCKER=1` in `.env`, verify docker-compose env variables, verify you've run supabase init, and run `pnpm run prep` before running bench.
@@ -254,7 +280,10 @@ If using Docker, make sure to set `USE_DOCKER=1` in `.env`, verify docker-compos
254280
Every run writes a JSON file into `./runs/`:
255281
256282
- Directory: `./runs/`
257-
- Filename: `<test-name>-<timestamp>.json`
258-
- Example: `test-1-2025-11-17T16-45-12-345Z.json`
283+
- Filename: `<test-name>-<connector>-a<alpha>-<timestamp>.json`
284+
- Example: `test-1-postgres_rpc-a1.5-2025-11-17T16-45-12-345Z.json`
285+
286+
For rollup tables, compute steady-state stats after a 30-second warmup window (`tSec >= 30`). The `scripts/bench-stats.py` default matches this (`--warmup-sec 30`).
287+
288+
Point your visualizations / CSV exports at `./runs/` and you're good.
259289

260-
Point your visualizations / CSV exports at `./runs/` and you’re good.

0 commit comments

Comments
 (0)