Skip to content

v5: Radically improve memory footprint#191

Open
mourner wants to merge 26 commits into
mainfrom
v5-new
Open

v5: Radically improve memory footprint#191
mourner wants to merge 26 commits into
mainfrom
v5-new

Conversation

@mourner
Copy link
Copy Markdown
Member

@mourner mourner commented May 18, 2026

TL;DR: 45–75% less retained memory and up to 79% lower peak across the bench corpus, with init build time within ~10% of v4 (and 50–70% faster on point datasets). Public API unchanged.

A focused rework of geojson-vt's internal data shape to slash memory footprint on large datasets — the workloads that have caused browser tabs to OOM. The headline case: building the county index (205 MB GeoJSON of US counties) now holds 103 MB instead of 241 MB on default settings, and the transient working set during a heavy drilldown build peaks at 404 MB instead of 1.9 GB — a workload that previously crashed browser tabs now fits comfortably.

What changed

Int32 source coords with a Float64 fallback. Source coordinates flow through convert → wrap → clip → tile.source as Int32Array, encoded as maxZoom-pixel quanta centered on zero. Halves the per-coord byte cost on the retained source slab (8 B → 4 B). Centering keeps every value inside V8's SMI range so reads stay on the fast integer path. A runtime gate falls back to Float64Array when (extent + 2*buffer) * 2^maxZoom > 2^32, covering extreme-extent configs.

Flat coords with inline ring headers. One buffer per feature in shape [ringLen, ringSize, x, y, z, …] instead of nested per-ring arrays. ringSize is signed area for polygons (encodes outer/hole) and stored sqrt-linear to fit Int32. Buffers are pre-sized exact-fit in convert and wrap.

Typed committed-tile coords. Every committed ring is retained as Int16Array (or Int32Array when extent+buffer > 32767), exact-sized via a pre-count pass over simplification z-values.

Specialized single-Point feature shape. Single-Point features use a 5-slot {id, type, x, y, tags} wrapper — no geometry array, no bbox slots. MultiPoint unchanged. Drives the point-dataset wins.

Foundation cleanup. Canonical winding enforced once at convert (GeoJSON structural nesting → signed-area outer/hole encoding), with the separate rewind pass removed. Eager extent projection moved inline into createTile; retained tile is immutable. Three clip implementations collapsed onto unified internal types (POINT=1, LINE=2, POLYGON=3). Public types now hand-authored via .d.ts + JSDoc with tsc --noEmit --checkJs in CI; @types/geojson-vt dependency dropped.

Benchmarks (v5-new vs v4 main)

Each dataset is built in two configurations, both measured:

  • init — default options (indexMaxZoom: 5, indexMaxPoints: 100k). The normal usage pattern: build a small pre-tiled index, generate the rest on demand.
  • deepindexMaxZoom: 10, indexMaxPoints: 1000. Proxy for heavy drilldown across the whole dataset. Not a typical user config, but a useful stress test for the transient working set during recursive clip+split.

Median of 3 iterations on the same machine, node --expose-gc, v8.GCProfiler for schedule-immune allocation tracking. held is post-build forced-GC retention (heap + external); peak is in-build high-water; alloc is total bytes allocated during build.

Held memory — what stays in RAM after the index is built

dataset init v4 init v5 Δ deep v4 deep v5 Δ
earthquakes 3.8 MB 1.1 MB −71% 10.6 MB 3.8 MB −64%
places 4.5 MB 1.2 MB −73% 9.3 MB 3.3 MB −65%
route 1.4 MB 664 KB −54% 2.4 MB 1.1 MB −54%
hrr 15.2 MB 6.4 MB −58% 42.0 MB 16.1 MB −62%
us-2010 38.0 MB 21.0 MB −45% 87.0 MB 45.5 MB −48%
county 241 MB 103 MB −57% 323 MB 147 MB −54%

Peak memory — does this OOM on a constrained device?

dataset init v4 init v5 Δ deep v4 deep v5 Δ
earthquakes 12.0 MB 3.2 MB −73% 25.3 MB 11.8 MB −53%
places 14.0 MB 3.6 MB −74% 23.5 MB 9.8 MB −58%
route 13.3 MB 5.2 MB −61% 51.8 MB 12.5 MB −76%
hrr 79.3 MB 43.8 MB −45% 191 MB 81.2 MB −57%
us-2010 111 MB 97.5 MB −12% 305 MB 167 MB −45%
county 550 MB 278 MB −49% 1886 MB 404 MB −79%

Alloc — total bytes churned during build (GC pressure)

dataset init v4 init v5 Δ deep v4 deep v5 Δ
earthquakes 8.2 MB 1.2 MB −85% 19.6 MB 8.6 MB −56%
places 8.9 MB 1.2 MB −87% 18.0 MB 8.0 MB −56%
route 11.9 MB 12.6 MB +6% 93.6 MB 5.1 MB −95%
hrr 64.0 MB 6.4 MB −90% 269 MB 45.6 MB −83%
us-2010 169 MB 83.9 MB −50% 426 MB 108 MB −75%
county 1440 MB 653 MB −55% 3886 MB 735 MB −81%

Wall-clock — CPU is within ±35% of v4

dataset init v4 init v5 Δ deep v4 deep v5 Δ
earthquakes 2.3 ms 1.0 ms −57% 4.0 ms 4.0 ms
places 4.3 ms 1.4 ms −67% 5.0 ms 2.8 ms −44%
route 11.0 ms 9.6 ms −13% 32.6 ms 22.5 ms −31%
hrr 54.4 ms 58.9 ms +8% 130 ms 156 ms +20%
us-2010 105 ms 119 ms +14% 225 ms 264 ms +18%
county 938 ms 1040 ms +11% 2118 ms 2845 ms +34%

On the build users actually run (init), point-dominated workloads get 50–70% faster and coord-heavy workloads stay within ~10% of v4 (hrr +8%, us-2010 +14%, county +11%) — a small CPU cost in exchange for 45–75% less memory. The deep-drilldown stress test pays more CPU (up to +34% on county) for the same memory wins, but typical usage doesn't hit that path.

@mourner mourner added performance ai AI coding agents co-authored the code labels May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai AI coding agents co-authored the code performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant