Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR: 45–75% less retained memory and up to 79% lower peak across the bench corpus, with init build time within ~10% of v4 (and 50–70% faster on point datasets). Public API unchanged.
A focused rework of geojson-vt's internal data shape to slash memory footprint on large datasets — the workloads that have caused browser tabs to OOM. The headline case: building the
countyindex (205 MB GeoJSON of US counties) now holds 103 MB instead of 241 MB on default settings, and the transient working set during a heavy drilldown build peaks at 404 MB instead of 1.9 GB — a workload that previously crashed browser tabs now fits comfortably.What changed
Int32 source coords with a Float64 fallback. Source coordinates flow through
convert → wrap → clip → tile.sourceasInt32Array, encoded as maxZoom-pixel quanta centered on zero. Halves the per-coord byte cost on the retained source slab (8 B → 4 B). Centering keeps every value inside V8's SMI range so reads stay on the fast integer path. A runtime gate falls back toFloat64Arraywhen(extent + 2*buffer) * 2^maxZoom > 2^32, covering extreme-extent configs.Flat coords with inline ring headers. One buffer per feature in shape
[ringLen, ringSize, x, y, z, …]instead of nested per-ring arrays.ringSizeis signed area for polygons (encodes outer/hole) and stored sqrt-linear to fit Int32. Buffers are pre-sized exact-fit in convert and wrap.Typed committed-tile coords. Every committed ring is retained as
Int16Array(orInt32Arraywhenextent+buffer > 32767), exact-sized via a pre-count pass over simplification z-values.Specialized single-Point feature shape. Single-Point features use a 5-slot
{id, type, x, y, tags}wrapper — no geometry array, no bbox slots. MultiPoint unchanged. Drives the point-dataset wins.Foundation cleanup. Canonical winding enforced once at convert (GeoJSON structural nesting → signed-area outer/hole encoding), with the separate rewind pass removed. Eager extent projection moved inline into
createTile; retained tile is immutable. Three clip implementations collapsed onto unified internal types (POINT=1, LINE=2, POLYGON=3). Public types now hand-authored via.d.ts+ JSDoc withtsc --noEmit --checkJsin CI;@types/geojson-vtdependency dropped.Benchmarks (v5-new vs v4 main)
Each dataset is built in two configurations, both measured:
indexMaxZoom: 5, indexMaxPoints: 100k). The normal usage pattern: build a small pre-tiled index, generate the rest on demand.indexMaxZoom: 10, indexMaxPoints: 1000. Proxy for heavy drilldown across the whole dataset. Not a typical user config, but a useful stress test for the transient working set during recursive clip+split.Median of 3 iterations on the same machine,
node --expose-gc,v8.GCProfilerfor schedule-immune allocation tracking.heldis post-build forced-GC retention (heap + external);peakis in-build high-water;allocis total bytes allocated during build.Held memory — what stays in RAM after the index is built
Peak memory — does this OOM on a constrained device?
Alloc — total bytes churned during build (GC pressure)
Wall-clock — CPU is within ±35% of v4
On the build users actually run (
init), point-dominated workloads get 50–70% faster and coord-heavy workloads stay within ~10% of v4 (hrr +8%, us-2010 +14%, county +11%) — a small CPU cost in exchange for 45–75% less memory. The deep-drilldown stress test pays more CPU (up to +34% on county) for the same memory wins, but typical usage doesn't hit that path.