Skip to content

perf: Extend BufferPool recycling to all scratch-buffer encodings#804

Closed
xiaoxmeng wants to merge 1 commit into
facebookincubator:mainfrom
xiaoxmeng:export-D106908319
Closed

perf: Extend BufferPool recycling to all scratch-buffer encodings#804
xiaoxmeng wants to merge 1 commit into
facebookincubator:mainfrom
xiaoxmeng:export-D106908319

Conversation

@xiaoxmeng
Copy link
Copy Markdown
Contributor

Summary:
Previously only MainlyConstantEncoding used BufferPool to recycle scratch buffers across encoding lifetimes. This diff extends the pattern to NullableEncoding, DeltaEncoding, and FixedBitWidthEncoding, reducing MemoryPool allocation overhead during deserialization.

Changes:

  • Added getBuffer(bytes), getVectorBuffer<V>(), and releaseVectorBuffer() helpers to Encoding base class, replacing the local detail::getPooledBuffer in MainlyConstantEncoding
  • getBuffer(bytes) uses BufferPool::get(bytes) so undersized cached buffers remain available instead of being popped and released back
  • NullableEncoding: nullBuffer_ now acquires/releases via BufferPool; removed dead indicesBuffer_ member
  • DeltaEncoding: deltasBuffer_, restatementsBuffer_, and isRestatementsBitmap_ now use BufferPool
  • DictionaryEncoding and RLEEncoding: dictionary-index scratch buffers now use BufferPool
  • FixedBitWidthEncoding: buffer_ now uses BufferPool
  • MainlyConstantEncoding: refactored to use shared helpers (same behavior)
  • Removed dead Vector<> members: SentinelEncoding::buffer_, VarintEncoding::buf_
  • Added EncodingBufferPoolTest.getBufferUsesMinimumCapacity
  • Added serializer benchmark at dwio/nimble/serializer/benchmarks/ measuring deserialization throughput and allocation counts with wide flat maps (BufferPool vs no BufferPool)

Benchmark result highlight (10K rows x 100 keys, MainlyConstant data pattern):

  • Allocations: 602 (pool) vs 2302 (no pool) = 73.8% fewer allocs

Differential Revision: D106908319

Summary:
Previously only `MainlyConstantEncoding` used `BufferPool` to recycle scratch buffers across encoding lifetimes. This diff extends the pattern to `NullableEncoding`, `DeltaEncoding`, and `FixedBitWidthEncoding`, reducing MemoryPool allocation overhead during deserialization.

Changes:
- Added `getBuffer(bytes)`, `getVectorBuffer<V>()`, and `releaseVectorBuffer()` helpers to `Encoding` base class, replacing the local `detail::getPooledBuffer` in `MainlyConstantEncoding`
- `getBuffer(bytes)` uses `BufferPool::get(bytes)` so undersized cached buffers remain available instead of being popped and released back
- `NullableEncoding`: `nullBuffer_` now acquires/releases via BufferPool; removed dead `indicesBuffer_` member
- `DeltaEncoding`: `deltasBuffer_`, `restatementsBuffer_`, and `isRestatementsBitmap_` now use BufferPool
- `DictionaryEncoding` and `RLEEncoding`: dictionary-index scratch buffers now use BufferPool
- `FixedBitWidthEncoding`: `buffer_` now uses BufferPool
- `MainlyConstantEncoding`: refactored to use shared helpers (same behavior)
- Removed dead `Vector<>` members: `SentinelEncoding::buffer_`, `VarintEncoding::buf_`
- Added `EncodingBufferPoolTest.getBufferUsesMinimumCapacity`
- Added serializer benchmark at `dwio/nimble/serializer/benchmarks/` measuring deserialization throughput and allocation counts with wide flat maps (BufferPool vs no BufferPool)

Benchmark result highlight (10K rows x 100 keys, MainlyConstant data pattern):
- Allocations: 602 (pool) vs 2302 (no pool) = 73.8% fewer allocs

Differential Revision: D106908319
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 31, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 31, 2026

@xiaoxmeng has exported this pull request. If you are a Meta employee, you can view the originating Diff in D106908319.

@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 31, 2026

This pull request has been merged in 566afa8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants