Skip to content

perf: use autoresearch to reduce allocations#3225

Draft
tac0turtle wants to merge 14 commits intomainfrom
autoresearch/block-perf-2026-04-04
Draft

perf: use autoresearch to reduce allocations#3225
tac0turtle wants to merge 14 commits intomainfrom
autoresearch/block-perf-2026-04-04

Conversation

@tac0turtle
Copy link
Copy Markdown
Contributor

@tac0turtle tac0turtle commented Apr 4, 2026

Overview

This pr used autoresearch to run a loop on reducing allocations

3 a8b6352 78allocs 22,996 33,091 keep Unsafe reinterpret cast of Txs to [][]byte in ApplyBlock — eliminates make([][]byte, n) allocation
4 823aa62 77allocs 20,276 32,480 keep Direct pb.Data serialization in DACommitment — avoids pruned Data wrapper and txsToByteSlices allocations
5 0720b44 74allocs 12,192 31,624 keep unsafe.Slice in Data.ToProto() — eliminates txsToByteSlices [][]byte allocation
6 — 74allocs 12,187 31,217 discard Reverted hand-written HashSlim wire encoder — produced different hashes than MarshalBinary
7 ccbc2e4 64allocs 11,130 31,570 keep sync.Pool for protobuf message structs in MarshalBinary — eliminates 10 allocs per block
8 805672e 56allocs 10,217 31,037 keep Pooled SignedHeader.MarshalBinary — reuse pb.SignedHeader/pb.Header/pb.Signer/pb.Version structs

Result: {"status":"keep","allocs_per_op":81,"bytes_per_op":25934,"ns_per_op":34001}
…allocations per block

Result: {"status":"keep","allocs_per_op":79,"bytes_per_op":25697,"ns_per_op":34147}
… make([][]byte, n) allocation

Result: {"status":"keep","allocs_per_op":78,"bytes_per_op":22996,"ns_per_op":33091}
…pper and txsToByteSlices allocations

Result: {"status":"keep","allocs_per_op":77,"bytes_per_op":20276,"ns_per_op":32480}
…allocation

Result: {"status":"keep","allocs_per_op":74,"bytes_per_op":12192,"ns_per_op":31624}
…10 allocs per block

Replace per-call allocation of pb.Header/pb.Version/pb.Data/pb.Metadata with
sync.Pool reuse in the hot MarshalBinary path. ToProto() API is unchanged —
only MarshalBinary is affected since it consumes the result immediately.

Metrics (100_txs benchmark):
- 74 → 64 allocs/op (-13.5%)
- ~12.1 → ~11.1 KB (-8.3%)
- ~31ns flat
…Signer/pb.Version

Eliminates 4 struct allocations per signed header marshal: pb.SignedHeader,
pb.Header, pb.Version, pb.Signer. These are now borrowed from sync.Pool and
returned after proto.Marshal completes.

Metrics (100_txs benchmark):
- 64 → 56 allocs/op
- ~11KB → ~10.2KB
…er block

State.ToProto allocated pb.State + pb.Version + timestamppb.Timestamp per block.
MarshalBinary now pools those structs and returns the marshaled bytes directly.
pkg/store/batch.UpdateState switched from ToProto+proto.Marshal to MarshalBinary.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 4, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9dc826f3-3e21-4e32-a93d-1955b346a0ba

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch autoresearch/block-perf-2026-04-04

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedApr 7, 2026, 6:39 PM

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 4, 2026

Claude finished @tac0turtle's task in 4m 39s —— View job


Code Review: perf: use autoresearch to reduce allocations

  • Gather context
  • Read changed files
  • Review unsafe operations
  • Review sync.Pool usage
  • Review serialization changes
  • Post comprehensive review

Good optimization work overall — the allocation reductions are real and the compile-time layout guards in tx.go are a nice touch. A few correctness concerns worth addressing before merge.

Critical / Correctness

1. ApplyBlock: aliasing between rawTxs and data.Txs (executor.go:799–801)

rawTxs = unsafe.Slice((*[]byte)(unsafe.SliceData(data.Txs)), n)

rawTxs and data.Txs now share the same backing array. If any Executor.ExecuteTxs implementation mutates elements of the passed [][]byte (e.g. appends to a transaction, trims it, or reassigns elements), it will silently corrupt data.Txs. The coreexecutor.Executor interface carries no documentation prohibiting mutation.

This is the riskiest change in the PR. Either:

  • document the no-mutation contract on coreexecutor.Executor.ExecuteTxs, and add a //nolint:govet or godoc note on the call site, or
  • add a //go:noescape wrapper / restrict to a clearly owned temporary scope.

Fix this →


2. Pool escape on panic in SignedHeader.MarshalBinary (serialization.go:223–291)

Pool cleanup is done inline (not via defer) in three separate code paths. If proto.Marshal panics (rare, but possible with corrupt internal state), psh, ph, pv, and psi are never returned to their pools. The pool will create new instances on demand so correctness is maintained, but it creates a subtle invariant. The same concern applies to Header.MarshalBinary.

Consider extracting the cleanup into a single defer:

defer func() {
    ph.Reset(); pbHeaderPool.Put(ph)
    pv.Reset(); pbVersionPool.Put(pv)
    psi.Reset(); pbSignerPool.Put(psi)
    psh.Reset(); pbSignedHeaderPool.Put(psh)
}()

The overhead of one defer vs. three inlined cleanup sites is negligible relative to proto.Marshal.

Fix this →


Medium

3. Inconsistent nil guard for Txs unsafe cast (hashing.go:130 vs serialization.go:143)

Data.MarshalBinary guards:

if d.Txs != nil {
    pd.Txs = unsafe.Slice(...)
}

DACommitment does not guard:

pbData := pb.Data{Txs: unsafe.Slice((*[]byte)(unsafe.SliceData(d.Txs)), len(d.Txs))}

Both are actually safe per the Go spec (unsafe.SliceData(nil) == nil, unsafe.Slice(nil, 0) == nil), but the inconsistency is confusing to readers. Either add the nil guard to DACommitment or remove it from MarshalBinary with a comment explaining why it's unnecessary.


4. timestamppb.Timestamp still allocates in State.MarshalBinary (serialization.go:539)

pts := &timestamppb.Timestamp{
    Seconds: s.LastBlockTime.Unix(),
    Nanos:   int32(s.LastBlockTime.Nanosecond()),
}

A *timestamppb.Timestamp is allocated on every State.MarshalBinary call. Since you've pooled pbStatePool, pooling the timestamp too would be consistent (though Timestamp is a small fixed struct so the escape analysis may already stack-allocate it — worth checking with -gcflags="-m").


5. sync.Pool returns interface{} — missed type assertion on sha256Pool (hashing.go:123)

s := sha256Pool.Get().(hash.Hash)

The sha256Pool.New returns sha256.New() which implements hash.Hash. The type assertion will panic if the pool ever returns a non-hash.Hash value — which can't happen with the current code, but is fragile. Storing *sha256.digest directly (using sha256.New().(*sha256.digest)) is not publicly accessible, so the interface approach is acceptable; just make sure there are no //go:linkname tricks pulling the wrong type in.


Low / Informational

6. Compile-time guards check size, not layout (tx.go:13–16)

The zero-array trick proves sizeof(Tx) == sizeof([]byte), which is necessary but not strictly sufficient for a pointer reinterpret cast. In practice this is fine because type Tx []byte shares the same underlying type, guaranteeing identical slice-header layout. A comment noting this distinction would help future readers.

7. defer pbMetadataPool.Put(pm) called after pbDataPool.Put(pd) on same path (serialization.go:136–140)

pd is returned to pbDataPool at line 148, then the deferred pm cleanup runs. Between those two moments, pd (with pd.Metadata = nil after Reset()) is back in the pool and could be grabbed by another goroutine before pm is returned. This is technically safe since pd.Metadata is already nil after Reset(), but the interleaved pool lifecycle is subtle. Adding an explicit pm.Reset(); pbMetadataPool.Put(pm) before the pd.Reset() (and removing the defer) would make the ordering unambiguous.


Testing

  • The new TestDataMarshalBinary_Pooled / concurrent test is absent. A TestDataMarshalBinaryRoundTrip with t.Parallel() plus a concurrent stress path exercising the pools would strengthen confidence (especially under -race).
  • Codecov flags 32 missing lines in types/serialization.go — the legacy-field encoding paths and the State.MarshalBinary pool path appear uncovered. Worth adding tests for marshalLegacyHeader round-trip and the pooled State.MarshalBinary directly.
  • The new benchmark file hash_memo_bench_test.go is clean and useful.

Summary

Category Count
Critical (correctness) 2
Medium 3
Low/Informational 2

The ApplyBlock aliasing issue (#1) and pool escape on panic (#2) are the most important to resolve before merge. The rest are polish.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://evstack.github.io/docs-preview/pr-3225/

Built to branch main at 2026-04-04 10:13 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@tac0turtle tac0turtle force-pushed the autoresearch/block-perf-2026-04-04 branch from 9148ccc to 2438dc0 Compare April 4, 2026 10:10
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 4, 2026

Codecov Report

❌ Patch coverage is 93.16770% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.06%. Comparing base (ff88b95) to head (636ce24).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
types/serialization.go 93.19% 9 Missing and 1 partial ⚠️
pkg/store/batch.go 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3225      +/-   ##
==========================================
+ Coverage   61.45%   62.06%   +0.61%     
==========================================
  Files         120      120              
  Lines       12504    12778     +274     
==========================================
+ Hits         7684     7931     +247     
- Misses       3959     3975      +16     
- Partials      861      872      +11     
Flag Coverage Δ
combined 62.06% <93.16%> (+0.61%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant