Skip to content

Base58: replace OpenSSL BIGNUM with pure-bytes encode/decode#337

Open
heifner wants to merge 3 commits into
masterfrom
feature/base58-no-bignum
Open

Base58: replace OpenSSL BIGNUM with pure-bytes encode/decode#337
heifner wants to merge 3 commits into
masterfrom
feature/base58-no-bignum

Conversation

@heifner
Copy link
Copy Markdown
Contributor

@heifner heifner commented May 14, 2026

Summary

Heaptrack on a node serving /v1/trace_api/get_block (5k TPS, 30s) flags base58-encoding of signatures as the dominant allocator -- ~5.7M OPENSSL_malloc calls through BN_rshift / BN_div inside EncodeBase58. The libfc port of base58 is a 2012-era Bitcoin Core fork that still walks the input through an OpenSSL BIGNUM, allocating per BN_new, per BN_CTX_new, and per internal BN_div step.

This PR replaces the BIGNUM path with Bitcoin Core's pure-bytes algorithm (repeated divmod 58 on a working byte buffer). The algorithm is mathematically identical -- both compute the base-58 representation of the input treated as a big-endian integer -- so encoded outputs are bit-for-bit unchanged.

Changes

  • libraries/libfc/src/crypto/base58.cpp rewritten against the pure-bytes algorithm; CBigNum wrapper class and the <openssl/bn.h> include removed.
  • Working buffers use boost::container::small_vector<unsigned char, 128>, so signatures / public keys / private keys all encode without a heap allocation for the working state.
  • Public API (fc::to_base58, fc::from_base58) is unchanged; existing callers compile and behave identically.
  • New libraries/libfc/test/crypto/test_base58.cpp covering Bitcoin Core's 12 canonical encode_decode.json vectors, empty input, leading-zero preservation, invalid-character rejection, whitespace tolerance, and randomized roundtrips at 32 / 33 / 65 byte input sizes.

Allocations per call

Input Before (BIGNUM) After
32-byte private key 12+ heap allocs 1 (output string)
33-byte public key 12+ heap allocs 1 (output string)
65-byte signature 12+ heap allocs 1 (output string)

heifner added 3 commits May 14, 2026 10:34
EncodeBase58 and DecodeBase58 walked the input through an OpenSSL BIGNUM, costing 12+ heap allocations per call (BN_new, BN_CTX_new, OPENSSL_malloc inside BN_div). On a node serving /v1/trace_api/get_block this dominated the allocation profile -- base58-encoding signatures in served blocks accounted for ~5.7M allocations in a 30s 5k-TPS heaptrack capture.

Adopt Bitcoin Core's pure-bytes algorithm (repeated divmod-58 on a working byte buffer). The algorithm is mathematically identical to the BIGNUM path, so encoded output is bit-for-bit unchanged. The working buffer uses boost::container::small_vector with 128-byte inline capacity, which covers every input the codebase encodes -- 32-byte private keys, 33-byte public keys, 65-byte signatures -- without touching the heap. The only remaining allocation is the std::string output.

Drop the CBigNum wrapper class and the <openssl/bn.h> include; neither is used elsewhere in this file. Public API in fc/crypto/base58.hpp is unchanged.

Add test/crypto/test_base58.cpp covering Bitcoin Core's 12 canonical encode_decode.json vectors, empty input, leading-zero preservation, invalid-character rejection, whitespace tolerance, and randomized roundtrips at 32/33/65-byte input sizes.
Two follow-ups from review:

- Change zeroes/size/length/i from int to size_t in both encode_base58
  and decode_base58. The encode size formula `(pend - pbegin) * 138 /
  100 + 1` was previously narrowed via static_cast<int>, which would
  overflow for inputs above ~15.5MB. No realistic call site reaches
  that, but size_t closes the theoretical gap and removes two redundant
  static_cast<size_t>(zeroes) calls at the str.reserve / vch.reserve
  sites. The inner-loop `carry` and `digit` stay int -- digit can be -1,
  and carry stays in a small bounded range.

- Drop the k_ prefix on file-local constants (b58_alphabet,
  b58_inline_capacity, b58_reverse). Other libfc file-locals use bare
  snake_case (hex_digits in utf8.cpp, minimize_max_size in variant.cpp).
CI abi_large_signature regressed: the test feeds a 256KB webauthn
signature through abi_serializer::to_variant with a 1ms deadline, and
asserts total elapsed <= 51ms. The previous base58.cpp only yielded
at the entry/exit of to_base58, so the encode ran to completion (~158s
in CI) and the deadline never fired.

Push the yield down into encode_base58 and call it every 64 input
bytes via a cheap mask check. Small inputs (32-65 byte keys/sigs)
incur zero yield overhead -- the mask only fires at byte 64 and beyond.
Large inputs trip the deadline within 64 outer iters of crossing it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant