Base58: replace OpenSSL BIGNUM with pure-bytes encode/decode#337
Open
heifner wants to merge 3 commits into
Open
Base58: replace OpenSSL BIGNUM with pure-bytes encode/decode#337heifner wants to merge 3 commits into
heifner wants to merge 3 commits into
Conversation
EncodeBase58 and DecodeBase58 walked the input through an OpenSSL BIGNUM, costing 12+ heap allocations per call (BN_new, BN_CTX_new, OPENSSL_malloc inside BN_div). On a node serving /v1/trace_api/get_block this dominated the allocation profile -- base58-encoding signatures in served blocks accounted for ~5.7M allocations in a 30s 5k-TPS heaptrack capture. Adopt Bitcoin Core's pure-bytes algorithm (repeated divmod-58 on a working byte buffer). The algorithm is mathematically identical to the BIGNUM path, so encoded output is bit-for-bit unchanged. The working buffer uses boost::container::small_vector with 128-byte inline capacity, which covers every input the codebase encodes -- 32-byte private keys, 33-byte public keys, 65-byte signatures -- without touching the heap. The only remaining allocation is the std::string output. Drop the CBigNum wrapper class and the <openssl/bn.h> include; neither is used elsewhere in this file. Public API in fc/crypto/base58.hpp is unchanged. Add test/crypto/test_base58.cpp covering Bitcoin Core's 12 canonical encode_decode.json vectors, empty input, leading-zero preservation, invalid-character rejection, whitespace tolerance, and randomized roundtrips at 32/33/65-byte input sizes.
Two follow-ups from review: - Change zeroes/size/length/i from int to size_t in both encode_base58 and decode_base58. The encode size formula `(pend - pbegin) * 138 / 100 + 1` was previously narrowed via static_cast<int>, which would overflow for inputs above ~15.5MB. No realistic call site reaches that, but size_t closes the theoretical gap and removes two redundant static_cast<size_t>(zeroes) calls at the str.reserve / vch.reserve sites. The inner-loop `carry` and `digit` stay int -- digit can be -1, and carry stays in a small bounded range. - Drop the k_ prefix on file-local constants (b58_alphabet, b58_inline_capacity, b58_reverse). Other libfc file-locals use bare snake_case (hex_digits in utf8.cpp, minimize_max_size in variant.cpp).
CI abi_large_signature regressed: the test feeds a 256KB webauthn signature through abi_serializer::to_variant with a 1ms deadline, and asserts total elapsed <= 51ms. The previous base58.cpp only yielded at the entry/exit of to_base58, so the encode ran to completion (~158s in CI) and the deadline never fired. Push the yield down into encode_base58 and call it every 64 input bytes via a cheap mask check. Small inputs (32-65 byte keys/sigs) incur zero yield overhead -- the mask only fires at byte 64 and beyond. Large inputs trip the deadline within 64 outer iters of crossing it.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Heaptrack on a node serving
/v1/trace_api/get_block(5k TPS, 30s) flags base58-encoding of signatures as the dominant allocator -- ~5.7MOPENSSL_malloccalls throughBN_rshift/BN_divinsideEncodeBase58. The libfc port of base58 is a 2012-era Bitcoin Core fork that still walks the input through an OpenSSL BIGNUM, allocating perBN_new, perBN_CTX_new, and per internalBN_divstep.This PR replaces the BIGNUM path with Bitcoin Core's pure-bytes algorithm (repeated
divmod 58on a working byte buffer). The algorithm is mathematically identical -- both compute the base-58 representation of the input treated as a big-endian integer -- so encoded outputs are bit-for-bit unchanged.Changes
libraries/libfc/src/crypto/base58.cpprewritten against the pure-bytes algorithm;CBigNumwrapper class and the<openssl/bn.h>include removed.boost::container::small_vector<unsigned char, 128>, so signatures / public keys / private keys all encode without a heap allocation for the working state.fc::to_base58,fc::from_base58) is unchanged; existing callers compile and behave identically.libraries/libfc/test/crypto/test_base58.cppcovering Bitcoin Core's 12 canonicalencode_decode.jsonvectors, empty input, leading-zero preservation, invalid-character rejection, whitespace tolerance, and randomized roundtrips at 32 / 33 / 65 byte input sizes.Allocations per call