Skip to content

[ISSUE #10442] Optimize message properties encode/decode to reduce allocation#10444

Open
wang-jiahua wants to merge 1 commit into
apache:developfrom
wang-jiahua:perf/flat-properties-map
Open

[ISSUE #10442] Optimize message properties encode/decode to reduce allocation#10444
wang-jiahua wants to merge 1 commit into
apache:developfrom
wang-jiahua:perf/flat-properties-map

Conversation

@wang-jiahua

@wang-jiahua wang-jiahua commented Jun 8, 2026

Copy link
Copy Markdown

Which Issue(s) This PR Fixes

Fixes #10442

Brief Description

Optimizes the message properties encode/decode hot path on the broker send and dispatch paths. All changes are within the common module's MessageDecoder, MessageConst, and MessageExtBrokerInner.

Changes

  • MessageDecoder.string2messageProperties: right-size HashMap from separator count instead of fixed capacity 128; intern well-known property keys via length-bucketed regionMatches (skip substring + hash lookup); intern frequent values ("0", "1", "true", "false", "DefaultRegion"). Added (String, int) overload for callers that need extra capacity for broker-injected entries.
  • MessageDecoder.messageProperties2Bytes: direct UTF-8 byte serialization of Map entries, skipping the StringBuilder → String → getBytes round-trip on the broker write path.
  • MessageDecoder.bytes2messageProperties: parse directly from byte[] with ASCII key matching, skipping the intermediate new String(bytes, ...) allocation on the dispatch path.
  • MessageDecoder.messageProperties2String: reuse ThreadLocal<StringBuilder> with capacity cap (64KB) to bound per-thread memory after oversized messages.
  • MessageConst: add STRING_INTERN_MAP and STRING_INTERN_BY_LEN for canonical key interning.
  • MessageExtBrokerInner: add propertiesData byte[] cache field to short-circuit propertiesString → byte[] re-encoding on the write path; getPropertiesData() returns defensive copy; getEffectivePropertiesData() (package-private) returns internal buffer for the encode hot path.

Note: Message.getProperty / getPriority fixes and MessageClientIDSetter / MessageVersion optimizations have been split into separate PRs (#10468, #10469) for focused review.

Compatibility

  • string2messageProperties(String) remains the primary public API; the new (String, int) overload is additive.
  • bytes2messageProperties and messageProperties2Bytes are new public methods, additive only.
  • messageProperties2String output is unchanged; the ThreadLocal is an internal optimization.
  • MessageExtBrokerInner.propertiesData is a new transient field; existing serialization is unaffected.

How Did You Test This Change

New unit tests added in MessageDecoderTest:

  • testMessageProperties2BytesMatchesString — verifies byte-for-byte equivalence with the String path for ASCII, multi-byte UTF-8, and paired surrogates.
  • testMessageProperties2StringAndBytesSkipNullKeyAndValue — verifies both paths skip null key/value entries consistently.
  • testMessageProperties2BytesUnpairedSurrogateMatchesJdk — verifies unpaired surrogate handling matches JDK behavior.
  • testBytes2messagePropertiesReturnsIndependentMap — verifies consecutive decodes on the same thread produce independent maps.
  • testMessageProperties2BytesNullAndEmpty — verifies null/empty input handling.
mvn -pl common -Dcheckstyle.skip=false -Dspotbugs.skip=true validate
# 0 Checkstyle violations

mvn -pl common -Dcheckstyle.skip -Dspotbugs.skip=true -Djacoco.skip=true \
  -Dtest='MessageDecoderTest' test
# Tests run: 16, Failures: 0, Errors: 0, Skipped: 0

Copilot AI review requested due to automatic review settings June 8, 2026 09:26

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR focuses on reducing allocations and improving performance on the broker hot path by optimizing message property serialization/parsing and reusing buffers.

Changes:

  • Added UTF-8 byte-level encode/decode paths for message properties to avoid intermediate String allocations.
  • Introduced reusable ThreadLocal buffers (StringBuilder, char[], and a properties map) to reduce per-message allocations.
  • Added key/value interning helpers to reduce duplicated String allocations when parsing properties.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
common/src/main/java/org/apache/rocketmq/common/message/MessageVersion.java Simplifies magic-code to version mapping with direct comparisons.
common/src/main/java/org/apache/rocketmq/common/message/MessageExtBrokerInner.java Adds cached UTF-8 propertiesData and lazy encoding path.
common/src/main/java/org/apache/rocketmq/common/message/MessageDecoder.java Adds byte-based properties encode/decode, interning, and ThreadLocal reuse.
common/src/main/java/org/apache/rocketmq/common/message/MessageConst.java Adds intern tables for frequent property keys to avoid substring allocations.
common/src/main/java/org/apache/rocketmq/common/message/MessageClientIDSetter.java Reuses a ThreadLocal char[] buffer for uniq ID creation.
common/src/main/java/org/apache/rocketmq/common/message/Message.java Removes side-effect allocation in getProperty and optimizes priority parsing.
common/src/main/java/org/apache/rocketmq/common/message/FlatPropertiesMap.java Introduces a compact map for decoded properties with fast encoding support.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread common/src/main/java/org/apache/rocketmq/common/message/MessageDecoder.java Outdated
@qianye1001

Copy link
Copy Markdown
Contributor

有个问题要注意下, 这个map的put 也需要有后覆盖前的语义(包括entry的正确性),以及多次decode会不会有问题

@wang-jiahua wang-jiahua force-pushed the perf/flat-properties-map branch 2 times, most recently from 1be93d8 to a6359a4 Compare June 8, 2026 13:38

@oss-sentinel-ai oss-sentinel-ai left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by github-manager-bot

Summary

This PR introduces performance optimizations for message properties encoding/decoding by replacing HashMap allocation with a compact FlatPropertiesMap backed by a flat Object[] array, adding ThreadLocal reuse for StringBuilder and char[] buffers, and interning high-frequency property keys.

Findings

  • [Good] MessageDecoder.java:605-635 — ThreadLocal StringBuilder reuse with proper capacity cap (REUSABLE_SB_CAP_LIMIT = 64KB) prevents unbounded memory growth.
  • [Good] MessageDecoder.java:644-673 — New messageProperties2Bytes() method directly encodes to UTF-8 bytes, avoiding String allocation on the hot path.
  • [Good] MessageDecoder.java:675-720 — Custom UTF-8 encoder correctly handles edge cases including unpaired surrogates (matches JDK behavior).
  • [Good] MessageDecoder.java:722-770 — bytes2messageProperties() returns independent HashMap (not ThreadLocal-shared), preventing cross-decode corruption.
  • [Good] MessageConst.java:1-35 — Interning of high-frequency property keys reduces String allocation.
  • [Good] MessageDecoderTest.java:431-556 — Comprehensive test coverage including ASCII, multi-byte UTF-8, surrogate pairs, and null handling.

Suggestions

  1. Consider adding a benchmark (JMH) to quantify the performance improvement on the encode/decode hot path.
  2. The REUSABLE_SB_CAP_LIMIT of 64KB is reasonable; document the rationale for this specific value.

Verdict

Approved — Well-designed performance optimization with proper memory safety guards and comprehensive test coverage.


Automated review by github-manager-bot

@wang-jiahua wang-jiahua force-pushed the perf/flat-properties-map branch from a6359a4 to e6fd9a2 Compare June 9, 2026 03:38
@fuyou001

fuyou001 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

test3

Comment thread common/src/main/java/org/apache/rocketmq/common/message/Message.java Outdated
Comment thread common/src/main/java/org/apache/rocketmq/common/message/MessageConst.java Outdated
@RongtongJin

Copy link
Copy Markdown
Contributor

Overall suggestion: this PR currently bundles several independent optimizations with different risk profiles: public Message accessor behavior, client ID buffer reuse, property key interning, direct byte encoding/decoding, and broker/store-side property byte caching. It would be much easier to review and validate if this were split into smaller PRs, each with its own benchmark or targeted regression tests. For example, the parser/interning changes, the direct messageProperties2Bytes encoder plus store hot-path integration, and the Message API behavior change could be reviewed independently. That would also make it clearer which optimization actually improves the intended per-message allocation path and reduce the chance of unrelated compatibility regressions being merged together.

@wang-jiahua wang-jiahua force-pushed the perf/flat-properties-map branch from e6fd9a2 to b7427cb Compare June 10, 2026 14:06
@wang-jiahua wang-jiahua changed the title [ISSUE #10442] Introduce FlatPropertiesMap and optimize message properties encode/decode [ISSUE #10442] Optimize message properties encode/decode to reduce allocation Jun 10, 2026
…uce allocation

Focuses on the properties encode/decode hot path (broker send + dispatch):

- MessageDecoder.string2messageProperties: right-size HashMap from
  separator count instead of fixed capacity 128; intern well-known
  property keys via length-bucketed regionMatches (skip substring+hash);
  intern frequent values ("0","1","true","false","DefaultRegion").
- MessageDecoder.messageProperties2Bytes: direct UTF-8 byte serialization
  of Map entries, skipping the StringBuilder→String→getBytes round-trip.
- MessageDecoder.bytes2messageProperties: parse directly from byte[]
  with ASCII key matching, skipping intermediate String allocation.
- MessageDecoder.messageProperties2String: reuse ThreadLocal StringBuilder
  with capacity cap to bound per-thread memory.
- MessageConst: add STRING_INTERN_MAP and STRING_INTERN_BY_LEN for
  canonical key interning.
- MessageExtBrokerInner: add propertiesData byte[] cache field to
  short-circuit propertiesString→byte[] re-encoding on the write path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@wang-jiahua wang-jiahua force-pushed the perf/flat-properties-map branch from b7427cb to e9e34a2 Compare June 10, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] Introduce FlatPropertiesMap and optimize message properties encode/decode to reduce allocation

6 participants