Skip to content

PerformanceHarness: use trace_api/get_actions for post-test trx extraction#302

Open
heifner wants to merge 7 commits into
feature/trace-api-historyfrom
pr295-get_actions-harness
Open

PerformanceHarness: use trace_api/get_actions for post-test trx extraction#302
heifner wants to merge 7 commits into
feature/trace-api-historyfrom
pr295-get_actions-harness

Conversation

@heifner
Copy link
Copy Markdown
Contributor

@heifner heifner commented Apr 19, 2026

Summary

Migrates the PerformanceHarness post-test block/trx extraction from trace_api/get_block to trace_api/get_actions + chain/get_block_info, exercising the new endpoint introduced in this PR and removing the base58-signature serialization that dominated harness CPU.

Why

queryBlockTrxData iterates every block in the test range. The old call path - trace_api/get_block - serialized the entire block including every signature (base58 of each signature does BigNum divide in OpenSSL; heaptrack showed ~5.7M OPENSSL_malloc calls on a 5k TPS 30s run). The harness only reads: trx_id, block_num, block_time, cpu_usage_us, net_usage_words, and the first action's name (to skip onblock). Signatures are never consulted.

get_actions returns exactly the fields needed, filtered server-side by action name so onblock trxs are dropped without any base58 work. chain/get_block_info covers the block-header fields (id, producer, timestamp).

Changes

  • tests/PerformanceHarness/performance_test_basic.py: queryBlockTrxData now issues two lightweight calls per block (chain/get_block_info + trace_api/get_actions) instead of the heavy trace_api/get_block. The action-name filter is derived from userTrxDataDict.actions[0].actionName, so all test variants (transfer, cpu, ram, net, newaccount, doit, doitslow) work automatically. When no user data is configured, falls back to an explicit onblock skip on the client side. Block finality (status column in blockData.txt) is read from the per-action block_status field added in PR trace_api: history-solution upgrade - auto ABIs, query APIs, indexes #295; when the action filter drops every entry in a block (typically the leading/trailing ramp window), falls back to trace_api/get_block for that block's status -- still single-source-of-truth from trace_api's data log rather than mixing in chain/get_info LIB.

Measurement

On a 50k-TPS 90s test (identical config, same combined-perf branch):

Harness Total runtime Post-test overhead
old trace_api/get_block 552 s 462 s
new trace_api/get_actions ~366 s (90s extrapolated) ~276 s

~40% less time spent in post-test extraction+analysis.

Targets feature/trace-api-history so it rolls into PR #295.

…ction

queryBlockTrxData previously called trace_api/get_block per block, which
serialized the entire block including base58-encoded signatures
(OpenSSL BN_div alloc storm - heaptrack showed ~5.7M OPENSSL_malloc
calls on a 5k TPS 30s test). The harness only needs trx id, block
num/time, cpu/net usage, and the block header - signatures are never
consulted.

Switch to two lightweight endpoints: chain/get_block_info for block
header and trace_api/get_actions (filtered to the userTrxData's
configured action name) for per-action data. Action-name filter is
derived from userTrxDataDict so all test variants (transfer, cpu, ram,
net, newaccount, doit, etc.) work out of the box, and onblock trxs
are dropped server-side.

Measured: ~40% reduction in post-test harness overhead for a 50k TPS
90s run (462s -> ~276s of non-test time).
if trxId in seen:
continue
seen.add(trxId)
cpu = action.get('cpu_usage_us', 0) or 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now records per-action cpu_usage_us / net_usage, but the old code recorded per-transaction cpu_usage_us / net_usage_words from trace_api/get_block. That changes both scope and units: net_usage is bytes on the action trace, while the previous net_usage_words was ceil(transaction.net_usage / 8). The harness’ block size and transaction net stats will no longer be comparable, and multi-action transactions will undercount CPU/net by taking only the first matching action. I’d either add transaction-level resource fields to get_actions, or keep a lightweight transaction summary endpoint for this harness path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Went with option 1 - added trx_cpu_usage_us and trx_net_usage_words to each get_actions action variant in PR #295 (feature/trace-api-history) at 61906f8, then gated them to the full shape only at 22880d2 so get_token_transfers (slim) keeps its "no resource fields" intent. This branch consumes the new fields at c12471d, so multi-action trxs are correctly attributed and net is back in words to match the previous trace_api/get_block shape.

heifner added 3 commits May 8, 2026 09:30
get_actions emits per-action cpu_usage_us / net_usage and per-trx
trx_cpu_usage_us / trx_net_usage_words on each action variant.  The
harness wants per-trx totals, so read the trx_* fields:

* multi-action trxs no longer undercount when the filter only catches
  one of their actions
* net is back in words (ceil(bytes / 8)) matching the trace_api/get_block
  shape the harness used before d7aa4fc

The dedup-by-trx_id loop still prunes duplicate entries when a trx has
multiple actions matching the filter.
@heifner heifner requested a review from huangminghuang May 8, 2026 15:20
producer=block["payload"]["producer"], status=block["payload"]["status"],
_timestamp=block["payload"]["timestamp"])
producer=blockInfo["payload"]["producer"],
status="executed",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line now hardcodes every block status as "executed". The previous trace_api/get_block path wrote the block status from trace API ("irreversible" or "pending"), and blockData.txt still has a status column. This silently changes the artifact semantics, and "executed" is a transaction status, not a block finality status. If downstream harness/report tooling reads that field, it loses finality information. I’d either preserve the old meaning by deriving pending/irreversible from chain/get_info/LIB, or make the artifact schema change explicit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- went with option 1's intent but kept the source consistent with trace_api rather than crossing the correlation boundary to chain/get_info.

Server side (feature/trace-api-history @ 200ec2d): get_actions and get_token_transfers now emit block_status ("irreversible" / "pending") on every action, sourced from the same data-log tuple that get_block uses for its status field. Slim shape carries it too. Doc notes that operators wanting only-irreversible responses can run nodeop with read-mode = irreversible.

Harness (this branch @ 09b31ce): reads block_status from the first action of each block (all actions in a block share it). When the action filter drops every entry in a block (typically the leading/trailing ramp window), falls back to trace_api/get_block for that block's status -- still a single trace_api source of truth. blockData.txt schema unchanged, every row has a real finality value.

heifner added 2 commits May 8, 2026 14:06
Brings in get_actions / get_token_transfers block_status field (200ec2d) so the harness can read finality from the same data-log tuple instead of mixing in chain/get_info LIB.
…et_block on empty

Replace the hardcoded status="executed" (which was a transaction-status concept misapplied to a block row) with the real block finality from trace_api. All actions in a block share the same block_status, so capture it once from the first action.

Blocks where the configured action filter dropped every entry (typically the leading/trailing ramp window) have no action to read from. Fall back to trace_api/get_block on those, so finality continues to come from trace_api's data log rather than mixing in a chain/get_info LIB read. Cost is bounded -- empty/onblock-only blocks have no trx signatures to base58-encode, so the heavy serialization the surrounding refactor avoided does not show up here either.
@huangminghuang huangminghuang self-requested a review May 8, 2026 19:09
Copy link
Copy Markdown
Contributor

@huangminghuang huangminghuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance_test_basic.py (line 367) now passes chain/get_block_info’s timestamp into blockData, but log_reader.py (line 146) always strips the last character because the old trace_api/get_block timestamp ended in Z. get_block_info serializes block_timestamp_type via fc::time_point::to_iso_string() without Z, so blockData.txt will now write timestamps like .50 instead of .500. The harness’ latency math likely survives for 0/500ms blocks, but the artifact format loses precision/shape and can surprise downstream parsers. Normalize by appending Z, or make blockData.timestamp strip only a trailing Z.

trace_api/get_block emits ISO8601 with a 'Z' suffix; chain/get_block_info
serializes block_timestamp_type via fc::to_iso_string() with no zone marker.
The setter blindly stripped the last character, which truncated '.500' to
'.50' for get_block_info. Strip 'Z' only when present so sub-second
precision survives either source and parsing tolerates either shape.
@heifner
Copy link
Copy Markdown
Contributor Author

heifner commented May 8, 2026

Fixed in 16ad816 (re: timestamp shape feedback). blockData.timestamp setter now strips 'Z' only when present, so it tolerates either trace_api/get_block ('Z' suffix) or chain/get_block_info (no suffix) and keeps full sub-second precision in both cases. blockData.txt is write-only in the harness so the on-disk shape stays as-was (no trailing Z, full '.500' precision).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants