feat(metrics): migrate sei-cosmos to OpenTelemetry (PLT-353)#3467
feat(metrics): migrate sei-cosmos to OpenTelemetry (PLT-353)#3467amir-deris wants to merge 1 commit into
Conversation
PR SummaryMedium Risk Overview In In storage and modules, OTEL metrics are added for bounded cache evictions and gas exceeded errors ( Reviewed by Cursor Bugbot for commit bf58610. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit bf58610. Configure here.
| defer telemetry.MeasureSinceWithLabels([]string{"abci", "query"}, time.Now(), []metrics.Label{{Name: "path", Value: req.Path}}) | ||
| queryStart := time.Now() | ||
| defer func() { | ||
| baseappMetrics.abciQueryDuration.Record(ctx, time.Since(queryStart).Seconds(), otelmetric.WithAttributes(attribute.String("path", req.Path))) |
There was a problem hiding this comment.
ABCI query path cardinality
High Severity
baseapp_abci_query_duration labels each series with the raw path from RequestQuery, which clients choose freely. That creates an unbounded set of metric attribute combinations and can grow memory use in the OTel metrics backend.
Triggered by learned rule: OTel metrics: guard attribute cardinality and use native types
Reviewed by Cursor Bugbot for commit bf58610. Configure here.
There was a problem hiding this comment.
This one is a blocker I'm afraid
Cc @amir-deris
| }, | ||
| ) | ||
| defer func() { | ||
| govMetrics.voteTotal.Add(goCtx, 1, otelmetric.WithAttributes(attribute.String("proposal_id", strconv.FormatUint(msg.ProposalId, 10)))) |
There was a problem hiding this comment.
Gov proposal_id metric labels
Medium Severity
Vote and deposit counters attach proposal_id from the message as an OTel string attribute. Proposal IDs increase monotonically, so label cardinality grows without bound over the life of a chain.
Additional Locations (2)
Triggered by learned rule: OTel metrics: guard attribute cardinality and use native types
Reviewed by Cursor Bugbot for commit bf58610. Configure here.
| upgradeMetrics.planHeight.Record(ctx.Context(), plan.Height, otelmetric.WithAttributes( | ||
| attribute.String("name", plan.Name), | ||
| attribute.String("info", plan.Info), | ||
| )) |
There was a problem hiding this comment.
Upgrade plan info attribute
Medium Severity
plan_height gauges include info from the upgrade plan as an OTel attribute. plan.Info is free-form text set via governance, so each distinct plan can add a unique label combination.
Additional Locations (1)
Triggered by learned rule: OTel metrics: guard attribute cardinality and use native types
Reviewed by Cursor Bugbot for commit bf58610. Configure here.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3467 +/- ##
==========================================
+ Coverage 59.05% 59.06% +0.01%
==========================================
Files 2188 2199 +11
Lines 182088 182234 +146
==========================================
+ Hits 107530 107639 +109
- Misses 64925 64951 +26
- Partials 9633 9644 +11
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|


Adds OTel instrumentation to
sei-cosmosfollowing the same pattern as PLT-329, PLT-330, PLT-336, PLT-339, and PLT-343.New instruments
baseapp (meter
seicosmos_baseapp)mid_block_duration— histogram, secondsend_block_duration— histogram, secondsdeliver_tx_duration— histogram, secondstx— counter, total delivered transactionstx_result— counter,resultlabel (successful/failed)tx_gas_used— gaugetx_gas_wanted— gaugecommit_duration— histogram, secondsabci_query_duration— histogram, seconds,pathlabelprocess_proposal_duration— histogram, secondsfinalize_block_duration— histogram, secondsget_tx_priority_hint_duration— histogram, secondsrun_tx_duration— histogram, seconds,modelabel (replacesMeasureThroughputSinceWithLabelsforTxCount)run_msgs_duration— histogram, seconds (replacesMeasureThroughputSinceWithLabelsforMessageCount)run_msg_latency— histogram, seconds,typelabel (replaces bothsei.cosmos.run.msg.latencyandcosmos.run.msg.latency)storev2/rootmulti (meter
seicosmos_storev2_rootmulti)sc_commit_latency— histogram, secondsss_version— gaugehistorical_abci_query— counter,success+prooflabelsiavl_total_key_bytes— gauge,store_namelabeliavl_total_value_bytes— gauge,store_namelabeliavl_total_num_keys— gauge,store_namelabelstate_sync_keys_exported— countertasks (meter
seicosmos_tasks)scheduler_retries— counterscheduler_incarnations— counterstore/types (meter
seicosmos_store_types)gas_exceeded— counter,error+descriptorlabelsbounded_cache— gauge,typelabelx/upgrade (meter
seicosmos_x_upgrade)begin_blocker_duration— histogram, secondsplan_height— gauge,name+infolabelsx/upgrade/keeper (meter
seicosmos_x_upgrade_keeper)plan_height— gauge,name+infolabelsx/auth/vesting (meter
seicosmos_x_auth_vesting)new_account— counteraccount_amount— gauge,denomlabelx/bank/keeper (meter
seicosmos_x_bank_keeper)send_amount— gauge,denomlabelx/distribution/keeper (meter
seicosmos_x_distribution_keeper)withdraw_reward_amount— gauge,denomlabelwithdraw_commission_amount— gauge,denomlabelx/staking/keeper (meter
seicosmos_x_staking_keeper)delegate— counterdelegate_amount— gauge,denomlabelredelegate— counterredelegate_amount— gauge,denomlabelundelegate— counterundelegate_amount— gauge,denomlabelx/gov/keeper (meter
seicosmos_x_gov_keeper)proposal— countervote— counter,proposal_idlabeldeposit— counter,proposal_idlabelNotes
TODO(PLT-353)comments pending dashboard verification.