Skip to content

metrics: deduplicate TiDB Server Status panel by instance (#66680)#67037

Open
ti-chi-bot wants to merge 4 commits intopingcap:release-8.1from
ti-chi-bot:cherry-pick-66680-to-release-8.1
Open

metrics: deduplicate TiDB Server Status panel by instance (#66680)#67037
ti-chi-bot wants to merge 4 commits intopingcap:release-8.1from
ti-chi-bot:cherry-pick-66680-to-release-8.1

Conversation

@ti-chi-bot
Copy link
Copy Markdown
Member

@ti-chi-bot ti-chi-bot commented Mar 16, 2026

This is an automated cherry-pick of #66680

What problem does this PR solve?

Issue Number: close #66193

Problem Summary:
On the TiDB Cluster panel in Clinic, TiDB Server Status directly counts the up metric. When multiple job levels scrape simultaneously, the same instance may be counted repeatedly, resulting in redundant node display.

What changed and how does it work?

Updated TiDB Server Status panel PromQL to deduplicate by instance before counting:

  • Before: count(up{...} == 1/0)
  • After: count(avg(up{...}) by (instance) == 1/0)

This removes duplicate counting caused by multiple jobs exposing the same up series for one instance.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.
    • Change is limited to Grafana dashboard PromQL expressions.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

Bug Fixes

  • Improved accuracy of instance availability metrics in monitoring dashboards through optimized calculation methods for more reliable service health reporting.

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. type/cherry-pick-for-release-8.1 This PR is cherry-picked to release-8.1 from a source PR. labels Mar 16, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Mar 16, 2026

This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick:

  1. It must be LGTMed and approved by the reviewers firstly.
  2. For pull requests to TiDB-x branches, it must have no failed tests.
  3. AFTER it has lgtm and approved labels, please wait for the cherry-pick merging approval from triage owners.
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 16, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 16, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7898773f-d3bc-4e9a-8e4e-4b89e6ff1bec

📥 Commits

Reviewing files that changed from the base of the PR and between 0987792 and cf12724.

📒 Files selected for processing (3)
  • pkg/metrics/grafana/tidb.json
  • pkg/metrics/nextgengrafana/tidb_with_keyspace_name.json
  • pkg/metrics/nextgengrafana/tidb_worker.json

📝 Walkthrough

Walkthrough

Modified Prometheus metric expressions in the TiDB Grafana dashboard to deduplicate instances. The up metric is now wrapped with a max aggregation grouped by instance before equality comparison, fixing overcounting when multiple scrape jobs collect metrics for the same instance.

Changes

Cohort / File(s) Summary
Grafana Dashboard Metrics
pkg/metrics/grafana/tidb.json
Modified "Up" and "Down" metric expressions to include max(up{...}) by (instance) aggregation before equality comparison, preventing instance duplication in instance counts when multiple scrape jobs exist.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

Suggested reviewers

  • yibin87
  • zimulala

Poem

🐰 A rabbit hops through metrics bright,
Dedup the instances, make them right!
Where six once appeared, now two will show,
Max by instance makes the dashboard glow!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: deduplicating the TiDB Server Status panel by instance in the metrics configuration.
Description check ✅ Passed The description includes the required Issue Number reference (#66193), clearly explains the problem and solution, and completes the checklist appropriately for a Grafana dashboard configuration change.
Linked Issues check ✅ Passed The PR successfully addresses issue #66193 by updating the PromQL expression to deduplicate instances using aggregation before comparison, directly resolving the duplicate counting problem described.
Out of Scope Changes check ✅ Passed All changes are directly scoped to addressing the linked issue: only the Grafana dashboard PromQL expressions were modified to fix the deduplication problem, with no extraneous changes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (release-8.1@0987792). Learn more about missing BASE report.

Additional details and impacted files
@@               Coverage Diff                @@
##             release-8.1     #67037   +/-   ##
================================================
  Coverage               ?   71.2789%           
================================================
  Files                  ?       1472           
  Lines                  ?     424908           
  Branches               ?          0           
================================================
  Hits                   ?     302870           
  Misses                 ?     101504           
  Partials               ?      20534           
Flag Coverage Δ
unit 71.2789% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9656% <0.0000%> (?)
parser ∅ <0.0000%> (?)
br 41.5836% <0.0000%> (?)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jiong-nba
Copy link
Copy Markdown
Contributor

/cherry-pick-invite

@ti-chi-bot
Copy link
Copy Markdown
Member Author

@jiong-nba you're already a collaborator in repo ti-chi-bot/tidb

@ti-chi-bot ti-chi-bot Bot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 16, 2026
@jiong-nba
Copy link
Copy Markdown
Contributor

/retest

Copy link
Copy Markdown
Contributor

@yibin87 yibin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Mar 17, 2026

@yibin87: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

LGTM

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Mar 17, 2026
@jiong-nba
Copy link
Copy Markdown
Contributor

/retest

@jiong-nba
Copy link
Copy Markdown
Contributor

/test check_dev_2

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Mar 17, 2026

@jiong-nba: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test build
/test check-dev
/test check-dev2
/test mysql-test
/test unit-test

The following commands are available to trigger optional jobs:

/test pull-br-integration-test
/test pull-check-deps
/test pull-common-test
/test pull-e2e-test
/test pull-integration-binlog-test
/test pull-integration-common-test
/test pull-integration-copr-test
/test pull-integration-ddl-test
/test pull-integration-jdbc-test
/test pull-integration-mysql-test
/test pull-integration-nodejs-test
/test pull-integration-python-orm-test
/test pull-integration-tidb-tools-test
/test pull-lightning-integration-test
/test pull-mysql-client-test
/test pull-sqllogic-test
/test pull-tiflash-test

Use /test all to run the following jobs that were automatically triggered:

pingcap/tidb/release-8.1/ghpr_build
pingcap/tidb/release-8.1/ghpr_check
pingcap/tidb/release-8.1/ghpr_check2
pingcap/tidb/release-8.1/ghpr_mysql_test
pingcap/tidb/release-8.1/ghpr_unit_test
pull-check-deps
Details

In response to this:

/test check_dev_2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Mar 17, 2026

@jiong-nba: No presubmit jobs available for pingcap/tidb@release-8.1

Details

In response to this:

/test check_dev_2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiong-nba
Copy link
Copy Markdown
Contributor

/test check-dev2

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Mar 17, 2026

@jiong-nba: No presubmit jobs available for pingcap/tidb@release-8.1

Details

In response to this:

/test check-dev2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiong-nba
Copy link
Copy Markdown
Contributor

/retest-required

@jiong-nba
Copy link
Copy Markdown
Contributor

/test check-dev2

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Mar 17, 2026

@jiong-nba: No presubmit jobs available for pingcap/tidb@release-8.1

Details

In response to this:

/test check-dev2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiong-nba
Copy link
Copy Markdown
Contributor

/retest

@jiong-nba
Copy link
Copy Markdown
Contributor

/test all

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Mar 17, 2026

@jiong-nba: No presubmit jobs available for pingcap/tidb@release-8.1

Details

In response to this:

/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiong-nba
Copy link
Copy Markdown
Contributor

/retest

2 similar comments
@jiong-nba
Copy link
Copy Markdown
Contributor

/retest

@jiong-nba
Copy link
Copy Markdown
Contributor

/retest

@ti-chi-bot ti-chi-bot Bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 30, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Mar 30, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: XuHuaiyu, yibin87, zimulala

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Mar 30, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-03-17 01:39:09.848702953 +0000 UTC m=+234676.936360520: ☑️ agreed by zimulala.
  • 2026-03-30 01:54:35.594692122 +0000 UTC m=+143680.800052169: ☑️ agreed by XuHuaiyu.

@jiong-nba
Copy link
Copy Markdown
Contributor

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved do-not-merge/cherry-pick-not-approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. type/cherry-pick-for-release-8.1 This PR is cherry-picked to release-8.1 from a source PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants