feat: add query/segment/count metric to data nodes#19624
Conversation
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 2 |
| P3 | 0 |
| Total | 2 |
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 2 |
| P3 | 0 |
| Total | 2 |
Reviewed 14 of 14 changed files.
This is an automated review by Codex GPT-5.5
78d50df to
c273754
Compare
kfaraz
left a comment
There was a problem hiding this comment.
Looks good, left some minor comments.
| |Metric|Description|Dimensions|Normal value| | ||
| |------|-----------|----------|------------| | ||
| |`query/time`|Milliseconds taken to complete a query.|<p>Common: `dataSource`, `type`, `interval`, `hasFilters`, `duration`, `context`, `remoteAddress`, `id`, `statusCode`.</p><p> Aggregation Queries: `numMetrics`, `numComplexMetrics`.</p><p> GroupBy: `numDimensions`.</p><p> TopN: `threshold`, `dimension`.</p>|< 1s| | ||
| |`query/segment/count`|Number of segments this Historical scans for a query. This is local to this data node; the Broker metric `query/segments/count` reports the distributed query plan.|<p>Common: `dataSource`, `type`, `interval`, `hasFilters`, `duration`, `context`, `remoteAddress`, `id`.</p><p> Aggregation Queries: `numMetrics`, `numComplexMetrics`.</p><p> GroupBy: `numDimensions`.</p><p> TopN: `threshold`, `dimension`.</p>|Varies| |
There was a problem hiding this comment.
Ah, the similar metric names query/segment/count and query/segments/count (with an 'S') can be confusing. I wish the Broker metric name reflected that it was a plan-only metric.
Nit: For the use case in this PR though, I feel that query/segments/count probably makes more sense since query/segment/time reflects time taken to scan a single segment and not the total time.
There was a problem hiding this comment.
Btw, do we even need to mention the Broker metric here? It seems unrelated despite the similar names. The disambiguation, if needed, may be done in Broker metrics section itself.
| @Nullable | ||
| private final String segmentId; | ||
| @Nullable | ||
| private final Integer localSegmentCount; |
There was a problem hiding this comment.
What does the prefix local signify?
There was a problem hiding this comment.
local to this data node
There was a problem hiding this comment.
I feel like that is implicit, since the data node is not aware of segments from other nodes anyway.
| /** | ||
| * This class is responsible for emitting query/segment/time, query/wait/time and query/segmentAndCache/Time metrics for a Sink. | ||
| * This class is responsible for emitting query/segment/time, query/wait/time, query/segmentAndCache/Time and | ||
| * query/segment/count metrics for a Sink. |
There was a problem hiding this comment.
We should also call out that unlike other metrics emitted here, query/segment/count is emitted only when segmentId is non-null, since it is a total count.
| Assert.assertEquals(1, stubServiceEmitter.getMetricEventCount(DefaultQueryMetrics.QUERY_SEGMENT_COUNT)); | ||
| Assert.assertEquals(2L, stubServiceEmitter.getLatestMetricEventValue(DefaultQueryMetrics.QUERY_SEGMENT_COUNT)); |
There was a problem hiding this comment.
Nit: You may also use stubServiceEmitter.verifyEmitted and verifyValue.
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 1 |
| P3 | 0 |
| Total | 1 |
Reviewed 16 of 16 changed files.
This is an automated review by Codex GPT-5.5
| ); | ||
| } | ||
|
|
||
| final int localSegmentCount = allSegmentReferences.size(); |
There was a problem hiding this comment.
[P2] Count realtime sinks instead of hydrants
This counts allSegmentReferences, but those references are hydrant-level: Sink.acquireSegmentReferences adds one SinkSegmentReference per FireHydrant. After a realtime segment has multiple hydrants, for example after persist/swap or with both persisted and current hydrants, a query over one Druid segment emits query/segment/count as the hydrant count instead of the segment/Sink count. That disagrees with the metric's documented segment-level meaning and with the existing per-segment metrics that aggregate by sinkSegmentId; please count distinct queried sinks/descriptors and add a multi-hydrant test.
Description
Allows for quick queries to understand how many segments a given query issued scans for against a given data node. Better than counting the unique query/segment/time or similar, especially if you are sampling higher volume metrics like query/segment/time, but still want to have quick way to know stats about # of segments queried per node.
Release note
Add query/segment/count metric to data nodes.
This PR has: