Skip to content

Nats v2 web#295

Open
sonus21 wants to merge 20 commits intomasterfrom
nats-v2-web
Open

Nats v2 web#295
sonus21 wants to merge 20 commits intomasterfrom
nats-v2-web

Conversation

@sonus21
Copy link
Copy Markdown
Owner

@sonus21 sonus21 commented May 6, 2026

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Unit Test
  • Integration Test
  • Reactive Integration Test

Test Configuration:

  • Spring Version
  • Spring Boot Version
  • Redis Driver Version

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

sonus21 added 20 commits May 3, 2026 01:08
The coverage_report job was producing an effectively empty
jacocoTestReport.xml (3.4KB vs ~1.1MB locally) because no .class files
existed when coverageReportOnly ran — the job checked out source code
and downloaded .exec artifacts, but never compiled. JaCoCo's report
generator skips packages/classes it cannot resolve, so the merged XML
ended up with only <sessioninfo> entries and no <package> elements.

That made coverallsJacoco silently no-op via the
"source file set empty, skipping" branch in CoverallsReporter, so
"Push coverage to Coveralls" reported success without uploading.

Verified by downloading the coverage-report artifact from a recent run
and comparing its XML structure against a local build's report.

Assisted-By: Claude Code
…e Q-detail

Replace the all-stub `NatsRqueueUtilityService` with real impls for the operations
JetStream can model: `pauseUnpauseQueue` persists the `paused` flag on `QueueConfig`
in the queue-config KV bucket and notifies the local listener container so the poller
stops dispatching; `deleteMessage` is a soft delete via `MessageMetadataService`
(stream message persists, dashboard hides via the metadata flag); `getDataType`
reports `STREAM`. `moveMessage`, `enqueueMessage`, and `makeEmpty` deliberately
remain "not supported" — there is no JetStream primitive for those.

Update `RqueueQDetailServiceImpl.getRunningTasks` / `getScheduledTasks` to return
header-only tables when the broker capabilities suppress those sections, instead of
emitting zero rows or 501s on NATS.

20 new unit tests cover the pause/delete paths and lock in the still-unsupported
operations. Updates `nats-task.md` / `nats-task-v2.md` to reflect what landed.

Assisted-By: Claude Code
End-to-end browser-tested the NATS dashboard and shipped the templates +
broker fixes uncovered by it:

- `RqueueViewControllerServiceImpl.addBasicDetails` now propagates the active
  broker's `Capabilities` to every template via `hideRunningPanel`,
  `hideScheduledPanel`, `hideCronJobs`, and `hideCharts`. Templates default
  to "show" when these are absent so the legacy Redis path is unchanged.

- `base.html` hides the Running tab when `hideRunningPanel` is set; Scheduled
  was already gated.
- `index.html` and `queue_detail.html` skip the stats / latency chart panels
  (and their JS bootstrap) when `hideCharts` is set, replacing the home
  charts with a friendly backend-aware blurb.
- `queues.html` swaps the hard-coded "backing Redis structures" copy for the
  broker-supplied `storageKicker`.

- `JetStreamMessageBroker.peek` rewritten to read messages directly from the
  stream via `JetStreamManagement.getMessage(streamName, seq)` instead of
  creating an ephemeral pull consumer. NATS 2.12+ rejects `AckPolicy.None`
  on WorkQueue streams (10084) and rejects mixing filtered + non-filtered
  consumers (10100), so the consumer-based approach can't coexist with the
  durable poller. Sequence-based reads sidestep both.

- `NatsRqueueMessageMetadataService.deleteMessage` now creates a tombstone
  metadata entry when no record exists (NATS skips the storeMessageMetadata
  path at enqueue time), so dashboard-driven deletes always succeed and the
  next peek renders the row as deleted.

- `rqueue.js`'s `deleteMessage` / `enqueueMessage` button handlers now use
  `closest('tr')` instead of two `.parent()` hops. The recent
  `explorer-action-group` div wrapper added an extra level of nesting; the
  old walk landed on the action cell and read "Delete" as the message id.

Assisted-By: Claude Code
Replace the hard-coded "LIST" / "ZSET" tokens on the queue-detail page with a
broker-supplied human label so NATS shows "Queue (Stream)", "Completed (KV)",
and "Dead Letter (Stream)" instead of Redis-shaped data structure names.

- New `MessageBroker.dataTypeLabel(NavTab, DataType)` SPI hook, default
  returns null (legacy Redis path keeps `DataType.name()`).
- `JetStreamMessageBroker` overrides for the NATS-mapped tabs.
- `RedisDataDetail` carries an optional `typeLabel` field; templates render
  via `{{ typeLabel | default(type) }}` so older callers stay correct.
- `queue_detail.html` plus the `rqueue.js` modal title use the label and
  surface the broker-friendly token in the explorer header.

Also fixes `JetStreamMessageBroker.size(QueueDetail)` for streams created
with `RetentionPolicy.Limits`. WorkQueue retention drops messages on ack so
`streamState.msgCount` already equals the pending count, but Limits keeps
all messages and `msgCount` over-reports. The new path detects retention
from `streamInfo.config` and walks the stream's durable consumers to surface
the maximum `numPending` for Limits-mode queues, falling back to msgCount
on consumer-enumeration errors.

Assisted-By: Claude Code
…~ prefix

Replace the consumer-walking max(numPending) computation with stream position
math so the dashboard surfaces the worst-case backlog using a single
pass over consumers:

  pending ≈ lastSeq - min(consumer.delivered.streamSeq)

Mathematically equivalent to the previous max(numPending), but expressed
in terms of stream offsets (which is what the dashboard now signals as
approximate to the operator).

Also adds the user-facing approximation indicator:

- New `MessageBroker.isSizeApproximate(QueueDetail)` SPI hook, default
  false (Redis returns exact list / sorted-set sizes).
- `JetStreamMessageBroker.isSizeApproximate` returns true for streams
  with `RetentionPolicy.Limits` and false for WorkQueue (the standard
  rqueue queue mode where msgCount is exact after acks remove messages).
- `RedisDataDetail.approximate` carries the flag through to the view.
- `queue_detail.html` renders `~ N` when approximate, `N` otherwise; the
  `Queue-backed` short-circuit for size<0 stays.
- `RqueueQDetailServiceImpl` sets the flag on the pending row when a
  broker is wired.

Assisted-By: Claude Code
A single aggregated "~ N" pending number hides the per-consumer lag that
matters on a fan-out stream — Consumer A might be at seq 100 while Consumer
B is at seq 1100, and the dashboard previously surfaced only the worst case.

This commit replaces that aggregate with a per-row breakdown when the broker
exposes one, leaving the WorkQueue / Redis path unchanged:

- New `MessageBroker.consumerPendingSizes(QueueDetail)` SPI hook returning
  an ordered map of `consumerName -> pending`. Default returns null (no
  breakdown available).
- `JetStreamMessageBroker` overrides for Limits-retention streams: walks
  the durable consumers, prefers `numPending` (server-computed), falls back
  to position math (`lastSeq - delivered.streamSeq`) when numPending is 0.
  Returns null on WorkQueue retention (single shared pool).
- `RedisDataDetail` carries an optional `consumerName`.
- `RqueueQDetailServiceImpl` emits one PENDING row per consumer when the
  broker provides a breakdown, with exact (non-approximate) counts.
- `queue_detail.html` renders the consumer name as muted text next to the
  data-structure name.

Result on a NATS Limits stream:
  PENDING | Stream | rqueue-js-feed / consumer-a | 100
  PENDING | Stream | rqueue-js-feed / consumer-b | 500
  PENDING | Stream | rqueue-js-feed / consumer-c |   0

Assisted-By: Claude Code
Replace the queue-detail page's data-structure-centric "Job Type / Data Type /
Name / Size" table with a consumer-level table that works across all backends.
The standalone "Queue Pollers" section is folded in via a worker-registry join
on consumer name, so the queue-detail page now has a single integrated view.

UI shape:

  Subscribers
    Consumer | Type | Storage | Pending | In-Flight | Status | Host/PID | Last Poll

  Terminal Storage (only when present)
    Bucket | Type | Storage | Size

Backends:

- Redis — every @RqueueListener registered for the queue is a row, with shared
  Pending (LIST size, marked "(shared)") and shared In-Flight (processing-ZSET
  size). EndpointRegistry.getAllForQueue() enumerates the handlers.
- NATS WorkQueue — every durable consumer is a row; Pending = shared msgCount
  marked "(shared)", In-Flight = consumer's exclusive numAckPending.
- NATS Limits — every durable consumer is a row; Pending = exact per-consumer
  numPending, In-Flight = numAckPending.

Architecture:

- New `MessageBroker.subscribers(QueueDetail)` SPI hook returning a list of
  `SubscriberView` records (consumerName, pending, inFlight, pendingShared).
  Default returns a single anonymous row backed by `size()` so brokers that
  don't track named consumers still render a working table.
- `RedisMessageBroker` overrides via `EndpointRegistry.getAllForQueue()` +
  shared list/ZSET sizes.
- `JetStreamMessageBroker` overrides via `jsm.getConsumerNames` +
  `getConsumerInfo` per consumer, branching on retention policy.
- `RqueueQDetailService` exposes `getSubscriberRows` / `getTerminalRows`,
  joining broker SPI data with the worker registry for status / last-poll.
- `RedisDataDetail` is unchanged (kept for existing API consumers).
- `queue_detail.html` rewritten: "Subscribers" + "Terminal Storage" sections
  replace the Job Type / Data Type / Name / Size and Queue Pollers blocks.

Side fix per user: charts (stats / latency) are no longer hidden when a
broker reports `!supportsScheduledIntrospection`. NATS has the chart endpoints
working; the panels just render empty until counters accumulate, which is
the natural "no data yet" state. `index.html` and `queue_detail.html` always
include the chart partials now; the friendly "not available" placeholder
is removed.

Assisted-By: Claude Code
…inal cards

Bring the queue-detail page in line with the modern card-based design system
already used by /queues and /workers. The previous layout was Bootstrap
table-bordered tables that felt like a different app.

New shape:

  Hero
    Kicker "QUEUE" + queue name + paused/live badge inline + subtitle
    Stat chips: Subscribers / Pending / In-Flight (right-aligned panel)

  Configuration chip strip
    Concurrency · Retries · Visibility · Dead Letter (icons + values)
    Created / Updated meta below a dashed divider

  Subscribers section
    "LIVE" kicker + heading + supporting copy
    One subscriber-card per @RqueueListener consumer:
      - Consumer name (clickable into explorer modal)
      - Type pill (Queue (Stream) / Stream consumer / List)
      - Status badge (ACTIVE/STALE/PAUSED) using existing worker-status-* classes
      - Pending + In-Flight stat panels with "shared" hint where applicable
      - Storage / Host / Last Poll meta rows
    Empty state uses the queue-empty-state pattern from /queues.

  Terminal Storage section
    "SHARED" kicker + heading
    One terminal-card per shared bucket (COMPLETED / DEAD), color-coded
    via left border (green for completed, orange for dead letter).
    Big size value with "messages" label or "Queue-backed" placeholder.

  Stats & Latency
    Collapsed by default in a <details> disclosure ("TELEMETRY" kicker).
    Charts lazy-render on first open so the initial paint stays focused
    on the actionable data. The full chart partials are unchanged.

CSS additions are scoped to new component classes (queue-detail-*,
subscriber-*, terminal-*) and reuse the existing tokens — same colors,
same border-radius scale, same shadow recipe — so the page sits flush
with /queues and /workers in look and feel.

Pebble fix: switched `if config` to `if config != null` because Pebble
rejects domain objects in boolean context (PebbleException 10084 caught
in browser test).

Assisted-By: Claude Code
For workqueue streams, NATS rejects multiple non-filtered consumers
(error 10099). When multiple listeners were registered on the same
workqueue queue without custom consumer names, each listener tried to
create its own consumer, causing the provisioning to fail.

Fix: For workqueue streams with no custom consumer name, use a
consistent `queueName-consumer` name so all listeners share a single
consumer. This matches the workqueue semantics where only one consumer
can receive each message.

- NatsStreamValidator: Resolve consumer name based on queue type,
  using `queueName-consumer` for workqueues without custom names
- JetStreamMessageBroker: Use the same resolution logic in pop() to
  ensure validator and poller use the same consumer name

Assisted-By: Claude Code
User feedback was blunt: too much wasted whitespace and no clear pause/play
control. This rewrite collapses the queue-detail page into a single header
bar plus dense table sections so the operator can see and act on everything
without scrolling.

Layout changes:

- Header: queue name + state pill (LIVE / PAUSED) + actionable pause/play
  toggle button + summary stats ("N subscribers · N pending · N in-flight")
  on the right, all on one row. Replaces the big hero block.
- Configuration chip strip: 6 inline cells (Concurrency · Retries · Visibility
  · DLQ · Created · Updated) with dashed dividers — fits everything in ~60px
  of vertical space.
- Subscribers and Terminal Storage as compact tables with light row dividers
  and pill-styled type labels. No more card-grid whitespace.
- Stats & Latency stays behind a <details> disclosure so charts don't pin
  the actionable rows off-screen.

Pause/play action:

- The state pill is paired with a button that reuses the existing
  `pause-queue-btn` JS handler (POSTs to /pause-unpause-queue and reloads).
  Click toggles the queue state and the pill / button icon swap accordingly.
  Browser-tested: pausing the queue switches to "PAUSED" + play icon;
  clicking play unpauses, queue resumes processing pending messages.

Bug caught while testing:

- `QueueDetail.resolvedConsumerName()` previously returned different names
  for system-generated vs. primary queues (`-consumer-primary` vs
  `-consumer`). NATS WorkQueue streams reject multiple non-filtered
  consumers (10099) so the poller's runtime consumer name had to match
  whatever the bootstrap validator created. Unified to a single
  `{name}-consumer` suffix.

Files touched:

- queue_detail.html — full template rewrite (compact tables, header bar,
  inline config strip, charts disclosure)
- rqueue.css — replaced the previous card-heavy queue-detail CSS block
  with a tighter `qd-*` namespaced ruleset (~250 lines, was ~430)
- QueueDetail.java — consumer-name suffix fix
- (assorted formatter cleanups across files touched in earlier commits)

Assisted-By: Claude Code
resolvedConsumerName() returned different suffixes (-consumer vs
-consumer-primary) based on systemGenerated. The bootstrap validator
and runtime poller therefore disagreed on the consumer name when
systemGenerated was false, and the second creation attempt failed
on NATS workqueue streams with error 10099 (multiple non-filtered
consumers not allowed).

Use {name}-consumer in both cases. The custom consumerName from
@RqueueListener is still honoured when set; only the generated default
loses the -primary distinction.

Assisted-By: Claude Code
The queue-detail explorer was paginating from the stream's first
sequence regardless of which subscriber the operator clicked on. For
Limits-retention streams with competing consumers each consumer has
its own delivered offset, so the explorer was showing messages the
selected consumer had already acked instead of what is still pending
for that subscriber.

Wire consumerName end-to-end:

- QueueExploreRequest: new optional consumerName field.
- MessageBroker.peek: new consumer-aware overload, default delegates
  to the existing peek so non-NATS backends keep working.
- JetStreamMessageBroker.peek: when consumerName is set on a Limits
  stream, base the start sequence on
  getConsumerInfo(stream, consumer).getDelivered().getStreamSequence()+1.
  WorkQueue and the no-consumer call site are unchanged.
- RqueueQDetailService(.Impl): propagate consumerName into the broker
  peek call.
- Rest controllers (blocking + reactive): forward consumerName from
  the request into the service.
- queue_detail.html: subscribers table emits data-consumer on each
  consumer link.
- rqueue.js: read data-consumer in exploreData() and include it in
  the queue-data POST body.

Assisted-By: Claude Code
The consumer-aware peek was starting from delivered.streamSeq + 1,
which skipped over the in-flight window (sequences delivered but not
yet acked). Operators looking at a row with pending=0, in-flight=15
clicked through and got an empty explorer because all 15 in-flight
sequences were <= delivered.streamSeq.

Use ackFloor.streamSeq + 1 instead, so the explorer includes both
in-flight and not-yet-delivered messages — i.e. everything this
consumer still has work to do on. Matches the operator's mental
model of "what is this consumer still chewing on".

Assisted-By: Claude Code
The inFlight map was keyed only on RqueueMessage.id. For Limits-
retention streams with two or more durable consumers (e.g. one
@RqueueListener per handler with consumerName="linkedin-search" and
consumerName="google-search" on the same queue), each consumer
receives its own copy of every message and the second pop's
inFlight.put silently overwrote the first's NATS Message handle.

When the first worker later called broker.ack/nack, it picked up the
wrong consumer's NATS Message and acknowledged that delivery instead
of its own. The original delivery stayed in numAckPending forever,
and Outstanding Acks grew without bound (e.g. 92 -> 101 -> 112 over
30s with no new produces).

Key the inFlight map on "<consumerName>::<messageId>" so each
consumer's delivery is tracked independently. Pop uses its
consumerName arg; ack/nack/moveToDlq use q.resolvedConsumerName(),
which matches what the poller passed at fetch time.

Verified: with two @RqueueListener on a Limits stream, Outstanding
Acks now drains to 0 and Ack Floor advances to lastSeq instead of
sticking near the start.

Assisted-By: Claude Code
Adds JetStreamMessageBrokerMultiConsumerAckIT, which catches the bug
fixed in 524fc5c: under multi-consumer fan-out (Limits-retention
stream + two durable consumers) the broker's inFlight map was keyed
on RqueueMessage.id alone, so consumer-b's pop silently overwrote
consumer-a's NATS Message handle and consumer-a's ack later targeted
the wrong delivery.

The trigger is timing-sensitive: pops on both consumers must occur
before either acks. A sequential drain-then-drain pattern hides the
bug because the inFlight key is removed before the second pop
repopulates it. Verified that the test fails ("ack(consumer-b, m-0)
must succeed ==> expected: <true> but was: <false>") against the
reverted broker and passes once the fix is restored.

Also updates four existing ITs to set q.resolvedConsumerName() to
match the consumer name passed to pop, since the fix makes ack/nack
key on (consumerName, messageId) and the contract is that callers
keep the two in sync. mockQueue gains an overload that takes a
consumerName for tests that need this explicitly.

Assisted-By: Claude Code
Three Subscribers-table refinements after the consumer-aware peek
landed:

1. Pending column = outstanding work, not just yet-to-deliver.
   Previously the column showed numPending (yet-to-deliver), but
   clicking the consumer link opens the explorer with peek starting
   at ackFloor + 1 — which includes in-flight messages. Operators saw
   "Pending: 0 / In-flight: 15" and got 15 rows in the explorer,
   confusing the column meaning. Switch the per-row pending count to
   numPending + numAckPending so the column matches what the
   explorer renders. WorkQueue retention is unaffected — its msgCount
   already represents outstanding work.

   Same shift applied to the queue-level approximateLimitsPending:
   base on min(ackFloor.streamSeq) instead of min(delivered.streamSeq)
   so the queue-wide "size" agrees with per-consumer pending semantics.

2. Workers column on Subscribers table. The registry stores one
   heartbeat per (JVM, consumer) pair, so multi-instance deployments
   show >1; thread-level fanout from concurrency = "10-20" lives
   inside a single instance and is not reflected. Javadoc spells this
   out so operators don't expect a thread count.

3. Example listener restored to mode = QueueType.STREAM on both
   job-queue listeners. Without it, two distinct consumerNames on
   the same queue would land on a WorkQueue stream and trip NATS
   error 10099. The mode change makes the multi-listener fan-out
   demonstrable end-to-end (and matches the scenario the new ITs
   exercise).

Assisted-By: Claude Code
Two test changes:

1. New JetStreamMessageBrokerConsumerAwarePeekIT covers the SPI
   overload peek(QueueDetail, consumerName, offset, count) on a
   Limits-retention stream with two durables at different ack
   positions. Asserts:
     - per-consumer peek for the fast consumer skips the acked range
       and returns only the un-acked tail
     - per-consumer peek for the untouched consumer returns the full
       stream (its ackFloor is 0)
     - the no-consumer overload bases on the stream's first sequence
       and is unaffected by per-consumer state

2. JetStreamQueueModeIT updated for the new (consumerName, id) inFlight
   keying contract. Three tests touched, all swap mockQueue(name, type)
   for mockQueue(name, type, consumerName) so q.resolvedConsumerName()
   matches what the test passes to broker.pop(...). Without the
   matching name, ack/nack lookups miss, the consumer's
   delivery-position assertions break, and consumerReuse falsely
   reports "5 messages remaining instead of 2."

The fan-out test was also restructured to use one QueueDetail per
listener — mirrors how production builds a separate QueueDetail per
@RqueueListener with its own resolvedConsumerName.

Assisted-By: Claude Code
Earlier ec99465 changed JetStreamMessageBroker.subscribers() to report
pending = numPending + numAckPending so the Pending column would equal
the explorer's row count. That collapsed the per-row pending vs
in-flight split and made NATS disagree with Redis on what "Pending"
means.

The Subscribers table already has a separate In-Flight column, so the
operator can see both numbers at a glance. Restoring pending to
numPending keeps the columns disjoint (yet-to-deliver vs being-
processed), aligns NATS Limits behaviour with the Redis backend's
LIST-vs-processing-ZSET split, and makes the Pending column match
NATS CLI's "Unprocessed" report.

The columns and the explorer answer different questions on purpose:
columns split outstanding work into pending + in-flight; clicking the
consumer link browses everything outstanding (peek bases on
ackFloor + 1, which spans both buckets). Operators see a 0-pending /
15-in-flight row and click through to see the 15 in-flight messages,
which was the original UX motivation for consumer-aware peek.

The queue-level approximateLimitsPending stays based on
min(ackFloor.streamSeq) — that's the queue's total outstanding work,
which is the right size for the queue listing.

Assisted-By: Claude Code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant