Skip to content

Add usage reporting and continuation-aware logs windows to the dashboard canvas#42226

Merged
pelikhan merged 9 commits into
mainfrom
copilot/add-usage-tab-to-dashboard
Jun 29, 2026
Merged

Add usage reporting and continuation-aware logs windows to the dashboard canvas#42226
pelikhan merged 9 commits into
mainfrom
copilot/add-usage-tab-to-dashboard

Conversation

Copilot AI commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

This updates the .github/extensions dashboard with a dedicated usage view and a shared logs-window model. The dashboard now ranks workflows by usage intensity from gh aw logs, projects monthly burn from the selected window, and progressively follows logs continuations instead of stopping at the first timed batch.

  • Usage tab

    • Adds a new Usage tab to the canvas.
    • Aggregates per-workflow AIC from gh aw logs --json results.
    • Sorts workflows by daily AIC burn, with AIC/run as a secondary sort.
    • Surfaces:
      • run count
      • total AIC
      • AIC per run
      • daily AIC
      • monthly forecast AIC
      • last run time
  • Shared report window controls

    • Adds a common report-window selector for 3 days, 7 days, and 1 month.
    • Applies the selected window consistently across:
      • Runs tab
      • Usage tab
      • Commands tab quick-fill / execution
    • Reuses the same timeout model for logs-backed views so the UI stays aligned on scope.
  • Progressive logs loading

    • Refactors the extension’s logs fetch path to treat gh aw logs --json as a paged/continuable source.
    • Follows returned continuation payloads across multiple requests, merging runs until completion or the configured continuation cap.
    • Preserves progressive timeout behavior by passing explicit --timeout values on each logs request.
  • Command panel behavior

    • Updates logs quick actions to emit commands scoped to the selected report window.
    • When a user runs gh aw logs --json ..., the extension now injects default window/timeout settings if they are not already specified.
    • Returns merged JSON output after continuation handling so the command panel reflects the same dataset as the tabs.
  • Extension API / runtime changes

    • Adds /api/usage for usage summaries derived from logs data.
    • Extends /api/runs and command execution endpoints to accept window + timeout parameters.
    • Adds canvas actions for usage reporting and aligns run lookup with the same logs-backed window model.

Example of the new logs flow:

gh aw logs --json -c 50 --start-date -1w --timeout 1
# if continuation is returned:
gh aw logs --json -c 50 --start-date -1w --before-run-id <next> --timeout 1

Copilot AI and others added 5 commits June 29, 2026 10:53
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title Add usage reporting to the agentic workflows dashboard Add usage reporting and continuation-aware logs windows to the dashboard canvas Jun 29, 2026
Copilot AI requested a review from pelikhan June 29, 2026 11:04
@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot use forecast command to compute forecast usage costs

const servers = new Map();
const cache = new Map(); // key → { data, expiresAt }
const CACHE_TTL_MS = 60_000;
const DEFAULT_LOG_TIMEOUT_MINUTES = 1;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot review timeout units, I think it is seconds. Check with command --help

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed with gh aw logs --help: --timeout is in minutes. The dashboard keeps the minute-based timeout model and now has coverage for it in 5224f6f.

@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot do not allow extension.mjs to be bloated, refactor into small focused helper files with tests.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

@copilot use forecast command to compute forecast usage costs

Addressed in 12d2cd2. The usage view now gets monthly forecast cost from gh aw forecast --json instead of extrapolating it directly from the logs window.

Copilot AI and others added 2 commits June 29, 2026 11:36
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

@copilot do not allow extension.mjs to be bloated, refactor into small focused helper files with tests.

Split the dashboard extension into focused CLI/config/logs/data helper modules and added tests for timeout and continuation handling in 5224f6f.

@pelikhan pelikhan marked this pull request as ready for review June 29, 2026 11:39
Copilot AI review requested due to automatic review settings June 29, 2026 11:39
@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR does not have the 'implementation' label and has ≤100 new lines of code in business logic directories.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances the Agentic Workflows Dashboard canvas extension by adding a usage/forecast view and refactoring logs retrieval to be continuation-aware, while also unifying the “report window” and timeout controls across dashboard tabs and the command panel.

Changes:

  • Adds a new Usage tab that aggregates per-workflow AIC burn and projects monthly usage via gh aw forecast.
  • Introduces shared report window (3d/7d/1mo) + timeout controls and updates the command panel to reuse them by default.
  • Refactors server-side data access to progressively fetch gh aw logs --json across continuations and return merged results.
Show a summary per file
File Description
.github/extensions/agentic-workflows-dashboard/web/styles.css Adjusts command output height and adds grid styling for the Usage tab table.
.github/extensions/agentic-workflows-dashboard/web/index.html Adds report window controls, Usage tab UI, and updates command quick actions.
.github/extensions/agentic-workflows-dashboard/web/app.js Adds report window state, usage fetching/paging, AIC formatting, and logs command building.
.github/extensions/agentic-workflows-dashboard/usage-forecast.mjs Implements usage aggregation and forecast-merging helpers.
.github/extensions/agentic-workflows-dashboard/test/usage-forecast.test.ts Adds unit coverage for usage/forecast helper logic.
.github/extensions/agentic-workflows-dashboard/test/dashboard-logs.test.ts Adds unit coverage for logs option normalization and continuation behavior.
.github/extensions/agentic-workflows-dashboard/test/dashboard-data.test.ts Adds unit coverage for continuation batching + forecast timeout plumbing.
.github/extensions/agentic-workflows-dashboard/package.json Expands JS formatting script to cover all *.mjs files.
.github/extensions/agentic-workflows-dashboard/extension.mjs Refactors extension server endpoints/actions to use shared data-access + new /api/usage.
.github/extensions/agentic-workflows-dashboard/dashboard-logs.mjs Adds logs parsing/normalization/continuation helpers for consistent CLI calls.
.github/extensions/agentic-workflows-dashboard/dashboard-data.mjs Adds cached data access layer with continuation-aware logs merging and usage+forecast aggregation.
.github/extensions/agentic-workflows-dashboard/dashboard-config.mjs Centralizes dashboard constants (windows, defaults, cache TTL, continuation cap).
.github/extensions/agentic-workflows-dashboard/dashboard-cli.mjs Extracts the gh-aw runner logic for reuse by the new data access layer.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 13/13 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment thread .github/extensions/agentic-workflows-dashboard/dashboard-logs.mjs
Comment on lines +75 to +79
function buildReportMessage(meta, emptyLabel) {
if (!meta?.window) {
return "";
}

@github-actions

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 97/100 — Excellent

Analyzed 11 test(s) across 3 new TypeScript/vitest files: 11 design tests, 0 implementation tests, 0 coding-guideline violations.

📊 Metrics & Test Classification (11 tests analyzed)
Metric Value
New/modified tests analyzed 11
✅ Design tests (behavioral contracts) 11 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 10 (91%)
Duplicate test clusters 0
Test inflation detected No
🚨 Coding-guideline violations 0
Test File Classification Issues Detected
"keeps logs command filters across continuation batches" test/dashboard-data.test.ts:6 ✅ Design
"passes timeout minutes through to forecast calls" test/dashboard-data.test.ts:36 ✅ Design Happy-path only
"defaults logs timeouts in minutes" test/dashboard-logs.test.ts:7 ✅ Design
"preserves fallback filters when continuing logs pages" test/dashboard-logs.test.ts:14 ✅ Design
"parses logs command args into continuation-ready options" test/dashboard-logs.test.ts:28 ✅ Design
"injects the selected report window and minute timeout when missing" test/dashboard-logs.test.ts:41 ✅ Design
"normalizes workflow ids from workflow paths" test/usage-forecast.test.ts:6 ✅ Design
"maps dashboard windows onto supported forecast history windows" test/usage-forecast.test.ts:11 ✅ Design
"prefers monte carlo p50 monthly forecast when available" test/usage-forecast.test.ts:17 ✅ Design
"merges logs usage with forecast command output by workflow id" test/usage-forecast.test.ts:22 ✅ Design
"can build summary with forecast data in a single call" test/usage-forecast.test.ts:62 ✅ Design Near-duplicate assertions with previous

Go: 0 (*_test.go); JavaScript: 0 (*.test.cjs, *.test.js); TypeScript: 11 (*.test.ts) — analyzed as JavaScript-family (vitest). Note: .test.ts is not the primary test format documented for this repo (*.test.cjs), but vitest conventions and scoring rules apply equally.

Score breakdown:

  • Behavioral coverage: 40/40 (11/11 design tests)
  • Error/edge case coverage: 27/30 (10/11 tests; "passes timeout minutes through to forecast calls" is happy-path only)
  • Low duplication: 20/20 (no cluster of 3+ with identical assertion patterns)
  • Proportional growth: 10/10 (test/prod ratios: data 0.28:1, logs 0.21:1, forecast 0.93:1 — all well under 2:1)

Dependency injection instead of mocks: All tests in dashboard-data.test.ts inject a runGhAw stub directly — no vi.mock() or vi.spyOn() is used anywhere. This is the preferred pattern.

Verdict

Check passed. 0% implementation tests (threshold: 30%). All tests verify observable behavioral contracts — continuation filter preservation, parameter propagation, default values, arg parsing, ID normalization, window mapping, forecast preference logic, and usage aggregation — with strong edge-case and multi-scenario coverage.

🧪 Test quality analysis by Test Quality Sentinel · 54 AIC · ⌖ 15.3 AIC · ⊞ 8.1K ·
Comment /review to run again

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 97/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%).

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /tdd, /zoom-out, /improve-codebase-architecture, and /grill-with-docs — requesting changes on a correctness bug and several coverage/architecture gaps.

📋 Key Themes & Highlights

Key Findings

  • Correctness buggetUsage cache key omits filter fields (workflowName, engine, branch, etc.), causing stale data to be served silently when the same window/count/timeout is reused with different filters (line 135–139, dashboard-data.mjs).
  • Precision riskmergeRuns sorts by Number(run_id) subtraction; GitHub run IDs can exceed Number.MAX_SAFE_INTEGER, making the sort unreliable on large IDs.
  • Shared singleton cachedataAccess is a process-level singleton, so clearCache() from one user's canvas session evicts all other users' cached data.
  • Test gapsmergeRuns (the core dedup function), the MAX_LOG_CONTINUATIONS cap, and the mid-continuation parse-error path have no tests.
  • Hardcoded contextsgetRun hardcodes count: 200, window: "1mo", and the runCommand canvas action hardcodes window: "7d", making them inconsistent with the parameterised HTTP endpoints.
  • Naming clarityforecastDaysForWindow returns 7 for the "3d" window (mapping to the forecast tool's minimum), which is non-obvious without a doc comment.

Positive Highlights

  • ✅ Excellent module extraction — extension.mjs slims from 119 to 19 lines, separation of concerns is clean.
  • continuationToLogsOptions correctly falls back to fallback values for missing continuation fields.
  • ✅ Test structure is clear: injected runGhAw mock makes unit tests fast and hermetic.
  • normalizeLogsOptions validates and normalizes every input field — defensive and predictable.
  • ✅ PR description is detailed and the example logs flow is helpful.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 102.8 AIC · ⌖ 11 AIC · ⊞ 6.6K
Comment /matt to run again

window: normalized.window.id,
count: normalized.count,
timeout: normalized.timeout,
})}`;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] The getUsage cache key omits several normalized fields (workflowName, engine, branch, startDate, endDate, artifacts), so two calls differing only in those parameters will incorrectly share a cached result — silently returning wrong data.

💡 Fix: mirror the complete key from `getLogsData`
const key = `usage:${JSON.stringify({
  window: normalized.window.id,
  count: normalized.count,
  timeout: normalized.timeout,
  startDate: normalized.startDate,
  endDate: normalized.endDate,
  workflowName: normalized.workflowName,
  engine: normalized.engine,
  branch: normalized.branch,
  artifacts: normalized.artifacts,
})}`;

Alternatively, since getUsage already delegates to getLogsData (which is cached with the full key), you can drop the usage: cache entry entirely and only cache the forecast-augmented result.

@copilot please address this.

merged.set(run.run_id, run);
}
}
return Array.from(merged.values()).sort((a, b) => Number(b.run_id ?? 0) - Number(a.run_id ?? 0));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] mergeRuns sorts by Number(b.run_id) - Number(a.run_id). GitHub run IDs can exceed Number.MAX_SAFE_INTEGER (253−1), so Number() conversion loses precision and the sort order becomes unreliable.

💡 Use BigInt comparison or string lexicographic sort instead
return Array.from(merged.values()).sort((a, b) => {
  const aId = BigInt(a.run_id ?? 0);
  const bId = BigInt(b.run_id ?? 0);
  return bId > aId ? 1 : bId < aId ? -1 : 0;
});

A test that exercises large run IDs (e.g. 9007199254740993) would lock in this fix and prevent regression.

@copilot please address this.

}
}
const runGhAw = createGhAwRunner({ getWorkspacePath: () => workspacePath });
const dataAccess = createDashboardDataAccess({ runGhAw });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/improve-codebase-architecture] dataAccess is a module-level singleton, so its 60-second in-memory cache is shared across all canvas instances (all users). A refresh action from one user's session clears every other session's cache.

💡 Instantiate per-session or pass instanceId into the cache key

Move construction inside startServer() (or the canvas open handler) so each canvas instance owns an isolated cache:

async function startServer() {
  const dataAccess = createDashboardDataAccess({ runGhAw });
  const server = createServer(async (req, res) => { ... });
  ...
}

Alternatively, prefix every cache key with ctx.instanceId so cache entries are scoped per session without changing the singleton boundary.

@copilot please address this.

handler: async ctx => {
const runs = await getRuns(200);
return { run: runs.find(r => r.run_id === Number(ctx.input?.run_id)) ?? null };
const logsData = await dataAccess.getRuns({ count: 200, window: "1mo", timeout: DEFAULT_LOG_TIMEOUT_MINUTES });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] getRun always fetches count: 200, window: "1mo" regardless of the user's selected window, so it silently misses any run older than one month, fetches an arbitrarily large result set on every call, and bypasses the shared cache for the user's current window.

💡 Accept the user's selected window from the action input
handler: async ctx => {
  const window = String(ctx.input?.window ?? "7d");
  const timeout = Number(ctx.input?.timeout ?? DEFAULT_LOG_TIMEOUT_MINUTES);
  const logsData = await dataAccess.getRuns({ count: DEFAULT_RUN_COUNT, window, timeout });
  return { run: logsData.runs.find(r => r.run_id === Number(ctx.input?.run_id)) ?? null };
},

Add window and timeout to the action's inputSchema so callers can control the scope.

@copilot please address this.


export function parseGhAwArgs(raw) {
const match = raw.trim().match(/^(?:gh\s+aw\s+)(.+)$/);
return match ? match[1].trim().split(/\s+/) : null;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/improve-codebase-architecture] parseGhAwArgs tokenizes by splitting on whitespace. Any argument containing a space (e.g. a workflow name like "my workflow", or a --start-date 2026-06-01 value that gets mixed with adjacent text) will be split incorrectly, producing malformed args passed to the CLI.

💡 Use a simple quoted-string-aware tokenizer or document the constraint

If only simple word-based args are expected, document this explicitly and guard the input. For full correctness, consider a minimal shell tokenizer:

export function parseGhAwArgs(raw) {
  const match = raw.trim().match(/^(?:gh\s+aw\s+)(.+)$/s);
  if (!match) return null;
  // Naïve tokenizer: supports '' and "" quoted tokens
  const tokens = [];
  let current = "";
  let inQuote = null;
  for (const ch of match[1]) {
    if (inQuote) {
      if (ch === inQuote) inQuote = null;
      else current += ch;
    } else if (ch === "'" || ch === '"') {
      inQuote = ch;
    } else if (/\s/.test(ch)) {
      if (current) tokens.push(current), current = "";
    } else {
      current += ch;
    }
  }
  if (current) tokens.push(current);
  return tokens.length ? tokens : null;
}

@copilot please address this.

it("injects the selected report window and minute timeout when missing", () => {
expect(normalizeLogsCommandArgs(["logs", "--json"], "3d", 4)).toEqual(["logs", "--json", "--start-date", "-3d", "--timeout", "4", "--artifacts", "usage"]);
});
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] mergeRuns — the core deduplication and sort function for paginated continuation results — has no test in this file. Without coverage, a regression in run deduplication would go undetected.

💡 Add a test for mergeRuns
import { mergeRuns } from "../dashboard-logs.mjs";

it("deduplicates runs across continuation batches by run_id", () => {
  const first = [{ run_id: 100, name: "a" }, { run_id: 99 }];
  const second = [{ run_id: 100, name: "b" }, { run_id: 98 }]; // run 100 updated
  const result = mergeRuns(first, second);
  expect(result.map(r => r.run_id)).toEqual([100, 99, 98]);
  expect(result[0].name).toBe("b"); // second batch wins
});

Also add a case with very large run IDs once the BigInt sort is in place (see comment on line 75 of dashboard-logs.mjs).

@copilot please address this.

expect(calls[1]).toEqual(["forecast", "--json", "--period", "month", "--days", "7", "--timeout", "3", "ci-doctor"]);
expect(usage.items[0]?.monthly_forecast_aic).toBe(44);
});
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] There's no test for the MAX_LOG_CONTINUATIONS cap. The loop in fetchLogsBatches exits with partial: true after 6 batches, but this behaviour is untested — a future refactor could silently regress it.

💡 Add a capped-continuation test
it("stops following continuations after MAX_LOG_CONTINUATIONS and marks result partial", async () => {
  let callCount = 0;
  const dataAccess = createDashboardDataAccess({
    runGhAw: async () => {
      callCount++;
      return JSON.stringify({
        runs: [{ run_id: callCount }],
        continuation: { before_run_id: callCount - 1 },
      });
    },
  });

  const result = await dataAccess.getRuns({ window: "7d" });
  expect(result.partial).toBe(true);
  expect(result.logsFetches).toBe(6); // MAX_LOG_CONTINUATIONS
  expect(callCount).toBe(6);
});\n```

</details>

@copilot please address this.

additionalProperties: false,
},
handler: async ctx => execCommand(String(ctx.input?.command ?? "")),
handler: async ctx => dataAccess.execCommand(String(ctx.input?.command ?? ""), { window: "7d", timeout: DEFAULT_LOG_TIMEOUT_MINUTES }),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] The runCommand canvas action hardcodes window: "7d" and timeout: DEFAULT_LOG_TIMEOUT_MINUTES regardless of the caller's context. This is inconsistent with the HTTP /api/run-command endpoint (which accepts window and timeout query params) and means an agent using the canvas action always gets 7-day scope, ignoring whatever window the user has selected.

💡 Accept window and timeout in the action's inputSchema
inputSchema: {
  type: "object",
  required: ["command"],
  properties: {
    command: { type: "string" },
    window: { type: "string", enum: ["3d", "7d", "1mo"] },
    timeout: { type: "number", minimum: 1, maximum: 10 },
  },
  additionalProperties: false,
},
handler: async ctx =>
  dataAccess.execCommand(String(ctx.input?.command ?? ""), {
    window: String(ctx.input?.window ?? "7d"),
    timeout: Number(ctx.input?.timeout ?? DEFAULT_LOG_TIMEOUT_MINUTES),
  }),

@copilot please address this.

return name.trim();
}

export function forecastDaysForWindow(window) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/grill-with-docs] forecastDaysForWindow returns 7 for both the "3d" and "7d" windows. The function name suggests it returns the number of days in the window, but it actually returns the minimum history span the forecast command supports. A caller expecting forecastDaysForWindow({ id: "3d" }) to return 3 will be surprised, and the test comment (// maps dashboard windows onto supported forecast history windows) only partially explains why.

💡 Rename or add a JSDoc comment
/**
 * Returns the forecast history window (in days) to pass to `gh aw forecast --days`.
 * The forecast command does not support windows smaller than 7 days, so 3-day
 * dashboard windows are mapped to the 7-day minimum.
 */
export function forecastHistoryDaysForWindow(window) {
  return window?.id === "1mo" ? 30 : 7;
}

Rename the callers accordingly, or at minimum add the JSDoc above the existing function.

@copilot please address this.

try {
data = JSON.parse(raw);
} catch (error) {
throw new Error(`Failed to parse logs batch ${logsFetches + 1}: ${error.message}`);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The error path in fetchLogsBatches — when a mid-continuation batch returns unparseable JSON — has no test. If the CLI emits a warning line before the JSON, or returns a rate-limit error message, the thrown Error will surface as a 500 with a raw message rather than something actionable.

💡 Add a parse-failure test and consider a better error surface

Test:

it("throws a descriptive error when a continuation batch returns invalid JSON", async () => {
  let call = 0;
  const dataAccess = createDashboardDataAccess({
    runGhAw: async () => {
      call++;
      return call === 1
        ? JSON.stringify({ runs: [{ run_id: 1 }], continuation: { before_run_id: 0 } })
        : "not json";
    },
  });
  await expect(dataAccess.getRuns({ window: "7d" })).rejects.toThrow("Failed to parse logs batch 2");
});\n```

Consider propagating the partial result (the runs collected so far) alongside the error so callers can choose to surface what they have rather than failing completely.

</details>

@copilot please address this.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Add usage reporting and continuation-aware logs windows

Good overall structure. The refactor into focused modules (dashboard-cli.mjs, dashboard-config.mjs, dashboard-data.mjs, dashboard-logs.mjs, usage-forecast.mjs) is clean, and the test coverage for the new logic is solid. The continuation-aware logs fetch and usage aggregation are well-designed.

Five issues worth addressing before merge:

# File Issue Severity
1 dashboard-data.mjs:135 getUsage cache key is missing filter fields (workflowName, startDate, engine, branch) that influence the underlying logs fetch Medium
2 dashboard-logs.mjs:80 parseGhAwArgs splits by whitespace only — workflow names with spaces will be silently mis-parsed on continuation pages Medium
3 dashboard-data.mjs:115 All workflow IDs are spread as unbounded CLI args to gh aw forecast; repos with many workflows could hit OS arg-length limits Medium
4 usage-forecast.mjs:24 forecastDaysForWindow("3d") returns 7 — needs a comment explaining the minimum-history constraint Low
5 extension.mjs:257 runCommand canvas action hardcodes window: "7d", ignoring the caller's selected report window Low

Inline comments with suggested fixes are attached.

🧵 Reviewed using Impeccable skills by Impeccable Skills Reviewer · 89.6 AIC · ⌖ 7.27 AIC · ⊞ 4.9K

window: normalized.window.id,
count: normalized.count,
timeout: normalized.timeout,
})}`;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getUsage cache key only captures window, count, and timeout, but getUsage calls getLogsData(normalized) which respects the full normalized object (including workflowName, startDate, endDate, engine, branch). If a caller passes extra filters, getUsage will silently return a cached result from a different query.

Consider extending the key to include all fields that influence the underlying logs fetch, mirroring the getLogsData key:

const key = `usage:${JSON.stringify({
  window: normalized.window.id,
  count: normalized.count,
  timeout: normalized.timeout,
  startDate: normalized.startDate,
  workflowName: normalized.workflowName,
  engine: normalized.engine,
  branch: normalized.branch,
})}`;

@copilot please address this.


export function parseGhAwArgs(raw) {
const match = raw.trim().match(/^(?:gh\s+aw\s+)(.+)$/);
return match ? match[1].trim().split(/\s+/) : null;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command is split naively by whitespace (split(/\s+/)), so workflow names that contain a space will be broken into multiple tokens. For example, gh aw logs "my workflow" --json would parse as ["logs", "\"my", "workflow\"", "--json"], misidentifying both the workflow name and the next arg.

Since continuation calls re-use the parsed workflowName, this silently produces wrong requests on each page.

Consider a shell-aware split (e.g., a small quoted-string tokenizer) or document the constraint that workflow names with spaces are not supported.

@copilot please address this.

return [];
}

const args = ["forecast", "--json", "--period", "month", "--days", String(forecastDaysForWindow(window)), "--timeout", String(timeout), ...workflowIDs];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All unique workflow IDs are spread directly into the CLI argument list with no upper bound (...workflowIDs). With 6 continuation batches × 100 runs each, a repo could easily have 50+ distinct workflows. Most operating systems cap argument lists around 128–256 KB; if the cap is hit, execFile will throw E2BIG and the entire usage panel will fail.

Consider either:

  • capping the list (e.g., workflowIDs.slice(0, MAX_FORECAST_WORKFLOWS) with a defined constant), or
  • passing the IDs as a comma-separated string via a single flag if the CLI supports it.

@copilot please address this.

}

export function forecastDaysForWindow(window) {
return window?.id === "1mo" ? 30 : 7;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both "3d" and "7d" map to 7 forecast-history days, so a user viewing the 3-day window still sees a 7-day historical forecast. This is likely intentional (the forecast API may require a minimum of 7 days of history), but without a comment it reads as a bug.

Please add a short comment explaining the constraint, e.g.:

// The forecast API requires at least 7 days of history; treat the 3-day window the same as 7-day.
export function forecastDaysForWindow(window) {
  return window?.id === "1mo" ? 30 : 7;
}

@copilot please address this.

additionalProperties: false,
},
handler: async ctx => execCommand(String(ctx.input?.command ?? "")),
handler: async ctx => dataAccess.execCommand(String(ctx.input?.command ?? ""), { window: "7d", timeout: DEFAULT_LOG_TIMEOUT_MINUTES }),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runCommand canvas action ignores the caller's window context and always defaults to window: "7d". If an agent calls runCommand after selecting the 3-day or 1-month window via selectReportWindow, the executed logs command will silently use the wrong date range.

Consider accepting window and timeout in the action's inputSchema (like listRuns does) so callers can pass the currently selected window through:

handler: async ctx => dataAccess.execCommand(
  String(ctx.input?.command ?? ""),
  { window: String(ctx.input?.window ?? "7d"), timeout: Number(ctx.input?.timeout ?? DEFAULT_LOG_TIMEOUT_MINUTES) }
),

@copilot please address this.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUEST_CHANGES — 2 high-severity correctness issues, 4 medium

The architecture refactor is clean and the new module boundaries are sensible. The continuation loop, cache infrastructure, and UI wiring are all reasonable. The issues below are all correctness bugs or latent time-bombs — not style.

Blocking issues

getUsage cache key is incomplete (dashboard-data.mjs:135-139): the key only covers window, count, and timeout. Any future caller that passes workflowName, engine, branch, or date/ID filters will get the wrong cached result. getLogsData does this right — getUsage should match it.

parseGhAwArgs splits on whitespace without honoring quotes (dashboard-logs.mjs:80): gh aw logs "My Workflow" --json tokenizes into ["logs", '"My', 'Workflow"', "--json"]. The workflowName is mangled, and the rest of the args are wrong. This is an exported function driving execCommand — fixing it requires a simple quoted-token loop.

Medium issues
  • mergeRuns Map key is not type-normalized (dashboard-logs.mjs:72): if run_id is a number in one batch and a string in another, dedup silently fails and duplicate runs appear.
  • continuationToLogsOptions uses || for numeric IDs (dashboard-logs.mjs:61): continuation.after_run_id || fallback.afterRunID ignores 0 as a valid ID; use ?? for numeric fields.
  • getRun silently misses valid run IDs (extension.mjs:244): 200-run cap on a 1-month window is insufficient for busy repos; the action description implies reliable lookup.
  • last_run_at uses lexicographic timestamp comparison (usage-forecast.mjs:82): relies on identical ISO format from all sources — use Date.parse() comparison to be robust.

🔎 Code quality review by PR Code Quality Reviewer · 139.8 AIC · ⌖ 7.62 AIC · ⊞ 5.2K
Comment /review to run again

window: normalized.window.id,
count: normalized.count,
timeout: normalized.timeout,
})}`;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete cache key lets getUsage return stale results for filtered queries: the key excludes workflowName, engine, branch, startDate, endDate, beforeRunID, afterRunID, and artifacts — so two calls that differ only in those fields collide on the same cache entry.

💡 Suggested fix

Mirror the key construction from getLogsData:

const key = `usage:${JSON.stringify({
  window: normalized.window.id,
  count: normalized.count,
  timeout: normalized.timeout,
  startDate: normalized.startDate,
  endDate: normalized.endDate,
  beforeRunID: normalized.beforeRunID,
  afterRunID: normalized.afterRunID,
  workflowName: normalized.workflowName,
  engine: normalized.engine,
  branch: normalized.branch,
  artifacts: normalized.artifacts,
})}`;

Current callers only pass window/count/timeout, so this is latent — but getUsage accepts the full options object and will return wrong cached data the moment any caller adds a filter.


export function parseGhAwArgs(raw) {
const match = raw.trim().match(/^(?:gh\s+aw\s+)(.+)$/);
return match ? match[1].trim().split(/\s+/) : null;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseGhAwArgs splits on whitespace without respecting shell quoting, so any command with a quoted workflow name produces broken arguments: gh aw logs "My Workflow" --json becomes ["logs", '"My', 'Workflow"', "--json"]workflowName is set to the fragment '"My' and the rest is misparsed.

💡 Suggested fix

Replace the naive split(/\s+/) with a simple shell-aware tokenizer that respects single/double-quoted spans:

export function parseGhAwArgs(raw) {
  const match = raw.trim().match(/^(?:gh\s+aw\s+)(.+)$/);
  if (!match) return null;
  const tokens = [];
  const re = /"([^"]*)"|'([^']*)'|(\S+)/g;
  let m;
  while ((m = re.exec(match[1])) !== null) {
    tokens.push(m[1] ?? m[2] ?? m[3]);
  }
  return tokens.length > 0 ? tokens : null;
}

This handles the common cases (quoted strings with spaces) without pulling in a full shell-parse library.

handler: async ctx => {
const runs = await getRuns(200);
return { run: runs.find(r => r.run_id === Number(ctx.input?.run_id)) ?? null };
const logsData = await dataAccess.getRuns({ count: 200, window: "1mo", timeout: DEFAULT_LOG_TIMEOUT_MINUTES });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getRun silently returns null for any run outside the first 200 in the last month: it fetches a bounded slice then does a client-side find — any valid run_id older than 30 days or beyond position 200 is reported as not found even though it exists.

💡 Why this matters

The action's description says "looks up a single run by run_id", implying it should reliably find any run ID the caller provides. A busy repository can easily produce 200+ runs in a month, so the lookup is incorrect by design for any older or deeper run.

Consider either:

  1. Raising the bound significantly (e.g. count: 1000) and accepting the latency cost, or
  2. Exposing the limitation in the action description so callers know this is a best-effort recent-run scan, not a reliable lookup.

Also: hardcoding window: "1mo" means this endpoint ignores the user's selected report window. If the user is browsing a 3-day window, the lookup will still fetch a full month of data.

const merged = new Map(existingRuns.map(run => [run.run_id, run]));
for (const run of nextRuns) {
if (run?.run_id != null) {
merged.set(run.run_id, run);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mergeRuns uses run_id directly as a Map key — if types differ across batches, deduplication silently produces duplicate runs: JavaScript Map uses strict equality, so numeric 100 and string "100" are distinct keys; both entries survive the merge and the same run appears twice in the sorted output.

💡 Suggested fix

Normalize to string (or number) before using as the key:

export function mergeRuns(existingRuns, nextRuns) {
  const merged = new Map(existingRuns.map(run => [String(run.run_id), run]));
  for (const run of nextRuns) {
    if (run?.run_id != null) {
      merged.set(String(run.run_id), run);
    }
  }
  return Array.from(merged.values()).sort((a, b) => Number(b.run_id ?? 0) - Number(a.run_id ?? 0));
}

The .sort() already calls Number() so the result is correct regardless; only the Map keying needs fixing.

endDate: continuation.end_date || fallback.endDate,
engine: continuation.engine || fallback.engine,
branch: continuation.branch || fallback.branch,
afterRunID: continuation.after_run_id || fallback.afterRunID,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continuationToLogsOptions uses || for numeric ID fields — a continuation that explicitly sends after_run_id: 0 falls back to the previous batch's value instead: 0 is falsy in JavaScript, so continuation.after_run_id || fallback.afterRunID silently ignores the continuation's intent when the value is zero.

💡 Suggested fix

Use nullish coalescing (??) for all fields where 0 or "" could be a deliberate override:

afterRunID: continuation.after_run_id ?? fallback.afterRunID,
beforeRunID: continuation.before_run_id ?? fallback.beforeRunID,
count: continuation.count ?? fallback.count,
timeout: continuation.timeout ?? fallback.timeout,

String fields like workflow_name, start_date, etc. are fine with || since an empty string there means "not set".

}

const createdAt = typeof run?.created_at === "string" ? run.created_at : "";
if (createdAt && (!entry.last_run_at || createdAt > entry.last_run_at)) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last_run_at is tracked via lexicographic string comparison — semantically valid ISO timestamps with differing timezone offsets or fractional-second precision compare incorrectly: "2026-06-29T12:00:00+00:00" and "2026-06-29T12:00:00Z" represent the same instant but "2026-06-29T12:00:00+00:00" > "2026-06-29T12:00:00Z" is true (lexicographically + > Z in ASCII is false, but offset variants produce unexpected orderings).

💡 Suggested fix

Compare by parsed epoch milliseconds to be format-agnostic:

const createdMs = createdAt ? Date.parse(createdAt) : NaN;
const lastMs = entry.last_run_at ? Date.parse(entry.last_run_at) : NaN;
if (!Number.isNaN(createdMs) && (Number.isNaN(lastMs) || createdMs > lastMs)) {
  entry.last_run_at = createdAt;
}

GitHub Actions timestamps are consistently UTC (Z-suffixed), so this is low-probability today, but correctness shouldn't rely on source consistency.

@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot run pr-finisher skill

…e fallback

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

@copilot run pr-finisher skill

Done. Ran pr-finisher and addressed the two open automated-review bugs:

  • dashboard-logs.mjs: normalizeLogsOptions now extracts window.id when passed a window object, so callers that hand in a normalized options object don't silently fall back to the default window.
  • web/app.js: buildReportMessage now returns emptyLabel instead of "" when meta.window is missing, so the caller-provided fallback is actually reachable.

A regression test for the window-as-object case was added. All 14 tests pass locally. Commit: 9fd7aa0.

@github-actions

Copy link
Copy Markdown
Contributor

🤖 PR Triage — §28376613466

Field Value
Category feature
Risk 🔴 High
Priority 🟡 Medium
Score 40 / 100
Action 🔁 batch_review
Age 2.7h

Score breakdown: Impact 25 + Urgency 10 + Quality 5

Rationale: Adds usage reporting and continuation-aware log windows to the dashboard canvas. Large change (1240+/288−, 13 files) with CHANGES_REQUESTED from github-actions bot (blocking). High risk due to size and blocking review status. Needs reviewer/author attention before any merge path.

Batch: Groups with #41824 and #42235 (all need CHANGES_REQUESTED resolved).

Generated by 🔧 PR Triage Agent · 93.2 AIC · ⌖ 14.6 AIC · ⊞ 5.4K ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants