Populate audit job steps in structured audit output by Copilot · Pull Request #42222 · github/gh-aw

Copilot · 2026-06-29T10:57:13Z

gh aw audit was always emitting jobs without step details, even though the GitHub Actions jobs API already returned per-step status and conclusions. This left failed runs without the step-level context needed to identify where a job actually broke.

Fetch step data from the jobs API
- Extend the gh api /actions/runs/{id}/jobs projection to include each job’s steps with name, status, and conclusion.
- Preserve existing job duration/failure counting behavior while carrying the additional step payload.
Thread steps through audit models
- Add step metadata to the internal job model used by run processing and cached summaries.
- Map that metadata into audit JSON so each jobs[] entry can expose steps[] instead of dropping it during report assembly.
Cover the regression at both layers
- Add a focused fetch-path test proving step data is parsed from the GitHub CLI/API response.
- Update audit report tests to verify step arrays survive JSON rendering.

Example output after this change:

{
  "name": "agent",
  "status": "completed",
  "conclusion": "failure",
  "steps": [
    { "name": "Set up job", "status": "completed", "conclusion": "success" },
    { "name": "Run agent", "status": "completed", "conclusion": "failure" }
  ]
}

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

github-actions · 2026-06-29T11:39:29Z

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

github-actions · 2026-06-29T11:39:30Z

✅ Test Quality Sentinel completed test quality analysis.

github-actions · 2026-06-29T11:39:47Z

✅ Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #42222 does not have the 'implementation' label and has ≤100 new lines of code in business logic directories (exactly 100 additions, threshold is >100).

github-actions · 2026-06-29T11:39:56Z

✅ PR Code Quality Reviewer completed the code quality review.

Copilot

Pull request overview

This PR enhances gh aw audit structured output by carrying GitHub Actions job step status/conclusion details end-to-end (API fetch → internal models → rendered audit JSON), so failed runs can be diagnosed at step granularity.

Changes:

Extend the jobs API fetch projection to include each job’s steps[] with name, status, and conclusion.
Thread step metadata through the internal job model and into the audit report JSON model.
Add/extend tests to cover step parsing from gh api output and preservation through JSON rendering.

Show a summary per file

File	Description
pkg/cli/logs_models.go	Adds `JobStep` and `JobInfo.Steps` to carry step metadata in the internal model.
pkg/cli/logs_github_api.go	Extends the `gh api .../jobs` jq projection to include per-job `steps[]` payload.
pkg/cli/logs_github_api_test.go	Adds a focused test that validates steps are parsed from the fetch path and failure counting/duration still work.
pkg/cli/audit_report.go	Adds `JobStepData` and maps internal step data into structured audit output.
pkg/cli/audit_report_test.go	Updates audit and JSON rendering tests to assert job steps survive report assembly and JSON encoding/decoding.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 5/5 changed files
Comments generated: 1
Review effort level: Low

 	output, err := workflow.RunGHCombined("Fetching job details...", "api",
 		fmt.Sprintf("repos/{owner}/{repo}/actions/runs/%d/jobs", runID),
-		"--jq", ".jobs[] | {name: .name, status: .status, conclusion: .conclusion, started_at: .started_at, completed_at: .completed_at}")
+		"--jq", ".jobs[] | {name: .name, status: .status, conclusion: .conclusion, started_at: .started_at, completed_at: .completed_at, steps: ((.steps // []) | map({name: .name, status: .status, conclusion: .conclusion}))}")


github-actions · 2026-06-29T11:48:07Z

🧪 Test Quality Sentinel Report

✅ Test Quality Score: 80/100 — Excellent

Analyzed 3 test(s) across 2 Go test files: 3 design tests, 0 implementation tests, 0 hard guideline violations (4 missing assertion messages noted).

📊 Metrics & Test Classification (3 tests analyzed)

Metric	Value
New/modified tests analyzed	3
✅ Design tests (behavioral contracts)	3 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	2 (67%)
Duplicate test clusters	0
Test inflation detected	YES (both test files exceed 2:1 ratio vs production changes)
🚨 Coding-guideline violations	0 hard violations; 4 missing assertion messages (soft flag)

Test	File	Classification	Issues Detected
`TestBuildAuditDataComplete` (modified)	`pkg/cli/audit_report_test.go:686`	✅ Design	—
`TestRenderJSONComplete` (modified)	`pkg/cli/audit_report_test.go:923`	✅ Design	Happy-path steps only
`TestFetchJobDetailsWithCountsIncludesSteps`	`pkg/cli/logs_github_api_test.go:228`	✅ Design	4 bare `require.*` calls (no message arg)

Go: 2 (*_test.go); JavaScript: 0. No other languages detected.

Inflation detail (binary −10 pt deduction applied):

audit_report_test.go: 44 lines added vs audit_report.go 15 lines added → 2.93:1
logs_github_api_test.go: 32 lines added vs logs_github_api.go 1 line added → 32:1

Both exceed the 2:1 threshold. Note: the high ratio for logs_github_api_test.go is expected — the production change is a single-line jq query expansion, while the test requires a full fake-gh harness to exercise it. The penalty is applied per rubric but does not indicate poor test quality here.

Score breakdown:

Behavioral coverage (40 pts): 3/3 design → 40
Error/edge coverage (30 pts): 2/3 with edge cases → 20
Low duplication (20 pts): 0 clusters → 20
Proportional growth (10 pts): inflation detected → 0
Total: 80/100

⚠️ Flagged Tests — Soft Concerns (1 item)

TestFetchJobDetailsWithCountsIncludesSteps (pkg/cli/logs_github_api_test.go:228) — ⚠️ Missing assertion messages: four require.NoError / require.Len calls lack descriptive message arguments, making failure output harder to diagnose. The guideline asks for a context string on every assertion call. Suggested fix: add messages, e.g. require.NoError(t, err, "fetchJobDetailsWithCounts should not error") and require.Len(t, jobs, 1, "should parse exactly one job from fake gh output"). This is a soft concern and does not trigger a block.

Verdict

✅ Check passed. 0% implementation tests (threshold: 30%). All three tests verify observable behavioral contracts — step data is parsed, propagated through the audit pipeline, and serialized correctly into JSON output.

🧪 Test quality analysis by Test Quality Sentinel · 57.6 AIC · ⌖ 17.5 AIC · ⊞ 8.1K · ◷
_{Comment /review to run again}

github-actions

Test Quality Sentinel: 80/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 3 tests verify observable behavioral contracts covering step data parsing, audit pipeline propagation, and JSON serialization. One soft concern: 4 bare require.* calls in TestFetchJobDetailsWithCountsIncludesSteps lack descriptive message arguments (not a block).

github-actions

Skills-Based Review 🧠

Applied /tdd and /zoom-out — approving with minor suggestions.

The fix is correct and well-targeted: the jq projection was the single missing link, and threading JobStep through the model + report layers is done cleanly. Regression tests at both the fetch and report layers are a strong signal of healthy TDD practice.

📋 Key Themes & Highlights

Minor Suggestions (non-blocking)

Missing edge-case test: the (.steps // []) null-guard in the jq projection has no test proving it handles a job with no steps field. The happy-path test is solid, but the fallback path is untested.
Dual struct types: JobStep and JobStepData are structurally identical; a type alias (type JobStepData = JobStep) would eliminate the per-field mapping and reduce future drift risk.
Missing console:"-" tag: Steps on JobData is the only field without a console struct tag; an explicit opt-out makes the intent clear to future maintainers.

Positive Highlights

✅ Root cause fixed at the right layer — the jq projection, not a downstream workaround
✅ (.steps // []) guard handles null steps defensively
✅ Fake-gh-binary test pattern is consistent with the rest of the file
✅ Cache invalidation (CLI version mismatch) ensures existing run_summary.json files without steps are automatically reprocessed
✅ Step data flows end-to-end: API → model → cache → audit report → JSON

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 63.6 AIC · ⌖ 11.3 AIC · ⊞ 6.6K
_{Comment /matt to run again}

github-actions · 2026-06-29T11:48:19Z

+	require.NoError(t, err)
+	assert.Contains(t, string(argsLog), "repos/{owner}/{repo}/actions/runs/28307653871/jobs", "should query the run jobs API")
+	assert.Contains(t, string(argsLog), "steps:", "gh jq projection should request step data")
+}


[/tdd] Missing edge-case test: job with null or absent steps — the (.steps // []) jq guard is untested.

The guard ensures API responses without a steps field still parse cleanly, but no test exercises this path. A job that returns JSON without a steps key could silently produce a nil slice or a parse error without anyone noticing.

💡 Suggested addition

Add a second sub-test (or table-driven variant) feeding output without a steps field and assert jobs[0].Steps is nil/empty:

// Verify a job with no steps field round-trips without error assert.Empty(t, jobs[0].Steps, "job with no steps field should produce empty Steps slice")

This documents the contract of (.steps // []) and prevents a regression if the jq expression is later simplified.

@copilot please address this.

github-actions · 2026-06-29T11:48:19Z

+	Name       string `json:"name"`
+	Status     string `json:"status,omitempty"`
+	Conclusion string `json:"conclusion,omitempty"`
 }


[/zoom-out] JobStep and JobStepData are structurally identical — the type-cast conversion adds maintenance debt.

The two structs have the same three fields. return JobStepData(step) works via Go's composite-literal type conversion, but any future field added to one will require a matching change in the other (and a updated mapping). Since JobStepData carries no extra behaviour or tags beyond what JobStep already has, consider using a type alias instead:

💡 Simpler alternative

// In audit_report.go — eliminate the duplicate struct entirely type JobStepData = JobStep // type alias, no conversion needed

Then the mapping in buildAuditData reduces to:

Steps: slices.Clone(jobDetail.Steps),

If the two types do need to diverge in the future (e.g., console tags, extra audit-only fields), switching back to a concrete struct is a one-line change.

@copilot please address this.

github-actions · 2026-06-29T11:48:19Z

+	Status     string        `json:"status" console:"header:Status"`
+	Conclusion string        `json:"conclusion,omitempty" console:"header:Conclusion,omitempty"`
+	Duration   string        `json:"duration,omitempty" console:"header:Duration,omitempty"`
+	Steps      []JobStepData `json:"steps,omitempty"`


[/zoom-out] Steps field has no console struct tag — steps will be silently invisible in console (tabular) rendering.

Every other field in JobData carries a console:"header:..." tag. Omitting it from Steps is probably intentional (nested slices are hard to render in a table), but leaving it undocumented is a subtle inconsistency.

💡 Suggestion

Add an explicit opt-out tag so the intent is clear to future maintainers:

Steps []JobStepData `json:"steps,omitempty" console:"-"`

The console:"-" pattern is already used in this file (see AwContext) for fields that should be excluded from table output.

@copilot please address this.

github-actions

Review: Populate audit job steps

Clean, focused change. The plumbing — jq projection → JobStep model → JobStepData in the audit output — is correct end-to-end, and the two new test paths (fetch layer + report layer) give good confidence.

Non-blocking notes (see inline comments):

Missing "no steps" test case — the fetch test only covers the happy path where steps are present; a case with "steps":null or "steps":[] in the fake API response would prove the // [] jq fallback and guard backward-compat for pre-existing data.
sliceutil.Map + nil input — when jobDetail.Steps is nil, sliceutil.Map returns []JobStepData{} (non-nil empty). omitempty handles the JSON output correctly; just prefer len(job.Steps) == 0 over == nil at any future call-site.

Neither issue is blocking. The implementation is correct and the test coverage is meaningfully improved.

🧵 Reviewed using Impeccable skills by Impeccable Skills Reviewer · 48.9 AIC · ⌖ 6.51 AIC · ⊞ 4.9K

github-actions · 2026-06-29T11:49:21Z

+	argsLog, err := os.ReadFile(argsLogPath)
+	require.NoError(t, err)
+	assert.Contains(t, string(argsLog), "repos/{owner}/{repo}/actions/runs/28307653871/jobs", "should query the run jobs API")
+	assert.Contains(t, string(argsLog), "steps:", "gh jq projection should request step data")


The assertion confirms the jq projection mentions steps:, which is useful. However, the test only exercises the case where the API returns populated steps. A second sub-test or table-driven case where the fake gh script emits "steps":null or "steps":[] would verify that the // [] fallback in the jq expression works and that fetchJobDetailsWithCounts returns an empty/nil Steps slice without error — providing a backward-compat regression guard for runs fetched before step data was available.

@copilot please address this.

github-actions · 2026-06-29T11:49:21Z

 			Status:     jobDetail.Status,
 			Conclusion: jobDetail.Conclusion,
+			Steps: sliceutil.Map(jobDetail.Steps, func(step JobStep) JobStepData {
+				return JobStepData(step)


The direct struct conversion JobStepData(step) is valid today because JobStep and JobStepData are structurally identical (same field names, types, and tags). Go's type conversion rules mean any field divergence will be caught at compile time, which is safe.

One subtle side-effect: sliceutil.Map calls make([]U, len(slice)), so when jobDetail.Steps is nil (no steps from the API), this returns []JobStepData{} — an empty non-nil slice — rather than nil. The json:"steps,omitempty" tag correctly omits both nil and empty slices from JSON output, so serialized audit reports are unaffected. Just be aware that in-memory checks of job.Steps == nil won't reliably distinguish "no step data" from "zero steps"; prefer len(job.Steps) == 0.

github-actions

Non-blocking observations

Correct fix for a real gap — step data flows cleanly from the jobs API through the model and into audit JSON, with tests at both layers. Two non-blocking observations worth considering before this pattern scales:

Findings summary

1. Struct cast coupling ( line 363) — is a named-struct type conversion that compiles only when JobStep and JobStepData are structurally identical. They are today, but the two types serve distinct purposes (internal API model vs. audit output model) and will likely diverge over time. Explicit field mapping is cleaner and safer.

2. jq expression not exercised by tests (logs_github_api_test.go line 254) — the fake gh binary always cats pre-baked JSON, so the actual jq projection in logs_github_api.go is never executed during CI. The assert.Contains(t, ..., "steps:") assertion is too coarse to catch jq field-name typos. A more specific substring check (e.g., ".steps // []") would catch obvious regressions.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

proxy.golang.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "proxy.golang.org"

See Network Configuration for more information.

🔎 Code quality review by PR Code Quality Reviewer · 84.8 AIC · ⌖ 7.19 AIC · ⊞ 5.2K
_{Comment /review to run again}

github-actions · 2026-06-29T11:51:00Z

 			Status:     jobDetail.Status,
 			Conclusion: jobDetail.Conclusion,
+			Steps: sliceutil.Map(jobDetail.Steps, func(step JobStep) JobStepData {
+				return JobStepData(step)


Brittle named-struct cast will surprise future maintainers: JobStepData(step) silently depends on JobStep and JobStepData having identical field names, types, and order — any divergence causes a compile error whose origin is non-obvious.

💡 Suggested fix

Use explicit field mapping instead:

Steps: sliceutil.Map(jobDetail.Steps, func(step JobStep) JobStepData { return JobStepData{Name: step.Name, Status: step.Status, Conclusion: step.Conclusion} }),

This makes the intent clear — only these three fields flow from the internal API model to the audit output model — and compiles cleanly even if either struct gains fields independently (e.g., Number int on JobStep for step ordering, without leaking it to audit consumers).

github-actions · 2026-06-29T11:51:00Z

+	argsLog, err := os.ReadFile(argsLogPath)
+	require.NoError(t, err)
+	assert.Contains(t, string(argsLog), "repos/{owner}/{repo}/actions/runs/28307653871/jobs", "should query the run jobs API")
+	assert.Contains(t, string(argsLog), "steps:", "gh jq projection should request step data")


The assert.Contains(t, ..., "steps:") check does not validate the jq expression: the fake gh binary cats pre-baked JSON regardless of the jq argument passed, so a typo in the jq projection (e.g., .step // [] instead of .steps // []) would still produce correct steps in the test output and the assertion would still pass.

💡 What this means

The test verifies Go-side parsing of step data — which is correct and valuable. But the actual jq expression in logs_github_api.go line 87 is never executed during tests. A regression in that expression (wrong field name, broken fallback, etc.) would only be caught by running against a real GitHub API.

Consider documenting this known gap in a comment, or strengthening the assertion to check for a more specific substring of the jq expression (e.g., .steps // []), at minimum catching obvious field-name typos:

assert.Contains(t, string(argsLog), ".steps // []", "jq must use .steps with empty-array fallback")

pelikhan · 2026-06-29T11:54:22Z

@copilot run pr-finisher skill

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-06-29T12:04:58Z

@copilot run pr-finisher skill

Done — addressed the null-conclusion review thread and ran the pr-finisher checks:

jq projection: changed .conclusion → (.conclusion // "") for both job and step conclusions, so null values (in-progress/queued jobs) are coerced to empty string before JSON decoding rather than forwarded as JSON null.
New test TestFetchJobDetailsWithCountsNullConclusion: verifies in-progress jobs with null conclusions are parsed and not silently dropped.
make fmt, make lint, make test-unit all pass locally.

github-actions · 2026-06-29T14:00:19Z

🤖 PR Triage — §28376613466

Field	Value
Category	`bug` (feature gap)
Risk	🟢 Low
Priority	🔴 High
Score	69 / 100
Action	🚀 `fast_track`
Age	2.8h

Score breakdown: Impact 30 + Urgency 22 + Quality 17

Rationale: gh aw audit was silently omitting step-level details from job output, leaving users without actionable failure context. Small focused fix (126+/8−, 5 files), CI fully passing. No reviewer objections. Ready for expedited human review and merge.

Generated by 🔧 PR Triage Agent · 93.2 AIC · ⌖ 14.6 AIC · ⊞ 5.4K · ◷

Initial plan

6ffd6f5

Copilot AI assigned Copilot and pelikhan Jun 29, 2026

Copilot started work on behalf of pelikhan June 29, 2026 11:18 View session

Populate audit job steps

7ff8be5

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix audit tool to populate job step details~~ Populate audit job steps in structured audit output Jun 29, 2026

Copilot finished work on behalf of pelikhan June 29, 2026 11:32

Copilot AI requested a review from pelikhan June 29, 2026 11:32

pelikhan marked this pull request as ready for review June 29, 2026 11:38

Copilot AI review requested due to automatic review settings June 29, 2026 11:38

Copilot started reviewing on behalf of pelikhan June 29, 2026 11:38 View session

Copilot AI reviewed Jun 29, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 29, 2026

[aw] No-Op Runs #39849

Open

github-actions Bot approved these changes Jun 29, 2026

View reviewed changes

github-actions Bot reviewed Jun 29, 2026

View reviewed changes

Copilot started work on behalf of pelikhan June 29, 2026 11:54 View session

Coerce null conclusion to empty string in jq projection

8dd1d02

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan June 29, 2026 12:05

pelikhan merged commit e78a973 into main Jun 29, 2026
29 checks passed

pelikhan deleted the copilot/cli-tools-test-audit-job-steps branch June 29, 2026 13:51

github-actions Bot added the pr-action:fast_track label Jun 29, 2026

github-actions Bot added pr-agent:copilot-swe-agent pr-priority:high pr-risk:low pr-type:bug labels Jun 29, 2026

github-actions Bot mentioned this pull request Jun 29, 2026

[PR Triage Report] Agent PR Triage Report — 2026-06-29 Run §28376613466 #42251

Open

Uh oh!

Conversation

Copilot AI commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Review details

Uh oh!

github-actions Bot commented Jun 29, 2026

🧪 Test Quality Sentinel Report

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Skills-Based Review 🧠

Minor Suggestions (non-blocking)

Positive Highlights

Uh oh!

github-actions Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Review: Populate audit job steps

Uh oh!

github-actions Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Non-blocking observations

Uh oh!

github-actions Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

pelikhan commented Jun 29, 2026

Uh oh!

Copilot AI commented Jun 29, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 29, 2026

🤖 PR Triage — §28376613466

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jun 29, 2026 •

edited

Loading