OTel Instrumentation Improvement: propagate gh-aw.engine.id to all setup spans
Analysis Date: 2026-05-15
Priority: Medium
Effort: Small (< 2h)
Problem
The gh-aw.<job>.setup span emitted by actions/setup/js/send_otlp_span.cjs (sendJobSetupSpan) is missing the gh-aw.engine.id attribute on every job. The conclusion span carries gh-aw.engine.id because by then /tmp/gh-aw/aw_info.json has been written, but at setup time that file does not exist yet and no GH_AW_INFO_ENGINE_ID env var is set on the setup step.
A DevOps engineer cannot answer: “What is the p95 setup latency for Claude vs Codex vs Copilot workflows?” or “Which engine is most likely to fail before reaching the agent step?” — because every setup span is unlabelled with respect to the engine.
Why This Matters (DevOps Perspective)
- Setup latency is workload-shaped (MCP install, firewall config, sandbox boot). Per-engine breakdowns are the first thing on-call wants when activation gets slow.
- Setup failures (MCP/firewall/auth) often happen before any agent step writes
aw_info.json, so the conclusion span’s engine attribute is the only carrier today — but conclusion spans are not always emitted on early aborts (e.g. timeouts in MCP install), leaving a class of failures with no engine.id anywhere in the trace.
- Activation and safe-outputs jobs do not produce
aw_info.json themselves; their setup AND conclusion spans both lack engine grouping today. Adding the attribute to the setup step env fixes both at once.
- Unblocks Grafana / Honeycomb / Datadog panels grouped by
gh-aw.engine.id for the setup phase, lowering MTTR on “my Claude workflow is slow today” incidents.
Current Behavior
send_otlp_span.cjs resolves the engine ID with this priority:
// actions/setup/js/send_otlp_span.cjs:176-178
function resolveEngineId(awInfo) {
return readContextString(awInfo.engine_id)
|| readContextString(awInfo.context?.engine_id)
|| process.env.GH_AW_INFO_ENGINE_ID
|| "";
}
At the time the setup span is sent:
/tmp/gh-aw/aw_info.json does not exist yet (it is written by generate_aw_info.cjs later in the agent step).
process.env.GH_AW_INFO_ENGINE_ID is not set on the setup step, because the compiler only emits it on the Generate agentic run info step:
// pkg/workflow/compiler_yaml.go:797-801
yaml.WriteString(" - name: Generate agentic run info\n")
yaml.WriteString(" id: generate_aw_info\n")
yaml.WriteString(" env:\n")
fmt.Fprintf(yaml, " GH_AW_INFO_ENGINE_ID: \"%s\"\n", engineID)
The setup step in compiler_yaml_step_generation.go:185-198 sets GH_AW_SETUP_WORKFLOW_NAME, GH_AW_CURRENT_WORKFLOW_REF, GH_AW_INFO_VERSION, and GH_AW_INFO_BODY_MODIFIED — but not GH_AW_INFO_ENGINE_ID.
Result: every gh-aw.<job>.setup span ships without gh-aw.engine.id.
Proposed Change
Propagate the resolved engine ID to the setup step env in pkg/workflow/compiler_yaml_step_generation.go, both in dev/release mode and in script mode. Since generateSetupStep does not currently receive the engine ID, pass it through or read it from data the same way generateCreateAwInfo does.
// pkg/workflow/compiler_yaml_step_generation.go (dev/release mode, ~line 185)
lines = append(lines,
" env:\n",
fmt.Sprintf(" GH_AW_SETUP_WORKFLOW_NAME: %q\n", data.Name),
fmt.Sprintf(" GH_AW_CURRENT_WORKFLOW_REF: %s\n", buildSetupWorkflowRefExpr(data)),
)
if engineID := resolveEngineIDForSetup(data); engineID != "" {
lines = append(lines, fmt.Sprintf(" GH_AW_INFO_ENGINE_ID: %q\n", engineID))
}
Add the same line in the script-mode branch (~line 143). The helper mirrors what generateCreateAwInfo already does:
func resolveEngineIDForSetup(data *WorkflowData) string {
if data == nil { return "" }
if data.EngineConfig != nil && data.EngineConfig.ID != "" {
return data.EngineConfig.ID
}
if data.AI != "" { return data.AI }
return ""
}
No change is required in send_otlp_span.cjs — resolveEngineId already reads process.env.GH_AW_INFO_ENGINE_ID as a fallback, but that branch is unreachable today because the env var is never set on the setup step.
Expected Outcome
After this change:
- In Grafana / Honeycomb / Datadog:
gh-aw.engine.id = "claude" matches setup spans, enabling p95 by gh-aw.engine.id on the setup phase and per-engine error rate panels that include pre-agent failures.
- In the JSONL mirror: the very first span of every job (the setup span in
/tmp/gh-aw/otel.jsonl) carries gh-aw.engine.id, matching what the conclusion span already provides.
- For on-call engineers: when an MCP install or firewall configuration fails during setup (no
aw_info.json ever written), the trace still identifies which engine the workflow targets — instead of an unattributable setup failure.
Implementation Steps
Evidence from Live Grafana Data
Grafana Cloud Tempo (grafanacloud-mnkiefer-traces, datasource UID grafanacloud-traces) returned no traces over the past 7 days for any TraceQL query ({}, {resource.service.name="gh-aw"}) — suggesting an export-side issue worth a separate investigation, but the conclusion below was instead grounded in the live JSONL mirror that the same instrumentation writes locally during this very run.
Sample setup span from /tmp/gh-aw/otel.jsonl written by this advisor run (workflow run §25902326163):
{
"traceId": "c2c8ebecda224fa682e57bc5c3b5e0a7",
"spanId": "410e8ff541f646af",
"parentSpanId": "9a3cd09c6892fe77",
"name": "gh-aw.agent.setup",
"attributes": [
{ "key": "gh-aw.job.name", "value": { "stringValue": "agent" } },
{ "key": "gh-aw.workflow.name", "value": { "stringValue": "Daily Grafana OTel Instrumentation Advisor" } },
{ "key": "gh-aw.run.id", "value": { "stringValue": "25902326163" } },
{ "key": "gh-aw.event_name", "value": { "stringValue": "schedule" } },
{ "key": "gh-aw.staged", "value": { "boolValue": false } },
{ "key": "gh-aw.episode.id", "value": { "stringValue": "25902326163-1:..." } }
/* NOTE: gh-aw.engine.id is MISSING */
]
}
Meanwhile /tmp/gh-aw/aw_info.json (written later in the same job) clearly resolves the engine:
{ "engine_id": "claude", "engine_name": "Claude Code", ... }
Resource attributes on the same span are fine (service.version, github.repository, github.run_id, github.event_name, deployment.environment all present) — the missing field is specifically gh-aw.engine.id at the span-attribute layer of the setup span.
Related Files
pkg/workflow/compiler_yaml_step_generation.go — emit GH_AW_INFO_ENGINE_ID env var on the setup step (both script and dev/release branches)
pkg/workflow/compiler_yaml.go:718-725 — engine-ID resolution to mirror
actions/setup/js/send_otlp_span.cjs:176-178 — existing fallback that consumes the env var (no change needed)
actions/setup/js/send_otlp_span.test.cjs — add coverage for the env-only setup path
pkg/workflow/compiler_jobs_test.go — add lock-file assertion that the setup step contains GH_AW_INFO_ENGINE_ID:
Generated by the Daily Grafana OTel Instrumentation Advisor workflow
Generated by 📊 Daily Grafana OTel Instrumentation Advisor · ● 20.7M · ◷
OTel Instrumentation Improvement: propagate
gh-aw.engine.idto all setup spansAnalysis Date: 2026-05-15
Priority: Medium
Effort: Small (< 2h)
Problem
The
gh-aw.<job>.setupspan emitted byactions/setup/js/send_otlp_span.cjs(sendJobSetupSpan) is missing thegh-aw.engine.idattribute on every job. The conclusion span carriesgh-aw.engine.idbecause by then/tmp/gh-aw/aw_info.jsonhas been written, but at setup time that file does not exist yet and noGH_AW_INFO_ENGINE_IDenv var is set on the setup step.A DevOps engineer cannot answer: “What is the p95 setup latency for Claude vs Codex vs Copilot workflows?” or “Which engine is most likely to fail before reaching the agent step?” — because every setup span is unlabelled with respect to the engine.
Why This Matters (DevOps Perspective)
aw_info.json, so the conclusion span’s engine attribute is the only carrier today — but conclusion spans are not always emitted on early aborts (e.g. timeouts in MCP install), leaving a class of failures with noengine.idanywhere in the trace.aw_info.jsonthemselves; their setup AND conclusion spans both lack engine grouping today. Adding the attribute to the setup step env fixes both at once.gh-aw.engine.idfor the setup phase, lowering MTTR on “my Claude workflow is slow today” incidents.Current Behavior
send_otlp_span.cjsresolves the engine ID with this priority:At the time the setup span is sent:
/tmp/gh-aw/aw_info.jsondoes not exist yet (it is written bygenerate_aw_info.cjslater in the agent step).process.env.GH_AW_INFO_ENGINE_IDis not set on the setup step, because the compiler only emits it on theGenerate agentic run infostep:The setup step in
compiler_yaml_step_generation.go:185-198setsGH_AW_SETUP_WORKFLOW_NAME,GH_AW_CURRENT_WORKFLOW_REF,GH_AW_INFO_VERSION, andGH_AW_INFO_BODY_MODIFIED— but notGH_AW_INFO_ENGINE_ID.Result: every
gh-aw.<job>.setupspan ships withoutgh-aw.engine.id.Proposed Change
Propagate the resolved engine ID to the setup step env in
pkg/workflow/compiler_yaml_step_generation.go, both in dev/release mode and in script mode. SincegenerateSetupStepdoes not currently receive the engine ID, pass it through or read it fromdatathe same waygenerateCreateAwInfodoes.Add the same line in the script-mode branch (~line 143). The helper mirrors what
generateCreateAwInfoalready does:No change is required in
send_otlp_span.cjs—resolveEngineIdalready readsprocess.env.GH_AW_INFO_ENGINE_IDas a fallback, but that branch is unreachable today because the env var is never set on the setup step.Expected Outcome
After this change:
gh-aw.engine.id = "claude"matches setup spans, enablingp95 by gh-aw.engine.idon the setup phase and per-engine error rate panels that include pre-agent failures./tmp/gh-aw/otel.jsonl) carriesgh-aw.engine.id, matching what the conclusion span already provides.aw_info.jsonever written), the trace still identifies which engine the workflow targets — instead of an unattributable setup failure.Implementation Steps
resolveEngineIDForSetup(data)helper inpkg/workflow/compiler_yaml_step_generation.go(mirroring the engine-ID resolution ingenerateCreateAwInfo).GH_AW_INFO_ENGINE_ID: %qin both branches ofgenerateSetupStep(script mode around line 143, dev/release mode around line 185).pkg/workflow/(e.g. alongsidecompiler_jobs_test.go) asserting the lock file containsGH_AW_INFO_ENGINE_ID:inside the setup step block for each engine.actions/setup/js/send_otlp_span.test.cjsasserting thatgh-aw.engine.idis present on the setup span when onlyGH_AW_INFO_ENGINE_IDenv var is set (noaw_info.json). The fallback path already exists atsend_otlp_span.cjs:177; the test makes the wiring durable.make test-unit(orcd actions/setup/js && npx vitest run) andgo test ./pkg/workflow/...to confirm.make fmt.Evidence from Live Grafana Data
Grafana Cloud Tempo (
grafanacloud-mnkiefer-traces, datasource UIDgrafanacloud-traces) returned no traces over the past 7 days for any TraceQL query ({},{resource.service.name="gh-aw"}) — suggesting an export-side issue worth a separate investigation, but the conclusion below was instead grounded in the live JSONL mirror that the same instrumentation writes locally during this very run.Sample setup span from
/tmp/gh-aw/otel.jsonlwritten by this advisor run (workflow run §25902326163):{ "traceId": "c2c8ebecda224fa682e57bc5c3b5e0a7", "spanId": "410e8ff541f646af", "parentSpanId": "9a3cd09c6892fe77", "name": "gh-aw.agent.setup", "attributes": [ { "key": "gh-aw.job.name", "value": { "stringValue": "agent" } }, { "key": "gh-aw.workflow.name", "value": { "stringValue": "Daily Grafana OTel Instrumentation Advisor" } }, { "key": "gh-aw.run.id", "value": { "stringValue": "25902326163" } }, { "key": "gh-aw.event_name", "value": { "stringValue": "schedule" } }, { "key": "gh-aw.staged", "value": { "boolValue": false } }, { "key": "gh-aw.episode.id", "value": { "stringValue": "25902326163-1:..." } } /* NOTE: gh-aw.engine.id is MISSING */ ] }Meanwhile
/tmp/gh-aw/aw_info.json(written later in the same job) clearly resolves the engine:{ "engine_id": "claude", "engine_name": "Claude Code", ... }Resource attributes on the same span are fine (
service.version,github.repository,github.run_id,github.event_name,deployment.environmentall present) — the missing field is specificallygh-aw.engine.idat the span-attribute layer of the setup span.Related Files
pkg/workflow/compiler_yaml_step_generation.go— emitGH_AW_INFO_ENGINE_IDenv var on the setup step (both script and dev/release branches)pkg/workflow/compiler_yaml.go:718-725— engine-ID resolution to mirroractions/setup/js/send_otlp_span.cjs:176-178— existing fallback that consumes the env var (no change needed)actions/setup/js/send_otlp_span.test.cjs— add coverage for the env-only setup pathpkg/workflow/compiler_jobs_test.go— add lock-file assertion that the setup step containsGH_AW_INFO_ENGINE_ID:Generated by the Daily Grafana OTel Instrumentation Advisor workflow