Skip to content

feat: include git commit SHA in PET info response for telemetry regression tracking#473

Merged
eleanorjboyd merged 2 commits into
microsoft:mainfrom
StellaHuang95:pet-commit-sha
Jun 9, 2026
Merged

feat: include git commit SHA in PET info response for telemetry regression tracking#473
eleanorjboyd merged 2 commits into
microsoft:mainfrom
StellaHuang95:pet-commit-sha

Conversation

@StellaHuang95

Copy link
Copy Markdown
Contributor

Background

PET (Python Environment Tools) is a Rust-based JSONRPC server used by the VS Code Python extension to discover Python environments. PET does not send its own telemetry — instead, it returns metadata to the calling extension (vscode-python-environments), which forwards it via telemetry.

To help catch regressions, we need a way to know exactly which PET build a user is running by querying telemetry. PR #470 added a JSONRPC info request that returns PET's version and an optional CI build ID. However, the build number alone isn't enough to pinpoint the exact source code — we also need the git commit hash.

This PR extends the info response to include the git commit SHA baked into the binary at build time. On the extension side, the companion PR (vscode-python-environments#1548) passes this information along to telemetry, so we can correlate user-reported issues or regressions back to the specific PET commit.

What Changed

1. Build script (crates/pet/build.rs)

  • Added support for three new environment variables that CI systems set automatically:
    • PET_COMMIT_SHA — explicit override (highest priority)
    • BUILD_SOURCEVERSION — set by Azure Pipelines
    • GITHUB_SHA — set by GitHub Actions
  • When any of these is present and non-empty, the value is embedded into the binary as a compile-time constant via cargo:rustc-env=PET_COMMIT_SHA=....

2. JSONRPC response (crates/pet/src/jsonrpc.rs)

  • Added a new optional commit_sha field to the InfoResponse struct.
  • InfoResponse::current() populates it from the compile-time PET_COMMIT_SHA env var (same pattern used for build_id).
  • For local dev builds where no CI env var is set, the field is None and omitted from the JSON response.

3. JSONRPC documentation (docs/JSONRPC.md)

  • Updated the InfoResponse TypeScript interface to document the new commitSha?: string field, including where the value is sourced from.

4. Tests (crates/pet/src/jsonrpc.rs, crates/pet/tests/jsonrpc_server_test.rs, crates/pet/tests/jsonrpc_client.rs)

  • Added commit_sha field to the test client's PetInfoResponse struct.
  • Added assertions that commit_sha, when present, is non-empty (mirrors existing build_id assertion pattern).
  • Added clarifying comments explaining that both build_id and commit_sha are None in local dev builds and only populated in CI.

How It Works

CI build (Azure Pipelines / GitHub Actions)
  ↓ sets BUILD_SOURCEVERSION or GITHUB_SHA
build.rs captures it → bakes into binary as PET_COMMIT_SHA
  ↓
JSONRPC `info` request → returns { petVersion, buildId, commitSha }
  ↓
vscode-python-environments extension → includes commitSha in telemetry
  ↓
Team queries telemetry → correlates regressions to exact PET commit

Related PRs

Comment thread crates/pet/build.rs
println!("cargo:rustc-env=PET_BUILD_ID={build_id}");
}

// BUILD_SOURCEVERSION is set by Azure Pipelines; GITHUB_SHA by GitHub Actions.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot generated:
The Skeptic identified (structurally confirmed) that an explicitly-set-but-empty PET_COMMIT_SHA silently suppresses the BUILD_SOURCEVERSION/GITHUB_SHA fallbacks: or_else only fires on None, so Some("") short-circuits the chain and the trailing .filter() then drops it to None, losing the SHA entirely. This is concretely triggerable (a GitHub Actions YAML referencing an unset context yields ""). While the Advocate notes this mirrors the existing build_id logic and may be unreachable in practice, the fix is cheap — filter each candidate before falling back: .ok().filter(|v| !v.is_empty()).or_else(|| std::env::var("BUILD_SOURCEVERSION").ok().filter(|v| !v.is_empty())).... Apply the same hardening to the build_id chain above.

[verified]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 356f4af. Moved the empty-string .filter() inside each candidate so a Some("") falls through to the next env var instead of being swallowed by the trailing filter. Applied the same hardening to the �uild_id chain.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 356f4af. Moved the empty-string .filter() inside each candidate so a Some(\"\") falls through to the next env var instead of being swallowed by the trailing filter. Applied the same hardening to the build_id chain.

Comment thread crates/pet/build.rs
println!("cargo:rustc-env=PET_BUILD_ID={build_id}");
}

// BUILD_SOURCEVERSION is set by Azure Pipelines; GITHUB_SHA by GitHub Actions.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot generated:
Operational caveat from the Skeptic (Low): Azure Pipelines' BUILD_SOURCEVERSION is the ephemeral merge commit for PR-triggered builds, not the PR head commit. Since the stated goal is correlating regressions to a specific PET commit, verify the CI trigger config resolves to the intended (head) commit — this is a config check, not a code defect.

[verified]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — treating this as a config check rather than a code change. Shipped PET binaries are built from main / release pipeline runs (not PR-triggered validation), where BUILD_SOURCEVERSION resolves to the real source commit. If we ever change the release pipeline to build from a PR context, I'll set PET_COMMIT_SHA explicitly to the head SHA so this stays correct.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — treating this as a config check rather than a code change. Shipped PET binaries are built from main / release pipeline runs (not PR-triggered validation), where BUILD_SOURCEVERSION resolves to the real source commit. If we ever change the release pipeline to build from a PR context, I'll set PET_COMMIT_SHA explicitly to the head SHA so this stays correct.

Comment thread crates/pet/src/jsonrpc.rs
let info = InfoResponse::current();

assert_eq!(info.pet_version, env!("CARGO_PKG_VERSION"));
// build_id / commit_sha are populated from env vars set at compile time by CI.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot generated:
The Skeptic, Advocate, and Architect all noted these assertions are vacuously true: commit_sha.as_deref().is_none_or(|s| !s.is_empty()) passes whenever commit_sha is None, which is every local dev run — and build.rs's own .filter() already guarantees it can never be Some(""). As a result no test positively exercises the build.rs → binary → JSON path the PR claims, so a regression (the empty-string short-circuit, a serde-rename typo, a dropped option_env!) would ship green. Consider a test that, when option_env!("PET_COMMIT_SHA") is Some, asserts the serialized JSON contains a non-empty commitSha key.

[verified]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Added test_info_response_serializes_camel_case_and_omits_none in 356f4af that constructs InfoResponse directly and asserts the JSON wire format — both the camelCase rename and the skip_serializing_if behavior. That catches serde-rename typos, dropped option_env!, or accidentally serializing None without requiring CI env vars at compile time. Kept the existing assertion (now renamed to ..._optional_build_metadata) as a light sanity check on the option_env! path; the new test provides the real wire-format protection.

@rchiodo

rchiodo commented Jun 9, 2026

Copy link
Copy Markdown

GitHub cannot anchor PR review comments to unchanged lines in the diff. Falling back to a general PR comment for crates/pet/src/jsonrpc.rs:L1545.

Copilot generated:
The Architect noted the test names (test_info_response_uses_package_version_and_optional_build_id here, and info_reports_pet_version_and_optional_build_id in jsonrpc_server_test.rs) now also assert on commit_sha but only mention build_id. A reader scanning test names won't know commit_sha is covered — consider renaming to ..._build_id_and_commit_sha or ..._optional_build_metadata.

[verified]

let info = client.info().expect("info request failed");

assert_eq!(info.pet_version, env!("CARGO_PKG_VERSION"));
// build_id / commit_sha are populated from env vars set at compile time by CI.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot generated:
Same vacuous-assertion concern as the unit test: commit_sha/build_id are None in local and most CI runs, so this assertion can essentially never fail and provides little real coverage of the end-to-end wire format. A round-trip test that sets the env var and asserts the commitSha JSON key would give genuine protection.

[verified]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix path as the unit-test thread: the new test_info_response_serializes_camel_case_and_omits_none in 356f4af covers the wire format at the serialization layer, which is where a serde-rename / skip_serializing_if regression would actually break. Setting compile-time env vars from an integration test would require rebuilding the binary, so I kept this assertion (renamed to info_reports_pet_version_and_optional_build_metadata) as a smoke check that deserialization still works when a SHA is present in a CI build.

@brettcannon brettcannon left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Copilot comments look reasonable.

…re-format test, rename tests

- build.rs: push empty-string filter into each candidate of build_id/commit_sha chains so an explicitly-set-but-empty primary var (e.g. PET_COMMIT_SHA from an unset GitHub Actions context) doesn't short-circuit fallbacks.
- jsonrpc.rs: add test_info_response_serializes_camel_case_and_omits_none covering camelCase rename + skip_serializing_if without requiring CI env vars at compile time.
- rename existing tests to ..._optional_build_metadata for accurate scope.
@StellaHuang95

Copy link
Copy Markdown
Contributor Author

Addressed Copilot feedback in 356f4af:

  • build.rs empty-string short-circuit (thread 1): pushed the .filter(|v| !v.is_empty()) inside each candidate of both the build_id and commit_sha chains so a Some("") from an unset GitHub Actions context falls through to the next env var instead of being silently swallowed.
  • BUILD_SOURCEVERSION on PR-triggered Azure builds (thread 2): acknowledged as a config concern, not a code change — shipped binaries come from release pipeline runs where BUILD_SOURCEVERSION is the real source commit.
  • Vacuous unit-test assertion (thread 3): added test_info_response_serializes_camel_case_and_omits_none that round-trips InfoResponse through serde_json and asserts on petVersion / buildId / commitSha keys plus omission of None fields. Gives real wire-format coverage without needing CI env vars at compile time.
  • Vacuous integration-test assertion (thread 4): the new unit test above covers the wire format end-to-end at serialization; kept the existing E2E assertion as a smoke check.
  • Test naming (unanchored comment): renamed both tests to ..._optional_build_metadata so the scope is obvious to a reader scanning test names.

@StellaHuang95 StellaHuang95 requested review from brettcannon, karthiknadig and rchiodo and removed request for brettcannon June 9, 2026 20:32

@rchiodo rchiodo left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved via Review Center.

@eleanorjboyd eleanorjboyd merged commit 990a8ff into microsoft:main Jun 9, 2026
20 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants