Skip to content

azl4: enable e2e and rollback pipeline testing#701

Draft
bfjelds wants to merge 19 commits into
user/bfjelds/azl4-3-pipelinesfrom
user/bfjelds/azl4-4-e2e
Draft

azl4: enable e2e and rollback pipeline testing#701
bfjelds wants to merge 19 commits into
user/bfjelds/azl4-3-pipelinesfrom
user/bfjelds/azl4-4-e2e

Conversation

@bfjelds

@bfjelds bfjelds commented Jun 26, 2026

Copy link
Copy Markdown
Member

Summary

Brings up AZL4 (Azure Linux 4.0) end-to-end and rollback pipeline testing, including the test images, e2e configurations, and pipeline templates, plus several AZL4-specific test-harness fixes uncovered while bringing the suites green.

Changes

AZL4 e2e / rollback enablement

  • AZL4 e2e test configurations, target configurations, and test selection (tests/e2e_tests/*azl4*, target-configurations-azl4.yaml, base-azl4/).
  • AZL4 host-test and rollback test images (baseimg-azl4.yaml, baseimg-grub-azl4.yaml, updateimg-grub-azl4.yaml) and supporting scripts (initrd rebuild, SELinux xattr strip, hostname shim).
  • Pipeline template updates to run AZL4 distro tests (e2e-template.yml, netlaunch-testing.yml, testing_common/*, build_image/*).

AZL4-specific harness fixes

  • sshd host keys on read-only /etc — host-test image (baseimg-azl4.yaml) now generates sshd host keys on the writable /var/srv partition at runtime via regen-sshd-keys.service + enable-regen-sshd-keys.sh (and ssh-move-host-keys-azl4.sh redirects sshd's HostKey there). Without this, sshd failed to start on a clean install (no host keys on the freshly recreated srv partition), so the trace-file/trident checks could not connect.
  • networkd-wait-online stall in rollback — the rollback netplan runtime test injects a dummy<N> netplan device that never becomes routable. On AZL4 (netplan generate/configure split) systemd-networkd-wait-online could block on it for its full 120s timeout, delaying network-online.target -> trident.service -> the post-update commit and inflating each rollback subtest. The generated dummy device is now marked optional: true (emits RequiredForOnline=no) so wait-online ignores it.
  • SudoSFTP path on AZL4tools/storm/utils/ssh/sftp/sftp.go hard-coded the AZL3 sftp-server path (/usr/libexec/sftp-server). On AZL4 (Fedora-based openssh) the binary is at /usr/libexec/openssh/sftp-server, so SudoSFTP failed at the SFTP handshake (unexpected EOF). The path is now auto-detected by exec-ing the first existing sftp-server, covering AZL4, AZL3, and Debian/Ubuntu.

Testing

  • Validated against AZL4 e2e, host-test, and rollback pipeline runs.

Notes

  • Draft for review while AZL4 pipeline coverage stabilizes.

@bfjelds bfjelds changed the title AZL4: enable e2e and rollback pipeline testing azl4: enable e2e and rollback pipeline testing Jun 26, 2026
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from 5cd9e74 to 547d478 Compare June 26, 2026 21:33
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch 2 times, most recently from 2980db9 to aa54ee9 Compare June 26, 2026 21:38
dependsOnStage: ${{ parameters.baseImageArtifactStage }}

# Build Trident test image (regular) for AZL4
- template: stages/build_image/build-image.yml

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add to more than pr-e2e

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Enables Azure Linux 4 (AZL4) end-to-end and rollback pipeline testing by adding AZL4 test images/configurations and updating Azure Pipelines templates to select and run AZL4 test matrices. It also includes harness fixes uncovered during bring-up (notably SFTP server path detection and AZL4 image build-time adjustments).

Changes:

  • Add AZL4 e2e configuration set + target configuration mapping for VM test runs.
  • Introduce an AZL4 MIC-based test image (plus helper scripts/systemd unit) to address initrd, SELinux xattrs, and sshd host key placement.
  • Extend pipeline templates to parameterize “distro” and run AZL4 VM host tests in PR pipelines.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tools/storm/utils/ssh/sftp/sftp.go Makes SFTP server path detection distro-aware for AZL4 compatibility.
tests/images/trident-testimage/base/scripts/strip-selinux-xattrs.sh New image build script to strip inherited SELinux xattrs from AZL4 rootfs.
tests/images/trident-testimage/base/scripts/ssh-move-host-keys-azl4.sh Adds AZL4 sshd HostKey drop-in pointing to keys on writable /var/srv.
tests/images/trident-testimage/base/scripts/rebuild-initrd-azl4.sh Rebuilds initramfs with non-hostonly storage drivers for rollback/QEMU SATA boots.
tests/images/trident-testimage/base/scripts/enable-regen-sshd-keys.sh Enables first-boot SSH host key regeneration service via wants symlink.
tests/images/trident-testimage/base/files/regen-sshd-keys.service Systemd oneshot unit to generate ssh host keys under /var/srv before sshd.
tests/images/trident-testimage/base/files/hostname-shim.sh Provides hostname shim for AZL4 images where the binary is missing.
tests/images/trident-testimage/base/baseimg-azl4.yaml New MIC base image config for AZL4 testimage build + postCustomization scripts.
tests/images/testimages.py Registers the new AZL4 test image in the test image build set.
tests/e2e_tests/trident_configurations/base-azl4/trident-config.yaml Adds AZL4 base e2e host configuration (storage, netplan, users, sudoers).
tests/e2e_tests/trident_configurations/base-azl4/test-selection.yaml Tags the AZL4 configuration for selection (compatible: [azl4, base]).
tests/e2e_tests/target-configurations-azl4.yaml Adds AZL4 target configuration mapping for VM host daily/PR/weekly runs.
tests/e2e_tests/helpers/edit_host_config.py Makes SSH key injection helper tolerant of configs without os.users.
tests/e2e_tests/base_test.py Skips test_users when os.users is not present (MIC-baked users).
tests/e2e_tests/azl4_test.py Adds a minimal AZL4-marked pytest test module for marker wiring.
.pipelines/templates/stages/testing_vm/netlaunch-testing.yml Adds distro/stageSuffix parameters and AZL4-specific stage wiring/variables.
.pipelines/templates/stages/testing_common/trident-prep.yml Allows trident-azl4-testimage as a selectable image name.
.pipelines/templates/stages/testing_common/get-tests.yml Selects AZL4 target-config file based on a new distro parameter.
.pipelines/templates/stages/testing_common/download-test-images.yml Allows downloading trident-azl4-testimage.
.pipelines/templates/stages/common_tasks/remove-from-acr.yml Allows removing trident-azl4-testimage from ACR.
.pipelines/templates/stages/common_tasks/push-to-acr.yml Allows pushing trident-azl4-testimage to ACR.
.pipelines/templates/e2e-template.yml Builds AZL4 test image and runs VM host pullrequest testing for AZL4.

Comment thread tests/images/trident-testimage/base/scripts/strip-selinux-xattrs.sh Outdated
Comment thread tests/images/trident-testimage/base/scripts/rebuild-initrd-azl4.sh
Comment thread tests/images/trident-testimage/base/baseimg-azl4.yaml Outdated
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from 48fc4cd to 85add72 Compare June 26, 2026 21:57
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from aa54ee9 to 2b543af Compare June 26, 2026 21:57
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from 85add72 to 951a287 Compare June 26, 2026 22:12
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from 2b543af to 56e1507 Compare June 26, 2026 22:12
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from 951a287 to 8783943 Compare June 26, 2026 22:18
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from 56e1507 to 472c8fd Compare June 26, 2026 22:18
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from 8783943 to f6db2d2 Compare June 26, 2026 22:49
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from 472c8fd to b6d027f Compare June 26, 2026 22:49
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from f6db2d2 to 17eeb8c Compare June 27, 2026 00:06
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from b6d027f to c5b49f0 Compare June 27, 2026 00:07
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from 17eeb8c to 90d040a Compare June 27, 2026 00:18
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from c5b49f0 to 63d6659 Compare June 27, 2026 00:18
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from 90d040a to 2c59c3d Compare June 27, 2026 00:33
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from 63d6659 to dff5bc8 Compare June 27, 2026 00:33
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from 2c59c3d to c6d2683 Compare June 27, 2026 00:43
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from dff5bc8 to a64a8a2 Compare June 27, 2026 00:43
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from c6d2683 to f8b2dd7 Compare June 27, 2026 00:49
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from a64a8a2 to 2494add Compare June 27, 2026 00:49
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from f8b2dd7 to 3bc3bbe Compare June 27, 2026 01:06
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from 2494add to 13a2019 Compare June 27, 2026 01:06
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-3-pipelines branch from 3bc3bbe to 5b34e5d Compare June 27, 2026 01:15
@bfjelds bfjelds force-pushed the user/bfjelds/azl4-4-e2e branch from 13a2019 to 6197f55 Compare June 27, 2026 01:15
bfjelds and others added 7 commits June 27, 2026 01:38
The clean-install host-test image (baseimg-azl4.yaml) relocates the sshd
HostKey to /var/srv/etc/ssh via ssh-move-host-keys-azl4.sh, but the srv
partition is created fresh on every clean install, so no host keys exist
there at first boot. sshd then exits with "no hostkeys available --
exiting" and nothing listens on TCP :22, so the "Check for trace file"
step fails with "connect: connection refused" even though deployment
(verified over the serial console) succeeds.

Add regen-sshd-keys.service and its enable script, mirroring the grubazl4
rollback test image, to generate the host keys under /var/srv on first
boot before sshd starts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SudoSFTP hard-coded the AZL3 sftp-server path (/usr/libexec/sftp-server).
On AZL4 (Fedora-based openssh) the binary lives at
/usr/libexec/openssh/sftp-server, so `sudo -n /usr/libexec/sftp-server`
failed (binary not found), the exec channel closed, and the SFTP client
errored at the protocol handshake:
  failed to create SFTP client: error receiving version packet from
  server: server unexpectedly closed connection: unexpected EOF
This broke the host-test ab-update helper update-hc step (Stage and
finalize A/B update into target OS B) on azl4, while plain SSH exec
(get-config) still worked.

Exec the first existing sftp-server path (openssh/, plain libexec, or the
Debian/Ubuntu location) so the SFTP protocol flows regardless of distro.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.

Comment thread tests/images/trident-testimage/base/files/regen-sshd-keys.service Outdated
Comment thread tests/images/trident-testimage/base/scripts/strip-selinux-xattrs.sh Outdated
Comment thread tests/images/trident-testimage/base/baseimg-azl4.yaml Outdated
Comment thread tests/e2e_tests/helpers/edit_host_config.py Outdated
- strip-selinux-xattrs.sh: test setfattr exit status directly and match
  benign errors with a shell case (no per-entry grep, no stale rc).
- enable-regen-sshd-keys.sh: set -euo pipefail and mkdir -p the wants
  dir so enabling fails fast and works in minimal chroots.
- regen-sshd-keys.service: OR-style ConditionPathExists for rsa/ecdsa/
  ed25519 so any missing host key triggers regeneration.
- rebuild-initrd-azl4.sh: shopt -s nullglob so a missing/empty modules
  dir yields an empty array and hits the 0) error branch.
- baseimg-azl4.yaml: fix hostname typo azll4 -> azl4.
- edit_host_config.py: fail loudly when os.users or testing-user is
  missing instead of silently skipping the SSH key.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

Comment thread tests/e2e_tests/azl4_test.py
Register the azl4 marker in tests/e2e_tests/pytest.ini so pytestmark in
azl4_test.py no longer emits PytestUnknownMarkWarning and stays
compatible with a future --strict-markers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.

Comment thread tests/images/trident-testimage/base/scripts/strip-selinux-xattrs.sh

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.

Comment thread tools/storm/utils/ssh/sftp/sftp.go Outdated
Comment thread tests/images/trident-testimage/base/scripts/strip-selinux-xattrs.sh
Use `sudo -n -- /bin/sh -c ...` instead of `sudo -n sh -c ...` so the
SudoSFTP handshake does not depend on `sh` being discoverable via
sudo's secure_path, which can be restricted on some sudoers configs and
reintroduce the "unexpected EOF" handshake failure. `--` also ends sudo
option parsing defensively.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.

Comment thread tools/storm/utils/ssh/sftp/sftp.go Outdated
Comment thread tests/e2e_tests/base_test.py Outdated
- base_test.py: skip test_users when os.users is missing OR empty/null,
  not just when the key is absent. Avoids silently passing with no
  assertions on an empty list and avoids iterating over None.
- storm/sftp: run the path-probing shell unprivileged and only exec the
  chosen sftp-server binary under sudo (`exec sudo -n -- "$p"`), so the
  handshake no longer requires a root-shell sudoers entry.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.

Comment thread tools/storm/utils/ssh/sftp/sftp.go
Comment thread .pipelines/templates/stages/testing_vm/netlaunch-testing.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants