azl4: enable e2e and rollback pipeline testing#701
Draft
bfjelds wants to merge 19 commits into
Draft
Conversation
5cd9e74 to
547d478
Compare
2980db9 to
aa54ee9
Compare
bfjelds
commented
Jun 26, 2026
| dependsOnStage: ${{ parameters.baseImageArtifactStage }} | ||
|
|
||
| # Build Trident test image (regular) for AZL4 | ||
| - template: stages/build_image/build-image.yml |
Contributor
There was a problem hiding this comment.
Pull request overview
Enables Azure Linux 4 (AZL4) end-to-end and rollback pipeline testing by adding AZL4 test images/configurations and updating Azure Pipelines templates to select and run AZL4 test matrices. It also includes harness fixes uncovered during bring-up (notably SFTP server path detection and AZL4 image build-time adjustments).
Changes:
- Add AZL4 e2e configuration set + target configuration mapping for VM test runs.
- Introduce an AZL4 MIC-based test image (plus helper scripts/systemd unit) to address initrd, SELinux xattrs, and sshd host key placement.
- Extend pipeline templates to parameterize “distro” and run AZL4 VM host tests in PR pipelines.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/storm/utils/ssh/sftp/sftp.go | Makes SFTP server path detection distro-aware for AZL4 compatibility. |
| tests/images/trident-testimage/base/scripts/strip-selinux-xattrs.sh | New image build script to strip inherited SELinux xattrs from AZL4 rootfs. |
| tests/images/trident-testimage/base/scripts/ssh-move-host-keys-azl4.sh | Adds AZL4 sshd HostKey drop-in pointing to keys on writable /var/srv. |
| tests/images/trident-testimage/base/scripts/rebuild-initrd-azl4.sh | Rebuilds initramfs with non-hostonly storage drivers for rollback/QEMU SATA boots. |
| tests/images/trident-testimage/base/scripts/enable-regen-sshd-keys.sh | Enables first-boot SSH host key regeneration service via wants symlink. |
| tests/images/trident-testimage/base/files/regen-sshd-keys.service | Systemd oneshot unit to generate ssh host keys under /var/srv before sshd. |
| tests/images/trident-testimage/base/files/hostname-shim.sh | Provides hostname shim for AZL4 images where the binary is missing. |
| tests/images/trident-testimage/base/baseimg-azl4.yaml | New MIC base image config for AZL4 testimage build + postCustomization scripts. |
| tests/images/testimages.py | Registers the new AZL4 test image in the test image build set. |
| tests/e2e_tests/trident_configurations/base-azl4/trident-config.yaml | Adds AZL4 base e2e host configuration (storage, netplan, users, sudoers). |
| tests/e2e_tests/trident_configurations/base-azl4/test-selection.yaml | Tags the AZL4 configuration for selection (compatible: [azl4, base]). |
| tests/e2e_tests/target-configurations-azl4.yaml | Adds AZL4 target configuration mapping for VM host daily/PR/weekly runs. |
| tests/e2e_tests/helpers/edit_host_config.py | Makes SSH key injection helper tolerant of configs without os.users. |
| tests/e2e_tests/base_test.py | Skips test_users when os.users is not present (MIC-baked users). |
| tests/e2e_tests/azl4_test.py | Adds a minimal AZL4-marked pytest test module for marker wiring. |
| .pipelines/templates/stages/testing_vm/netlaunch-testing.yml | Adds distro/stageSuffix parameters and AZL4-specific stage wiring/variables. |
| .pipelines/templates/stages/testing_common/trident-prep.yml | Allows trident-azl4-testimage as a selectable image name. |
| .pipelines/templates/stages/testing_common/get-tests.yml | Selects AZL4 target-config file based on a new distro parameter. |
| .pipelines/templates/stages/testing_common/download-test-images.yml | Allows downloading trident-azl4-testimage. |
| .pipelines/templates/stages/common_tasks/remove-from-acr.yml | Allows removing trident-azl4-testimage from ACR. |
| .pipelines/templates/stages/common_tasks/push-to-acr.yml | Allows pushing trident-azl4-testimage to ACR. |
| .pipelines/templates/e2e-template.yml | Builds AZL4 test image and runs VM host pullrequest testing for AZL4. |
48fc4cd to
85add72
Compare
aa54ee9 to
2b543af
Compare
85add72 to
951a287
Compare
2b543af to
56e1507
Compare
951a287 to
8783943
Compare
56e1507 to
472c8fd
Compare
8783943 to
f6db2d2
Compare
472c8fd to
b6d027f
Compare
f6db2d2 to
17eeb8c
Compare
b6d027f to
c5b49f0
Compare
17eeb8c to
90d040a
Compare
c5b49f0 to
63d6659
Compare
90d040a to
2c59c3d
Compare
63d6659 to
dff5bc8
Compare
2c59c3d to
c6d2683
Compare
dff5bc8 to
a64a8a2
Compare
c6d2683 to
f8b2dd7
Compare
a64a8a2 to
2494add
Compare
f8b2dd7 to
3bc3bbe
Compare
2494add to
13a2019
Compare
3bc3bbe to
5b34e5d
Compare
13a2019 to
6197f55
Compare
The clean-install host-test image (baseimg-azl4.yaml) relocates the sshd HostKey to /var/srv/etc/ssh via ssh-move-host-keys-azl4.sh, but the srv partition is created fresh on every clean install, so no host keys exist there at first boot. sshd then exits with "no hostkeys available -- exiting" and nothing listens on TCP :22, so the "Check for trace file" step fails with "connect: connection refused" even though deployment (verified over the serial console) succeeds. Add regen-sshd-keys.service and its enable script, mirroring the grubazl4 rollback test image, to generate the host keys under /var/srv on first boot before sshd starts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SudoSFTP hard-coded the AZL3 sftp-server path (/usr/libexec/sftp-server). On AZL4 (Fedora-based openssh) the binary lives at /usr/libexec/openssh/sftp-server, so `sudo -n /usr/libexec/sftp-server` failed (binary not found), the exec channel closed, and the SFTP client errored at the protocol handshake: failed to create SFTP client: error receiving version packet from server: server unexpectedly closed connection: unexpected EOF This broke the host-test ab-update helper update-hc step (Stage and finalize A/B update into target OS B) on azl4, while plain SSH exec (get-config) still worked. Exec the first existing sftp-server path (openssh/, plain libexec, or the Debian/Ubuntu location) so the SFTP protocol flows regardless of distro. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
6197f55 to
f4efac5
Compare
- strip-selinux-xattrs.sh: test setfattr exit status directly and match benign errors with a shell case (no per-entry grep, no stale rc). - enable-regen-sshd-keys.sh: set -euo pipefail and mkdir -p the wants dir so enabling fails fast and works in minimal chroots. - regen-sshd-keys.service: OR-style ConditionPathExists for rsa/ecdsa/ ed25519 so any missing host key triggers regeneration. - rebuild-initrd-azl4.sh: shopt -s nullglob so a missing/empty modules dir yields an empty array and hits the 0) error branch. - baseimg-azl4.yaml: fix hostname typo azll4 -> azl4. - edit_host_config.py: fail loudly when os.users or testing-user is missing instead of silently skipping the SSH key. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Register the azl4 marker in tests/e2e_tests/pytest.ini so pytestmark in azl4_test.py no longer emits PytestUnknownMarkWarning and stays compatible with a future --strict-markers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use `sudo -n -- /bin/sh -c ...` instead of `sudo -n sh -c ...` so the SudoSFTP handshake does not depend on `sh` being discoverable via sudo's secure_path, which can be restricted on some sudoers configs and reintroduce the "unexpected EOF" handshake failure. `--` also ends sudo option parsing defensively. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- base_test.py: skip test_users when os.users is missing OR empty/null, not just when the key is absent. Avoids silently passing with no assertions on an empty list and avoids iterating over None. - storm/sftp: run the path-probing shell unprivileged and only exec the chosen sftp-server binary under sudo (`exec sudo -n -- "$p"`), so the handshake no longer requires a root-shell sudoers entry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brings up AZL4 (Azure Linux 4.0) end-to-end and rollback pipeline testing, including the test images, e2e configurations, and pipeline templates, plus several AZL4-specific test-harness fixes uncovered while bringing the suites green.
Changes
AZL4 e2e / rollback enablement
tests/e2e_tests/*azl4*,target-configurations-azl4.yaml,base-azl4/).baseimg-azl4.yaml,baseimg-grub-azl4.yaml,updateimg-grub-azl4.yaml) and supporting scripts (initrd rebuild, SELinux xattr strip, hostname shim).e2e-template.yml,netlaunch-testing.yml,testing_common/*,build_image/*).AZL4-specific harness fixes
/etc— host-test image (baseimg-azl4.yaml) now generates sshd host keys on the writable/var/srvpartition at runtime viaregen-sshd-keys.service+enable-regen-sshd-keys.sh(andssh-move-host-keys-azl4.shredirects sshd's HostKey there). Without this, sshd failed to start on a clean install (no host keys on the freshly recreated srv partition), so the trace-file/trident checks could not connect.dummy<N>netplan device that never becomes routable. On AZL4 (netplan generate/configure split)systemd-networkd-wait-onlinecould block on it for its full 120s timeout, delayingnetwork-online.target->trident.service-> the post-update commit and inflating each rollback subtest. The generated dummy device is now markedoptional: true(emitsRequiredForOnline=no) so wait-online ignores it.tools/storm/utils/ssh/sftp/sftp.gohard-coded the AZL3 sftp-server path (/usr/libexec/sftp-server). On AZL4 (Fedora-based openssh) the binary is at/usr/libexec/openssh/sftp-server, so SudoSFTP failed at the SFTP handshake (unexpected EOF). The path is now auto-detected by exec-ing the first existing sftp-server, covering AZL4, AZL3, and Debian/Ubuntu.Testing
Notes