engineering: Azure Linux 4.0 (AZL4) full enablement stack#667
Conversation
|
/azp run [GITHUB]-trident-pr-e2e |
|
Azure Pipelines successfully started running 1 pipeline(s). |
23e5322 to
659da62
Compare
Implements AzureLinuxRelease::AzL4 variant, VERSION_ID 4.x parsing, ID_LIKE=fedora matching, updated GRUB match arms for AzL3|AzL4, and image_distro() fallback to host os-release. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1ca856a to
317b898
Compare
image_distro() was falling back to the host os-release whenever the image's distro was Distro::Other. This silently masked unrecognized distros as the host distro, causing GRUB config to be written for the wrong OS. Now: if an image is mounted (self.image.is_some()), always use the image's distro. Fallback to host only fires when no image is present at all (functional tests, runtime operations). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds is_azl4_or_later() helper, generic EFI vendor-dir discovery via grub-probe, and AZL4 ESP partition layout support. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove redundant ensure!(grub_noprefix) check from ESP setup. generate_boot_filepaths() already finds a working GRUB binary (noprefix, standard, or vendor-dir). The separate policy check was redundant. - Simplify copy_boot_files to return () instead of bool - Attribute grub search format variants to distro conventions (AZL3/Mariner vs AZL4/Fedora), not MIC internals - Update mixed-forms test comment to reference cross-version A/B update scenario Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
No callers remain after the noprefix check removal. Can be re-added if a future change needs version-range gating. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
AZL3 ships two GRUB variants: grub2-efi-binary (prefix-relative config lookup) and grub2-efi-binary-noprefix (root-device-relative lookup). Trident's A/B update path requires the noprefix variant on AZL3. Restore the noprefix check, but scope it to AZL3 only using image_distro().is_azl3(). AZL4+ uses standard grubx64.efi in vendor directories and does not need noprefix. This replaces the previous generic ensure! + DISABLE_GRUB_NOPREFIX_CHECK flag with a targeted distro check. No escape hatch needed since the check only fires for AZL3. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep the original variable name and preserve the operator escape hatch. Minimize diff from upstream. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep the same macro as upstream to minimize diff. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep the original if/else if chain with replace (first match). No real-world grub config has multiple search lines. Minimizes diff from upstream. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1de76ba to
5d0d1e8
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
941764d to
c4cecd1
Compare
AZL4 (Fedora-based) uses Boot Loader Spec entries instead of inline linux commands in grub.cfg. When grub.cfg contains blscfg and no inline linux lines, fall back to reading boot args from /boot/loader/entries/*.conf. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
c4cecd1 to
afb3c77
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds AZL4 build pipeline stages with MCR-hosted MIC container, BlobImageManifest class for ACG blob source downloads, and service connection runbook. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
testimages.py runs docker with the short tag (imagecustomizer:1.4.0-1) but docker pull uses the full MCR path. Without a local tag, docker run fails with 'pull access denied'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
afb3c77 to
3767fd8
Compare
AZL4 base VHDXes may continue to come from blob storage rather than the ADO feed. The trident-service RPM will come from an AZL4 package repo, not ADO. Update comments to reflect this. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
e53ffb0 to
fa9f4a0
Compare
os.users alone passed. Now testing swap + /home partitions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fa9f4a0 to
0fc76f3
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
0fc76f3 to
4eab827
Compare
The COSI image user (MIC) must differ from the trident config user (os.users) to avoid /home mount conflict. AZL3 uses testuser in the COSI and testing-user in the trident config. AZL4 was using testing-user in both, causing 'Mount path /mnt/newroot/home is not empty' during install. Also restore full test config (swap, /home, os.users, os.selinux, os.netplan) and fix netplan match from enp* to eth* (AZL4 uses net.ifnames=0). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4eab827 to
745568e
Compare
COSI ESP only stores one set of boot files (~7MB). 64M was unnecessarily large. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
745568e to
f5a3b53
Compare
The COSI bakes /home/testuser onto root via MIC os.users. Trident's newroot mount rejects non-empty mount points, so a separate /home partition conflicts. AZL3 avoids this by only testing /home in container mode. Container mode for AZL4 is a follow-up. Keep swap, os.users, os.selinux, os.netplan, postConfigure. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds AZL4 bare-metal simulated netlaunch pipeline stage and SELinux xattr stripping script for test image prep. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds sfdisk partition-table helper, extended offline-init for AZL4 qcow2 images, base image COSI config, and test helper scripts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
osmodifier is now a Rust crate built into the trident binary (PR #638). No separate osmodifier binary needs to be baked into test images. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f5a3b53 to
22e88da
Compare
Matches AZL3's 16M. Remove stale comment about needing 64M. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds AZL4 VM rollback test pipeline stage using storm-trident for automated rollback validation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…k config Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
22e88da to
532b19f
Compare
| - dev | ||
| - preview | ||
| - release | ||
| default: release |
There was a problem hiding this comment.
for your testing, you can change micBuildType=dev and micVersion=
| @@ -0,0 +1,81 @@ | |||
| # AZL4 variant of build-image.yml. | |||
| # | |||
| # Forked from build-image.yml on 2026-05-13. Calls build-image-template-azl4.yml | |||
There was a problem hiding this comment.
i suspect we could merge these azl4 templates now, maybe using soemthign like : baseimgBuildType=azl4-preview
There was a problem hiding this comment.
or better yet, azureLinuxVersion=4.0-preview
There was a problem hiding this comment.
this is untested, but it is the idea: user/bfjelds/azl4-pipelines
| displayName: "Stage Trident binary into testimage tree" | ||
| workingDirectory: ${{ parameters.tridentSourceDirectory }} | ||
|
|
||
| # Pull the released MIC container from MCR. AZL4 support is included |
There was a problem hiding this comment.
the pipeline should use mcr imagecustomizer:latest i think ... not sure we need this.
There was a problem hiding this comment.
testimages.py: DEFAULT_IMAGE_CUSTOMIZER_VERSION = "latest"
| docker tag "mcr.microsoft.com/azurelinux/${{ parameters.micContainerTag }}" "${{ parameters.micContainerTag }}" | ||
| displayName: "Pull MIC container from MCR" | ||
|
|
||
| # Stage the pipeline-wide SSH key into the testimage tree before |
There was a problem hiding this comment.
this ssh handling coder seems out of place. we shouldn't need to do this here if we don't for azl3 images. it feels like we are confusing images built for servicing/rollback testing and e2e images.
| @@ -0,0 +1,222 @@ | |||
| # AZL4 VM offline-init rollback test stage. | |||
There was a problem hiding this comment.
why was this needed but no testing_servicing/vm-testing-azl4.yml?
|
i think it'd be easier to have the existing pipeline templates augmented (via parameter or whatever) to do azl4 rather than forking separate files. it is hard to tell what is divergent vs copied. |
| install: | ||
| # AZL4 equivalents of the AZL3 set. See updateimg-grub-azl4.yaml | ||
| # for the rationale on each substitution. | ||
| - curl |
There was a problem hiding this comment.
can you add trident here as a package rather than using additionalFiles?
|
/azp run [GITHUB]-trident-pr-e2e |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
Summary
Full AZL4 enablement stack — all changes from PR-1 through PR-7b in a single cumulative PR against main. This PR includes:
Rust engine changes (PR-1 + PR-2 + PR-3)
Build infrastructure (PR-5a)
Image configs + pipeline (PR-5b + PR-6 + PR-7a + PR-7b)
Validation
CI build 1127408 — all 4 AZL4 stages passed (image builds, BM-sim install, storm-trident rollback). All AZL3 stages also passed (no regressions).
Stacked PR breakdown
For reviewers who prefer smaller chunks, this stack is also available as individual branches: