Add Umbraco load-testing pipeline (Locust + ALT) by andr317c · Pull Request #6 · umbraco/Umbraco.Testing.Cms.Load

andr317c · 2026-05-13T07:39:48Z

Summary

Ephemeral Terraform infrastructure provisioning per-case App Services + SQL DBs across (Umbraco version × tier × scenario).
Locust workload with inventory-driven tasks, replacing the previous JMeter test plan.
Two shipped scenarios: Default (vanilla) and DeliveryApi (headless mode, with code overlay).
Long-lived history storage in Azure Blob (NDJSON per run); per-run pipeline artifacts on the build.
Local analysis scripts: show-trends.ps1, compare-runs.ps1, check-regression.ps1 (pipeline gate, permissive by default).
Pipeline split into six stages: validateTestCases → ensureHistoryInfra → provision → loadTest → regression → cleanup.

Test plan

PR-validation pipeline passes (terraform fmt + validate, PSScriptAnalyzer, py_compile).
Smoke run (loadProfile=smoke, runStarter=true) completes end-to-end through all six stages.
Tier SKUs in loadtests/tiers.json confirmed (currently placeholders pending input).

Pipeline gains a skipLoadTests parameter so we can validate provisioning + deploy + seed without burning ALT time. When set, runLoadTests is skipped and a verifyDeployments smoke job hits each App Service homepage instead, exiting non-zero on any non-200. Other hardening: pin terraformVersion to 1.13.3, broaden deleteResourceGroup condition to also fire on Canceled/Skipped (so mid-run cancellation doesn't leak the ephemeral RG), bump the manual keep/delete window from 1h to 2h, and gitignore .terraform/ + .idea/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

testSummary's "Results available in: Pipeline artifacts" line was misleading when skipLoadTests=true (no artifacts produced). Gate the wording on the parameter at compile time. Also adds the missing engineInstances row to the README parameters table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… lint Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The 'each' and 'if' directives are not allowed inside script: | string blocks. Use $env vars and runtime foreach/if instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Under Workload Identity Federation, the AzureCLI task gets idToken instead of a client secret. Plumb both through terraform; install script picks WIF if the oidc token is present, falls back to client-secret otherwise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

testSummary now gates on apply succeeding specifically, so a failed apply doesn't print a misleading "summary" with no real data. Adding apply to the cleanup chain's dependsOn ensures RG cleanup still triggers on apply-failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Move SP credentials from the local-exec environment block to the parent task's env (where the install script inherits them). With sensitive vars in the environment block, terraform suppresses local-exec output - which hides the seeder polling status. They're still masked in pipeline logs via issecret=true. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Login: loadtest@example.invalid / LoadTest123! Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

After moving auth to ARM_* env-var inheritance, the client_id/secret/oidc_token/ tenant_id terraform variables are no longer read. Cleaning up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… errors `dotnet add package --version "17.*" --prerelease` wasn't resolving 17.0.0-beta.1 reliably; the build silently proceeded without the package. Use explicit prerelease floating syntax `17.*-*` and check $LASTEXITCODE. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bump this when a new prerelease/stable ships. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Make the ALT testId scenario-scoped (one test per scenario, with every version/tier as a run inside) so the portal's "Compare runs" view can overlay them natively. History storage path mirrors the same axes (scenario/major/version/tier/date_build) for prefix-listed sweeps. Run name trimmed to fit ALT's 50-char cap.

Cloud runs all plans on one P1V3 ASP and differentiates via per-site quotas inside a shared pool. We can't replicate the per-site quotas on dedicated plans, but we can match the SKU and the SQL eDTU split (S1/S2/S3 = 20/50/100 eDTU). Also sidesteps the Pv4 worker-pool stamp error on RG re-creation.

CpuPercentage/MemoryPercentage live on the App Service Plan (Microsoft.Web/serverfarms), not the site, and SQL DTU/CPU/IO metrics scope to the database, not the server. Surface the plan and database resource IDs as terraform outputs and split the appComponents block into Site / Plan / Database with the right namespaces. Drops two now-dead AzDO variables (sqlServerName, sqlServerResourceId). Also: rename "ALT" -> "Azure Load Testing" / "load test" throughout, and rename the Hydrate step to "Read terraform outputs" for clarity.

Replace the homepage-only smoke locustfile with a workload that fetches the seeder's /umbraco/api/seederstatus/inventory at test start, buckets the seeded URLs by content type, and spreads requests across sections, categories, pages, details, and media. Detail pages are weighted highest since they're the deepest read path - that's where SQL pressure surfaces. Falls back to homepage-only if the inventory endpoint is unreachable. Switch to FastHttpUser for higher requests/sec per engine, store shared state on the locust environment (canonical pattern), use logging instead of print, and pre-bucket URLs so tasks don't linear-scan on every call. Also: rename the prepare step to "Validate + resolve scenario overrides" and add justifying comments to the publish steps' continueOnError flags.

Picks up the URL rebuild fix + inventory enhancements (mediaTypes / includeMemberPassword / cultures / paginated /inventory/urls).

One reference missed during the rename pass.

@task

- New @task(3) submit_contact_form posts JSON to the seeder's anonymous contact-form endpoint. Exercises the SQL write path (each submission becomes an Umbraco content node) which has very different perf characteristics from reads, especially on lower SQL tiers. - Added a docstring caveat about the 1-3s wait_time being aggressive (~30 req/s per VU vs real human pacing of 5-30s) so readers don't mis-interpret VU counts as concurrent humans. Skipped (explicit non-goals for now): - catch_response content validation: FastHttpUser already marks 4xx/5xx as failures; 200-with-error-template is too rare to justify the noise in every task. - Member auth flow: needs anti-forgery token handling + cookie state + per-VU on_start. Worth adding when we specifically want to measure authenticated browsing perf, but not blocking a first real run. - Helpers extraction: there's only one locustfile and _hit() already covers the obvious DRY win. Premature for a single-file workload. EOF

The original weight 3 produced ~1.4 form submissions/sec at 100 VUs, roughly 2.8% write share - too light to differentiate Starter S1 vs Pro S3 on write characteristics. Bumping to 8 lands at ~3.7 RPS / ~7% write share, which is within the 5-15% realistic CMS production range and enough to actually exercise SQL Log IO on the lower tiers.

Adds the stage that runs ensure-monitoring-infra.ps1 + deploy-workbook.ps1 and propagates DCE/DCR/Stream as cross-stage outputs. Template re-declares the 3 LA params; pipeline passes them at every template call site. Without this the Workbook gets no data and the YAML fails to compile (undeclared parameter refs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…own + glossary Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add LoadTestSeries_CL table for per-minute resource pressure, workbook Top issues panel + Stability/Bottleneck columns, history-RG build cache, deterministic locust PRNG seed, seeder-status non-JSON guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

KQL parse error on Top issues (top→sort|take), n<5 stability floor, case/whitespace-insensitive regression joins, Trends sampler filter, chart legend separator (·→| for chart series), Compare delta Note column. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Phase 2 warmup now primes every URL in every bucket instead of one per type — the dominant run-to-run noise source on p95/p99 was first-touch latency on URLs the load test reached via random.choice() but warmup never visited. Cost: ~30-60s extra provisioning. Also treats seeder duration_seconds <= 0 as null so a misreported ElapsedMs can't drag the dashboard median to 0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Get-MetricSummary now emits a logissue warning on query failure in addition to Write-Warning, so partial-metrics gaps (e.g. plan_* missing while sql_*/app_* populate) surface in the AzDO summary panel instead of blending into the step log. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Trends Sampler filter switches from single-select to multi-select with an All sentinel. KQL filter uses scenario_name in ({Sampler}). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Get-MetricSummary's silent-success branch (API responded OK but returned empty timeseries — e.g. VM hadn't emitted yet, wrong window) now logs a logissue warning naming the metric and resource. Closes the diagnostic gap where plan_* columns came back null with no indication why. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

provision and loadTest run on different agents, so the .seeder-results/<testCaseId>.json files written by install-umbraco-cms-on-appservice.ps1's local-exec couldn't be read by the loadTest stage — the dashboard never showed real seeder durations as a result. The provision.apply.outputVars step now aggregates all per-case JSONs into a single seederResults output variable; load-test-job reads it via $env:SEEDER_RESULTS instead of touching the (absent) filesystem. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two queue-UI changes that move in opposite directions: * Add seederPresetOverride (Auto/Small/Medium/Large/Massive). Auto keeps the existing preset-coupled-to-profile default; explicit values let off-diagonal cells run (e.g. Massive content + smoke load) and unblock the otherwise-unreachable Massive preset. * Remove skipLoadTests. Infra-only smoke runs aren't a workflow we keep using — the per-stage condition checks and the verifyDeployments job go away with it; README references the workflow no longer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

All three override pills now follow the same '(Auto = match X)' pattern: 'match tier' for the SKU/DTU pair (tier-coupled) and 'match load profile' for the seeder preset (profile-coupled). Drops 'use the tier's default' jargon. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Azure Load Testing rejects run descriptions over 100 chars. Worst- case combo (long prerelease + Enterprise + DeliveryApi + Massive + stress numbers) was hitting ~110 chars; compact separator-driven template now lands around 70. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Moves the tolower/trim normalization ahead of arg_max so the summarize groups by the normalized keys directly, and adds run_id to the normalization set. Sidesteps KQL arg_max column-naming quirks that prevented the regression row from joining to the load- test row even when the underlying values matched. Applied at all three sites (Top issues, Latest-runs card, Runs tab). The drill-down panel (which doesn't use a join) was already showing the correct verdict; this should now make the Runs table column agree with it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…harts * Glossary moved to its own tab so the wall-of-vocabulary doesn't greet every viewer. Top banner now points to it. * Filter-scope note under the global pills makes the Trends/Compare/ Runs vs Tiers/Versions split visible without reading the README. * Tier rows sort by capacity rank (Starter → Enterprise) instead of alphabetical — upgrade story reads top-to-bottom. * Seeder duration median drops 0/negative readings so a bogus ElapsedMs from a single run can't drag the column to 0. * Compare baseline/candidate dropdowns hide failed runs (no_metrics / no_results_dir) so picking a known-broken option isn't easy. * Trends latency chart and resource-pressure chart now render side-by-side at 50% width each, sharing the same run-indexed x-axis. Direct visual correlation of code-bound vs infra-bound symptoms. * Runs drill-into-run is a dropdown instead of a typo-prone text field, sourced from the same time/filter scope as the Runs table. * Per-minute resource chart split in two so HTTP error counts get their own auto-scaled Y-axis instead of being crushed against the 0-100 percentage floor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Parameter table + paragraph for seederPresetOverride * Dashboard description names six tabs (Glossary added), reflects side-by-side Trends charts, multi-select sampler picker, drill dropdown, split per-minute charts on Runs, tier-rank ordering * loadtest.workbook.json one-liner mentions all six tabs * .gitignore picks up __pycache__ folders that Locust runs leave behind Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

When the user deselects all sampler options, the multi-select picker substituted empty into 'scenario_name in ({Sampler})', producing 'in ()' which is a KQL syntax error. The Trends line chart and matrix both failed to render in that state. isRequired=true keeps at least the 'All' sentinel selected, so the in-clause is always well-formed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The in-app filter-scope note above the tabs lists Top issues; the Glossary's matching line was missing it. One-line consistency fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

It always rendered '✓ Clear' in practice — the regression-table join that drove it has been silently broken, and even when working the panel duplicated information the Runs tab + per-tab verdict columns already surface. Cleared the markdown banner above the tabs that referenced it, and dropped Top issues from the Glossary + global- filter scope note + README. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The leftouter join on the Runs table column kept missing the regression_status even when the drill-down panel (no join) found the row. Replacing arg_max(TimeGenerated, regression_status) with take_any + max(TimeGenerated) sidesteps arg_max's column-binding quirks that may have been the cause. take_any is safe here because check-regression.ps1 writes exactly one regression_check row per (run × scenario × version × tier). If this still doesn't fix the join, the diagnostic KQL in the earlier conversation pinpoints which column actually mismatches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add/dashboard improvements

Add Umbraco load-testing pipeline + Azure Workbook dashboard

Added Jmeter load tests for v13 and v17

Three coupled fixes landing together because they touch the same post-build flow: * Build cache (both local .build-cache/ and shared blob mirror) removed — observed time savings didn't materialise in practice. Each run now does a fresh dotnet publish. Build-dir cleanup happens in the finally{} block as before. * Deploy step moved INSIDE the try{} block (was after the finally cleanup). The previous structure relied on the cache zip living outside the build dir so it survived cleanup; without a cache, the publish.zip sits inside the build dir and must be deployed before that tree is removed. * az webapp stop wrapped in Stop-AppServiceBestEffort: 3 retries with 5/10/20s backoff, then warn-and-continue if all fail. Transient 503s from Azure's management API right after a deploy were failing provisioning for what's functionally a polish step (the load-test stage's az webapp start is idempotent). Applies to both call sites — failure-cleanup path and normal-cleanup. Pipeline side: FetchBuildCacheKey task and BUILD_CACHE_* env vars removed from azure-pipeline.yml's provision stage. README's build- cache and storage-cost sections removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

andr317c and others added 30 commits January 23, 2026 08:22

Updated to use package

620c7e4

Updated setup and added read me

90c592d

early state of locust infra

3fc9833

Restructure pipeline: scenarios, tier matrix, history NDJSON, validator

93495d6

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Remove quality-gates pipeline; private repo doesn't need PR-triggered…

1f61ee6

… lint Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Trim defensive features and docs for private-repo simplicity

4aa2a35

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Move history-* defaults into the variable group

4f1bd7c

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Rename managed_by=bootstrap-script tag to ensure-script

c2301ec

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fix testSummary: replace compile-time directives with runtime PowerShell

28e44cf

The 'each' and 'if' directives are not allowed inside script: | string blocks. Use $env vars and runtime foreach/if instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Point pipeline at the new WIF service connection

1558c8c

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Update README service connection name

3a47b17

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drop Clean starter kit; the seeder already provides templates + content

18e15b9

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Hardcode admin password so the team can log into the backoffice

04028e7

Login: loadtest@example.invalid / LoadTest123! Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drop dead SP-credential terraform variables

5bd5a0f

After moving auth to ARM_* env-var inheritance, the client_id/secret/oidc_token/ tenant_id terraform variables are no longer read. Cleaning up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pin seeder package to exact 17.0.0-beta.1 for reliability

05f077d

Bump this when a new prerelease/stable ships. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Added comma

fcb74b5

Bump pinned seeder package to 17.0.0-beta.2

2571334

Picks up the URL rebuild fix + inventory enhancements (mediaTypes / includeMemberPassword / cultures / paginated /inventory/urls).

Drop final ALT residue in publish-to-history comment

9486524

One reference missed during the rename pass.

andr317c and others added 30 commits May 18, 2026 09:49

README: pipeline workflow now reflects ensureMonitoringInfra stage

ce9787b

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Enable v13-v18: assume seeders shipped (v18 falls back to v17)

e27d697

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

README: drop PR-triggered CI reference

5e63c15

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Cleaned up

3690d2d

Updated to support 13

84d90c9

seeder: capture per-case duration into LoadTestSummary_CL

d106b5f

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

load-test-job: warm every endpoint + surface seeder duration

c69f4ac

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

workbook: aggregate Tiers across runs, filter cold-start, fix drill-d…

61bfb6d

…own + glossary Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

workbook: Sampler picker multi-select with All

707a1d8

Trends Sampler filter switches from single-select to multi-select with an All sentinel. KQL filter uses scenario_name in ({Sampler}). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

workbook: Glossary Layout note includes Top issues in scope

ad978a9

The in-app filter-scope note above the tabs lists Top issues; the Glossary's matching line was missing it. One-line consistency fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge pull request #7 from umbraco/add/dashboard-improvements

4c5095e

Add/dashboard improvements

Merge pull request #5 from umbraco/add/dashboard

265e4d0

Add Umbraco load-testing pipeline + Azure Workbook dashboard

Merge pull request #4 from umbraco/add/jmeter-load-test-scripts

bad38cd

Added Jmeter load tests for v13 and v17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Umbraco load-testing pipeline (Locust + ALT)#6

Add Umbraco load-testing pipeline (Locust + ALT)#6
andr317c wants to merge 163 commits into
mainfrom
add/seed-package-usage

andr317c commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andr317c commented May 13, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants