Skip to content

feat(visual-regression): prototipo testes visuais com Playwright#2789

Open
alinelariguet wants to merge 9 commits into
masterfrom
visual-regression/prototype-playwright-v2
Open

feat(visual-regression): prototipo testes visuais com Playwright#2789
alinelariguet wants to merge 9 commits into
masterfrom
visual-regression/prototype-playwright-v2

Conversation

@alinelariguet
Copy link
Copy Markdown
Member

@alinelariguet alinelariguet commented Mar 31, 2026

feat(visual-regression): prototipo testes visuais com Playwright

Summary

Adds a Playwright-based visual regression testing prototype to po-angular. The infrastructure captures pixel-by-pixel screenshots of PO UI components and compares them against committed baselines, failing on any visual difference.

Key architectural decisions:

  • Separate visual-app registered in angular.json — completely isolated from projects/app/ (used for behavior validation). Lives under e2e/visual/app/.
  • Co-localized folder structure — each component gets its own folder under e2e/visual/ containing spec, component template, and nearby __snapshots__/ baselines.
  • CI integration with po-style branch detection — the new test-visual job in ci.yml checks if a branch with the same name exists in po-ui/po-style. If found, it clones, builds, and packs that branch into a .tgz, installing it over the npm version before running visual tests. This prevents false positives when CSS and components change together across repos.
  • CI artifact uploads — after test execution, the Playwright HTML report and test results (diffs, screenshots) are uploaded as GitHub Actions artifacts with 30-day retention. Available even when tests fail, so developers can inspect visual diffs directly from the Actions UI.

Coverage: 24 tests total — 11 basic component samples (po-button, po-table, po-accordion, etc.) + 13 po-input state combinations (disabled, readonly, required, loading, label/helper permutations, etc.).

Commands:

  • npm run test:visual — run tests against baselines
  • npm run test:visual:update — regenerate baselines (needed on first run or after intentional changes)
  • npm run test:visual:report — open HTML report with visual diffs

Updates since last revision

  • Added two actions/upload-artifact@v4 steps to the test-visual CI job:
    • visual-regression-report — Playwright HTML report (e2e/visual/playwright-report/)
    • visual-regression-test-results — test result diffs and screenshots (e2e/visual/test-results/)
    • Both use if: ${{ !cancelled() }} so artifacts are available even when tests fail
    • retention-days: 30

Review & Testing Checklist for Human

  • Verify artifact uploads work in CI — go to the Actions tab for this PR, open the Test visual regression job run, and confirm both visual-regression-report and visual-regression-test-results artifacts are listed and downloadable. Download the report artifact and open index.html to confirm it renders a valid Playwright test report.
  • Visually inspect the baseline PNGs in e2e/visual/__snapshots__/po-input/ — particularly po-input-state-required-*.png and po-input-state-required-error-*.png. Verify the required asterisk and error message are actually visible in the screenshots (these use [p-show-required]="true" and [p-required-field-error-message]="true" property bindings).
  • CI po-style branch detection uses unauthenticated GitHub API (curl to api.github.com without token). Rate limit is 60 req/hour per IP on shared runners — if hit, the step silently falls back to the npm version. Consider whether ${{ secrets.GITHUB_TOKEN }} should be added for authentication.
  • Baseline snapshot platform — the committed .png files were generated on Linux (chromium-linux). Local dev on Windows/macOS will get mismatches until they run test:visual:update. Confirm this is acceptable UX for contributors.
  • test-results/ artifact may be empty on success — Playwright only writes diffs to test-results/ when tests fail. On a green run the artifact will contain no meaningful files. This is expected behavior but worth confirming.

Suggested test plan:

  1. Clone this PR branch locally
  2. Run npx playwright install chromium
  3. Run npm i && npm run build:ui:lite
  4. Run npm run test:visual — all 24 tests should pass (may need test:visual:update on non-Linux platforms)
  5. Intentionally break a test (e.g., edit e2e/visual/po-button/po-button.visual.spec.ts to change the button label), re-run test:visual, and verify it fails with a clear diff in the HTML report

Notes


Open with Devin

- Configura Playwright com toHaveScreenshot para visual regression
- Cria visual-app separado do projects/app (isolamento completo)
- Estrutura co-localizada em e2e/visual/ por componente
- Testa 11 componentes basicos + 13 estados do po-input
- Adiciona job CI com deteccao de branch po-style
- Se branch de mesmo nome existe no po-style, builda e usa tgz
- Corrige bindings p-show-required e p-required-field-error-message
- Timeout de 60s para compatibilidade com Windows
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 14 additional findings in Devin Review.

Open in Devin Review

Comment thread .github/workflows/ci.yml
Comment on lines +253 to +255
- name: Generate baselines for CI environment
if: steps.baselines-cache.outputs.cache-hit != 'true'
run: npx playwright test --update-snapshots --reporter=list
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CI baseline regeneration overwrites committed baselines, making visual regression tests always pass on cache miss

When cache-hit != 'true' (which happens whenever spec/HTML files change, when the cache is evicted, or on first runs for new branches), the CI runs npx playwright test --update-snapshots which regenerates all 186 committed baseline screenshots from the current code. The subsequent npx playwright test then compares against those just-generated baselines, which trivially always passes.

This completely defeats the purpose of visual regression testing in the exact scenario where it matters most: when a developer modifies UI components AND updates test files in the same PR. The --update-snapshots flag updates ALL snapshots (not just missing ones), so even existing tests that should catch regressions in unchanged components get their baselines silently overwritten.

Additionally, the restore-keys fallback (visual-baselines-${{ runner.os }}-) provides no benefit—actions/cache@v4 sets cache-hit to false for restore-key matches, so the restored baselines are immediately overwritten by the regeneration step.

The baselines are committed to the repo but ignored by CI

There are 186 baseline PNGs committed in e2e/visual/__snapshots__/ which represent the approved visual state. The checkout step (actions/checkout@v3) makes them available, but the regeneration step at line 255 overwrites them. The fix should check if committed baselines already exist before regenerating, e.g.:

- name: Generate baselines for CI environment
  if: steps.baselines-cache.outputs.cache-hit != 'true'
  run: |
    if find e2e/visual/__snapshots__ -name '*.png' | grep -q .; then
      echo "Baselines found from repo/cache, skipping regeneration"
    else
      npx playwright test --update-snapshots --reporter=list
    fi
Suggested change
- name: Generate baselines for CI environment
if: steps.baselines-cache.outputs.cache-hit != 'true'
run: npx playwright test --update-snapshots --reporter=list
- name: Generate baselines for CI environment
if: steps.baselines-cache.outputs.cache-hit != 'true'
run: |
if find e2e/visual/__snapshots__ -name '*.png' | grep -q .; then
echo "Baselines found from repo or restore-keys, skipping regeneration"
else
echo "No baselines found, generating new ones"
npx playwright test --update-snapshots --reporter=list
fi
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O ponto é válido, mas intencional neste caso. As baselines commitadas no repo são geradas em ambiente local (para desenvolvimento) e possuem dimensões ligeiramente diferentes do ambiente CI (GitHub Actions Ubuntu 24.04) — por exemplo, o po-textarea renderiza com 7px de diferença na altura entre os dois ambientes. Se usássemos as baselines commitadas diretamente no CI, os testes sempre falhariam por mismatch de dimensão.

O fluxo desenhado é:

  1. Primeiro run (cache miss): gera baselines no ambiente CI → passa
  2. Runs subsequentes (cache hit): compara contra baselines do cache do CI → detecta regressões entre execuções no CI

Quando os arquivos de spec/HTML mudam, o cache key muda e baselines são regeneradas — isso é esperado, pois os testes mudaram.

Para detecção de regressão cross-versão mais robusta (comparando contra baselines de uma branch base), seria necessário um approach mais sofisticado como Chromatic ou cache baseado na branch target. Isso pode ser evoluído em iterações futuras.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant