Skip to content

Latest commit

 

History

History
349 lines (226 loc) · 23.9 KB

File metadata and controls

349 lines (226 loc) · 23.9 KB

vc-testing-module — Code Review

Comprehensive review of the project as of 2026-04-30 (refreshed; original review 2026-04-28).

Scope

  • ~17.7k LOC of Python across 9 packages: core, dataset, gql, restapi, page_objects, tests, utils
  • 105+ test files (tests/e2e/, tests/graphql/, tests/restapi/)
  • Build/CI: pyproject.toml + 7 GitHub workflows (was 6)
  • _BACKUP/ excluded from review (legacy reference material)

Status legend (relative to the 2026-04-28 review)

  • FIXED — issue resolved.
  • PARTIAL — meaningful progress; remaining work scoped below.
  • OPEN — unchanged since the previous review.
  • NEW — surfaced in this pass.

Severity legend

  • [CRITICAL] — correctness, masked failures, or test integrity at risk
  • [SIGNIFICANT] — architectural smell or real maintenance burden
  • [MINOR] — naming, style, cosmetic

What changed since the last review

ID Title Status Notes
C1 Global --retries=1 ✅ Fixed addopts in pyproject.toml:21 is now --alluredir=allure-results.
C2 pyright excludes restapi/ ✅ Fixed include in pyproject.toml:44 now lists restapi.
C3 Untyped REST responses ◐ Partial restapi/types/ exists with 14 hand-written RestModel classes; ~50% of operations return typed models. ~70 methods still return dict. RestModel uses extra="allow" (documented as follow-up).
C4 Context.from_dataset dead branch ❌ Open tests/context.py:34 still uses next() without default — same StopIteration trap.
C5 Race in AuthProvider._refresh ❌ Open core/auth/provider.py:42-68 still releases the lock before the HTTP call.
C6 Swallowed teardown errors ✅ Fixed All make_* factory teardowns log via logger.warning — verified across tests/restapi/platform/conftest.py, catalog/conftest.py, contacts/conftest.py, marketing/conftest.py, orders/conftest.py.
C7 time.sleep polling ❌ Open Still in tests/restapi/search/test_search.py:115 and tests/restapi/platform/test_misc.py:37.
C8 Component.wait_for_results networkidle ✅ Fixed page_objects/components/component.py is now an 11-line locator wrapper; no networkidle anywhere in page_objects/.
C9 client._session private access ❌ Open tests/conftest.py:109 still mutates client._session.hooks["response"].
C10 serial marker informational ❌ Open Marker still defined in pyproject.toml:26; no xdist grouping configured.
S11 cart_operations.py duplication ❌ Open Still 442 lines, still 12+ near-identical mutations.
S12 BaseOperations vs RestBaseOperations ❌ Open Naming and file-name split unchanged.
S13 Implicit with_user chain ❌ Open graphql_client/with_cart/ctx still depend on with_user.
S14 Multiple autouse fixtures ❌ Open screenshot_on_failure, har_recorder, with_cart, delete_cart_after all autouse.
S15 No tests/e2e/conftest.py ❌ Open E2E branches still gated by if marker == "e2e" in root conftest.
S16 Page-object children per-access ❌ Open top_header.py still rebuilds children on every property.
S17 Positional marker payloads ❌ Open marker.args[0] access at tests/conftest.py:30,181,206,270.
S18 Mutable restapi/constants.py ❌ Open restapi/constants.py:10 lists/dicts unchanged.
S19 ORDER_LINE_ITEM_TEMPLATE coupling ❌ Open Hardcoded product IDs still in restapi/constants.py.
S20 gql() no-op ❌ Open Still in gql/operations/base_operations.py:35.
S21 GqlModel not strict ❌ Open gql/types/base.py still missing extra="forbid".
S22 # type: ignore[call-arg] ❌ Open core/global_settings.py:45.
S23 Allure decorator discipline ❌ Open No documented convention or enforcement hook.
S24 Inconsistent __init__.py ❌ Open tests/restapi/ has them; tests/e2e/, tests/graphql/, tests/ don't.
S25 dataset_manager.log location ❌ Open Still inside the package dir.
S26 GraphQL partial response handling ❌ Open core/clients/graphql.py:36-38 still raises on any errors.
S27 RestClient._parse_response strictness ❌ Open Still raises NotImplementedError on non-JSON.
S28 make_user mutates server response ❌ Open tests/restapi/platform/conftest.py:77-79.
M29–M40, M42–M45 Various minor ❌ Open See per-item entries below.
M30 _legacy/ vs _BACKUP/ ✅ Fixed dataset/_legacy/ no longer exists; only _BACKUP/ remains.
M37 HARRecorder.hook swallow ❌ Open utils/har_recorder.py:60-64 still has except Exception: pass.
M41 Thin Component base class ◐ Partial After C8, Component is a 4-line wrapper holding _root and exposing .root. The "delete or give real behaviour" decision still pending — the class adds little over a raw Locator today.
N1 (➕) REST typing coverage uneven New See N1 below.
N2 (➕) RestModel not strict New See N2 below.
N3 (➕) New auto-tests.yml workflow New See N3 below.
N4 (➕) OrderOperations.update accepts dict | CustomerOrder New See N4 below.
N5 (➕) tests/restapi/store/test_store.py:46 except: pass New See N5 below.
N6 (➕) tests/restapi/search/test_search.py:113 except: pass inside polling loop New See N6 below.
N7 (➕) RestClient reaches into auth.headers New See N7 below.

CRITICAL issues (still open)

C3 (PARTIAL). Typed REST responses — coverage and strictness still incomplete

Where: restapi/types/, restapi/operations/

What landed (PR #153, commit 26c629c):

  • restapi/types/ now contains 14 hand-written models: Catalog, CatalogLanguage, Category, Contact, CustomerOrder, Employee, Member, OrderLineItem, Organization, Pricelist, PricelistAssignment, Product, Promotion, Role, Store, User, Vendor.
  • Operations that consume these models: CatalogOperations.create/update, CategoryOperations.create, ProductOperations.create, MemberOperations.create, ContactOperations.create, OrganizationOperations.create, EmployeeOperations.create, VendorOperations.create, RoleOperations.create, OrderOperations.create/get_by_id/get_by_number, UserOperations.get_by_name, etc.
  • Each model is conservative — only fields touched by the test suite are typed.

What's left:

  1. ~70 method signatures still return dict. Search/list endpoints, dynamic_content, oauth, settings, notifications, api_key, price, cms_content, several order reads (search, recalculate, dashboard_statistics, etc.). Plan calls these out individually in Phase 2.3.
  2. RestModel uses extra="allow" deliberately to round-trip unknown fields on update calls (see restapi/types/base.py:5-13). This means schema drift still passes silently — extra="forbid" is parked behind "first get coverage, then tighten." Tracked as N2 below.
  3. OrderOperations.update accepts dict | CustomerOrder. This is an escape hatch when round-tripping unknown fields, but it lets tests bypass the typed contract. Tracked as N4.

Severity: Still CRITICAL because the gap between "the model says it accepts X" and "we actually validate X" remains. Type-safety asymmetry has narrowed but not closed.


C4. Context.from_dataset has dead error path

Where: tests/context.py:34-36

user = next(u for u in dataset["users"] if u["userName"] == username)
if user is None:
    raise ValueError(f"User '{username}' not found in dataset")

Problem: Unchanged. next() without a default raises StopIteration; the if user is None branch is unreachable.

Fix:

user = next((u for u in dataset["users"] if u["userName"] == username), None)
if user is None:
    raise ValueError(f"User '{username}' not found in dataset")

C5. Race condition in AuthProvider._refresh

Where: core/auth/provider.py:42-68

Problem: Still releases the lock between the is_expired check and requests.post. Two concurrent threads can double-refresh and the second response overwrites the first.

Fix: Hold the lock through the HTTP call (refreshes are rare; contention isn't the concern).


C7. time.sleep polling instead of using existing helper

Where:

Problem: Same as before. utils/polling_utils.py poll_until is unused at these sites.

Fix: Replace the loops with poll_until(...) (see original review for example).


C9. tests/conftest.py reaches into private client._session

Where: tests/conftest.py:109

Problem: Conftest still mutates client._session.hooks["response"] directly.

Fix: Add add_response_hook / remove_response_hook to RestClient and GraphQLClient; have the har_recorder fixture use the public API.


C10. serial marker is informational only

Where: pyproject.toml:26

Problem: Without pytest-xdist worker grouping or pytest-ordering, the marker still does nothing. CI does not currently run with -n auto, so the marker is purely aspirational.

Fix:

  • If parallel runs aren't planned, remove the marker (false sense of safety).
  • If parallel runs are planned: add pytest-xdist, use @pytest.mark.xdist_group("serial") so all serial-marked tests run on the same worker; document in README and CI workflows.

SIGNIFICANT issues (still open)

S11–S28 are unchanged from the previous review. Cross-references and fix sketches in the original document remain accurate. Highlights:

  • S11. cart_operations.py duplicationgql/operations/cart_operations.py is still 442 lines, still 12 near-identical mutations. Extract _execute_command(operation_name, query, command) on BaseOperations.
  • S12 / S15 / S24. Naming and structure inconsistenciesBaseOperations vs RestBaseOperations, base.py vs base_operations.py, missing tests/e2e/conftest.py, mixed __init__.py placement. None individually critical; together they fight new contributors.
  • S13 / S14. Implicit autouse fixture chain — every test transitively pulls with_user/with_cart/delete_cart_after/screenshot_on_failure/har_recorder. Test signatures hide what's running.
  • S17. Positional marker payloads — six call sites in tests/conftest.py still read marker.args[0].
  • S18 / S19. restapi/constants.py mutability and dataset coupling — lists and dicts are mutable; ORDER_LINE_ITEM_TEMPLATE hardcodes product-acme-laptop-lenovo-ideapad-5i.
  • S21. GqlModel not strict — still no extra="forbid". Schema drift passes silently.
  • S26 / S27. Client error handlingGraphQLClient.execute raises whenever errors is non-empty, dropping partial data; RestClient._parse_response raises NotImplementedError on non-JSON content types.
  • S28. make_user mutates the response dict — credentials get mixed into the server response object.

Refer to the original review for full detail and fix sketches; they remain valid.


MINOR issues (selected)

Most M-series issues unchanged. Notable:

  • M37. HARRecorder.hook still swallows everythingutils/har_recorder.py:60-64. Keeps the silent-failure pattern alive in a place where it'd be cheap to log.
  • M41. Component base class is even thinner now — after C8, page_objects/components/component.py is just __init__(self, root) + root property. It doesn't yet justify its existence; either delete and have components hold a Locator directly, or give the base real shared behaviour (visibility helpers, common attribute accessors).
  • M43. Workflow duplication has grownauto-tests.yml was added in addition to the existing six. Consolidating remains worthwhile.
  • M30. _legacy/ vs _BACKUP/ — Resolved. dataset/_legacy/ is gone.

Other M-items (M29, M31–M36, M38–M40, M42, M44–M45) are unchanged.


NEW issues surfaced in this pass

N1. REST typing coverage is uneven by domain

Where: restapi/operations/

Observation: Among the 22 operation modules:

  • Fully or mostly typed: catalog, category, product, member, contact, organization, employee, vendor, promotion, pricelist, pricelist_assignment, role, store, user (reads).
  • Untyped or mixed: order (write paths return dict), dynamic_content (18 dict returns), cms_content, oauth, settings, notifications, api_key, price, user (writes).

Problem: When a typed operation calls into an untyped helper or shares a schema with one (e.g. OrderOperations.create returns CustomerOrder but OrderOperations.recalculate returns dict), tests get inconsistent ergonomics in the same file.

Fix: Continue Phase 2.3 — pick one untyped module per PR, type its responses, drop the dict annotations. Aim for a module-by-module rollout with pyright catching the call-site mismatches.


N2. RestModel permissive extra="allow" parks schema drift

Where: restapi/types/base.py:5-13

Observation: RestModel deliberately uses extra="allow" so that update flows (update(catalog: Catalog)model_dump() → POST) round-trip server fields the model doesn't know about. The docstring acknowledges this and points to S21 as the follow-up.

Problem: Schema drift cannot be detected by the test suite while extra="allow" is in effect — a renamed or added server field is silently accepted on read and re-sent unchanged on write. This is the same hole that S21 calls out for GqlModel.

Fix (when REST coverage is complete):

  • Once a model has been validated against current server responses, flip to extra="forbid".
  • For update flows that genuinely need round-tripping unknown fields, add an explicit unknown_fields: dict[str, Any] = Field(default_factory=dict) to capture them, instead of relying on extra="allow".
  • Stage the flip module-by-module after Phase 2.3 finishes so it doesn't block coverage progress.

N3. Workflow duplication has grown to seven files

Where: .github/workflows/auto-tests.yml, e2e-tests.yml, e2e-tests-docker.yml, graphql-tests.yml, graphql-tests-docker.yml, restapi-tests-docker.yml, refactored-tests.yml.

Problem: Original M43 noted six; one more was added. The same setup steps (Python install, dependency install, Playwright install, Allure setup) are duplicated. Each rename or env-var change touches all of them.

Fix: Consolidate via a reusable workflow + matrix strategy (see M43 fix sketch). Worth bundling with the PR that addresses M43.


N4. OrderOperations.update accepts dict | CustomerOrder

Where: restapi/operations/order_operations.py:47-49

def update(self, order: dict | CustomerOrder) -> None:
    body = order.model_dump(by_alias=True) if isinstance(order, CustomerOrder) else order
    self._client.put(self._url(self.PATH), json=body)

Problem: The escape hatch (accepting either) lets tests bypass the typed contract — e.g. mutate order["status"] = "Cancelled" on a raw dict obtained elsewhere. The class docstring justifies this for "deep field-level edits without paying dump/reconstruct" but the cost is that pyright can't catch a typo'd field name in the dict path.

Fix options:

  • Keep update(order: CustomerOrder) only; add a low-level update_raw(payload: dict) for the round-trip case so test code is explicit about which mode it's in.
  • Or extend CustomerOrder to expose enough mutators (a model_copy(update={...}) helper) that the dict path is unnecessary.

Either way, surface the choice at the call site rather than letting it default into dict.


N5. try: ... except Exception: pass regression in tests

Where: tests/restapi/store/test_store.py:46-47 — comment says "Expected: validation error or 400" but no assertion verifies that.

Problem: This is the same pattern C6 fixed in conftest factories, but now living inside a test body. Bare except Exception: pass cannot distinguish "expected 400" from "connection refused" or "the entire backend is down."

Fix: Use pytest.raises(requests.HTTPError) (or the project's wrapping exception) and assert on response.status_code.


N6. time.sleep loop also includes except: pass

Where: tests/restapi/search/test_search.py:108-115

Problem: Combines two anti-patterns from this review (C7 + silent-failure). The try/except Exception: pass inside the polling loop hides every failure mode — including the case where the backend has stopped serving job status entirely.

Fix: Replace with poll_until(...); if the test wants to tolerate transient failures, encode that in predicate= rather than via blanket except.


N7. RestClient._request reads auth.headers on every call

Where: core/clients/rest.py:33

Observation: Every request reads self._auth.headers, which acquires the auth lock and may trigger a refresh. That's correct, but combined with C5 (the race in _refresh), parallelism amplifies the bug. Worth keeping in mind when fixing C5: the contention pattern under pytest-xdist would be "every worker hits headers on every call."

Fix: No standalone change; just verify that the C5 fix accounts for high-frequency callers.


Cross-cutting themes (refresh)

T1. Type safety asymmetry — narrowing, not closed

REST is no longer entirely dict-typed (down from 100% to ~50%), but the remaining dict returns are concentrated in search/dynamic-content/oauth modules. The intent is clear (continue rolling out); the risk is that "good enough" stalls progress before strictness (extra="forbid") lands.

T2. Silent failure tolerance — partly addressed

Factory teardowns now log (C6 ✅). HAR recorder hook (M37), GraphQL partial response handling (S26), in-test polling loops (N6), and one isolated test-body except: pass (N5) remain. The C1 retry-masking is gone.

T3. Pattern documentation drift — improving

Component.wait_for_results is gone (✅), so the docs and code now agree on no-networkidle. Cached children (S16), no-time.sleep (C7), and Allure decorator policy (S23) are still drifted. A ruff config with project-specific rules (banning time.sleep, networkidle, except Exception: pass outside specific paths) would close this gap mechanically.

T4. Naming inconsistency — unchanged

BaseOperations vs RestBaseOperations, base.py vs base_operations.py, marker positional vs kwargs — all unchanged.

T5. Implicit autouse chain — unchanged

Test signature still hides what activates. Splitting tests/e2e/conftest.py and making with_user opt-in are both still pending.


Recommended fix priority (refreshed)

The original top-5 had C1, C2, C6, C8, C3. Three are done; one is partial. Updated top 5:

  1. C5 (race in AuthProvider) — single-file fix; correctness bug that gets worse the moment parallel runs land.
  2. C9 (public hook API on clients) — small but unblocks both T2 and removes private-attribute access. Touches two files.
  3. Continue C3 — Phase 2.3 module rollout. Pick one untyped module per PR; aim for full coverage before flipping extra="forbid".
  4. S11 (cart_operations dedup) — biggest readability win; ~200 lines deleted; pattern then propagates to other large mutation files.
  5. S21 + N2 (extra="forbid" on both base models) — gate the strictness flip behind C3 finish, but hold the line that this is the next visible-quality milestone.

After these, work S13–S15 (conftest split + opt-in with_user) and S17 (kwargs migration) opportunistically; pre-commit + ruff (Phase 4.4 in the plan) lands somewhere in the middle to start enforcing decisions automatically.


Notes

  • The dataset/ module remains reference-quality; not part of this review's findings.
  • _BACKUP/ is still excluded per project owner direction.
  • Counts and line numbers are accurate as of 2026-04-30 commit d3ad033.
  • Three PRs since the previous review (#151, #152, #153) closed four CRITICAL items (C1, C2, C6, C8) and meaningfully advanced C3.

Overall Assessment

Headline

Same shape, better posture. The visibility-first sweep (C1, C2, C6, C8) and the start of typed REST responses (C3) have moved the project from "B/B+" to a tentative B+/A−. The remaining work is concentrated in two areas: (1) finishing what's started (REST typing coverage; C9 public hooks; C5 race), and (2) the autouse/conftest hygiene cluster (S13–S15) which has not been touched.

What's genuinely better

  • CI signal is honest now. No --retries=1. Type checking covers restapi/. Cleanup failures log instead of vanishing. The cost of seeing "green" is now closer to the value of seeing green.
  • Type asymmetry has shrunk. ~50% of REST operations return Pydantic models instead of dict. The pattern is established and reproducible — Phase 2.3 is a rollout, not a design problem.
  • The discouraged Playwright pattern is gone. No networkidle in page_objects/. Project memory and the actual base class now agree.

What still needs attention

  • The auth provider race (C5) is a latent bug. It's invisible in single-threaded runs but every push toward parallelisation makes it more dangerous. Worth fixing before adding pytest-xdist.
  • Two new silent-failure patterns surfaced. tests/restapi/store/test_store.py:46 and tests/restapi/search/test_search.py:113 show the C6 lesson didn't fully propagate to test bodies. A ruff rule banning except Exception: pass outside whitelisted files would catch the next one.
  • The autouse chain and conftest split (S13–S15) is the largest remaining structural cleanup. It's been deferred twice now. Recommend bundling 5.1 + 5.2 + 5.4 from the improvement plan into one feature branch.
  • Pattern enforcement is still manual. No ruff, no pre-commit beyond black. Every convention from .claude/skills/ survives by author discipline only.

Trajectory

Net-improving, faster than the previous cadence. Three meaningful PRs in two days closed one full visibility cluster and started the architectural one. The risk going forward is that the visible wins (typing coverage, extra="forbid" flip) crowd out the less glamorous but high-value cleanups (C5, autouse split, ruff). Pacing those in alongside C3 rollout would prevent that.

Bottom line

Grade: B+ / A−. The project is no longer hiding what's broken; what remains is the work of finishing started initiatives and closing the autouse hygiene gap. The single highest-leverage change now is C5 (single-file race fix; gates safe parallelism) followed by completing C3 + flipping N2/S21 to extra="forbid" (the architectural win the project has been climbing toward).