diff --git a/ideas/DE_field_provenance.md b/ideas/DE_field_provenance.md new file mode 100644 index 0000000000..a98c1aa441 --- /dev/null +++ b/ideas/DE_field_provenance.md @@ -0,0 +1,130 @@ +# RFC: Per-field provenance for OBP Dynamic Entities + +**Target:** OBP-API (upstream) +**Status:** Draft for discussion +**Author:** (OGCR team) +**Date:** 2026-06-05 + +> Companion: field-level write/read role permissions are specified separately in +> `DE_field_write_role_read_role.md`. The two are **orthogonal** and compose, but each ships +> independently. This doc covers **provenance only**. + +--- + +## 1. Summary + +A small, optional addition to Dynamic Entities (DEs): on write, the server stamps **who** and **when** +for a field, and the writer may attach an **opaque** metadata object. No blockchain vocabulary — a chain +projection is just one worked example (§5). Opt-in per field, fully backward compatible. + +## 2. Motivation + +DEs today have only a fixed set of server-injected audit fields (`userId`, `consentId`) at the **record** +level. There is no way to record *who set a particular field, and when* — useful for any sensitive or +externally-sourced field: + +- "Who last changed this field, and when" auditing on any field. +- Externally-sourced fields (registry imports, indexer projections) that should carry their source/stamp. + +Kept blockchain-agnostic so it's reusable across OBP deployments. + +## 3. Current behaviour (baseline) + +- Server-injected **record-level** audit fields (`userId`, `consentId`) only. +- No per-field "who/when" or per-field metadata. +- DE data is a JSON blob (`dataJson`) in a generic `DynamicData` table. + +--- + +## 4. Schema addition (per property) + +| keyword | type | meaning | +|---|---|---| +| `trackProvenance` | boolean (default `false`) | record who/when on each write of this field, plus optional writer-supplied metadata | + +Entity-level convenience (optional): `trackProvenanceAllFields: true` enables provenance for every field. + +`trackProvenance` is **independent of access control** — any field can opt in, restricted or not. + +## 5. Provenance structure + +When a `trackProvenance` field is written, the server records an entry in a reserved `_provenance` object +on the record: + +```jsonc +"_provenance": { + "": { + // server-stamped (authoritative, cannot be forged): + "written_by_user_id": "…", + "written_by_role": "…", // role that authorised the write (if any) + "updated_at": "2026-06-05T12:00:00Z", + // writer-supplied (optional, opaque to OBP): + "metadata": { /* any JSON the writer attaches */ } + } +} +``` + +- `written_by_user_id`, `written_by_role`, `updated_at` are **always** set by the server. +- `metadata` is an **opaque** object OBP stores and returns verbatim — OBP attaches no meaning to it. +- Stores the **latest** write per field (full history out of scope — §9). +- `_provenance` is a reserved property name (definitions may not declare a field called that). + +## 6. Interaction with field-level permissions + +Provenance is a **side-effect of an accepted write**, so it inherits the field's authorisation from the +companion permissions RFC — no separate permission is needed: + +- **Provenance inherits write authorisation.** A field's provenance can only be created/changed by a write + that already passed its `writeRole` check. So a write-restricted field's provenance can only come from the + keyholder; ordinary consumers can only generate provenance on fields they're allowed to write. +- **Restamp only on a real authorised write.** When PUT/CREATE preserves a restricted field (value not + actually written by the caller), its provenance is **preserved**, not restamped to that caller/now. No-op + echoes don't update `updated_at`. +- **Visibility follows `readRole`.** If a field is read-restricted, its `_provenance[field]` entry is + **omitted** for callers lacking the read role (who/when/metadata can leak information). +- **Writer-supplied `metadata`** is accepted only for fields the caller is authorised to write; server- + stamped keys always override client-supplied ones. + +### 6.1 Writer supplies metadata via a parallel block + +```http +PATCH /obp/dynamic-entity//{id} +{ + "": "", + "_provenance": { "": { "metadata": { /* arbitrary */ } } } +} +``` + +## 7. Worked example — blockchain projection (one example, not a special case) + +A `chain_owner` field with `writeRole = CanWriteChainProjection` (the indexer's shared key, per the +permissions RFC) and `trackProvenance: true`, where the writer's opaque `metadata` happens to be +`{ "source": "chain:ogcr:2025", "tx_hash": "0x…", "block_number": 1234567, "confirmations": 20, "finalized": true }`. +OBP neither knows nor cares that this is chain data — it's just a restricted, provenance-tracked field. + +## 8. Backward compatibility + +- Definitions without `trackProvenance` behave exactly as today. +- `_provenance` only appears when at least one field opts in. +- Only new validation on definitions: reserving the `_provenance` property name. + +## 9. Out of scope / future + +- **Provenance history** (append-only) — this RFC keeps latest-only. +- **Field-level permissions** — companion RFC `DE_field_write_role_read_role.md`. +- **Queryable DE GET** — see `dynamic_entity_indexing.md`. + +## 10. Implementation touch points (OBP-API) + +- `code/dynamicEntity/DynamicEntityProvider.scala` — recognise `trackProvenance` / + `trackProvenanceAllFields`; reserve `_provenance`. +- `code/api/dynamic/entity/APIMethodsDynamicEntity.scala` & `Http4sDynamicEntity.scala` — stamp provenance + on accepted writes; preserve provenance on no-op/preserved fields; omit provenance per `readRole`. +- `MapppedDynamicDataProvider.scala` — persist & return `_provenance`. +- `JSONFactory6.0.0` / `ExampleValue.scala` / resource docs — document keyword + examples. + +## 11. Open questions + +- Latest-only (proposed) vs append-only history. +- Should `trackProvenanceAllFields` be allowed, given storage/write cost, or per-field only? +- Sequencing: ship after the permissions RFC (recommended), since provenance leans on its auth model. diff --git a/ideas/DE_field_write_role_read_role.md b/ideas/DE_field_write_role_read_role.md new file mode 100644 index 0000000000..cf110b5ec9 --- /dev/null +++ b/ideas/DE_field_write_role_read_role.md @@ -0,0 +1,303 @@ +# RFC: Field-level write/read role permissions for OBP Dynamic Entities + +**Target:** OBP-API (upstream) +**Status:** Draft for discussion +**Author:** (OGCR team) +**Date:** 2026-06-05 + +> Companion: per-field provenance is specified separately in +> `DE_field_provenance.md` (provenance). The two are **orthogonal** and +> compose, but each ships independently. This doc covers **access control only**. + +--- + +## 1. Summary + +A small, generic addition to Dynamic Entities (DEs): **per-field `writeRole` / `readRole`** so individual +fields can be restricted independently of the entity-level roles. No blockchain vocabulary; useful across +OBP deployments. Opt-in per field, fully backward compatible. + +## 2. Motivation + +DEs today have only **entity-level** roles (`CanGet/CanCreate/CanUpdate/CanDelete`), **row-level** +scoping (`hasPersonalEntity` + `userId`), and a fixed set of server-injected audit fields. There is no way +to make *one field* writable or readable only by a specific role. + +Generic use-cases (no blockchain required): + +- A `verification_status` field only a **certifier/verifier** role may set; everyone reads it. +- **Admin-only** fields (`internal_notes`) — restricted read *and* write. +- A moderation/`featured` flag only platform staff may toggle. +- Registry-mirrored fields a consumer app must display but never edit. +- A projection field written only by a privileged service (e.g. an indexer), read by everyone. + +## 3. Current behaviour (baseline) + +- Entity-level roles gate whole operations. +- Row-level scoping via `hasPersonalEntity` / `personalRequiresRole` + `userId`. +- Per-property schema in the definition (`metadataJson`): `type`, `required`, `minLength`, `maxLength`, + `reference:`. +- DE data is a JSON blob (`dataJson`) in a generic `DynamicData` table; reads are get-by-id or get-all + (field filtering/sorting/pagination is a separate concern — see §9). + +--- + +## 4. Schema additions (per property) + +All keywords are camelCase, consistent with existing schema keywords (`minLength`, `maxLength`). They are +always read in the context of a field, so the names need no "field-level" prefix. + +| keyword | type | meaning | +|---|---|---| +| `writeRoleRequired` | boolean (default `false`) | field is **write-restricted**: not writable via PUT/CREATE; only via the role-gated PATCH path | +| `writeRole` | string (optional) | the role permitted to write; if omitted, auto-generate `CanWriteDynamicEntityField___` | +| `readRoleRequired` | boolean (default `false`) | field is **read-restricted**: omitted from GET unless the caller holds the read role | +| `readRole` | string (optional) | the role permitted to read; if omitted, auto-generate `CanGetDynamicEntityField___` | + +**Restriction-on rule:** write restriction is on if *either* `writeRoleRequired: true` *or* an explicit +`writeRole` is named (and symmetrically for read). So you never specify both — `writeRoleRequired: true` +gives an auto role, `writeRole: "…"` gives an explicit (shareable) role and implies the restriction. + +Read and write are independent — a field can be (a) write-restricted but world-readable (the common case: +verifier/indexer writes, everyone reads), (b) read-restricted but writable, or (c) both. + +**Example definition:** +```json +{ + "activity_listing": { + "required": ["title"], + "properties": { + "title": { "type": "string" }, + "price_per_credit": { "type": "string" }, + "chain_owner": { "type": "string", "writeRoleRequired": true }, + "verification_status": { "type": "string", "writeRole": "CanSetVerificationStatus_activity_listing" }, + "internal_notes": { "type": "string", "writeRole": "CanEditInternal_activity_listing", + "readRole": "CanReadInternal_activity_listing" } + } + } +} +``` + +## 5. Write semantics — PUT/CREATE never write restricted fields + +- **PUT / CREATE** operate on **unrestricted fields only**. Any field with `writeRoleRequired`/`writeRole` + in the body is **ignored and preserved** (existing value kept) — no value comparison, no error. This + removes the stale-echo problem (a consumer echoing an out-of-date restricted value can never block or + clobber). +- **PATCH** (role-gated) is the **only path that writes restricted fields**. The caller must hold the + field's `writeRole`, else `403`. (PATCH also writes unrestricted fields a caller is allowed to write.) +- **`required` is validated against the merged object** (request body + preserved restricted fields), so + a consumer create/update never fails because a restricted field is "missing". Restricted fields are + **optional-at-create** (the keyholder fills them afterward via PATCH). + +## 6. Read semantics + +- A field with `readRoleRequired`/`readRole` is **omitted** from GET responses for callers without the + read role (applied consistently on GET_ONE and GET_ALL). +- Fields without read restriction behave exactly as today. + +## 7. Roles & "the key" + +- Restricted fields **generate** the implied roles by default: + `CanWriteDynamicEntityField___` and `CanGetDynamicEntityField___` + (double-underscore delimiter to disambiguate snake_case entity/field names; bank-scoped variants where + the entity is bank-scoped). They appear in the dynamic role registry and are grantable via the existing + entitlement endpoints. +- **Explicit shared role override:** naming an explicit `writeRole`/`readRole` lets many fields (across many + entities) point at **one** role — essential to avoid role-explosion for a service like an indexer that + writes lots of fields (grant it one `CanWriteChainProjection`-style role once). +- One write role per field covers both create-time and PATCH writes; the read role is **additive** on top + of the entity-level `CanGet…` (you need both to see a read-restricted field). +- "The key" = an entitlement granted to a user. A service (verifier, importer, indexer) authenticates as a + user holding the role; ordinary consumers don't and are read-only on those fields. + +## 8. Operation-aware resource docs + +OBP auto-generates DE resource docs (request/response schemas + examples) from the entity definition. Today +it derives one body shape from the full definition; this enhancement makes generation **operation-aware** so +the docs stop advertising fields a caller can't actually set on that endpoint. + +| Endpoint | Write-restricted fields in **request** body | In **response** body | +|---|---|---| +| **POST (CREATE)** | **omitted** from `typed_request_body` + `example_request_body` (they'd be ignored) | included | +| **PUT (UPDATE)** | **omitted** from the request body | included | +| **PATCH** (role-gated write path) | **included** — annotated with the required `writeRole` | included | +| **GET** | n/a | included; read-restricted fields annotated "only returned if you hold `readRole`" | + +Consequences: +- A developer reading **POST/PUT** docs sees only settable fields — no misleading restricted fields in the + example body. This also reflects that restricted fields are **optional-at-create**. +- **PATCH** docs are where restricted fields appear, each annotated with the role needed to write it. +- **Responses** include restricted fields (they're readable); read-restricted ones are annotated. + +Because OBP resource docs are **static, not per-caller**, role requirements are *documented* (the endpoint +already lists its required roles) rather than dynamically hidden per viewer. A read-restricted field +therefore still appears in the response *schema* with a "requires `readRole`" annotation, but is omitted from +the actual JSON at runtime for callers without the role. + +Implementation: the generator (`DynamicEntityHelper.operationToResourceDoc` / `APIMethodsDynamicEntity`) +filters the request-body schema by operation — strip `writeRoleRequired` fields from CREATE/PUT, keep them in +PATCH, keep readable fields in responses with annotations. + +## 9. Backward compatibility + +- Definitions without the new keywords behave exactly as today. +- Restricted fields are simply omitted (read) or ignored (PUT/CREATE write) for callers without roles. +- No new reserved property names; no migration. + +## 10. Security considerations + +- Authorisation uses the existing entitlement check (`hasEntitlement`). +- `readRole` omission must be applied consistently on GET_ONE and GET_ALL. +- Orthogonal to personal-entity (`userId`) row scoping. +- Entitlements are user-level; a service "key" is a service user's credential. + +## 11. Out of scope / related + +- **Per-field provenance** (who/when stamping) — companion RFC `…-permissions.md`. Composes with this. +- **Queryable DE GET** (field filtering/sorting/pagination) — separate enhancement; see + `dynamic_entity_indexing.md`. +- A new **PATCH** verb for DEs (the write path for restricted fields) — assumed by this RFC; if OBP has no + DE PATCH today, adding it is part of this work. + +## 12. Implementation touch points (OBP-API) + +- `code/dynamicEntity/DynamicEntityProvider.scala` — recognise `writeRole` / `readRole` / + `writeRoleRequired` / `readRoleRequired`; helper to list restricted fields; skip `required` for + role-restricted fields at create. +- `code/api/dynamic/entity/APIMethodsDynamicEntity.scala` & `Http4sDynamicEntity.scala` — + PUT/CREATE ignore-and-preserve of restricted fields; role-gated PATCH write path; `readRole` omission on + GET; operation-aware resource-doc generation. +- `DynamicEntityInfo` + `code/entitlement/*` — generate/register the implied roles. +- `JSONFactory6.0.0` / `ExampleValue.scala` / resource docs — document keywords + examples. + +## 13. Tests + +**Conventions (match existing OBP suites):** ScalaTest feature scenarios +(`scenario("x.y: …", VersionOfApi)` with Given/When/Then); grant roles via +`Entitlement.entitlement.vend.addEntitlement("", userId, role)`; drive endpoints with +`makePostRequest` / `makeGetRequest` / `makePutRequest` (add a `makePatchRequest` helper); +multiple `resourceUser`s for privileged vs ordinary callers. +**Home:** a new `obp-api/src/test/scala/code/api/v6_0_0/DynamicEntityFieldRolesTest.scala`, plus +regression additions to the existing `DynamicEntityTest`, `DynamicEntityAccessFlagsTest`, +`DynamicEntityFilterAndBankAccessTest`. + +**A. Definition & role generation** +- A.1 A definition with `writeRole`/`readRole`/`writeRoleRequired`/`readRoleRequired` parses and persists. +- A.2 `writeRoleRequired: true` (no explicit role) auto-generates `CanWriteDynamicEntityField___`, and it is grantable via the entitlement endpoints. +- A.3 Explicit `writeRole` is used verbatim; several fields (and entities) sharing one role all enforce it. +- A.4 Bank-scoped entity → bank-scoped field-role variant is generated and enforced. +- A.5 Restriction-on rule: boolean `true` **or** an explicit role each enables restriction; neither = unrestricted (today's behaviour). + +**B. Write — POST/CREATE (never writes restricted fields)** +- B.1 Ordinary consumer POSTs with a restricted field in the body → field ignored; record created (201); restricted field unset. +- B.2 Unrestricted fields in the same POST are written normally. +- B.3 A restricted field listed in `required` → consumer create still succeeds (`required` validated post-merge; restricted = optional-at-create), not rejected as "missing". +- B.4 Even a keyholder's POST does not set restricted fields (confirms PATCH is the only write path). + +**C. Write — PUT (ignore + preserve)** +- C.1 Consumer PUT omitting a restricted field → existing value preserved, not blanked. +- C.2 Consumer PUT echoing a **stale** restricted value → ignored; current value preserved; no error, no clobber (the stale-echo case). +- C.3 Consumer PUT changing only unrestricted fields → those update; restricted fields untouched. + +**D. Write — PATCH (the role-gated write path)** +- D.1 Caller **with** the field's `writeRole` PATCHes a restricted field → updated (200). +- D.2 Caller **without** the role PATCHes a restricted field → 403; value unchanged. +- D.3 Allowed caller PATCHes an unrestricted field → updated. +- D.4 Shared-role: one service user holding a single shared `writeRole` PATCHes restricted fields across multiple entities. + +**E. Read — GET one/all** +- E.1 Caller **without** `readRole` → read-restricted field omitted from GET_ONE **and** GET_ALL. +- E.2 Caller **with** `readRole` (and entity `CanGet`) → field present. +- E.3 Additive rule: holding the field `readRole` but **not** the entity `CanGet` → still cannot read the entity. + +**F. Operation-aware resource docs** +- F.1 POST/PUT resource docs: restricted fields absent from `typed_request_body` + `example_request_body`. +- F.2 PATCH resource docs: restricted fields present, annotated with the required role. +- F.3 GET resource docs: restricted fields present in the response schema; read-restricted ones annotated. + +**G. Backward compatibility** +- G.1 A definition with none of the new keywords behaves exactly as today (existing DE suites pass unchanged). +- G.2 Existing endpoints/behaviour unaffected when no field is restricted. + +**H. Security / negative** +- H.1 Tampering via PUT (changed restricted value) is silently ignored — no privilege escalation. +- H.2 Tampering via PATCH without the role → 403, value unchanged. +- H.3 Revoking the entitlement: previously-allowed PATCH now 403; read-restricted field now omitted on GET. + +**I. Personal-entity interaction** +- I.1 Field-level roles compose with personal (`userId`) row scoping — both enforced, orthogonally. + +**Harness note:** OBP DE tests today exercise GET/POST/PUT/DELETE; a **`makePatchRequest`** helper (and PATCH routing for DEs) is a prerequisite, since PATCH is new for Dynamic Entities. + +## 14. Open questions + +- Multiple write roles per field (OR semantics) — needed, or one role per field enough? +- PATCH semantics confirmation (partial update verb) vs reusing PUT with a flag. +- Naming of the auto-generated role delimiter (`__`) — final convention. + +## 15. Implementation plan (v7.0.0) + +Branch: `feature/de-field-level-permissions`. Locked decisions: target **v7.0.0**; introduce **PATCH** as the +restricted-field write path; enforce in the **handler layer** (`Http4sDynamicEntity`, which has `callContext`/user); +use the recommended `handleEntitlementsAndScopes` for new checks (boolean `APIUtil.hasEntitlement` for per-field loops). + +Grounding (from code recon): +- CRUD: `Http4sDynamicEntity.scala` — `genericPost/genericGet/genericPut/genericDelete`; role checks via + `NewStyle.function.hasEntitlement(... DynamicEntityInfo.canXRole(entityName, bankId) ...)`; data + body-validation via + `NewStyle.function.invokeDynamicConnector(op, ...)`. **No PATCH route** (dispatch match on `(req.method, rest)`). +- Definition validation: `DynamicEntityProvider.scala` — `validateEntityJson` (runtime body) and + `DynamicEntityCommons.apply` (definition-time schema). Per-field foreach validates type/example/minLength. +- Roles: `DynamicEntityHelper.scala` → `DynamicEntityInfo.canCreateRole/...` (`CanCreateDynamicEntity_System` + or bank variant) via `ApiRole.getOrCreateDynamicApiRole`; registered through `dynamicEntityRoles`/`roleNames`. +- Data provider: `MapppedDynamicDataProvider.scala` — `save/update/get/getAll/getAllDataJson` (stores `dataJson`). +- Docs: `DynamicEntityHelper.createDocs` + `DynamicEntityInfo.getSingleExampleWithoutId/getSingleExample`. + +**Phase 1 — schema keywords (DynamicEntityProvider.scala).** Recognise per-property `writeRole`/`readRole` (string) +and `writeRoleRequired`/`readRoleRequired` (boolean) in `DynamicEntityCommons.apply`; add `writeRestrictedFields`/ +`readRestrictedFields` + `explicitWriteRole`/`explicitReadRole` helpers on `DynamicEntityT`; make `validateEntityJson` +skip `required` for write-restricted fields. Backward compatible (absence ⇒ unrestricted). +**Phase 2 — roles (DynamicEntityHelper `DynamicEntityInfo` + ApiRole).** `fieldWriteRole/fieldReadRole` auto-names +(`CanWriteDynamicEntityField___`); register explicit role strings; extend `roleNames`/`dynamicEntityRoles`. +**Phase 3 — POST/PUT enforcement (Http4sDynamicEntity).** POST strips restricted fields; PUT strips + merges existing +restricted values back before `invokeDynamicConnector(UPDATE)`. +**Phase 4 — PATCH (Http4sDynamicEntity).** Add `Method.PATCH => genericPatch`; per-field `writeRole` check (403 else), +merge authorised fields into existing record; register route + resource doc. +**Phase 5 — GET read omission (Http4sDynamicEntity `genericGet`/`publicGet`/`communityGet`).** Omit fields the caller +lacks `readRole` for (GET_ONE + GET_ALL); public/community omit all read-restricted. +**Phase 6 — operation-aware resource docs (DynamicEntityHelper).** CREATE/UPDATE request examples exclude restricted +fields; add PATCH docs; annotate read-restricted in responses. +**Phase 7 — tests.** New `obp-api/src/test/scala/code/api/v7_0_0/DynamicEntityFieldRolesTest.scala` + `makePatchRequest` +helper, covering §13 A–I. +**Phase 8 — docs/changelog.** + +Status: Phase 1 DONE (compiled, runs). Phase 2 DONE — pending compile: +- `DynamicEntityHelper.scala`: `DynamicEntityInfo.fieldWriteRole`/`fieldReadRole` (explicit shared role, or auto + `CanWriteDynamicEntityField_[System]__` / `CanGetDynamicEntityField_...`); `dynamicEntityRoles` + now also emits per-field roles (`.distinct`), so they're grantable via the existing `ApiRole.valueOf` path. +Phase 2b DONE — v6.0.0 create-DE docs (`Http4s600.scala`) now advertise the keywords (markdown + structured request/response examples + Note bullet). +Phase 3 DONE — pending compile: `Http4sDynamicEntity.scala` — `genericPost` strips write-restricted fields; `genericPut` strips + re-injects existing restricted values (preserve); helpers `writeRestrictedFieldsOf`/`stripFields`/`preserveRestrictedOnPut`; restricted-field helpers added to `DynamicEntityInfo`. +Phase 4 DONE — pending compile: `Http4sDynamicEntity.scala` — added `Method.PATCH` route + `genericPatch` +(baseline `canUpdateRole`; per-field `writeRole` check → 403 via `missingFieldWriteRoleNames`; partial-update +`mergePatch` of incoming over existing, bounded to schema fields); `propertyNames` added to `DynamicEntityInfo`. +Note: PATCH route works at runtime but has no resource doc yet (Phase 6) — test via curl/Postman, not API Explorer. +Phase 5 DONE — pending compile: `Http4sDynamicEntity.scala` — `genericGet`/`publicGet`/`communityGet` omit +read-restricted fields via `applyReadRestrictions`/`omitFields` (authenticated → per-user role check; public → +omit all read-restricted). Internal merge fetches in PUT/PATCH stay unfiltered. +Phase 6 PARTIAL DONE — pending compile: `DynamicEntityHelper.scala` — added `getSingleExampleWithoutIdWritable`; +CREATE/UPDATE request-body examples now exclude write-restricted fields (responses keep all). +Phase 6 REMAINING: PATCH resource doc (needs a new `DynamicEntityOperation.PATCH` enum value in obp-commons + +a createDocs branch + doc-pipeline wiring; PATCH already works at runtime), and read-restricted annotations in +response-body docs. +Phase 6 COMPLETE — pending compile: PATCH resource doc (generic + my/) via new `DynamicEntityOperation.PATCH` +enum value (obp-commons) + `buildPatchFunctionName` + createDocs branches; restriction notes appended to +`fieldsDescription`. (The connector match in LocalMappedConnector casts to `Any`, so the new enum value is safe.) +Phase 7 DONE — pending compile: added `makePatchRequest` to `SendServerRequests.scala`; new test +`obp-api/src/test/scala/code/api/v6_0_0/DynamicEntityFieldRolesTest.scala` (placed in v6_0_0 because DE creation is +v6.0.0 + harness lives there) covering: definition-create, POST-drops-write-restricted, PUT-can't-set, PATCH 403→grant→200, +GET read-omission→grant→visible. NOT yet run — may need iteration (role-string exactness, dispatch `.PATCH`). +Phase 8 DONE — brief `release_notes.md` entry (2026-06-05) + field-level permissions section added to the +"Dynamic-Entities" glossary item in `Glossary.scala`. + +ALL PHASES 1–8 IMPLEMENTED on branch `feature/de-field-level-permissions` (Phases 1–4 compiled+spot-tested by user; +5–8 pending compile/test). Provenance (companion RFC `DE_field_provenance.md`) is a separate later PR. diff --git a/ideas/DE_indexing_plan.md b/ideas/DE_indexing_plan.md new file mode 100644 index 0000000000..9af78f0f31 --- /dev/null +++ b/ideas/DE_indexing_plan.md @@ -0,0 +1,142 @@ +# DE Indexing — Implementation Plan + +**Branch:** `DE_indexing` +**Design doc:** [`dynamic_entity_indexing.md`](dynamic_entity_indexing.md) (read first — this plan implements the "Approach A" decision there) +**Status:** Draft plan + +## Progress + +- **Phase 0 — DONE** (compiles). `indexed`/`index` declaration parsed+validated in `DynamicEntityProvider.scala`; query package `code/api/dynamic/entity/query/` (`QueryModel`, `OperatorMatrix`, `DynamicEntityQueryBackend` seam, `InMemoryQueryExecutor`). +- **Phase 1 — core DONE** (compiles; 15/15 unit tests green in `QuerySpec`). `QueryParamParser` (`obp_filter[FIELD]=OP:VALUE` + `obp_sort_by`/`obp_sort_direction`/`obp_offset`/`obp_limit`), `QueryPlanner` (4-check validation → 400), `DynamicEntityInfo.indexedFields`, wired into `genericGet`/`publicGet`/`communityGet` (in-memory backend). +- **Back-compat DECIDED — Option 1 (additive, no version):** `/obp/dynamic-entity/` is **unversioned and the bare-param filter is documented**, so the legacy contract is preserved **byte-for-byte**: + - **Legacy** bare-param equality (`?name=A&number=1&number=2` => `name==A && (number==1 || number==2)`, deep-equality for json fields, dotted paths) is **kept** via the restored `filterDynamicObjects` (uses `JsonUtils.isFieldEquals`). Runs in-memory on any field. `locale` and `obp_*` excluded. + - **New** capabilities are **additive**: `obp_filter[field]=op:value` (operators/range/spatial — require an `indexed` field, else 400), plus `obp_sort_*`/`obp_offset`/`obp_limit`. Composition per request: legacy filter → new operators → sort → paginate. + - **Transparent acceleration (Phase 3):** both syntaxes compile to one `QueryPlan`; when **all** queried fields are `indexed`, the query routes to the SQL projection (fast, scales to millions) — the *same legacy URL* gets faster with no client change. If any field is unindexed → in-memory (status quo). + - **No client breaks; G1 passes unchanged.** +- **Future direction (documented now to avoid surprise):** we reserve the right to **enforce the SQL-projection path even for small datasets** — i.e. require hot fields be `indexed` and cap/deprecate the unbounded in-memory legacy scan (returning a clear "declare this field indexed" error instead of risking RAM/OOM). In-memory is fine to ~thousands; large/GeoJSON entities degrade in the low tens-of-thousands and OOM under concurrency, so this enforcement will land before that bites. **Must be added to the public DE docs.** +- **Phase 1 — remaining:** legacy parity confirmed (`DynamicEntityFilterAndBankAccessTest` G1 passes **unchanged**, 5/5). Still to do: document the param contract + future-enforcement in public DE docs. +- **Phase 2 — skeleton DONE** (compiles; `ProjectionNamingSpec` 5/5 green). New package `code/api/dynamic/entity/projection/`: + - `ProjectionNaming` — deterministic, hashed, length/charset-safe `de_` / `c_` identifiers (the only identifier-safety surface; `Fragment.const` injects them). + - `IndexingCapabilities` — vendor detection (`DoobieUtil.isSqlServer` / `DBUtil.dbUrl`), PostGIS-extension probe, `dynamic_entity.indexing.backend` kill-switch (`inmemory` default | `auto`). + - `DynamicEntityIndex` Mapper (registry) + `ProjectionState` constants; registered in `Boot.ToSchemify`. + - `ProjectionDDL` — idempotent Doobie DDL (create table / add column / `CREATE INDEX CONCURRENTLY` / drop) + DE-type→SQL-type mapping. **Skeleton: not yet invoked.** +- **Phase 3 — IN PROGRESS.** Test suite now runs on Postgres (`obp_test_only`, confirmed via boot log `DatabaseInfoJson(PostgreSQL,14.23)`; `db.url` set locally in `test.default.props`, uncommitted). All Phase 3 work is **guarded by `projectionEnabled` (prop default `inmemory` → off)**, so default behaviour is unchanged. + - **3.4 (core) DONE** — `ProjectionSql` compiles a `QueryPlan` → `SELECT data_id FROM WHERE … ORDER BY … LIMIT ? OFFSET ?`. Operands bind as text + `CAST` to the column type in-SQL (no per-type `Put`, no value-injection); identifiers via `Fragment.const` on hashed names; scalar ops only (spatial = Phase 4). `ProjectionSqlSpec` 3/3 green. + - **Data-plane + provisioner + backend DONE & proven on Postgres** (`ProjectionDataPlaneIntegrationTest` 1/1 green on PG 14): + - `ProjectionDb` — committing Doobie transactor over the shared pool, for independent provisioner/backend ops (vs `DoobieUtil`'s `Strategy.void` which shares Lift's txn — that's for the dual-write). + - `ProjectionCoerce` — JValue→column-text coerce-or-null (mirrors planner rules). + - `ProjectionStore` — upsert / delete / `readBlobRows` + access `scope` (mirrors `MappedDynamicDataProvider` get-all scoping), identifiers from the live `DynamicData` mapper. + - `ProjectionProvisioner` — `ensureProvisioned` (create table → add columns → app-side backfill coerce-or-null → `CREATE INDEX` → mark `ready` in registry) + `readyFields`. (Scalar only; spatial = Phase 4. Plain `CREATE INDEX`; `CONCURRENTLY` = prod refinement.) + - `PostgresProjectionBackend.query` — one JOIN query (projection ⋈ canonical) doing scope + filter + sort + paginate, returning blobs in order. + - **Live-path wiring DONE & no-regression-verified** (compiles; DynamicEntityFilterAndBankAccessTest + ProjectionDataPlaneIntegrationTest 6/6 green with projection off): + - `ProjectionDualWrite.onSave/onDelete` hooked into `MappedDynamicDataProvider.saveOrUpdate`/`delete` — txn-unified via `DoobieUtil.runQuery`; guarded + no-op unless `projectionEnabled` and the entity has a `ready` projection. + - **Read-path backend selection** in `genericGet`: `decideProjection` → projection when `projectionEnabled` + no legacy bare params + every plan field indexed & ready (skips the fetch-all connector call, serves from SQL); indexed-but-not-ready field → **409** (`ProjectionPendingMsg`); else in-memory. (`cats.effect` IORuntime aliased to avoid clashing with the file's EC `global`.) + - **Provisioning trigger** in `MappedDynamicEntityProvider.createOrUpdate` (`ensureProvisionedFields`, fields passed explicitly since the new definition isn't committed yet) — guarded, best-effort (failure logs, leaves definition saved + queries pending). + - Default off (`dynamic_entity.indexing.backend=inmemory`) → entire wiring is inert; existing DE behavior unchanged. + - **Remaining:** projection-**ON** end-to-end test through the HTTP handlers (needs `backend=auto`; create indexed entity → records → query routes to SQL); extend selection to `publicGet`/`communityGet` (different scoping — deferred); legacy bare-param → SQL unification (deferred); `CREATE INDEX CONCURRENTLY` + batched/resumable backfill (prod refinements); spatial = Phase 4. + - **Test-infra caveat:** the suite now points at Postgres (`test.default.props`, local). Persistent Postgres + OBP's re-schemify/migrations (a view/matview) means **repeated full-suite runs need a clean schema** — `psql "" -c "DROP OWNED BY obp_test_only CASCADE;"` (PostGIS objects are owned by `postgres`, untouched), or `DROP_EXISTING=true ./scripts/create_test_db.sh`. Pure unit tests (`QuerySpec`, `ProjectionSqlSpec`, `ProjectionNamingSpec`) are unaffected and also run on H2. + +## Guiding principle + +**Dependable, predictable, simple.** This is a banking API also used by EU academic projects — it must not surprise operators. Concretely: **a query is either served by a ready, bounded SQL path or it returns a clear error — never a silent in-memory fallback that could spike RAM on a large entity.** Prefer the simplest design that meets the contract; reject anything we can't serve predictably. + +Add declarative **filter / sort / paginate** and **spatial** querying to Dynamic Entity list reads, behind a stable, vendor-neutral API contract (Shape B), with **Approach A** (per-entity typed projection tables, automatic DDL) as the accelerated backend and an **in-memory backend** as the portable floor. + +## Grounding — what exists today (from code audit) + +| Concern | Current state | File | +|---|---|---| +| Canonical store | `DynamicData` Mapper, `DataJson` text col, keyed by `DynamicEntityName` | `code/dynamicEntity/MapppedDynamicDataProvider.scala` | +| Write path | `saveOrUpdate(...)` → `saveMe()`; `delete_!` | same, ~L202 / L127 | +| Read path | `find()` / `findAll()` with `By()` — **no LIMIT/OFFSET/ORDER BY** | same, L46–172 | +| Filtering | **in-memory only** via `filterDynamicObjects()` | `code/api/dynamic/entity/Http4sDynamicEntity.scala` ~L108 | +| Definition | `DynamicEntity.MetadataJson` (nested JSON); per-field flags already parsed | `code/dynamicEntity/MapppedDynamicEntityProvider.scala`, `DynamicEntityProvider.scala` L76–105, L597+ | +| Field types | `number, integer, boolean, string, DATE_WITH_DAY, json` (+ reference types) | `obp-commons/.../enums/Enumerations.scala` L199–275 | +| Endpoints | http4s only; `genericGet/Post/Put/Patch/Delete`, `publicGet`, `communityGet` | `Http4sDynamicEntity.scala` L231–403 | +| DDL/migration | Lift Schemifier (static models); legacy `migration.Migration` uses Lift `DB.use`+JDBC | `bootstrap/liftweb/Boot.scala`, `code/api/util/migration/Migration.scala` | +| **Raw SQL / DDL library** | **Doobie** (`doobie-core`/`doobie-hikari` 1.0.0-RC4) via `DoobieUtil` — already in use | `obp-api/pom.xml` L450–459, `code/api/util/DoobieTransactor.scala` | +| Spatial | none — greenfield | — | + +**Two architectural consequences:** +1. Projection tables are dynamic, so they live **outside Lift Schemifier** — provisioned and queried via **Doobie** (`DoobieUtil`), not Lift Mapper (which can't model runtime tables). `DoobieUtil.runQuery` reuses Lift's request Connection (transaction unification — perfect for dual-write); `runQueryIO` returns `IO` (perfect for the query backend); both share the Hikari pool. **Caveat:** Doobie parameterizes *values* but not *identifiers* — `de_`/`c_` go through `Fragment.const`, so identifier safety rests on our hashing, not Doobie. +2. The in-memory backend can reproduce today's exact behaviour, so Phases 0–1 ship the contract + validation **with zero DDL and zero regression**. + +## Naming constraints (per project feedback) + +- New Mapper subclasses must **not** start with `Mapped`; column objects must **not** be `m`+Uppercase; no `X` suffix on new vendor objects. +- Registry mapper: `DynamicEntityIndex` / provider `DynamicEntityIndexProvider`. Backend objects: `PostgresProjectionBackend`, `InMemoryQueryBackend`, etc. + +--- + +## Phases (each independently shippable) + +### Phase 0 — Contract scaffolding (no storage, no DDL) +**Outcome:** the query grammar, the abstract plan, the backend seam, and the `indexed` field declaration exist; behaviour unchanged. + +- **0.1** Extend definition parsing (`DynamicEntityProvider.scala`) to read optional per-field `indexed` metadata: `indexed: true`, `index: "scalar"|"spatial"` (default `scalar`), optional `path: "a.b"` (nested scalar). Validate at definition time: + - `indexed:true` on a `json` field is **only** allowed with `index:"spatial"` (GeoJSON) — else reject. + - `index:"spatial"` is **only** allowed on a `json` field — else reject. + - `path` only with a scalar type. +- **0.2** Define the abstract query model (new package `code/api/dynamic/entity/query/`): `QueryPlan(filters: List[Filter], sort: List[SortKey], page: Page)`, `Filter(field, op, value)`, operator enum (`eq, ne, in, lt, gt, le, ge, between, like, within, contains, intersects, dwithin`). +- **0.3** Define `DynamicEntityQueryBackend` trait (Shape B seam): `def query(entity, definition, plan): Future[(List[JObject], Long /*total*/)]` + `def provision(definition): Future[Unit]` (no-op default). +- **0.4** Implement `InMemoryQueryBackend` that fetches via the existing provider and applies filter/sort/paginate in memory (supersedes `filterDynamicObjects`). This is the portable floor and the test oracle. + +### Phase 1 — Planner + validation + wire-in (still in-memory) +**Outcome:** real filter/sort/pagination on every endpoint, validated, served by the in-memory backend. + +- **1.1** Query-param parser. **Pagination + sort reuse OBP's standard params** (`obp_offset`, `obp_limit`, `obp_sort_by`, `obp_sort_direction`) — reuse the existing `createQueriesByHttpParamsFuture` / `OBPQueryParam` machinery rather than inventing new param names. **Field filtering** uses our own grammar `filter[field][op]=value` (not covered by the standard params). (DE is http4s-native, so params are in the query string.) +- **1.2** The 4-check planner: (a) field declared `indexed`? (b) operator legal for the field's **type** (operator matrix); (c) value coerces to type; (d) sortable type? → clear `400`s naming field+type+operator. Closed allow-list. +- **1.3** **No response envelope, no total count.** List reads keep returning a **bare `JArray`** (no breaking change). Pagination is **offset/limit only** — deliberately *no* `COUNT(*)` / total-pages (offset/limit doesn't need it, and we often can't know the total cheaply). Clients page by advancing `obp_offset`. +- **1.4** Wire planner+backend into `genericGet` / `communityGet` / `publicGet`; delete `filterDynamicObjects`. +- **1.5** Tests: validation 400s (bad field / bad op-for-type / uncoercible value / sort on json), filter/sort/paginate correctness, no-regression on existing DE tests. + +### Phase 2 — Registry + provisioner skeleton + capability detection +**Outcome:** the machinery to manage projection tables exists; not yet wired to writes/reads. + +- **2.1** `DynamicEntityIndex` Mapper (registry): `entityName, fieldName, fieldType, indexKind, safeTableName, safeColumnName, state, backfillCheckpoint, rowCountExpected, coercionErrors, lastError, provisionerVersion`. Add to `ToSchemify`. +- **2.2** Identifier safety: deterministic `de_` / `c_` generation + collision/length handling. +- **2.3** Capability detection at startup: vendor (`DoobieUtil.isSqlServer` / `DBUtil.dbUrl` already exist), **PostGIS extension present?** (`SELECT 1 FROM pg_extension WHERE extname='postgis'`), plus an operator **kill-switch prop** (`dynamic_entity.indexing.backend = auto|inmemory`). Selects the backend once. +- **2.4** Idempotent, resumable DDL runner via **Doobie** (`DoobieUtil.runQueryIO`): `CREATE TABLE IF NOT EXISTS`, `ALTER … ADD COLUMN`, `CREATE INDEX CONCURRENTLY`, reaping `INVALID` indexes. Identifiers via `Fragment.const` on hashed names only. Note: `CONCURRENTLY` must run **outside** a transaction (autocommit connection on the fallback pool). + +### Phase 3 — Postgres projection backend, scalar fields +**Outcome:** Approach A live for scalar indexed fields on Postgres. + +- **3.1** Field-type → column mapping (string→text, number→numeric, integer→bigint, boolean→boolean, DATE_WITH_DAY→date; reference→text). Coerce-or-null on backfill. +- **3.2** Provisioning lifecycle (state machine): provision column → **dual-write on** → batched resumable backfill (`INSERT … ON CONFLICT`) → `CREATE INDEX CONCURRENTLY` → verify count → flip `ready`. +- **3.3** Dual-write hook in `saveOrUpdate` / `delete` via `DoobieUtil.runQuery` (reuses Lift's request Connection → **same transaction** as the blob write, same commit/rollback; FK `ON DELETE CASCADE`). Uses the already-parsed JObject — no DB trigger. +- **3.4** `PostgresProjectionBackend.query(plan)` → Doobie `ConnectionIO` (`WHERE`/`ORDER BY`/`LIMIT/OFFSET`; keyset option) against `de_` via `runQueryIO`, returning `data_id`s → hydrate JObjects from canonical blob (blob stays source of truth). Bound params for values; `Fragment.const` for the (hashed) table/column identifiers. +- **3.5** Readiness gating: planner routes to projection only when every touched field is `ready`; otherwise in-memory fallback (configurable) — never partial results. +- **3.6** Tests (gated on Postgres; in-memory equivalence as oracle). + +### Phase 4 — Spatial (PostGIS, `geography(4326)`) +**Outcome:** "parcels within 100 km" and within-area predicates. + +- **4.1** `spatial` index kind → `geography(…, 4326)` column via `ST_GeomFromGeoJSON(dataJson->'geom')`; **GiST** index; `ST_MakeValid` on invalid geometries. +- **4.2** Spatial predicates in planner: `dwithin` (metres, → `ST_DWithin`), `within`, `contains`, `intersects`. URL: `filter[geom][dwithin]=,;100000`. Range/sort N/A for spatial. +- **4.3** Capability gate: spatial only when PostGIS detected; else reject (or in-memory JTS floor — decide). +- **4.4** Tests (need PostGIS in CI — see Risks). + +### Phase 5 — Schema evolution, recovery, (optional) SQL Server +- **5.1** Add/remove indexed field; **type change** (add-new-column→swap→retire); field rename (remap+rebackfill); entity delete (drop table). +- **5.2** Rebuild/recovery admin action: drop projection, rebuild from blob. +- **5.3** *(Deferred / optional)* `SqlServerProjectionBackend` (JSON_VALUE/computed columns + SQL Server spatial). Only if a SQL Server deployment needs it. + +--- + +## Decisions + +**Resolved:** +1. **Pagination / response shape — DECIDED.** Keep returning a **bare `JArray`** (no envelope, no breaking change). **Offset/limit only**, reusing OBP's `obp_offset` / `obp_limit` (+ `obp_sort_by` / `obp_sort_direction`). **No total count / no page-number** scheme — offset/limit doesn't need it and avoids a `COUNT(*)`. +2. **Non-indexed filter policy — DECIDED.** **Reject with 400** (closed allow-list). No per-query in-memory fallback. +3. **Pending-field query policy — DECIDED.** A query touching a field whose projection is not `ready` (`provisioning`/`backfilling`) **returns a clear error** (e.g. 409 "field not yet queryable; retry shortly"). **Never** an in-memory fallback — predictability over availability, no RAM surprises. +4. **SQL Server — DECIDED: defer.** Build Postgres + in-memory now; add `SqlServerProjectionBackend` (Phase 5.3) only when a real MSSQL deployment needs it. + +**Role of the in-memory backend (clarified):** it is the **portable floor for deployments with no projection backend** (non-Postgres, or DDL/indexing disabled), serving the same contract best-effort and honouring `obp_limit`/`obp_offset`. On a projection-capable deployment it is **not** used as a per-query fallback — unservable queries (non-indexed or pending field) return an error instead. + +## Risks / notes + +- **CI PostGIS**: Phases 3–4 tests need Postgres (and PostGIS for 4). The in-memory backend keeps the bulk of tests vendor-free; gate the projection/spatial tests on capability so they skip cleanly where the extension is absent. +- **Doobie identifier safety**: Doobie binds *values* safely but not *identifiers*. All projection SQL goes through one small helper; table/column names are hashed (`de_`/`c_`) and injected via `Fragment.const`; all user-supplied values are bound params. Never interpolate a raw user string into an identifier. +- **Transaction scope**: dual-write uses `DoobieUtil.runQuery` (reuses Lift's request Connection via transaction unification — see `DoobieTransactor.scala` / `RequestScopeConnection`), so it commits/rolls back with the blob write. +- **`CONCURRENTLY` outside a transaction**: `CREATE INDEX CONCURRENTLY` (and reindex) cannot run inside a txn block — provisioner must use an autocommit connection on the fallback pool, not the request connection. +- **ResourceDocs**: the new query params (filter/sort/page) should be documented on the DE endpoints' ResourceDocs. diff --git a/ideas/dynamic_entity_indexing.md b/ideas/dynamic_entity_indexing.md new file mode 100644 index 0000000000..56154e3167 --- /dev/null +++ b/ideas/dynamic_entity_indexing.md @@ -0,0 +1,246 @@ +# Dynamic Entity indexing & querying + +**Status:** Draft / exploration +**Scope:** OBP-API — making Dynamic Entity (DE) list reads filterable, sortable and paginatable, portably across Postgres and Microsoft SQL Server. + +--- + +## Problem + +DE data is stored as a JSON blob (`MappedDynamicData.dataJson`) in a single generic table keyed by `DynamicEntityName`. Reads are **get-by-id** or **get-all** only — there is no field filtering, sorting, pagination, or cross-field querying at the storage layer. Lists are therefore fetched whole and filtered in application memory, which does not scale. + +We want generic, declarative querying — `GET /obp/dynamic-entity/?filter=…&sort=…&page=…` — **without** depending on a specific SQL dialect (OBP runs mostly on Postgres, sometimes on SQL Server), and without per-request RPC to external systems. + +## Principle: separate contract from implementation + +The query **contract** (filter/sort/paginate declared fields) must be identical on every database. Only the **implementation** of how the DB satisfies it varies. DB-native JSON (Postgres `jsonb`, SQL Server `JSON_VALUE`/`OPENJSON`) is therefore an **optional accelerator behind capability detection**, never a requirement. The portable baseline must work with plain relational SQL through the existing Lift Mapper DSL. + +## How a generic endpoint knows what to query: a definition-driven planner + +Queryability is **declared in the entity definition**, not guessed. A field is filterable/sortable only if its property schema marks it (e.g. `indexed: true`) with a declared type. The generic DE GET then: + +1. Loads the definition for `entityName`. +2. Reads the set of `indexed` fields, their types, and their storage mapping. +3. Validates incoming `filter`/`sort` params against that allow-list — rejects (or in-memory-fallback) anything not declared queryable. +4. Emits portable `WHERE` / `ORDER BY` / `LIMIT/OFFSET` against the mapped storage. + +So the planner reads the schema to translate query params → SQL. Queryability is declared, discoverable and validated. + +## Storage options (where the value physically lands) + +### Option 1 — generic "slot" columns on `MappedDynamicData` +Add a fixed set of typed, indexed slots once: `idx_str_1..k`, `idx_num_1..k`, `idx_ts_1..k`, `idx_bool_1..k`. The definition stores a per-entity **field→slot map** (e.g. `price→idx_num_1`). +- **Write:** the DE write path writes `dataJson` **and** the mapped slot columns in one transaction. +- **Query:** planner rewrites `price < 10` → `idx_num_1 < 10`, using a `(DynamicEntityName, idx_num_1)` composite index. Plain portable SQL. +- **Cost:** cap on #indexed fields per entity; type→slot coercion; slot indexes shared across entities (mitigated by the entity-name-leading composite index). + +### Option 2 — EAV side-index table (current lean; needs careful future consideration) +A side table, e.g. `DynamicDataIndex(entityName, dataId, fieldName, numVal, strVal, tsVal, boolVal)`, one row per indexed field per record, with indexes on `(entityName, fieldName, numVal)` etc. +- **Write:** the DE write path upserts index rows alongside the blob (delete+reinsert or upsert per indexed field). +- **Query:** each predicate becomes a join / `EXISTS` against the index table; multiple predicates = multiple joins; sort = join + `ORDER BY`; pagination over the joined result. +- **Pros:** unlimited indexed fields, fully generic, **no DDL** per entity, portable. +- **Cons / open concerns to evaluate:** + - Multi-predicate queries become **join-heavy**; query-planner complexity and performance need careful design (one self-join per filtered field). + - **Pagination + sorting** across multiple joined attributes is the tricky part to get correct and fast. + - Type handling: separate typed value columns vs a single stringified value (affects range/numeric/date comparisons and index usage). + - Index strategy on the EAV table (composite `(entityName, fieldName, )`). + - Write amplification: N index-row upserts per record write. + - Consistency between blob and index rows (same transaction). + +### Option 3 — per-entity generated tables (dynamic DDL) — **CHOSEN (as "Approach A")** +At definition time, `CREATE TABLE de_(...)` with real typed columns, provisioned by **automatic DDL** (see the lifecycle section below). +- **Pros:** cleanest, fastest SQL; best indexes; range/sort/keyset pagination is just ordinary typed SQL with no cast roulette and no joins. +- **Cons (now acceptable):** the original objections were runtime-DDL locking, cross-DB DDL portability, and schema evolution. In our deployment profile these are manageable: **few entities (≈30), some with many rows.** Few entities → no catalog-sprawl cost. The lock blast radius of any DDL is **one entity's table**, not the global blob table, and every heavy operation (backfill, index build, type change) is done **online + batched + resumable** (see lifecycle). Cross-DB DDL is emitted per-backend behind Shape B. **This is the chosen direction.** + +## Optional DB-native JSON accelerator + +Independently of the above, a native-JSON path can be enabled by **capability detection**: +- Detect vendor at startup. +- Postgres + jsonb enabled → `jsonb` operators (`->>`, `@>`) + GIN/expression indexes. +- SQL Server → `JSON_VALUE` / `OPENJSON`, ideally persisted computed columns that can be indexed. +- Unknown / disabled → portable path (Option 1/2) or in-memory fallback. + +Same API contract; conservative portable default; accelerated where available. `jsonb` stays opt-in, never on the critical path. (Storing as `jsonb` vs `nvarchar(max)` is itself vendor-specific DDL, so the canonical store should remain portable text and any jsonb use be additive.) + +## Approach A — per-entity projection tables: lifecycle (CHOSEN) + +This is the concrete plan for the chosen direction. It fits under Shape B as a `PerEntityTableBackend` with a `provision(definition)` hook. + +### Canonical store: JSON stays primary, the table is a derived projection + +**Decision: the JSON blob (`MappedDynamicData.dataJson`) remains canonical; `de_` is a rebuildable typed projection of only the *indexed* fields.** We do **not** (yet) make the per-entity tables the primary data source. Reasons: + +- **DEs are schemaless with an *indexed subset*.** The projection only carries fields marked `indexed` — it is not a complete representation of a record, so it cannot be the sole store without either materialising *every* field as a column or adding a JSON overflow column. Keeping the blob canonical sidesteps that. +- **Automatic DDL stays safe because canonical data never moves.** Every operation (create, add field, type change, rename, rebuild) is online and reversible precisely because the source of truth is untouched. The worst-case recovery is always "drop the projection and rebuild from the blob" — never data loss. +- **Smaller blast radius.** `get-by-id` / `get-all` and the generic DE write path keep working against the single blob table unchanged; only the *query* path consults the projection. A DE with **no** indexed fields needs no table and no DDL at all. +- **No big-bang migration.** Existing data already lives in the blob; projections are built incrementally per indexed field. + +**When it *would* make sense to flip to tables-as-primary:** only if DEs evolve toward **fully-declared schemas** (every field typed and declared, not a schemaless bag). Then `de_` could hold typed columns for all fields **plus a JSON overflow column** for any undeclared extras, and the blob could be retired. The trade is real: tables-as-primary means **DDL-before-any-write** (you can't store a record until its table exists), schema evolution now mutates canonical data (losing the "reversible because canonical never moves" safety net), and the get-by-id/get-all machinery must become per-entity-table-aware. Until the schema model changes, **canonical JSON + derived projection wins.** Recorded as an open revisit, not a near-term change. + +### Metadata registry + +A small registry table (managed by the provisioner, **outside** Lift's schemifier) drives everything, per entity / per indexed field: `entity_name, field_name (JSON key), field_type, safe_table_name, safe_column_name, state, backfill_checkpoint, row_count_expected, coercion_errors, last_error, provisioner_version`. `safe_table_name` / `safe_column_name` are **generated** (`de_`, `c_`) — raw user-supplied entity/field names never reach a DDL string (injection + identifier-length/collision safety). The registry maps the safe name back to the JSON key. + +### Field-type → column mapping (and the `json` exclusion) + +A DE field's declared `type` is one of `DynamicEntityFieldType`: `number`, `integer`, `boolean`, `string`, `DATE_WITH_DAY`, `json` — plus the reference types (string IDs underneath). They fall into two categories: + +**Scalar types → clean typed B-tree columns (indexable):** + +| DE field type | Projection column | Coercion note | +|---|---|---| +| `string` (+ reference types) | `text` / `varchar` | direct | +| `number` | `numeric` / `double precision` | value may arrive as `JDouble` **or** `JInt` — accept both | +| `integer` | `bigint` | from `JInt` | +| `boolean` | `boolean` / `bit` | value may be `JBool` **or** the strings `"true"`/`"false"` — handle both | +| `DATE_WITH_DAY` | `date` | source is a `yyyy-MM-dd` string → cast to `date` | + +These support filter/range/sort/keyset pagination natively. The lifecycle's coerce-or-null policy applies (a non-coercible value → `NULL` + `coercion_errors++`, never fails the backfill). + +**`json` type (JObject / JArray) → NOT eligible for a typed column:** +- A whole object/array is not a scalar, so it cannot become a B-tree-orderable column. **Reject `indexed: true` on a `json`-typed field at definition time** with a clear error — an extension of the declared-indexed allow-list the planner already enforces. +- `json` fields stay in the canonical blob and remain fully readable via `get-by-id` / `get-all`; they are simply not filterable/sortable through the projection. +- Querying *into* a `json` field (array-contains-X, nested-key=Y) is a **containment** query — the GIN/`jsonb` equality case, explicitly **outside** Approach A. Route it to the jsonb accelerator or in-memory fallback under Shape B, never to a `de_` column. + +**Middle case — a scalar at a known nested path (optional extension):** a scalar buried inside a `json` field can be indexed by declaring a **dotted index path with a scalar type** (e.g. `path: "address.city", type: string`), projected to a `text` column from `dataJson #>> '{address,city}'`. Nested *scalars* are reachable this way; nested *objects/arrays as a whole* are not. + +> Only fields marked `indexed` get a column at all. Non-indexed scalars (like `json` fields) stay canonical-only in the blob. +> +> **Spatial carve-out:** the "`json` is not indexable" rule has one exception — a `json` field holding **GeoJSON** *is* indexable, via a **`spatial`** index kind (not a B-tree). See the next subsection. + +### Spatial / GIS fields (PostGIS & SQL Server spatial) — primary GIS driver + +**Context: the most important `json` data is geospatial** — e.g. land-parcel geometries — and the headline query is *"find parcels within 100 km of a point"* (and similar within-an-area predicates). Spatial is a **fourth query family**, distinct from equality / range-sort / containment: it needs a **spatial index (R-tree / GiST)**, which a B-tree and a GIN both cannot serve. This is the **single strongest justification for Approach A**: performant spatial querying *requires* a real geometry-typed column with a spatial index — unobtainable from EAV, slots, jsonb-as-text, or in-memory-at-scale. The canonical GeoJSON stays in the blob; a materialized geometry column is the derived, rebuildable spatial projection. The standard lifecycle (provision column → backfill → online index build → dual-write) applies unchanged, with a geometry type and spatial index instead of a B-tree. + +**A new index *kind*: `spatial`.** A field declared `type: json, index: spatial` (i.e. "this json is a geometry") provisions a geometry column: + +| Backend | Column + conversion | Index | Predicates | +|---|---|---|---| +| **Postgres + PostGIS** | `geography(…, 4326)` via `ST_GeomFromGeoJSON(dataJson->'geom')` (native GeoJSON, WGS84) | **GiST** (`CONCURRENTLY` ok) | `ST_DWithin`, `ST_Within`, `ST_Contains`, `ST_Intersects` | +| **SQL Server** | `geography` — **no native GeoJSON parser**; convert GeoJSON→WKT/WKB in app code, then `geography::STGeomFromText(…, 4326)` | spatial index | `.STDistance()`, `.STWithin()`, `.STContains()`, `.STIntersects()` | +| **In-memory floor** | parse GeoJSON with JTS | none (full scan) | JTS predicates — fine for small N, **slow for large parcel sets** | + +**`geography(4326)` is the chosen default** (not `geometry`). Rationale: `geography` measures distance on the Earth's curved surface and returns **real metres**, so "within 100 km" is literally `100000` — no conversion. `geometry` treats lon/lat as planar degrees (a degree of longitude is ~111 km at the equator, ~0 at the poles), which makes radius queries wrong and latitude-distorted. SRID **4326 = WGS84** matches GeoJSON's default coordinate system, so source data needs no reprojection. + +**The headline query:** +```sql +-- parcels (polygons) within 100 km of a home point +SELECT * FROM de_parcel +WHERE ST_DWithin( + geom, -- geography(Polygon, 4326) + ST_MakePoint(home_lon, home_lat)::geography, -- home point + 100000 -- metres + ); +``` +- `ST_DWithin` works polygon-to-point (nearest edge within 100 km) and is **index-accelerated** via GiST. Always use `ST_DWithin` for radius queries — a naïve `ST_Distance(...) < 100000` does **not** use the index. + +**A new operator class: spatial predicates** (`within`, `contains`, `intersects`, `dwithin`), declared in the contract and validated by the planner. `range`/`sort` stay N/A for spatial fields; spatial fields accept only spatial predicates. Same Shape-B parity rule: each backend satisfies the predicate its own way, in-memory as the universal floor. +``` +GET /obp/dynamic-entity/parcel?filter[geom][dwithin]=,;100000 +GET /obp/dynamic-entity/parcel?filter[geom][within]= +``` + +**Decisions this surfaces:** +- **PostGIS is an *extension*, not core Postgres.** Capability detection must check "postgres **and** PostGIS installed" (`CREATE EXTENSION postgis`) — finer-grained than the jsonb switch. +- **GeoJSON→geometry conversion is vendor-specific** and lives inside the backend (Postgres native; SQL Server needs an app-side GeoJSON→WKT step). Canonical store stays GeoJSON. +- **Geometry validity** is the coerce-or-null analog: invalid/self-intersecting polygons → `ST_MakeValid`, or null-with-error-count at backfill time. +- **Escape hatch for planar analytics:** because Approach A materializes columns, a *second* projected `geometry` column (e.g. local UTM SRID) can be added alongside the `geography` one when precise **area/overlap** math is needed — distance uses geography, area uses geometry, both indexed, same GeoJSON source. +- **Honest floor caveat:** on a non-PostGIS, non-SQL-Server deployment, "parcels within an area" degrades to a slow in-memory full scan. Acceptable for a rarely-hit fallback; worth stating. + +### State machine (per indexed field) + +``` + (field marked indexed) +none ─────────────────────────▶ provisioning ──▶ backfilling ──▶ verifying ──▶ ready + │ │ │ │ + └────────────────┴──────────────┴────▶ failed (retryable) +ready ──(field unmarked / removed)──▶ retiring ──▶ none +ready ──(type / rename change)──────▶ rebuilding ──▶ backfilling ──▶ … ──▶ ready +``` + +**Query gating:** the planner routes a `filter`/`sort` to the projection only if **every** field it touches is `ready`. If any is `provisioning`/`backfilling`, config policy decides: reject with "field not yet queryable, retry shortly", or in-memory fallback with a documented perf caveat. Never serve partial/wrong results from a half-built column. + +### Lifecycle events + +**1. Entity created / first field marked indexed** — ordered, every step idempotent and resumable: +1. Register field as `provisioning`. +2. `CREATE TABLE IF NOT EXISTS de_` with `data_id` PK + FK to the blob row (`ON DELETE CASCADE`) and one typed column per indexed field. New table = empty = instant DDL, no lock concern. +3. **Turn on dual-write now, before backfill** — the app write path begins upserting into the projection on every create/update. This closes the race: rows written *during* backfill land via dual-write, and backfill upserts, so the two converge. +4. **Backfill historical rows in batches** — walk existing blob rows by PK range, decompose JSON, `INSERT … ON CONFLICT DO UPDATE` (PG) / `MERGE` (SQL Server) in bounded, throttled chunks, persisting `backfill_checkpoint` after each batch (resumable). This is the only step that scales with row count. +5. **Build indexes after backfill, online**: `CREATE INDEX CONCURRENTLY` (PG) / `WITH (ONLINE = ON)` (SQL Server Enterprise). Indexing post-load is far faster and non-blocking. Reap a failed PG `CONCURRENTLY` (`INVALID` index) and retry. +6. **Verify**: projection row count == blob row count for the entity. Mismatch → re-scan or `failed` with reason. +7. **Flip to `ready`** in a single metadata update. Only now does the planner use the projection. + +A non-coercible value (e.g. `"price":"free"`) stores `NULL` for that cell and increments `coercion_errors` — one junk row never fails a large backfill. + +**2. Steady-state writes** — write path is unchanged in contract (still writes the canonical blob); in the **same transaction**, using the already-parsed object, it upserts the typed columns into `de_` by `data_id` (so the projection can't diverge from a committed blob write). Do this in app code, **not** a JSON-parsing DB trigger. Deletes cascade via the FK. Cost: one extra single-row upsert per write — negligible. + +**3. Add an indexed field to an existing entity** — `ALTER TABLE … ADD COLUMN c_ NULL` (nullable, no default → metadata-only / instant on PG 11+ and SQL Server) → dual-write the new column → batched backfill of just that column → `CREATE INDEX CONCURRENTLY/ONLINE` → verify → `ready`. Lock blast radius is one entity's table. + +**4. Unmark / remove an indexed field** — `retiring`: drop the index, then drop the column (lazily — an unused nullable column is harmless and the `DROP COLUMN` can wait for a maintenance window). The JSON key in the blob is untouched. + +**5. Field type change** — never `ALTER TYPE` in place. Add a new column of the new type → backfill + index → atomically flip `safe_column_name` old→new in the registry → retire old column lazily. Online and reversible because canonical data never moved. + +**6. Field rename (JSON key change)** — the column is keyed by `safe_column_name` mapped to the JSON key in the registry; update the mapping and re-backfill the column from the new key. (Migrating old rows still carrying the old key is a separate DE-data concern.) + +**7. Entity deleted** — `DROP TABLE de_` (optionally after a soft-retire grace period); remove registry rows. + +**8. Rebuild / recovery (safety net)** — one admin action: drop the projection table and re-run the create+backfill lifecycle from the canonical blob. Nothing is lost because the blob is the source of truth — this is what makes automatic DDL operationally safe. + +### What "automatic DDL" means here + +- The provisioner **emits and runs** `CREATE TABLE` / `ALTER` / `CREATE INDEX` itself, triggered by DE-definition changes — no human-authored migration. +- All DDL is **idempotent** (`IF NOT EXISTS`, registry-guarded) and **resumable** (checkpointed) — a process restart mid-provision is safe. +- All DDL uses **generated, sanitised identifiers** — user strings never reach raw DDL. +- Gated by **capability detection + an operator kill-switch prop**; the backend is selected at startup (Shape B). Online/concurrent builds keep it non-blocking on large tables — the only place the "many rows" reality bites. + +## Query backend abstraction (Shape B) + +The storage options above and the native-JSON accelerator are **implementations**, not interfaces. They sit behind a single swappable seam so the public API never leaks which database (or which storage option) is in use. + +**One contract, many backends.** There is exactly **one** endpoint and **one** filter/sort/paginate grammar, identical on every database: + +``` +GET /obp/dynamic-entity/?filter[price][lt]=10&sort=-created&page=2&per_page=20 +``` + +Clients never see vendor syntax — no `->>`, `@>`, `JSON_VALUE`, joins, or slot column names appear in any URL, ResourceDoc, Swagger entry, or response. The request is parsed into an abstract query **plan** (validated against the entity's declared `indexed` fields), and only the final compile step is vendor-specific: + +``` +parse request → planner builds abstract query plan ← vendor-independent + (validated against declared `indexed` fields) + │ + ▼ + QueryBackend.compile(plan) ← the ONLY vendor-specific part + ├── PostgresJsonbBackend → ->> / @> / GIN + B-tree expr indexes + ├── SqlServerJsonBackend → JSON_VALUE / OPENJSON / persisted computed cols + ├── PortableRelationalBackend → slot columns (Opt 1) or EAV (Opt 2), Lift Mapper DSL + └── InMemoryFallbackBackend → fetch + filter in app (last resort) +``` + +**Selection is server config, not per-request.** Capability detection picks the backend once at startup from the detected vendor, with an operator override prop (so the portable path can be forced even on Postgres). A client can never choose the backend. + +**This is the load-bearing decision — independent of jsonb.** Adopting the backend seam does not commit us to building `PostgresJsonbBackend`. jsonb is just one backend that plugs in later, behind capability detection, *if* measured equality-filter performance justifies it — with no change to the endpoint contract, docs, or tests. Sequence: (1) commit to the backend seam + declared-`indexed` contract now; (2) build the portable backend first (Opt 1 or 2); (3) add the jsonb accelerator later only if needed. + +**Contract parity, not just code parity.** The same `filter`/`sort` must return the **same results** on every backend, or this becomes vendor-coupling by the back door. Guard rails: +- The declared `indexed` allow-list **is** the contract — only fields the definition marks queryable are filterable/sortable, identical on every backend (no backend "accidentally" supports a field another can't). +- **One conformance test suite**, run against every backend (Postgres, SQL Server, portable, in-memory), asserting identical responses for identical requests. +- Watch the known drift points: type coercion + null/missing-key handling, text-sort collation/case-sensitivity, and numeric/date cast edge cases (see the range/sort notes above). + +## Cross-cutting costs (any option) + +- **Backfill:** marking a field `indexed` on an entity that already has rows requires a one-time backfill from existing `dataJson`. +- **Definition changes:** adding/removing an indexed field means provisioning/freeing storage (slot/EAV rows) + backfill or tombstone. +- **Non-indexed filters:** reject with a clear error, or fall back to in-memory with a documented perf caveat — never silently allow arbitrary-field filtering. +- **Write-path consistency:** index storage must be updated in the same transaction as the blob. + +## Recommendation / next steps + +- **Chosen: Approach A — per-entity projection tables with automatic DDL** (Option 3 revisited). Justified by the deployment profile (≈30 entities, some with many rows): no catalog sprawl, per-entity lock isolation, and online/batched/resumable provisioning. See the lifecycle section. +- **Canonical JSON stays primary; the per-entity table is a derived, rebuildable projection of indexed fields.** Tables-as-primary is an open revisit, only attractive if DEs move to fully-declared schemas. +- Everything sits behind the **Shape B** backend seam, so the portable (slots/EAV) and in-memory paths remain available as fallbacks where DDL is disabled or the vendor is unknown; `jsonb`/SQL-Server-JSON are still possible accelerators but are no longer the primary plan. +- Drive everything from declared `indexed` fields in the entity definition via the query planner above. + +## Relationship to field-level permissions + +This is **independent** of the field-level read/write permissions + per-field provenance work — that governs *who can read/write* fields; this governs *querying lists*. They compose: a field can be both permission-restricted and indexed. diff --git a/obp-api/src/main/scala/bootstrap/liftweb/Boot.scala b/obp-api/src/main/scala/bootstrap/liftweb/Boot.scala index f8c8be873c..30b508c139 100644 --- a/obp-api/src/main/scala/bootstrap/liftweb/Boot.scala +++ b/obp-api/src/main/scala/bootstrap/liftweb/Boot.scala @@ -946,6 +946,7 @@ object ToSchemify extends MdcLoggable { WebUiProps, DynamicEntity, DynamicData, + code.api.dynamic.entity.projection.DynamicEntityIndex, DynamicEndpoint, AccountIdMapping, DirectDebit, diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/Http4sDynamicEntity.scala b/obp-api/src/main/scala/code/api/dynamic/entity/Http4sDynamicEntity.scala index a67c2a8850..ed610dfa82 100644 --- a/obp-api/src/main/scala/code/api/dynamic/entity/Http4sDynamicEntity.scala +++ b/obp-api/src/main/scala/code/api/dynamic/entity/Http4sDynamicEntity.scala @@ -30,6 +30,9 @@ import cats.effect.IO import code.DynamicData.{DynamicData, DynamicDataProvider} import code.api.Constant.PARAM_LOCALE import code.api.dynamic.entity.helper.{CommunityEntityName, DynamicEntityHelper, DynamicEntityInfo, EntityName, PublicEntityName} +import code.api.dynamic.entity.query.{FieldSpec, InMemoryQueryExecutor, QueryParamParser, QueryPlan, QueryPlanner} +import code.api.dynamic.entity.projection.{IndexingCapabilities, PostgresProjectionBackend, ProjectionProvisioner} +import cats.effect.unsafe.implicits.{global => ioRuntime} // aliased: avoids clashing with the EC `global` imported below import code.api.util.APIUtil._ import code.api.util.ErrorMessages._ import code.api.util.http4s.Http4sRequestAttributes.{EndpointHelpers, RequestOps} @@ -99,22 +102,46 @@ object Http4sDynamicEntity extends MdcLoggable { box.openOrThrowException("impossible error") } + // ----- DE_indexing: declarative filter / sort / paginate for GET-all list reads ----- + // Replaces the old bare-param equality filter. The query contract is `obp_filter[FIELD]=OP:VALUE` + // (+ obp_sort_by / obp_sort_direction / obp_offset / obp_limit), validated against the entity's + // declared `indexed` fields. Phase 1 backend = in-memory portable floor (projection backend later). + + private def deIndexedFields(bankId: Option[String], entityName: String): Map[String, FieldSpec] = + DynamicEntityHelper.definitionsMap.get((bankId, entityName)).map(_.indexedFields).getOrElse(Map.empty) + + /** Parse + validate list-read query params into a QueryPlan; fail 400 (clear message) on any error. */ + private def buildQueryPlan(req: Request[IO], bankId: Option[String], entityName: String, cc: Option[CallContext]): Future[QueryPlan] = { + val planned = for { + parsed <- QueryParamParser.parse(queryParams(req)) + plan <- QueryPlanner.plan(parsed._1, parsed._2, parsed._3, deIndexedFields(bankId, entityName)) + } yield plan + planned match { + case Right(plan) => Future.successful(plan) + case Left(err) => Helper.booleanToFuture(err.message, 400, cc = cc) { false }.map(_ => QueryPlan.empty) + } + } + + /** Apply a validated plan to the fetched records (Phase 1: in-memory; the SQL backend comes later). */ + private def applyQueryPlan(resultList: JArray, plan: QueryPlan, indexed: Map[String, FieldSpec]): JArray = { + val records = resultList.arr.collect { case o: JObject => o } + val fieldTypes = indexed.mapValues(_.fieldType).toMap + JArray(InMemoryQueryExecutor.execute(records, plan, fieldTypes)) + } + /** - * http4s equivalent of the Lift `filterDynamicObjects(resultList, req)`: filter GET-all - * results by query parameters (AND across keys, OR across a key's values), excluding the - * `locale` (PARAM_LOCALE) param. Lift read `req.params`; here we read the http4s query - * multiParams (same `Map[String, List[String]]` shape). + * Legacy GET-all filter (documented, unversioned contract — preserved byte-for-byte): bare query + * params filter by field value, AND across distinct fields, OR across a field's repeated values, + * e.g. `?name=A&number=1&number=2` => name==A && (number==1 || number==2). Equality only; runs + * in-memory on any field. `locale` and the new `obp_*` params (which are handled by the QueryPlan + * path) are excluded. This composes with the new path: legacy filter first, then operators/sort/page. */ private def filterDynamicObjects(resultList: JArray, params: Map[String, List[String]]): JArray = { - if (params.isEmpty) resultList - else { - val filtered = resultList.arr.filter { jValue => - params.filter(_._1 != PARAM_LOCALE).forall { case (path, values) => - values.exists(JsonUtils.isFieldEquals(jValue, path, _)) - } - } - JArray(filtered) - } + val legacyParams = params.filter { case (k, _) => k != PARAM_LOCALE && !k.startsWith("obp_") } + if (legacyParams.isEmpty) resultList + else JArray(resultList.arr.filter { jValue => + legacyParams.forall { case (path, values) => values.exists(JsonUtils.isFieldEquals(jValue, path, _)) } + }) } private def queryParams(req: Request[IO]): Map[String, List[String]] = @@ -226,6 +253,31 @@ object Http4sDynamicEntity extends MdcLoggable { if (omit.isEmpty) value else omitFields(value, omit) } + // ----- DE_indexing: read-path backend selection (projection vs in-memory) ----- + // Phase 3: applied to the authenticated genericGet only. Public/community stay in-memory for now + // (different scoping). Projection is used only for pure obp_filter/sort queries (no legacy bare + // params) whose every field is indexed AND ready; an indexed-but-not-ready field returns 409. + private sealed trait ProjDecision + private case object UseProjection extends ProjDecision + private case object PendingProjection extends ProjDecision + private case object UseInMemory extends ProjDecision + + private def legacyParamsPresent(req: Request[IO]): Boolean = + queryParams(req).keys.exists(k => k != PARAM_LOCALE && !k.startsWith("obp_")) + + private def planFields(plan: QueryPlan): List[String] = + (plan.filters.map(_.field) ++ plan.sort.map(_.field)).distinct + + private def decideProjection(req: Request[IO], bankId: Option[String], entityName: String, plan: QueryPlan): ProjDecision = + if (!IndexingCapabilities.projectionEnabled || legacyParamsPresent(req) || planFields(plan).isEmpty) UseInMemory + else { + val ready = ProjectionProvisioner.readyFields(bankId, entityName) + if (planFields(plan).forall(ready.contains)) UseProjection else PendingProjection + } + + private def projectionList(entityName: String, bankId: Option[String], userId: Option[String], isPersonalEntity: Boolean, plan: QueryPlan): Future[JArray] = + PostgresProjectionBackend.query(entityName, bankId, userId, isPersonalEntity, plan).map(JArray(_)).unsafeToFuture()(ioRuntime) + // ----- generic endpoint (authenticated, system / bank / personal) ----- private def genericGet(req: Request[IO], bankId: Option[String], entityName: String, id: String, isPersonalEntity: Boolean): IO[Response[IO]] = @@ -242,12 +294,24 @@ object Http4sDynamicEntity extends MdcLoggable { _ <- if (isPersonalEntity && !personalRequiresRole) Future.successful(true) else NewStyle.function.hasEntitlement(bankId.getOrElse(""), u.userId, DynamicEntityInfo.canGetRole(entityName, bankId), callContext) _ <- failIf(afterIntercept(callContext, operationId), callContext) - (box, _) <- NewStyle.function.invokeDynamicConnector(operation, entityName, None, Option(id).filter(StringUtils.isNotBlank), bankId, None, Some(u.userId), isPersonalEntity, Some(cc)) - _ <- Helper.booleanToFuture(notFoundMsg(entityName, id, bankId), 404, cc = callContext) { box.isDefined } + queryPlan <- if (isGetAll) buildQueryPlan(req, bankId, entityName, callContext) else Future.successful(QueryPlan.empty) + decision = if (isGetAll) decideProjection(req, bankId, entityName, queryPlan) else UseInMemory + _ <- if (decision == PendingProjection) Helper.booleanToFuture(DynamicEntityFieldNotYetQueryable, 409, cc = callContext) { false } + else Future.successful(true) + // Projection path: serve the list from SQL, skipping the fetch-all connector call. + projList <- if (decision == UseProjection) projectionList(entityName, bankId, Some(u.userId), isPersonalEntity, queryPlan).map(Option(_)) + else Future.successful(Option.empty[JArray]) + (box, _) <- if (decision == UseProjection) Future.successful((net.liftweb.common.Empty: Box[JValue], callContext)) + else NewStyle.function.invokeDynamicConnector(operation, entityName, None, Option(id).filter(StringUtils.isNotBlank), bankId, None, Some(u.userId), isPersonalEntity, Some(cc)) + _ <- if (decision == UseProjection) Future.successful(true) + else Helper.booleanToFuture(notFoundMsg(entityName, id, bankId), 404, cc = callContext) { box.isDefined } } yield { if (isGetAll) { - val resultList: JArray = unboxResult(box.asInstanceOf[Box[JArray]], entityName) - val filtered = filterDynamicObjects(resultList, queryParams(req)) + val filtered: JArray = projList.getOrElse { + val resultList: JArray = unboxResult(box.asInstanceOf[Box[JArray]], entityName) + val legacyFiltered = filterDynamicObjects(resultList, queryParams(req)) + applyQueryPlan(legacyFiltered, queryPlan, deIndexedFields(bankId, entityName)) + } wrapBankId(bankId, (listName(entityName) -> applyReadRestrictions(filtered, bankId, entityName, Some(u.userId)))) } else { val singleObject: JValue = unboxResult(box.asInstanceOf[Box[JValue]], entityName) @@ -357,12 +421,14 @@ object Http4sDynamicEntity extends MdcLoggable { _ <- failIf(beforeIntercept(callContext0, operationId), Some(callContext0)) (_, callContext) <- anonymousAccess(callContext0) (_, callContext) <- bankCheck(bankId, callContext) + queryPlan <- if (isGetAll) buildQueryPlan(req, bankId, entityName, callContext) else Future.successful(QueryPlan.empty) (box, _) <- NewStyle.function.invokeDynamicConnector(operation, entityName, None, Option(id).filter(StringUtils.isNotBlank), bankId, None, None, false, Some(cc)) _ <- Helper.booleanToFuture(notFoundMsg(entityName, id, bankId), 404, cc = callContext) { box.isDefined } } yield { if (isGetAll) { val resultList: JArray = unboxResult(box.asInstanceOf[Box[JArray]], entityName) - val filtered = filterDynamicObjects(resultList, queryParams(req)) + val legacyFiltered = filterDynamicObjects(resultList, queryParams(req)) + val filtered = applyQueryPlan(legacyFiltered, queryPlan, deIndexedFields(bankId, entityName)) wrapBankId(bankId, (listName(entityName) -> applyReadRestrictions(filtered, bankId, entityName, None))) } else { val singleObject: JValue = unboxResult(box.asInstanceOf[Box[JValue]], entityName) @@ -385,11 +451,13 @@ object Http4sDynamicEntity extends MdcLoggable { (_, callContext) <- bankCheck(bankId, callContext) _ <- NewStyle.function.hasEntitlement(bankId.getOrElse(""), u.userId, DynamicEntityInfo.canGetRole(entityName, bankId), callContext) _ <- failIf(afterIntercept(callContext, operationId), callContext) + queryPlan <- if (isGetAll) buildQueryPlan(req, bankId, entityName, callContext) else Future.successful(QueryPlan.empty) } yield { if (isGetAll) { val resultList: List[JObject] = DynamicDataProvider.connectorMethodProvider.vend.getAllDataJsonCommunity(bankId, entityName) val resultArray = JArray(resultList) - val filtered = filterDynamicObjects(resultArray, queryParams(req)) + val legacyFiltered = filterDynamicObjects(resultArray, queryParams(req)) + val filtered = applyQueryPlan(legacyFiltered, queryPlan, deIndexedFields(bankId, entityName)) wrapBankId(bankId, (listName(entityName) -> applyReadRestrictions(filtered, bankId, entityName, Some(u.userId)))) } else { val singleResult = DynamicDataProvider.connectorMethodProvider.vend.getCommunity(bankId, entityName, id) diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/helper/DynamicEntityHelper.scala b/obp-api/src/main/scala/code/api/dynamic/entity/helper/DynamicEntityHelper.scala index ce8910b007..d30c31ab19 100644 --- a/obp-api/src/main/scala/code/api/dynamic/entity/helper/DynamicEntityHelper.scala +++ b/obp-api/src/main/scala/code/api/dynamic/entity/helper/DynamicEntityHelper.scala @@ -724,6 +724,7 @@ object DynamicEntityHelper { case class DynamicEntityInfo(definition: String, entityName: String, bankId: Option[String], hasPersonalEntity: Boolean, hasPublicAccess: Boolean = false, hasCommunityAccess: Boolean = false, personalRequiresRole: Boolean = false) { import net.liftweb.json + import code.api.dynamic.entity.query.FieldSpec val subEntities: List[DynamicEntityInfo] = Nil @@ -858,6 +859,21 @@ case class DynamicEntityInfo(definition: String, entityName: String, bankId: Opt case props: JObject => props.obj.map(_.name) case _ => Nil } + + /** + * Fields declared `indexed` (DE_indexing): name -> (declared type, index kind "scalar"|"spatial"). + * Only recognised DynamicEntityFieldType fields are surfaced; this is the queryable allow-list the + * planner validates against. (Reference-typed indexed fields are not yet supported.) + */ + lazy val indexedFields: Map[String, FieldSpec] = (entity \ "properties") match { + case props: JObject => props.obj.collect { + case JField(name, propDef: JObject) if (propDef \ "indexed") == JBool(true) => + val typeName = (propDef \ "type") match { case JString(s) => s; case _ => "" } + val kind = (propDef \ "index") match { case JString(s) => s; case _ => "scalar" } + DynamicEntityFieldType.withNameOption(typeName).map(ft => name -> FieldSpec(ft, kind)) + }.flatten.toMap + case _ => Map.empty + } } object DynamicEntityInfo { diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/DynamicEntityIndex.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/DynamicEntityIndex.scala new file mode 100644 index 0000000000..226996a74a --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/DynamicEntityIndex.scala @@ -0,0 +1,44 @@ +package code.api.dynamic.entity.projection + +import net.liftweb.mapper._ + +/** + * Registry of per-entity projection state (DE_indexing, Approach A). One row per declared `indexed` + * field, recording the provisioning state machine, the safe (hashed) table/column identifiers, and + * backfill bookkeeping. Managed by the provisioner via Doobie DDL — the projection *tables* live + * outside Lift Schemifier, but this registry itself is a normal Schemifier-managed table. + * + * Naming follows project convention: no `Mapped` prefix, columns are plain Capitalised objects. + */ +class DynamicEntityIndex extends LongKeyedMapper[DynamicEntityIndex] with IdPK { + def getSingleton = DynamicEntityIndex + + object EntityName extends MappedString(this, 255) + object BankId extends MappedString(this, 255) // "" for system-level entities + object FieldName extends MappedString(this, 255) + object FieldType extends MappedString(this, 64) // DynamicEntityFieldType name + object IndexKind extends MappedString(this, 32) // "scalar" | "spatial" + object SafeTableName extends MappedString(this, 128) + object SafeColumnName extends MappedString(this, 128) + object State extends MappedString(this, 32) // provisioning|backfilling|verifying|ready|failed|retiring|rebuilding + object BackfillCheckpoint extends MappedString(this, 255) // resumable cursor (last PK processed) + object RowCountExpected extends MappedLong(this) + object CoercionErrors extends MappedLong(this) + object LastError extends MappedText(this) + object ProvisionerVersion extends MappedInt(this) +} + +object DynamicEntityIndex extends DynamicEntityIndex with LongKeyedMetaMapper[DynamicEntityIndex] { + override def dbIndexes = Index(EntityName, BankId, FieldName) :: super.dbIndexes +} + +/** Provisioning state machine states (see DE_indexing_plan.md). */ +object ProjectionState { + val Provisioning = "provisioning" + val Backfilling = "backfilling" + val Verifying = "verifying" + val Ready = "ready" + val Failed = "failed" + val Retiring = "retiring" + val Rebuilding = "rebuilding" +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/IndexingCapabilities.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/IndexingCapabilities.scala new file mode 100644 index 0000000000..932df66d27 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/IndexingCapabilities.scala @@ -0,0 +1,49 @@ +package code.api.dynamic.entity.projection + +import code.api.util.{APIUtil, DBUtil, DoobieUtil} +import code.util.Helper.MdcLoggable +import doobie._ +import doobie.implicits._ + +import scala.util.Try + +/** + * Startup capability detection for the DE projection backend (Shape B selection). + * + * Decides, once, which backend can run: the SQL projection (Approach A) is only used when the + * vendor supports it AND an operator opts in via prop; otherwise the in-memory portable floor is + * used. Spatial requires the PostGIS *extension*, not merely Postgres — a finer check than the + * generic JSON accelerator. + */ +object IndexingCapabilities extends MdcLoggable { + + sealed trait Vendor + case object Postgres extends Vendor + case object SqlServer extends Vendor + case object OtherVendor extends Vendor + + lazy val vendor: Vendor = + if (Try(DoobieUtil.isSqlServer).getOrElse(false)) SqlServer + else if (Try(DBUtil.dbUrl).getOrElse("").toLowerCase.contains("postgresql")) Postgres + else OtherVendor + + /** True only on Postgres with the PostGIS extension installed. Probed once; failures => false. */ + lazy val postgisAvailable: Boolean = vendor match { + case Postgres => + Try { + DoobieUtil.runQuery(sql"SELECT 1 FROM pg_extension WHERE extname = 'postgis'".query[Int].option).isDefined + }.getOrElse(false) + case _ => false + } + + /** + * Operator kill-switch `dynamic_entity.indexing.backend`: + * - "inmemory" (default) -> always the in-memory portable floor (no projection / DDL) + * - "auto" -> use the SQL projection backend where the vendor supports it + */ + def backendMode: String = APIUtil.getPropsValue("dynamic_entity.indexing.backend", "inmemory") + + /** Whether the SQL projection backend (and its automatic DDL) is enabled for this deployment. */ + def projectionEnabled: Boolean = + backendMode.equalsIgnoreCase("auto") && (vendor == Postgres || vendor == SqlServer) +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/PostgresProjectionBackend.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/PostgresProjectionBackend.scala new file mode 100644 index 0000000000..2c983e5956 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/PostgresProjectionBackend.scala @@ -0,0 +1,50 @@ +package code.api.dynamic.entity.projection + +import cats.effect.IO +import code.api.dynamic.entity.helper.DynamicEntityHelper +import code.api.dynamic.entity.query.{DynamicEntityQueryBackend, QueryPlan} +import doobie._ +import doobie.implicits._ +import net.liftweb.json.JsonAST.JObject + +/** + * Approach A query backend (DE_indexing, Phase 3): serves a validated [[QueryPlan]] from the entity's + * projection table, joined back to the canonical blob table for access-scoping and to return the + * canonical record. One SQL statement does filter + sort + paginate (index-accelerated) and yields + * the blobs already in order — no second hydration round-trip. + * + * SELECT d.dataJson + * FROM p JOIN d ON d.id = p.data_id + * WHERE [AND ] + * [ORDER BY

] [LIMIT ?] [OFFSET ?] + * + * Scalar operators only (spatial = Phase 4). The caller (backend selection) guarantees every queried + * field is `indexed` and `ready` before routing here. + */ +object PostgresProjectionBackend extends DynamicEntityQueryBackend { + + def name: String = "postgres-projection" + + def query(entityName: String, bankId: Option[String], userId: Option[String], isPersonalEntity: Boolean, plan: QueryPlan): IO[List[JObject]] = { + val indexed = DynamicEntityHelper.definitionsMap.get((bankId, entityName)).map(_.indexedFields).getOrElse(Map.empty) + val safeTable = ProjectionNaming.tableName(bankId, entityName) + val P = "p"; val D = "d" + def columnOf(f: String): Option[String] = indexed.get(f).map(_ => s"$P." + ProjectionNaming.columnName(f)) + def sqlTypeOf(f: String): Option[String] = indexed.get(f).map(s => ProjectionDDL.sqlColumnType(s.fieldType.toString)) + + (ProjectionSql.predicates(plan, columnOf, sqlTypeOf), ProjectionSql.orderBy(plan, columnOf)) match { + case (Some(preds), Some(ords)) => + val scope = ProjectionStore.scope(bankId, entityName, isPersonalEntity, userId, D) + val whereAll = if (preds == Fragment.empty) fr"WHERE" ++ scope else fr"WHERE" ++ scope ++ fr"AND" ++ preds + val q = + fr"SELECT" ++ Fragment.const(s"$D.${ProjectionStore.jsonColumn}") ++ + fr"FROM" ++ Fragment.const(s"$safeTable $P") ++ + fr"JOIN" ++ Fragment.const(s"${ProjectionStore.blobTable} $D") ++ + fr"ON" ++ Fragment.const(s"$D.${ProjectionStore.idColumn} = $P.data_id") ++ + whereAll ++ ords ++ ProjectionSql.limitOffset(plan) + ProjectionDb.run(q.query[String].to[List]).map(_.map(s => net.liftweb.json.parse(s).asInstanceOf[JObject])) + case _ => + IO.raiseError(new RuntimeException(s"PostgresProjectionBackend: unresolved field in query plan for $entityName")) + } + } +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionCoerce.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionCoerce.scala new file mode 100644 index 0000000000..2b51062f10 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionCoerce.scala @@ -0,0 +1,37 @@ +package code.api.dynamic.entity.projection + +import com.openbankproject.commons.model.enums.DynamicEntityFieldType +import net.liftweb.json.JsonAST._ + +import java.time.LocalDate +import scala.util.Try + +/** + * Coerce a record's JSON field value to the text form bound into its typed projection column + * (DE_indexing). Returns `None` for missing / not-coercible values → the column is stored as NULL + * (coerce-or-null: one bad value never aborts a backfill or a write). Mirrors the planner's + * coercion rules so the SQL projection and the in-memory executor agree on what's queryable. + */ +object ProjectionCoerce { + + def toColumnValue(jv: JValue, ft: DynamicEntityFieldType): Option[String] = { + import DynamicEntityFieldType._ + def keepIf(cond: Boolean, out: String): Option[String] = if (cond) Some(out) else None + asText(jv).flatMap { v => + val t = v.trim + if (ft == number) keepIf(Try(BigDecimal(t)).isSuccess, t) + else if (ft == integer) keepIf(Try(BigInt(t)).isSuccess, t) + else if (ft == boolean) keepIf(t.equalsIgnoreCase("true") || t.equalsIgnoreCase("false"), t.toLowerCase) + else if (ft == DATE_WITH_DAY) keepIf(Try(LocalDate.parse(t)).isSuccess, t) + else Some(v) // string + reference types: store as-is + } + } + + private def asText(jv: JValue): Option[String] = jv match { + case JString(s) => Some(s) + case JInt(i) => Some(i.toString) + case JDouble(d) => Some(d.toString) + case JBool(b) => Some(b.toString) + case _ => None // JNothing / JNull / JObject / JArray → NULL + } +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionDDL.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionDDL.scala new file mode 100644 index 0000000000..24bd16b2ab --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionDDL.scala @@ -0,0 +1,55 @@ +package code.api.dynamic.entity.projection + +import cats.effect.IO +import code.util.Helper.MdcLoggable +import doobie._ +import doobie.implicits._ + +/** + * Idempotent DDL for per-entity projection tables (Approach A), executed via Doobie. + * + * Phase 2 skeleton: builds and runs the statements; not yet invoked by definition changes (that + * wiring + backfill + dual-write is Phase 3). Identifiers come from [[ProjectionNaming]] (hashed / + * sanitised) and are injected via `Fragment.const` — Doobie binds *values* but not *identifiers*, + * so identifier safety rests on ProjectionNaming, never on raw user strings. + * + * Postgres-first (SQL Server backend deferred — Phase 5.3). `CREATE INDEX CONCURRENTLY` must run + * outside a transaction; `DoobieUtil.runQueryIO` uses the autocommit fallback pool, not the request + * connection. + */ +object ProjectionDDL extends MdcLoggable { + + /** Map a DE scalar field type to a portable SQL column type. (Spatial handled separately in Phase 4.) */ + def sqlColumnType(fieldType: String): String = fieldType match { + case "number" => "numeric" + case "integer" => "bigint" + case "boolean" => "boolean" + case "DATE_WITH_DAY" => "date" + case _ => "text" // string + reference types + } + + /** CREATE TABLE IF NOT EXISTS de_(data_id varchar primary key). */ + def createTableIO(safeTable: String): IO[Int] = + run(s"CREATE TABLE IF NOT EXISTS $safeTable (data_id varchar(255) PRIMARY KEY)") + + /** ALTER TABLE de_ ADD COLUMN IF NOT EXISTS c_ (nullable). */ + def addColumnIO(safeTable: String, safeColumn: String, sqlType: String): IO[Int] = + run(s"ALTER TABLE $safeTable ADD COLUMN IF NOT EXISTS $safeColumn $sqlType") + + /** CREATE INDEX IF NOT EXISTS idx_... ON de_ (c_). Runs in any context (used in Phase 3). */ + def createIndexIO(safeTable: String, safeColumn: String): IO[Int] = + run(s"CREATE INDEX IF NOT EXISTS ${ProjectionNaming.indexName(safeTable, safeColumn)} ON $safeTable ($safeColumn)") + + /** CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_... — non-blocking on large tables, but must run + * OUTSIDE a transaction (autocommit connection). Production refinement over [[createIndexIO]]. */ + def createIndexConcurrentlyIO(safeTable: String, safeColumn: String): IO[Int] = + run(s"CREATE INDEX CONCURRENTLY IF NOT EXISTS ${ProjectionNaming.indexName(safeTable, safeColumn)} ON $safeTable ($safeColumn)") + + /** DROP TABLE IF EXISTS de_ (entity retired / rebuild). */ + def dropTableIO(safeTable: String): IO[Int] = + run(s"DROP TABLE IF EXISTS $safeTable") + + // All DDL identifiers originate from ProjectionNaming (hashed) — safe to inline via Fragment.const. + private def run(ddl: String): IO[Int] = + ProjectionDb.run(Fragment.const(ddl).update.run) +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionDb.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionDb.scala new file mode 100644 index 0000000000..3d02ddd14a --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionDb.scala @@ -0,0 +1,25 @@ +package code.api.dynamic.entity.projection + +import cats.effect.IO +import code.api.util.APIUtil +import doobie._ +import doobie.implicits._ + +import scala.concurrent.ExecutionContext + +/** + * A **committing** Doobie transactor over the shared HikariCP pool, for projection operations that + * run independently of any Lift request transaction — provisioner DDL/backfill and the read-path + * projection backend. Uses Doobie's default Strategy (autoCommit off → run → commit), so each + * statement persists. + * + * Contrast with `DoobieUtil.runQuery`/`runQueryIO`, which use `Strategy.void` to *share* Lift's + * request connection/transaction — that is the right tool for the dual-write hook (so the projection + * upsert commits/rolls back with the canonical blob write), but it never commits on its own. + */ +object ProjectionDb { + private lazy val xa: Transactor[IO] = + Transactor.fromDataSource[IO].apply(APIUtil.vendor.HikariDatasource.ds, ExecutionContext.global) + + def run[A](program: ConnectionIO[A]): IO[A] = program.transact(xa) +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionDualWrite.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionDualWrite.scala new file mode 100644 index 0000000000..76cb2707c4 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionDualWrite.scala @@ -0,0 +1,45 @@ +package code.api.dynamic.entity.projection + +import code.api.dynamic.entity.helper.DynamicEntityHelper +import code.api.dynamic.entity.query.OperatorMatrix +import code.api.util.DoobieUtil +import code.util.Helper.MdcLoggable +import net.liftweb.json.JsonAST.JObject + +/** + * Keeps a record's projection row in sync on the write path (DE_indexing, Phase 3). Guarded by + * `projectionEnabled` and a no-op unless the entity has a `ready` projection — so it changes nothing + * by default. Uses `DoobieUtil.runQuery`, which reuses Lift's request connection, so the projection + * upsert/delete participates in the SAME transaction as the canonical blob write (commit/rollback + * together). Scalar fields only (spatial dual-write is Phase 4). + */ +object ProjectionDualWrite extends MdcLoggable { + + /** Upsert the record's ready indexed scalar columns into the projection (called after the blob save). */ + def onSave(bankId: Option[String], entityName: String, dataId: String, body: JObject): Unit = + withReadyScalarFields(bankId, entityName) { (safeTable, fields) => + val cols = fields.map { case (f, spec) => + ProjectionStore.ColumnValue( + ProjectionNaming.columnName(f), + ProjectionDDL.sqlColumnType(spec.fieldType.toString), + ProjectionCoerce.toColumnValue(body \ f, spec.fieldType)) + } + DoobieUtil.runQuery(ProjectionStore.upsert(safeTable, dataId, cols)) + } + + /** Delete the record's projection row (called after the blob delete; FK cascade is a backstop). */ + def onDelete(bankId: Option[String], entityName: String, dataId: String): Unit = + withReadyScalarFields(bankId, entityName) { (safeTable, _) => + DoobieUtil.runQuery(ProjectionStore.delete(safeTable, dataId)) + } + + private def withReadyScalarFields(bankId: Option[String], entityName: String) + (f: (String, List[(String, code.api.dynamic.entity.query.FieldSpec)]) => Any): Unit = { + if (!IndexingCapabilities.projectionEnabled) return + val ready = ProjectionProvisioner.readyFields(bankId, entityName) + if (ready.isEmpty) return + val indexed = DynamicEntityHelper.definitionsMap.get((bankId, entityName)).map(_.indexedFields).getOrElse(Map.empty) + val scalarReady = indexed.toList.filter { case (name, spec) => spec.indexKind != OperatorMatrix.SPATIAL && ready.contains(name) } + if (scalarReady.nonEmpty) f(ProjectionNaming.tableName(bankId, entityName), scalarReady) + } +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionNaming.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionNaming.scala new file mode 100644 index 0000000000..5ebd8d7668 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionNaming.scala @@ -0,0 +1,40 @@ +package code.api.dynamic.entity.projection + +import java.nio.charset.StandardCharsets +import java.security.MessageDigest + +/** + * Deterministic, collision-resistant, length-safe SQL identifiers for per-entity projection tables + * (DE_indexing, Approach A). User-supplied entity/field names are NEVER used raw in DDL — they are + * sanitised + hashed here, then injected via `Fragment.const`. Doobie binds values but not + * identifiers, so identifier safety rests entirely on this object. + * + * Output stays well under both Postgres (63) and SQL Server (128) identifier limits. + */ +object ProjectionNaming { + + private def hash(s: String): String = + MessageDigest.getInstance("SHA-256") + .digest(s.getBytes(StandardCharsets.UTF_8)) + .take(6).map(b => "%02x".format(b & 0xff)).mkString // 12 hex chars + + private def sanitize(s: String, max: Int): String = { + val cleaned = s.toLowerCase.replaceAll("[^a-z0-9]+", "_").replaceAll("^_+|_+$", "") + if (cleaned.length > max) cleaned.substring(0, max) else cleaned + } + + private def entityKey(bankId: Option[String], entityName: String): String = + bankId.getOrElse("") + ":" + entityName + + /** Stable projection table name for an entity (system- or bank-level). e.g. `de_parcel_a1b2c3d4e5f6`. */ + def tableName(bankId: Option[String], entityName: String): String = + s"de_${sanitize(entityName, 24)}_${hash(entityKey(bankId, entityName))}" + + /** Stable column name for an indexed field within its entity's table. e.g. `c_price_9f8e7d6c5b4a`. */ + def columnName(fieldName: String): String = + s"c_${sanitize(fieldName, 24)}_${hash(fieldName)}" + + /** Stable index name for a column. */ + def indexName(safeTable: String, safeColumn: String): String = + s"idx_${safeTable}_$safeColumn" +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionProvisioner.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionProvisioner.scala new file mode 100644 index 0000000000..bad34b49c7 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionProvisioner.scala @@ -0,0 +1,94 @@ +package code.api.dynamic.entity.projection + +import cats.effect.IO +import cats.implicits._ +import code.api.dynamic.entity.helper.DynamicEntityHelper +import code.api.dynamic.entity.query.{FieldSpec, OperatorMatrix} +import code.util.Helper.MdcLoggable +import net.liftweb.mapper.By + +/** + * Provisions per-entity projection tables for an entity's declared `indexed` scalar fields + * (DE_indexing, Approach A, Phase 3): create table → add columns → backfill from the canonical blob + * (coerce-or-null) → build indexes → mark fields `ready` in the [[DynamicEntityIndex]] registry. + * + * Phase 3 covers scalar fields; spatial (`index: spatial`) is provisioned in Phase 4. Backfill is + * app-side (parse each blob, coerce per type) — simple and predictable; batched/resumable backfill + * and `CREATE INDEX CONCURRENTLY` are production refinements. + */ +object ProjectionProvisioner extends MdcLoggable { + + /** Ensure the projection for (bankId, entityName) exists, is backfilled, indexed and marked ready. + * Reads the entity's indexed fields from the (committed) definition map. */ + def ensureProvisioned(bankId: Option[String], entityName: String, isPersonalEntity: Boolean = false, userId: Option[String] = None): IO[Unit] = + ensureProvisionedFields(bankId, entityName, indexedScalarFields(bankId, entityName), isPersonalEntity, userId) + + /** As [[ensureProvisioned]] but with the indexed scalar fields passed in explicitly — used at + * definition-create time, where the new definition isn't committed/visible in the map yet. */ + def ensureProvisionedFields(bankId: Option[String], entityName: String, indexed: List[(String, FieldSpec)], + isPersonalEntity: Boolean = false, userId: Option[String] = None): IO[Unit] = { + if (indexed.isEmpty) IO.unit + else { + val safeTable = ProjectionNaming.tableName(bankId, entityName) + for { + _ <- ProjectionDDL.createTableIO(safeTable) + _ <- indexed.traverse_ { case (f, spec) => + ProjectionDDL.addColumnIO(safeTable, ProjectionNaming.columnName(f), ProjectionDDL.sqlColumnType(spec.fieldType.toString)) + } + _ <- backfill(bankId, entityName, isPersonalEntity, userId, safeTable, indexed) + _ <- indexed.traverse_ { case (f, _) => ProjectionDDL.createIndexIO(safeTable, ProjectionNaming.columnName(f)) } + _ <- IO(markReady(bankId, entityName, indexed)) + } yield () + } + } + + /** Filter a definition's indexed fields to the scalar ones this phase provisions (spatial = Phase 4). */ + def scalarFieldsOf(indexed: Map[String, FieldSpec]): List[(String, FieldSpec)] = + indexed.toList.filter(_._2.indexKind != OperatorMatrix.SPATIAL) + + /** Field names whose projection column is `ready` (used by backend selection). */ + def readyFields(bankId: Option[String], entityName: String): Set[String] = + DynamicEntityIndex.findAll( + By(DynamicEntityIndex.EntityName, entityName), + By(DynamicEntityIndex.BankId, bankId.getOrElse("")), + By(DynamicEntityIndex.State, ProjectionState.Ready) + ).map(_.FieldName.get).toSet + + // ----- internals ----- + + private def indexedScalarFields(bankId: Option[String], entityName: String): List[(String, FieldSpec)] = + DynamicEntityHelper.definitionsMap.get((bankId, entityName)) + .map(_.indexedFields).getOrElse(Map.empty) + .toList.filter(_._2.indexKind != OperatorMatrix.SPATIAL) + + private def backfill(bankId: Option[String], entityName: String, isPersonalEntity: Boolean, userId: Option[String], + safeTable: String, fields: List[(String, FieldSpec)]): IO[Unit] = + for { + rows <- ProjectionDb.run(ProjectionStore.readBlobRows(bankId, entityName, isPersonalEntity, userId)) + _ <- rows.traverse_ { case (id, jsonStr) => + val obj = net.liftweb.json.parse(jsonStr) + val cols = fields.map { case (f, spec) => + ProjectionStore.ColumnValue( + ProjectionNaming.columnName(f), + ProjectionDDL.sqlColumnType(spec.fieldType.toString), + ProjectionCoerce.toColumnValue(obj \ f, spec.fieldType)) + } + ProjectionDb.run(ProjectionStore.upsert(safeTable, id, cols)) + } + } yield () + + private def markReady(bankId: Option[String], entityName: String, fields: List[(String, FieldSpec)]): Unit = + fields.foreach { case (f, spec) => + val row = DynamicEntityIndex.find( + By(DynamicEntityIndex.EntityName, entityName), + By(DynamicEntityIndex.BankId, bankId.getOrElse("")), + By(DynamicEntityIndex.FieldName, f) + ).openOr(DynamicEntityIndex.create) + row.EntityName(entityName).BankId(bankId.getOrElse("")) + .FieldName(f).FieldType(spec.fieldType.toString).IndexKind(spec.indexKind) + .SafeTableName(ProjectionNaming.tableName(bankId, entityName)) + .SafeColumnName(ProjectionNaming.columnName(f)) + .State(ProjectionState.Ready) + .save + } +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionSql.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionSql.scala new file mode 100644 index 0000000000..6e246460c4 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionSql.scala @@ -0,0 +1,84 @@ +package code.api.dynamic.entity.projection + +import code.api.dynamic.entity.query._ +import doobie._ +import doobie.implicits._ + +/** + * Compiles a validated [[QueryPlan]] into `SELECT data_id FROM

WHERE ... ORDER BY ... LIMIT ? OFFSET ?` + * for the Postgres projection backend (DE_indexing, Phase 3). + * + * Safety: table/column identifiers come hashed from [[ProjectionNaming]] and are injected via + * `Fragment.const`; every operand is a **bound param**. Operands bind as text and are `CAST` to the + * column's SQL type in-query, so no per-type Doobie `Put` is needed and there is no injection surface + * on values. Scalar operators only — spatial (`ST_DWithin`, …) is added in Phase 4. + * + * Returns `None` if any referenced field can't be resolved to a column/type or a spatial operator is + * present (caller then errors / falls back rather than emitting bad SQL). + */ +object ProjectionSql { + + def selectDataIds( + safeTable: String, + plan: QueryPlan, + columnOf: String => Option[String], + sqlTypeOf: String => Option[String] + ): Option[Fragment] = + for { + preds <- predicates(plan, columnOf, sqlTypeOf) + ords <- orderBy(plan, columnOf) + } yield { + val whereF = if (preds == Fragment.empty) Fragment.empty else fr"WHERE" ++ preds + fr"SELECT data_id FROM" ++ Fragment.const(safeTable) ++ whereF ++ ords ++ limitOffset(plan) + } + + /** AND-joined predicate fragment (no `WHERE` keyword). `Some(Fragment.empty)` if no filters; `None` if unresolvable. */ + def predicates(plan: QueryPlan, columnOf: String => Option[String], sqlTypeOf: String => Option[String]): Option[Fragment] = { + val ps = plan.filters.map(predicate(_, columnOf, sqlTypeOf)) + if (ps.exists(_.isEmpty)) None + else Some(ps.flatten match { case Nil => Fragment.empty; case xs => intercalate(xs, fr"AND") }) + } + + /** `ORDER BY ...` fragment (empty if no sort). `None` if a sort field is unresolvable. */ + def orderBy(plan: QueryPlan, columnOf: String => Option[String]): Option[Fragment] = { + val os = plan.sort.map(s => columnOf(s.field).map(c => Fragment.const(c) ++ (if (s.direction == SortDirection.Desc) fr"DESC" else fr"ASC"))) + if (os.exists(_.isEmpty)) None + else Some(os.flatten match { case Nil => Fragment.empty; case xs => fr"ORDER BY" ++ intercalate(xs, fr",") }) + } + + /** `LIMIT ? OFFSET ?` fragment (only the parts present). */ + def limitOffset(plan: QueryPlan): Fragment = { + val limitF = plan.page.limit.map(l => fr"LIMIT $l").getOrElse(Fragment.empty) + val offsetF = plan.page.offset.map(o => fr"OFFSET $o").getOrElse(Fragment.empty) + limitF ++ offsetF + } + + private def predicate(f: Filter, columnOf: String => Option[String], sqlTypeOf: String => Option[String]): Option[Fragment] = + if (FilterOp.spatial.contains(f.op)) None + else for { + col <- columnOf(f.field) + sqlType <- sqlTypeOf(f.field) + } yield { + val c = Fragment.const(col) + def cast(v: String): Fragment = fr"CAST(" ++ fr0"$v" ++ fr" AS" ++ Fragment.const(sqlType) ++ fr")" + import FilterOp._ + f.op match { + case Eq => c ++ fr"=" ++ cast(f.values.head) + case Ne => c ++ fr"<>" ++ cast(f.values.head) + case Lt => c ++ fr"<" ++ cast(f.values.head) + case Gt => c ++ fr">" ++ cast(f.values.head) + case Le => c ++ fr"<=" ++ cast(f.values.head) + case Ge => c ++ fr">=" ++ cast(f.values.head) + case Between => c ++ fr"BETWEEN" ++ cast(f.values(0)) ++ fr"AND" ++ cast(f.values(1)) + case In => c ++ fr"IN (" ++ intercalate(f.values.map(cast), fr",") ++ fr")" + case Like => c ++ fr"ILIKE ('%' ||" ++ fr0"${f.values.head}" ++ fr"|| '%')" // case-insensitive contains, matches in-memory + case _ => Fragment.empty // unreachable: spatial filtered above + } + } + + def intercalate(frags: List[Fragment], sep: Fragment): Fragment = + frags match { + case Nil => Fragment.empty + case head :: tail => tail.foldLeft(head)((acc, f) => acc ++ sep ++ f) + } +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionStore.scala b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionStore.scala new file mode 100644 index 0000000000..af7a89dc30 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/projection/ProjectionStore.scala @@ -0,0 +1,66 @@ +package code.api.dynamic.entity.projection + +import code.DynamicData.DynamicData +import doobie._ +import doobie.implicits._ + +/** + * Data-plane operations for per-entity projection tables (DE_indexing, Approach A): upsert/delete one + * record's indexed columns, and read canonical blob rows for backfill. Used by the provisioner + * (backfill) and the dual-write hook. All via Doobie; identifiers are hashed (`Fragment.const`), + * values bound + `CAST` to the column type (no per-type `Put`, no value-injection). + * + * Canonical blob table/column identifiers are taken from the live Lift `DynamicData` mapper so they + * stay correct regardless of Lift's naming. + */ +object ProjectionStore { + + // Real DB identifiers of the canonical blob table (Lift-mapped). + val blobTable: String = DynamicData.dbTableName + val idColumn: String = DynamicData.DynamicDataId.dbColumnName + val jsonColumn: String = DynamicData.DataJson.dbColumnName + val entityNameColumn: String = DynamicData.DynamicEntityName.dbColumnName + val bankIdColumn: String = DynamicData.BankId.dbColumnName + val userIdColumn: String = DynamicData.UserId.dbColumnName + val personalColumn: String = DynamicData.IsPersonalEntity.dbColumnName + + /** One indexed field's value for a record: safe column, SQL type, coerced text (None => NULL). */ + case class ColumnValue(safeColumn: String, sqlType: String, value: Option[String]) + + /** INSERT (data_id, cols…) VALUES (?, CAST(? AS t)…) ON CONFLICT (data_id) DO UPDATE … */ + def upsert(safeTable: String, dataId: String, cols: List[ColumnValue]): ConnectionIO[Int] = { + val colList = Fragment.const("(data_id" + cols.map(", " + _.safeColumn).mkString + ")") + val valFrags = fr0"$dataId" :: cols.map(c => fr"CAST(" ++ fr0"${c.value}" ++ fr" AS" ++ Fragment.const(c.sqlType) ++ fr")") + val values = ProjectionSql.intercalate(valFrags, fr",") + val conflict = + if (cols.isEmpty) fr"ON CONFLICT (data_id) DO NOTHING" + else fr"ON CONFLICT (data_id) DO UPDATE SET" ++ + ProjectionSql.intercalate(cols.map(c => Fragment.const(s"${c.safeColumn} = EXCLUDED.${c.safeColumn}")), fr",") + (fr"INSERT INTO" ++ Fragment.const(safeTable) ++ colList ++ fr"VALUES (" ++ values ++ fr")" ++ conflict).update.run + } + + /** DELETE the projection row for a data_id (used by the delete dual-write hook; FK cascade also covers it). */ + def delete(safeTable: String, dataId: String): ConnectionIO[Int] = + (fr"DELETE FROM" ++ Fragment.const(safeTable) ++ fr"WHERE data_id =" ++ fr0"$dataId").update.run + + /** Read (data_id, dataJson) for every canonical row of an entity, scoped like the connector get-all. */ + def readBlobRows(bankId: Option[String], entityName: String, isPersonalEntity: Boolean, userId: Option[String]): ConnectionIO[List[(String, String)]] = + (fr"SELECT" ++ Fragment.const(idColumn) ++ fr"," ++ Fragment.const(jsonColumn) ++ + fr"FROM" ++ Fragment.const(blobTable) ++ fr"WHERE" ++ scope(bankId, entityName, isPersonalEntity, userId)) + .query[(String, String)].to[List] + + /** + * Scope predicate mirroring `MappedDynamicDataProvider`'s get-all: entity name always; bankId via + * IS NOT DISTINCT FROM (handles system-level NULL); personal flag; userId only when personal. + * Returned without the `WHERE` keyword so callers can AND it with index predicates. + */ + def scope(bankId: Option[String], entityName: String, isPersonalEntity: Boolean, userId: Option[String], alias: String = ""): Fragment = { + val p = if (alias.isEmpty) "" else alias + "." + val byEntity = Fragment.const(p + entityNameColumn) ++ fr"=" ++ fr0"$entityName" + val byBank = Fragment.const(p + bankIdColumn) ++ fr"IS NOT DISTINCT FROM" ++ fr0"${bankId.orNull: String}" + val byPersonal = Fragment.const(p + personalColumn) ++ fr"=" ++ fr0"$isPersonalEntity" + val base = byEntity ++ fr"AND" ++ byBank ++ fr"AND" ++ byPersonal + if (isPersonalEntity) base ++ fr"AND" ++ Fragment.const(p + userIdColumn) ++ fr"IS NOT DISTINCT FROM" ++ fr0"${userId.orNull: String}" + else base + } +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/query/DynamicEntityQueryBackend.scala b/obp-api/src/main/scala/code/api/dynamic/entity/query/DynamicEntityQueryBackend.scala new file mode 100644 index 0000000000..6c6bcff6b4 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/query/DynamicEntityQueryBackend.scala @@ -0,0 +1,39 @@ +package code.api.dynamic.entity.query + +import cats.effect.IO +import net.liftweb.json.JsonAST.JObject + +/** + * The Shape B seam: one query contract, swappable implementations. + * + * Implementations (selected once at startup by capability detection): + * - InMemoryQueryBackend — portable floor for deployments with no projection backend + * - PostgresProjectionBackend — Approach A: per-entity typed projection tables + indexes (Phase 3+) + * - SqlServerProjectionBackend — deferred (Phase 5.3) + * + * The public DE endpoint and the [[QueryPlan]] it parses are identical on every backend; only + * `query`/`provision` differ. Unservable queries return an error — never a silent fallback. + */ +trait DynamicEntityQueryBackend { + + /** Short name for logging / capability reporting (e.g. "in-memory", "postgres"). */ + def name: String + + /** + * Execute a validated plan and return the matching records (already filtered, sorted, paged). + * Records are canonical JObjects hydrated from the blob store. + */ + def query( + entityName: String, + bankId: Option[String], + userId: Option[String], + isPersonalEntity: Boolean, + plan: QueryPlan + ): IO[List[JObject]] + + /** + * Provision (or reconcile) whatever storage this backend needs for the entity's declared + * `indexed` fields — DDL, backfill, index build. Default no-op (in-memory needs nothing). + */ + def provision(entityName: String, bankId: Option[String]): IO[Unit] = IO.unit +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/query/InMemoryQueryExecutor.scala b/obp-api/src/main/scala/code/api/dynamic/entity/query/InMemoryQueryExecutor.scala new file mode 100644 index 0000000000..60e1f21211 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/query/InMemoryQueryExecutor.scala @@ -0,0 +1,119 @@ +package code.api.dynamic.entity.query + +import com.openbankproject.commons.model.enums.DynamicEntityFieldType +import net.liftweb.json.JsonAST._ + +import scala.util.Try + +/** + * Pure, side-effect-free evaluation of a [[QueryPlan]] over an in-memory list of records. + * + * This is (a) the implementation behind the in-memory portable-floor backend and (b) the test + * oracle the projection (SQL) backends are checked against — same plan, same result. It handles + * only scalar operators; spatial operators cannot be evaluated in memory and are excluded here + * (the planner rejects spatial against a non-spatial backend before this point). + * + * Predictability over cleverness: a record whose field is missing or not coercible to the + * declared type simply does not match a filter, and sorts last. + */ +object InMemoryQueryExecutor { + + /** Apply filter -> sort -> offset/limit. `fieldTypes` maps fieldName -> declared type. */ + def execute(records: List[JObject], plan: QueryPlan, fieldTypes: Map[String, DynamicEntityFieldType]): List[JObject] = { + val filtered = records.filter(rec => plan.filters.forall(f => matches(f, rec, typeOf(f.field, fieldTypes)))) + val sorted = if (plan.sort.isEmpty) filtered else filtered.sortWith((a, b) => compareRecords(a, b, plan.sort, fieldTypes) < 0) + paginate(sorted, plan.page) + } + + private def typeOf(field: String, fieldTypes: Map[String, DynamicEntityFieldType]): DynamicEntityFieldType = + fieldTypes.getOrElse(field, DynamicEntityFieldType.string) + + private def paginate(records: List[JObject], page: Page): List[JObject] = { + val afterOffset = page.offset.filter(_ > 0).fold(records)(records.drop) + page.limit.filter(_ >= 0).fold(afterOffset)(afterOffset.take) + } + + // ----- filtering ----- + + private def matches(filter: Filter, record: JObject, ft: DynamicEntityFieldType): Boolean = { + val jv = record \ filter.field + import FilterOp._ + filter.op match { + case Eq | In => filter.values.exists(v => cmp(ft, jv, v).contains(0)) + case Ne => filter.values.nonEmpty && filter.values.forall(v => cmp(ft, jv, v).exists(_ != 0)) + case Lt => single(filter).exists(v => cmp(ft, jv, v).exists(_ < 0)) + case Gt => single(filter).exists(v => cmp(ft, jv, v).exists(_ > 0)) + case Le => single(filter).exists(v => cmp(ft, jv, v).exists(_ <= 0)) + case Ge => single(filter).exists(v => cmp(ft, jv, v).exists(_ >= 0)) + case Between => filter.values match { + case lo :: hi :: _ => cmp(ft, jv, lo).exists(_ >= 0) && cmp(ft, jv, hi).exists(_ <= 0) + case _ => false + } + case Like => toStringValue(jv).exists(s => filter.values.exists(v => s.toLowerCase.contains(v.toLowerCase))) + case _ => false // spatial operators are not evaluable in memory + } + } + + private def single(filter: Filter): Option[String] = filter.values.headOption + + /** Sign of record value compared to the operand string, or None if missing / not coercible. */ + private def cmp(ft: DynamicEntityFieldType, jv: JValue, operand: String): Option[Int] = + if (isNumeric(ft)) for { a <- toBigDecimal(jv); b <- parseBigDecimal(operand) } yield a.compare(b) + else if (isBoolean(ft)) for { a <- toBoolean(jv); b <- parseBoolean(operand) } yield Ordering.Boolean.compare(a, b) + else toStringValue(jv).map(_.compareTo(operand)) // string, DATE_WITH_DAY (ISO sorts lexically), reference + + // ----- sorting ----- + + private def compareRecords(a: JObject, b: JObject, keys: List[SortKey], fieldTypes: Map[String, DynamicEntityFieldType]): Int = + keys.iterator.map { k => + val c = cmp2(typeOf(k.field, fieldTypes), a \ k.field, b \ k.field) + k.direction match { case SortDirection.Asc => c; case SortDirection.Desc => -c } + }.find(_ != 0).getOrElse(0) + + /** Compare two record values; present-and-coercible sorts before missing/uncoercible. */ + private def cmp2(ft: DynamicEntityFieldType, a: JValue, b: JValue): Int = + if (isNumeric(ft)) compareOpt(toBigDecimal(a), toBigDecimal(b)) + else if (isBoolean(ft)) compareOpt(toBoolean(a), toBoolean(b)) + else compareOpt(toStringValue(a), toStringValue(b)) + + private def compareOpt[T](a: Option[T], b: Option[T])(implicit ord: Ordering[T]): Int = (a, b) match { + case (Some(x), Some(y)) => ord.compare(x, y) + case (Some(_), None) => -1 // present before missing + case (None, Some(_)) => 1 + case (None, None) => 0 + } + + // ----- type helpers ----- + + private def isNumeric(ft: DynamicEntityFieldType): Boolean = + ft == DynamicEntityFieldType.number || ft == DynamicEntityFieldType.integer + private def isBoolean(ft: DynamicEntityFieldType): Boolean = + ft == DynamicEntityFieldType.boolean + + private def toBigDecimal(jv: JValue): Option[BigDecimal] = jv match { + case JInt(i) => Some(BigDecimal(i)) + case JDouble(d) => Some(BigDecimal(d)) + case JString(s) => parseBigDecimal(s) + case _ => None + } + private def parseBigDecimal(s: String): Option[BigDecimal] = Try(BigDecimal(s.trim)).toOption + + private def toBoolean(jv: JValue): Option[Boolean] = jv match { + case JBool(b) => Some(b) + case JString(s) => parseBoolean(s) + case _ => None + } + private def parseBoolean(s: String): Option[Boolean] = s.trim.toLowerCase match { + case "true" => Some(true) + case "false" => Some(false) + case _ => None + } + + private def toStringValue(jv: JValue): Option[String] = jv match { + case JString(s) => Some(s) + case JInt(i) => Some(i.toString) + case JDouble(d) => Some(d.toString) + case JBool(b) => Some(b.toString) + case _ => None + } +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/query/OperatorMatrix.scala b/obp-api/src/main/scala/code/api/dynamic/entity/query/OperatorMatrix.scala new file mode 100644 index 0000000000..0d56b8586b --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/query/OperatorMatrix.scala @@ -0,0 +1,45 @@ +package code.api.dynamic.entity.query + +import com.openbankproject.commons.model.enums.DynamicEntityFieldType + +/** + * The closed allow-list of which [[FilterOp]]s are legal for each Dynamic Entity field type + * (and whether the type is sortable), plus the index kind ("scalar" | "spatial"). + * + * This is the contract-layer rule the planner enforces — identical on every backend (Shape B), + * so a query is accepted/rejected the same way regardless of the underlying database. Anything + * not explicitly permitted here is rejected; new types/operators stay rejected until added. + */ +object OperatorMatrix { + import DynamicEntityFieldType._ + import FilterOp._ + + val SCALAR = "scalar" + val SPATIAL = "spatial" + + private val numericOps: Set[FilterOp] = Set(Eq, Ne, In, Lt, Gt, Le, Ge, Between) + private val dateOps: Set[FilterOp] = Set(Eq, Ne, In, Lt, Gt, Le, Ge, Between) + private val stringOps: Set[FilterOp] = Set(Eq, Ne, In, Like) + private val boolOps: Set[FilterOp] = Set(Eq, Ne) + + /** Operators permitted for a field of this type + index kind. Empty = field is not filterable. */ + def allowedOps(fieldType: DynamicEntityFieldType, indexKind: String): Set[FilterOp] = + (fieldType, indexKind) match { + case (`json`, SPATIAL) => spatial + case (`json`, _) => Set.empty // non-spatial json is never filterable + case (`number`, _) => numericOps + case (`integer`, _) => numericOps + case (`DATE_WITH_DAY`, _) => dateOps + case (`boolean`, _) => boolOps + case (`string`, _) => stringOps + case _ => stringOps // reference types behave like string ids + } + + /** Whether a field of this type + index kind may appear in `obp_sort_by`. */ + def sortable(fieldType: DynamicEntityFieldType, indexKind: String): Boolean = + (fieldType, indexKind) match { + case (`json`, _) => false // neither whole-json nor geometry is orderable + case (`boolean`, _) => false // ordering booleans is meaningless — keep it simple + case _ => true + } +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/query/QueryModel.scala b/obp-api/src/main/scala/code/api/dynamic/entity/query/QueryModel.scala new file mode 100644 index 0000000000..84724b8f22 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/query/QueryModel.scala @@ -0,0 +1,63 @@ +package code.api.dynamic.entity.query + +/** + * Abstract, backend-neutral query model for Dynamic Entity list reads (DE_indexing, Phase 0). + * + * A [[QueryPlan]] is produced by the planner from request query params (filter/sort/paginate), + * validated against the entity definition's declared `indexed` fields, then handed to a + * [[DynamicEntityQueryBackend]] which compiles it to its own dialect (in-memory, Postgres SQL, ...). + * + * Filter operand values are kept as raw strings here (as received from query params); typed + * coercion against the field's declared type happens inside each backend, so the model itself + * carries no vendor- or type-specific representation. + */ + +sealed trait FilterOp { def name: String } +object FilterOp { + // scalar + case object Eq extends FilterOp { val name = "eq" } + case object Ne extends FilterOp { val name = "ne" } + case object In extends FilterOp { val name = "in" } + case object Lt extends FilterOp { val name = "lt" } + case object Gt extends FilterOp { val name = "gt" } + case object Le extends FilterOp { val name = "le" } + case object Ge extends FilterOp { val name = "ge" } + case object Between extends FilterOp { val name = "between" } + case object Like extends FilterOp { val name = "like" } + // spatial (served only by a spatial-capable backend; never in-memory) + case object Within extends FilterOp { val name = "within" } + case object Contains extends FilterOp { val name = "contains" } + case object Intersects extends FilterOp { val name = "intersects" } + case object DWithin extends FilterOp { val name = "dwithin" } + + val all: List[FilterOp] = + List(Eq, Ne, In, Lt, Gt, Le, Ge, Between, Like, Within, Contains, Intersects, DWithin) + + val byName: Map[String, FilterOp] = all.map(op => op.name -> op).toMap + + val spatial: Set[FilterOp] = Set(Within, Contains, Intersects, DWithin) +} + +sealed trait SortDirection +object SortDirection { + case object Asc extends SortDirection + case object Desc extends SortDirection +} + +/** A single filter predicate: `field op values`. `between` carries two values, `in` carries N, others one. */ +case class Filter(field: String, op: FilterOp, values: List[String]) + +/** A single sort key. */ +case class SortKey(field: String, direction: SortDirection) + +/** Offset/limit pagination. No page-number / total-count scheme by design (see DE_indexing_plan.md). */ +case class Page(offset: Option[Int], limit: Option[Int]) +object Page { + val empty: Page = Page(None, None) +} + +/** The fully-parsed, validated query. */ +case class QueryPlan(filters: List[Filter], sort: List[SortKey], page: Page) +object QueryPlan { + val empty: QueryPlan = QueryPlan(Nil, Nil, Page.empty) +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/query/QueryParamParser.scala b/obp-api/src/main/scala/code/api/dynamic/entity/query/QueryParamParser.scala new file mode 100644 index 0000000000..3478af33d3 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/query/QueryParamParser.scala @@ -0,0 +1,112 @@ +package code.api.dynamic.entity.query + +import scala.util.Try + +/** + * Parses Dynamic Entity list-read query params into the raw, unvalidated pieces of a query + * (DE_indexing, Phase 1). Validation against the entity definition happens in [[QueryPlanner]]. + * + * Param contract (OBP-prefixed for consistency with the rest of the API): + * - pagination : `obp_offset`, `obp_limit` (offset/limit; no page-number, no total count) + * - sort : `obp_sort_by` (comma-separated for multi-key), `obp_sort_direction` (ASC|DESC) + * - filter : `obp_filter[FIELD]=OP:VALUE` (repeat per field; repeat the same key for + * multiple constraints on one field). VALUE after the first ':' is opaque; `in` + * and `between` split it on commas. OP is required (e.g. `eq:`), which also lets a + * value legitimately contain ':' (only the first ':' is the separator). + * + * This is the one syntax-specific function; everything downstream is syntax-neutral. + */ +object QueryParamParser { + + val Limit = "obp_limit" + val Offset = "obp_offset" + val SortBy = "obp_sort_by" + val SortDirectionKey = "obp_sort_direction" + + private val FilterKey = """^obp_filter\[(.+)\]$""".r + + def parse(params: Map[String, List[String]]): Either[QueryError, (List[Filter], List[SortKey], Page)] = + for { + page <- parsePage(params) + sort <- parseSort(params) + filters <- parseFilters(params) + } yield (filters, sort, page) + + // ----- pagination ----- + + private def parsePage(params: Map[String, List[String]]): Either[QueryError, Page] = + for { + limit <- parseNonNegativeInt(params, Limit) + offset <- parseNonNegativeInt(params, Offset) + } yield Page(offset, limit) + + private def parseNonNegativeInt(params: Map[String, List[String]], key: String): Either[QueryError, Option[Int]] = + firstValue(params, key) match { + case None => Right(None) + case Some(raw) => Try(raw.trim.toInt).toOption match { + case Some(n) if n >= 0 => Right(Some(n)) + case Some(_) => Left(QueryError(s"$key must be a non-negative integer.")) + case None => Left(QueryError(s"$key must be an integer.")) + } + } + + // ----- sort ----- + + private def parseSort(params: Map[String, List[String]]): Either[QueryError, List[SortKey]] = + firstValue(params, SortBy).map(_.trim).filter(_.nonEmpty) match { + case None => Right(Nil) + case Some(sortByRaw) => + parseDirection(params).map { dir => + sortByRaw.split(",").toList.map(_.trim).filter(_.nonEmpty).map(SortKey(_, dir)) + } + } + + private def parseDirection(params: Map[String, List[String]]): Either[QueryError, SortDirection] = + firstValue(params, SortDirectionKey).map(_.trim) match { + case None => Right(SortDirection.Asc) + case Some(d) if d.equalsIgnoreCase("ASC") => Right(SortDirection.Asc) + case Some(d) if d.equalsIgnoreCase("DESC") => Right(SortDirection.Desc) + case Some(d) => Left(QueryError(s"$SortDirection must be ASC or DESC, not '$d'.")) + } + + // ----- filter ----- + + private def parseFilters(params: Map[String, List[String]]): Either[QueryError, List[Filter]] = { + val perKey: List[Either[QueryError, List[Filter]]] = params.toList.collect { + case (FilterKey(field), values) => traverse(values)(parseOneFilter(field, _)) + } + sequence(perKey).map(_.flatten) + } + + private def parseOneFilter(field: String, raw: String): Either[QueryError, Filter] = { + val idx = raw.indexOf(':') + if (idx < 0) + Left(QueryError(s"Filter obp_filter[$field] must be of the form :, e.g. obp_filter[$field]=eq:...")) + else { + val opToken = raw.substring(0, idx) + val operand = raw.substring(idx + 1) + FilterOp.byName.get(opToken) match { + case None => + Left(QueryError(s"Unknown filter operator '$opToken' in obp_filter[$field]. Valid operators: ${FilterOp.byName.keys.toList.sorted.mkString(", ")}.")) + case Some(op) => + val vals = + if (op == FilterOp.In || op == FilterOp.Between) operand.split(",", -1).toList.map(_.trim) + else List(operand) + Right(Filter(field, op, vals)) + } + } + } + + // ----- helpers ----- + + private def firstValue(params: Map[String, List[String]], key: String): Option[String] = + params.get(key).flatMap(_.headOption) + + private def traverse[A, B](xs: List[A])(f: A => Either[QueryError, B]): Either[QueryError, List[B]] = + xs.foldRight(Right(Nil): Either[QueryError, List[B]]) { (a, acc) => + for { b <- f(a); rest <- acc } yield b :: rest + } + + private def sequence[B](xs: List[Either[QueryError, B]]): Either[QueryError, List[B]] = + traverse(xs)(identity) +} diff --git a/obp-api/src/main/scala/code/api/dynamic/entity/query/QueryPlanner.scala b/obp-api/src/main/scala/code/api/dynamic/entity/query/QueryPlanner.scala new file mode 100644 index 0000000000..92c9a50696 --- /dev/null +++ b/obp-api/src/main/scala/code/api/dynamic/entity/query/QueryPlanner.scala @@ -0,0 +1,90 @@ +package code.api.dynamic.entity.query + +import com.openbankproject.commons.model.enums.DynamicEntityFieldType + +import java.time.LocalDate +import scala.util.Try + +/** What the planner knows about one declared-`indexed` field. */ +case class FieldSpec(fieldType: DynamicEntityFieldType, indexKind: String) + +/** A contract-layer validation failure (maps to HTTP 400 at the endpoint). */ +case class QueryError(message: String) + +/** + * The definition-driven planner (DE_indexing, Phase 1). + * + * Validates parsed filter/sort terms against the entity's declared `indexed` fields and the + * [[OperatorMatrix]], producing a [[QueryPlan]] only if every term is legal. This is the closed + * allow-list enforced identically on every backend (Shape B): a field must be declared queryable, + * the operator must be legal for the field's type, and scalar values must coerce to that type. + * + * Four checks per the design doc: (1) field indexed? (2) operator legal for type? (3) value + * coerces? (4) sort field sortable? + */ +object QueryPlanner { + + def plan( + filters: List[Filter], + sort: List[SortKey], + page: Page, + indexedFields: Map[String, FieldSpec] + ): Either[QueryError, QueryPlan] = + for { + _ <- firstError(filters.map(validateFilter(_, indexedFields))) + _ <- firstError(sort.map(validateSort(_, indexedFields))) + } yield QueryPlan(filters, sort, page) + + // ----- per-term validation ----- + + private def validateFilter(f: Filter, indexedFields: Map[String, FieldSpec]): Option[QueryError] = + indexedFields.get(f.field) match { + case None => Some(QueryError(s"Field '${f.field}' is not queryable (it is not declared indexed).")) + case Some(spec) => + val allowed = OperatorMatrix.allowedOps(spec.fieldType, spec.indexKind) + if (!allowed.contains(f.op)) + Some(QueryError(s"Operator '${f.op.name}' is not valid for field '${f.field}' of type '${spec.fieldType}'.")) + else + arityError(f).orElse(coercionError(f, spec)) + } + + private def validateSort(s: SortKey, indexedFields: Map[String, FieldSpec]): Option[QueryError] = + indexedFields.get(s.field) match { + case None => Some(QueryError(s"Cannot sort by '${s.field}': it is not declared indexed.")) + case Some(spec) => + if (OperatorMatrix.sortable(spec.fieldType, spec.indexKind)) None + else Some(QueryError(s"Field '${s.field}' of type '${spec.fieldType}' is not sortable.")) + } + + /** Operand count must match the operator. */ + private def arityError(f: Filter): Option[QueryError] = { + import FilterOp._ + f.op match { + case Between if f.values.size != 2 => Some(QueryError(s"Operator 'between' on '${f.field}' requires exactly two values.")) + case In if f.values.isEmpty => Some(QueryError(s"Operator 'in' on '${f.field}' requires at least one value.")) + case _ if FilterOp.spatial.contains(f.op) => None // spatial operand shape validated by the spatial backend + case _ if f.values.size != 1 => Some(QueryError(s"Operator '${f.op.name}' on '${f.field}' requires exactly one value.")) + case _ => None + } + } + + /** Scalar values must coerce to the declared type (spatial / like operands are not coerced here). */ + private def coercionError(f: Filter, spec: FieldSpec): Option[QueryError] = { + if (FilterOp.spatial.contains(f.op) || f.op == FilterOp.Like) None + else f.values.find(v => !coerces(spec.fieldType, v)) + .map(bad => QueryError(s"Value '$bad' is not a valid '${spec.fieldType}' for field '${f.field}'.")) + } + + private def coerces(ft: DynamicEntityFieldType, v: String): Boolean = { + import DynamicEntityFieldType._ + val s = v.trim + if (ft == number) Try(BigDecimal(s)).isSuccess + else if (ft == integer) Try(BigInt(s)).isSuccess + else if (ft == boolean) s.equalsIgnoreCase("true") || s.equalsIgnoreCase("false") + else if (ft == DATE_WITH_DAY) Try(LocalDate.parse(s)).isSuccess // ISO yyyy-MM-dd + else true // string and reference types accept any value + } + + private def firstError(results: List[Option[QueryError]]): Either[QueryError, Unit] = + results.flatten.headOption.toLeft(()) +} diff --git a/obp-api/src/main/scala/code/api/util/ErrorMessages.scala b/obp-api/src/main/scala/code/api/util/ErrorMessages.scala index f5e8e0f416..13f3b52fe0 100644 --- a/obp-api/src/main/scala/code/api/util/ErrorMessages.scala +++ b/obp-api/src/main/scala/code/api/util/ErrorMessages.scala @@ -76,6 +76,7 @@ object ErrorMessages { val DuplicateQueryParameters = "OBP-09016: Duplicate Query Parameters are not allowed." val DuplicateHeaderKeys = "OBP-09017: Duplicate Header Keys are not allowed." val InvalidDynamicEntityName = "OBP-09018: Invalid entity_name format. Entity names must be lowercase with underscores (snake_case), e.g. 'customer_preferences'. No uppercase letters or spaces allowed." + val DynamicEntityFieldNotYetQueryable = "OBP-09019: Requested field(s) are not yet queryable - the index is still being built. Please retry shortly." // General messages (OBP-10XXX) diff --git a/obp-api/src/main/scala/code/dynamicEntity/DynamicEntityProvider.scala b/obp-api/src/main/scala/code/dynamicEntity/DynamicEntityProvider.scala index 9395e2b602..2fc0601608 100644 --- a/obp-api/src/main/scala/code/dynamicEntity/DynamicEntityProvider.scala +++ b/obp-api/src/main/scala/code/dynamicEntity/DynamicEntityProvider.scala @@ -610,6 +610,33 @@ object DynamicEntityCommons extends Converter[DynamicEntityT, DynamicEntityCommo if(readRole != JNothing) { checkFormat(readRole.isInstanceOf[JString] && readRole.asInstanceOf[JString].s.nonEmpty, s"$DynamicEntityInstanceValidateFail The property of $fieldName's 'readRole' field must be a non-empty string.") } + + // validate optional indexing keywords (DE_indexing). All optional; absence => field is not queryable. + // 'indexed': boolean — marks the field filterable/sortable. + // 'index' : "scalar" (default, B-tree) | "spatial" (GeoJSON geometry, only on a json field). + val indexed = value \ "indexed" + if(indexed != JNothing) { + checkFormat(indexed.isInstanceOf[JBool], s"$DynamicEntityInstanceValidateFail The property of $fieldName's 'indexed' field must be boolean.") + } + val isIndexed = indexed.isInstanceOf[JBool] && indexed.asInstanceOf[JBool].value + val indexKind = value \ "index" + val indexKindName = + if(indexKind != JNothing) { + checkFormat(indexKind.isInstanceOf[JString], s"$DynamicEntityInstanceValidateFail The property of $fieldName's 'index' field must be a string.") + val k = indexKind.asInstanceOf[JString].s + checkFormat(Set("scalar", "spatial").contains(k), s"$DynamicEntityInstanceValidateFail The property of $fieldName's 'index' field must be one of: scalar, spatial.") + checkFormat(isIndexed, s"$DynamicEntityInstanceValidateFail The property of $fieldName's 'index' field is only valid when 'indexed' is true.") + k + } else "scalar" + if(isIndexed) { + val isJsonType = fieldTypeOp.exists(_ == DynamicEntityFieldType.json) + if(indexKindName == "spatial") { + checkFormat(isJsonType, s"$DynamicEntityInstanceValidateFail The property of $fieldName's 'index':'spatial' is only allowed on a 'json' (GeoJSON) field; field type is $fieldTypeName.") + } + if(isJsonType) { + checkFormat(indexKindName == "spatial", s"$DynamicEntityInstanceValidateFail The property of $fieldName is type 'json' and can only be indexed with 'index':'spatial' (GeoJSON geometry).") + } + } }) DynamicEntityCommons(entityName, compactRender(jsonObject), dynamicEntityId, userId, bankId, hasPersonalEntityValue, hasPublicAccessValue, hasCommunityAccessValue, personalRequiresRoleValue) diff --git a/obp-api/src/main/scala/code/dynamicEntity/MapppedDynamicDataProvider.scala b/obp-api/src/main/scala/code/dynamicEntity/MapppedDynamicDataProvider.scala index d6615dbd65..9c7078ade5 100644 --- a/obp-api/src/main/scala/code/dynamicEntity/MapppedDynamicDataProvider.scala +++ b/obp-api/src/main/scala/code/dynamicEntity/MapppedDynamicDataProvider.scala @@ -125,7 +125,12 @@ object MappedDynamicDataProvider extends DynamicDataProvider with CustomJsonForm } override def delete(bankId: Option[String], entityName: String, id: String, userId: Option[String], isPersonalEntity: Boolean) = { - get(bankId, entityName, id, userId, isPersonalEntity).map(_.asInstanceOf[DynamicData].delete_!) + get(bankId, entityName, id, userId, isPersonalEntity).map { d => + val result = d.asInstanceOf[DynamicData].delete_! + // DE_indexing: remove the projection row in the same transaction (no-op unless projection enabled+ready). + code.api.dynamic.entity.projection.ProjectionDualWrite.onDelete(bankId, entityName, id) + result + } } // Community access: return ALL records regardless of userId/IsPersonalEntity @@ -203,12 +208,15 @@ object MappedDynamicDataProvider extends DynamicDataProvider with CustomJsonForm val data: DynamicData = dynamicData tryo { val dataStr = json.compactRender(requestBody) - data.DataJson(dataStr) + val saved = data.DataJson(dataStr) .DynamicEntityName(entityName) .BankId(bankId.getOrElse(null)) .UserId(userId.getOrElse(null)) .IsPersonalEntity(isPersonalEntity) .saveMe() + // DE_indexing: keep the projection in sync in the same transaction (no-op unless projection enabled+ready). + code.api.dynamic.entity.projection.ProjectionDualWrite.onSave(bankId, entityName, saved.DynamicDataId.get, requestBody) + saved } } diff --git a/obp-api/src/main/scala/code/dynamicEntity/MapppedDynamicEntityProvider.scala b/obp-api/src/main/scala/code/dynamicEntity/MapppedDynamicEntityProvider.scala index 6590950a2f..d9910c18df 100644 --- a/obp-api/src/main/scala/code/dynamicEntity/MapppedDynamicEntityProvider.scala +++ b/obp-api/src/main/scala/code/dynamicEntity/MapppedDynamicEntityProvider.scala @@ -61,7 +61,7 @@ object MappedDynamicEntityProvider extends DynamicEntityProvider with CustomJson tryo{ try { - entityToPersist + val saved = entityToPersist .EntityName(dynamicEntity.entityName) .MetadataJson(dynamicEntity.metadataJson) .UserId(dynamicEntity.userId) @@ -71,6 +71,25 @@ object MappedDynamicEntityProvider extends DynamicEntityProvider with CustomJson .HasCommunityAccess(dynamicEntity.hasCommunityAccess) .PersonalRequiresRole(dynamicEntity.personalRequiresRole) .saveMe() + // DE_indexing: provision/refresh the projection for this definition's indexed scalar fields. + // Guarded by projectionEnabled (default off); best-effort (a failure leaves the definition saved + // and queries reporting pending, not a broken create). Fields passed explicitly because the new + // definition isn't committed/visible in the definition map yet. + if (code.api.dynamic.entity.projection.IndexingCapabilities.projectionEnabled) { + try { + val info = code.api.dynamic.entity.helper.DynamicEntityInfo( + dynamicEntity.metadataJson, dynamicEntity.entityName, dynamicEntity.bankId, + dynamicEntity.hasPersonalEntity, dynamicEntity.hasPublicAccess, dynamicEntity.hasCommunityAccess, dynamicEntity.personalRequiresRole) + val scalar = code.api.dynamic.entity.projection.ProjectionProvisioner.scalarFieldsOf(info.indexedFields) + if (scalar.nonEmpty) + code.api.dynamic.entity.projection.ProjectionProvisioner + .ensureProvisionedFields(dynamicEntity.bankId, dynamicEntity.entityName, scalar) + .unsafeRunSync()(cats.effect.unsafe.implicits.global) + } catch { + case e: Throwable => logger.error(s"DE projection provisioning failed for ${dynamicEntity.entityName} (definition saved; queries will report pending)", e) + } + } + saved } catch { case e : Throwable => logger.error("Create or Update DynamicEntity fail.", e) diff --git a/obp-api/src/test/scala/code/api/dynamic/entity/projection/ProjectionNamingSpec.scala b/obp-api/src/test/scala/code/api/dynamic/entity/projection/ProjectionNamingSpec.scala new file mode 100644 index 0000000000..f3108b1d61 --- /dev/null +++ b/obp-api/src/test/scala/code/api/dynamic/entity/projection/ProjectionNamingSpec.scala @@ -0,0 +1,39 @@ +package code.api.dynamic.entity.projection + +import org.scalatest.{FlatSpec, Matchers} + +class ProjectionNamingSpec extends FlatSpec with Matchers { + + "ProjectionNaming.tableName" should "be deterministic and length/charset safe" in { + val a = ProjectionNaming.tableName(None, "ParcelOwnerVerification") + a shouldBe ProjectionNaming.tableName(None, "ParcelOwnerVerification") // deterministic + a should fullyMatch regex "[a-z0-9_]+".r + a.length should be <= 63 + a should startWith("de_") + } + + it should "distinguish system-level from bank-level entities of the same name" in { + ProjectionNaming.tableName(None, "Parcel") should not be ProjectionNaming.tableName(Some("bankX"), "Parcel") + } + + it should "distinguish different entity names" in { + ProjectionNaming.tableName(None, "Parcel") should not be ProjectionNaming.tableName(None, "Owner") + } + + "ProjectionNaming.columnName" should "be deterministic, safe and start with c_" in { + val c = ProjectionNaming.columnName("price.amount") + c shouldBe ProjectionNaming.columnName("price.amount") + c should fullyMatch regex "[a-z0-9_]+".r + c.length should be <= 63 + c should startWith("c_") + } + + "ProjectionDDL.sqlColumnType" should "map DE scalar types to portable SQL types" in { + ProjectionDDL.sqlColumnType("number") shouldBe "numeric" + ProjectionDDL.sqlColumnType("integer") shouldBe "bigint" + ProjectionDDL.sqlColumnType("boolean") shouldBe "boolean" + ProjectionDDL.sqlColumnType("DATE_WITH_DAY") shouldBe "date" + ProjectionDDL.sqlColumnType("string") shouldBe "text" + ProjectionDDL.sqlColumnType("reference:Bank") shouldBe "text" + } +} diff --git a/obp-api/src/test/scala/code/api/dynamic/entity/projection/ProjectionSqlSpec.scala b/obp-api/src/test/scala/code/api/dynamic/entity/projection/ProjectionSqlSpec.scala new file mode 100644 index 0000000000..b19f696797 --- /dev/null +++ b/obp-api/src/test/scala/code/api/dynamic/entity/projection/ProjectionSqlSpec.scala @@ -0,0 +1,47 @@ +package code.api.dynamic.entity.projection + +import code.api.dynamic.entity.query._ +import doobie.implicits._ +import org.scalatest.{FlatSpec, Matchers} + +class ProjectionSqlSpec extends FlatSpec with Matchers { + + private val cols = Map("price" -> "c_price_x", "status" -> "c_status_y") + private val types = Map("price" -> "numeric", "status" -> "text") + private def columnOf(f: String): Option[String] = cols.get(f) + private def sqlTypeOf(f: String): Option[String] = types.get(f) + + private def sql(plan: QueryPlan): String = + ProjectionSql.selectDataIds("de_t", plan, columnOf, sqlTypeOf).get.query[String].sql + + "ProjectionSql" should "build select data_id with where / order / limit / offset and cast operands" in { + val s = sql(QueryPlan( + List(Filter("price", FilterOp.Lt, List("10"))), + List(SortKey("price", SortDirection.Desc)), + Page(Some(40), Some(20)))) + s should include ("SELECT data_id FROM de_t") + s should include ("c_price_x") + s.toUpperCase should include ("CAST(") + s should include ("numeric") + s.toUpperCase should include ("ORDER BY") + s.toUpperCase should include ("DESC") + s.toUpperCase should include ("LIMIT") + s.toUpperCase should include ("OFFSET") + } + + it should "AND multiple predicates and support in / between" in { + val s = sql(QueryPlan(List( + Filter("status", FilterOp.In, List("a", "b")), + Filter("price", FilterOp.Between, List("5", "10"))), Nil, Page.empty)) + s.toUpperCase should include ("AND") + s.toUpperCase should include ("IN (") + s.toUpperCase should include ("BETWEEN") + } + + it should "return None when a field is unresolved or a spatial operator is present" in { + ProjectionSql.selectDataIds("de_t", + QueryPlan(List(Filter("nope", FilterOp.Eq, List("1"))), Nil, Page.empty), columnOf, sqlTypeOf) shouldBe None + ProjectionSql.selectDataIds("de_t", + QueryPlan(List(Filter("price", FilterOp.DWithin, List("x"))), Nil, Page.empty), columnOf, sqlTypeOf) shouldBe None + } +} diff --git a/obp-api/src/test/scala/code/api/dynamic/entity/query/QuerySpec.scala b/obp-api/src/test/scala/code/api/dynamic/entity/query/QuerySpec.scala new file mode 100644 index 0000000000..6ac8f96567 --- /dev/null +++ b/obp-api/src/test/scala/code/api/dynamic/entity/query/QuerySpec.scala @@ -0,0 +1,122 @@ +package code.api.dynamic.entity.query + +import com.openbankproject.commons.model.enums.DynamicEntityFieldType +import net.liftweb.json.JsonAST.JObject +import org.scalatest.{FlatSpec, Matchers} + +/** + * Pure unit tests for the DE_indexing query core: param parser, definition-driven planner, + * and the in-memory executor (the portable floor + oracle). No server / DB. + */ +class QuerySpec extends FlatSpec with Matchers { + + private def params(kvs: (String, String)*): Map[String, List[String]] = + kvs.groupBy(_._1).map { case (k, vs) => k -> vs.map(_._2).toList } + + private def rec(s: String): JObject = net.liftweb.json.parse(s).asInstanceOf[JObject] + + private val indexed: Map[String, FieldSpec] = Map( + "price" -> FieldSpec(DynamicEntityFieldType.number, "scalar"), + "qty" -> FieldSpec(DynamicEntityFieldType.integer, "scalar"), + "status" -> FieldSpec(DynamicEntityFieldType.string, "scalar"), + "active" -> FieldSpec(DynamicEntityFieldType.boolean, "scalar"), + "geom" -> FieldSpec(DynamicEntityFieldType.json, "spatial") + ) + private val fieldTypes = indexed.mapValues(_.fieldType).toMap + + // ----- QueryParamParser ----- + + "QueryParamParser" should "parse a scalar filter, sort and pagination" in { + val Right((filters, sort, page)) = QueryParamParser.parse(params( + "obp_filter[price]" -> "lt:10", "obp_sort_by" -> "price", "obp_sort_direction" -> "DESC", + "obp_limit" -> "20", "obp_offset" -> "40")) + filters shouldBe List(Filter("price", FilterOp.Lt, List("10"))) + sort shouldBe List(SortKey("price", SortDirection.Desc)) + page shouldBe Page(Some(40), Some(20)) + } + + it should "split in/between values on commas but keep other operands opaque" in { + val Right((filters, _, _)) = QueryParamParser.parse(params( + "obp_filter[status]" -> "in:a,b,c", "obp_filter[price]" -> "between:5,10")) + filters.toSet shouldBe Set( + Filter("status", FilterOp.In, List("a", "b", "c")), + Filter("price", FilterOp.Between, List("5", "10"))) + } + + it should "reject a missing operator, an unknown operator and a bad direction/limit" in { + QueryParamParser.parse(params("obp_filter[price]" -> "10")).isLeft shouldBe true + QueryParamParser.parse(params("obp_filter[price]" -> "foo:10")).isLeft shouldBe true + QueryParamParser.parse(params("obp_sort_by" -> "price", "obp_sort_direction" -> "sideways")).isLeft shouldBe true + QueryParamParser.parse(params("obp_limit" -> "-1")).isLeft shouldBe true + QueryParamParser.parse(params("obp_limit" -> "x")).isLeft shouldBe true + } + + // ----- QueryPlanner ----- + + "QueryPlanner" should "accept a valid plan" in { + QueryPlanner.plan(List(Filter("price", FilterOp.Lt, List("10"))), List(SortKey("price", SortDirection.Asc)), Page.empty, indexed).isRight shouldBe true + } + + it should "reject a non-indexed field" in { + QueryPlanner.plan(List(Filter("colour", FilterOp.Eq, List("red"))), Nil, Page.empty, indexed).isLeft shouldBe true + } + + it should "reject an operator illegal for the field's type" in { + QueryPlanner.plan(List(Filter("active", FilterOp.Gt, List("true"))), Nil, Page.empty, indexed).isLeft shouldBe true // gt on boolean + QueryPlanner.plan(List(Filter("status", FilterOp.Gt, List("x"))), Nil, Page.empty, indexed).isLeft shouldBe true // gt on string + } + + it should "reject a value that does not coerce to the field type" in { + QueryPlanner.plan(List(Filter("price", FilterOp.Lt, List("abc"))), Nil, Page.empty, indexed).isLeft shouldBe true + QueryPlanner.plan(List(Filter("qty", FilterOp.Eq, List("1.5"))), Nil, Page.empty, indexed).isLeft shouldBe true + } + + it should "reject between with the wrong arity" in { + QueryPlanner.plan(List(Filter("price", FilterOp.Between, List("5"))), Nil, Page.empty, indexed).isLeft shouldBe true + } + + it should "reject sorting by a json/spatial field" in { + QueryPlanner.plan(Nil, List(SortKey("geom", SortDirection.Asc)), Page.empty, indexed).isLeft shouldBe true + } + + it should "allow spatial operators only on a spatial field" in { + QueryPlanner.plan(List(Filter("geom", FilterOp.DWithin, List("13.4,52.5;100000"))), Nil, Page.empty, indexed).isRight shouldBe true + QueryPlanner.plan(List(Filter("price", FilterOp.DWithin, List("x"))), Nil, Page.empty, indexed).isLeft shouldBe true + } + + // ----- InMemoryQueryExecutor ----- + + private val data = List( + rec("""{"price":10,"qty":1,"status":"active"}"""), + rec("""{"price":5,"qty":3,"status":"pending"}"""), + rec("""{"price":20,"qty":2,"status":"active"}""") + ) + + "InMemoryQueryExecutor" should "filter numerically (not lexically)" in { + val plan = QueryPlan(List(Filter("price", FilterOp.Lt, List("10"))), Nil, Page.empty) + InMemoryQueryExecutor.execute(data, plan, fieldTypes).map(d => (d \ "price").values) shouldBe List(BigInt(5)) + } + + it should "sort ascending and descending by a numeric field" in { + val asc = InMemoryQueryExecutor.execute(data, QueryPlan(Nil, List(SortKey("price", SortDirection.Asc)), Page.empty), fieldTypes) + asc.map(d => (d \ "price").values) shouldBe List(BigInt(5), BigInt(10), BigInt(20)) + val desc = InMemoryQueryExecutor.execute(data, QueryPlan(Nil, List(SortKey("price", SortDirection.Desc)), Page.empty), fieldTypes) + desc.map(d => (d \ "price").values) shouldBe List(BigInt(20), BigInt(10), BigInt(5)) + } + + it should "apply offset and limit after sorting" in { + val plan = QueryPlan(Nil, List(SortKey("price", SortDirection.Asc)), Page(Some(1), Some(1))) + InMemoryQueryExecutor.execute(data, plan, fieldTypes).map(d => (d \ "price").values) shouldBe List(BigInt(10)) + } + + it should "support eq, in and between" in { + InMemoryQueryExecutor.execute(data, QueryPlan(List(Filter("status", FilterOp.Eq, List("active"))), Nil, Page.empty), fieldTypes).size shouldBe 2 + InMemoryQueryExecutor.execute(data, QueryPlan(List(Filter("status", FilterOp.In, List("pending", "x"))), Nil, Page.empty), fieldTypes).size shouldBe 1 + InMemoryQueryExecutor.execute(data, QueryPlan(List(Filter("price", FilterOp.Between, List("6", "20"))), Nil, Page.empty), fieldTypes).size shouldBe 2 + } + + it should "exclude records whose field is missing or not coercible" in { + val withMissing = data :+ rec("""{"qty":9,"status":"active"}""") // no price + InMemoryQueryExecutor.execute(withMissing, QueryPlan(List(Filter("price", FilterOp.Ge, List("0"))), Nil, Page.empty), fieldTypes).size shouldBe 3 + } +} diff --git a/obp-api/src/test/scala/code/api/v6_0_0/DynamicEntityFieldRolesTest.scala b/obp-api/src/test/scala/code/api/v6_0_0/DynamicEntityFieldRolesTest.scala new file mode 100644 index 0000000000..c91123c794 --- /dev/null +++ b/obp-api/src/test/scala/code/api/v6_0_0/DynamicEntityFieldRolesTest.scala @@ -0,0 +1,202 @@ +/** +Open Bank Project - API +Copyright (C) 2011-2025, TESOBE GmbH + +This program is free software: you can redistribute it and/or modify +it under the terms of the GNU Affero General Public License as published by +the Free Software Foundation, either version 3 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU Affero General Public License for more details. + +You should have received a copy of the GNU Affero General Public License +along with this program. If not, see . + +Email: contact@tesobe.com +TESOBE GmbH +Osloerstrasse 16/17 +Berlin 13359, Germany + +This product includes software developed at +TESOBE (http://www.tesobe.com/) + */ +package code.api.v6_0_0 + +import code.api.util.APIUtil.OAuth._ +import code.api.util.ApiRole._ +import code.entitlement.Entitlement +import com.openbankproject.commons.util.ApiVersion +import net.liftweb.json.JsonDSL._ +import net.liftweb.json.Serialization.write +import net.liftweb.json._ +import org.scalatest.Tag + +/** + * Field-level write/read role permissions on Dynamic Entities. + * + * Lives in v6_0_0 because Dynamic Entity *definitions* are created via the v6.0.0 + * `management/system-dynamic-entities` endpoint and the DE test harness lives here; the runtime + * instance CRUD (incl. the new PATCH) is served on the version-agnostic `/obp/dynamic-entity` path. + */ +class DynamicEntityFieldRolesTest extends V600ServerSetup { + + object VersionOfApi extends Tag(ApiVersion.v6_0_0.toString) + + // ==================== Helpers ==================== + + private def grant(role: String): Unit = + Entitlement.entitlement.vend.addEntitlement("", resourceUser1.userId, role) + + private def createSystemEntity(entityJson: JValue): (Int, JValue) = { + grant(CanCreateSystemLevelDynamicEntity.toString) + val request = (v6_0_0_Request / "management" / "system-dynamic-entities").POST <@(user1) + val response = makePostRequest(request, write(entityJson)) + (response.code, response.body) + } + + private def deleteSystemEntity(dynamicEntityId: String): Unit = { + grant(CanDeleteSystemLevelDynamicEntity.toString) + val deleteRequest = (v4_0_0_Request / "management" / "system-dynamic-entities" / dynamicEntityId).DELETE <@(user1) + makeDeleteRequest(deleteRequest) + } + + // ==================== Fixture ==================== + + private val entityName = "field_roles_test" + private val single = "field_roles_test" // singleName wrapper key + private val idName = "field_roles_test_id" + + // Auto-generated (system-level) field roles + private val writeInternalRole = "CanWriteDynamicEntityField_Systemfield_roles_test__internal_note" + private val readSecretRole = "CanGetDynamicEntityField_Systemfield_roles_test__secret_note" + // Entity-level (system-level) roles + private val createRole = "CanCreateDynamicEntity_Systemfield_roles_test" + private val getRole = "CanGetDynamicEntity_Systemfield_roles_test" + private val updateRole = "CanUpdateDynamicEntity_Systemfield_roles_test" + + private val schema: JValue = parse( + """ + |{ + | "description": "Field-level role test entity.", + | "required": ["name"], + | "properties": { + | "name": {"type": "string", "minLength": 1, "maxLength": 40, "example": "Acme"}, + | "internal_note": {"type": "string", "example": "set via patch", "writeRoleRequired": true}, + | "secret_note": {"type": "string", "example": "hush", "readRoleRequired": true} + | } + |} + """.stripMargin) + + private val entity: JValue = + ("entity_name" -> entityName) ~ + ("has_personal_entity" -> true) ~ + ("schema" -> schema) + + private def recordId(createBody: JValue): String = (createBody \ single \ idName).extract[String] + + // ==================== Scenarios ==================== + + feature("Field-level write/read role permissions on Dynamic Entities") { + + scenario("A definition with field-level keywords can be created", VersionOfApi) { + val (code, body) = createSystemEntity(entity) + try code should equal(201) + finally deleteSystemEntity((body \ "dynamic_entity_id").extract[String]) + } + + scenario("POST drops a write-restricted field", VersionOfApi) { + val (code, body) = createSystemEntity(entity) + code should equal(201) + val dynamicEntityId = (body \ "dynamic_entity_id").extract[String] + try { + grant(createRole); grant(getRole) + When("We POST a record that includes the write-restricted internal_note") + val createResp = makePostRequest((dynamicEntity_Request / entityName).POST <@(user1), + write(parse("""{"name":"Acme","internal_note":"should be dropped"}"""))) + createResp.code should equal(201) + val id = recordId(createResp.body) + + Then("GET should not contain internal_note (it was stripped at create)") + val getResp = makeGetRequest((dynamicEntity_Request / entityName / id).GET <@(user1)) + getResp.code should equal(200) + (getResp.body \ single \ "name").extract[String] should equal("Acme") + (getResp.body \ single \ "internal_note") should equal(JNothing) + } finally deleteSystemEntity(dynamicEntityId) + } + + scenario("PUT cannot set a write-restricted field", VersionOfApi) { + val (code, body) = createSystemEntity(entity) + code should equal(201) + val dynamicEntityId = (body \ "dynamic_entity_id").extract[String] + try { + grant(createRole); grant(getRole); grant(updateRole) + val createResp = makePostRequest((dynamicEntity_Request / entityName).POST <@(user1), write(parse("""{"name":"Acme"}"""))) + val id = recordId(createResp.body) + + When("We PUT trying to set internal_note") + val putResp = makePutRequest((dynamicEntity_Request / entityName / id).PUT <@(user1), + write(parse("""{"name":"Acme2","internal_note":"hacked"}"""))) + putResp.code should equal(200) + + Then("internal_note remains unset; the unrestricted field updated") + val getResp = makeGetRequest((dynamicEntity_Request / entityName / id).GET <@(user1)) + (getResp.body \ single \ "name").extract[String] should equal("Acme2") + (getResp.body \ single \ "internal_note") should equal(JNothing) + } finally deleteSystemEntity(dynamicEntityId) + } + + scenario("PATCH a write-restricted field requires the field write role", VersionOfApi) { + val (code, body) = createSystemEntity(entity) + code should equal(201) + val dynamicEntityId = (body \ "dynamic_entity_id").extract[String] + try { + grant(createRole); grant(getRole); grant(updateRole) + val createResp = makePostRequest((dynamicEntity_Request / entityName).POST <@(user1), write(parse("""{"name":"Acme"}"""))) + val id = recordId(createResp.body) + + When("We PATCH internal_note WITHOUT the field write role") + val patch1 = makePatchRequest((dynamicEntity_Request / entityName / id).PATCH <@(user1), + write(parse("""{"internal_note":"viaPatch"}"""))) + Then("We get 403") + patch1.code should equal(403) + + When("We grant the field write role and PATCH again") + grant(writeInternalRole) + val patch2 = makePatchRequest((dynamicEntity_Request / entityName / id).PATCH <@(user1), + write(parse("""{"internal_note":"viaPatch"}"""))) + Then("We get 200 and the value is set; the other field is preserved") + patch2.code should equal(200) + val getResp = makeGetRequest((dynamicEntity_Request / entityName / id).GET <@(user1)) + (getResp.body \ single \ "internal_note").extract[String] should equal("viaPatch") + (getResp.body \ single \ "name").extract[String] should equal("Acme") + } finally deleteSystemEntity(dynamicEntityId) + } + + scenario("GET omits a read-restricted field unless the caller holds the read role", VersionOfApi) { + val (code, body) = createSystemEntity(entity) + code should equal(201) + val dynamicEntityId = (body \ "dynamic_entity_id").extract[String] + try { + grant(createRole); grant(getRole) + When("We POST a record with secret_note (read-restricted but writable)") + val createResp = makePostRequest((dynamicEntity_Request / entityName).POST <@(user1), + write(parse("""{"name":"Acme","secret_note":"hush"}"""))) + createResp.code should equal(201) + val id = recordId(createResp.body) + + Then("GET without the field read role omits secret_note") + val getResp1 = makeGetRequest((dynamicEntity_Request / entityName / id).GET <@(user1)) + (getResp1.body \ single \ "secret_note") should equal(JNothing) + + When("We grant the field read role") + grant(readSecretRole) + Then("GET now includes secret_note") + val getResp2 = makeGetRequest((dynamicEntity_Request / entityName / id).GET <@(user1)) + (getResp2.body \ single \ "secret_note").extract[String] should equal("hush") + } finally deleteSystemEntity(dynamicEntityId) + } + } +} diff --git a/obp-api/src/test/scala/code/api/v6_0_0/ProjectionDataPlaneIntegrationTest.scala b/obp-api/src/test/scala/code/api/v6_0_0/ProjectionDataPlaneIntegrationTest.scala new file mode 100644 index 0000000000..0d6be0965a --- /dev/null +++ b/obp-api/src/test/scala/code/api/v6_0_0/ProjectionDataPlaneIntegrationTest.scala @@ -0,0 +1,63 @@ +package code.api.v6_0_0 + +import code.api.dynamic.entity.projection.{IndexingCapabilities, ProjectionDb, ProjectionDDL, ProjectionSql, ProjectionStore} +import code.api.dynamic.entity.query._ +import code.api.util.APIUtil +import cats.effect.unsafe.implicits.global +import doobie.implicits._ + +/** + * Phase 3 integration proof: exercises the projection data-plane (DDL + upsert + compiled SQL) against + * the real Postgres test DB. Validates that numeric filtering, sorting and offset/limit run as actual + * SQL on a per-entity projection table (not in-memory). No DE-definition machinery — pure data-plane. + */ +class ProjectionDataPlaneIntegrationTest extends V600ServerSetup { + + private val table = "de_itest_projection" + private val priceCol = "c_price_itest" + + private def run[A](io: cats.effect.IO[A]): A = io.unsafeRunSync() + private def upsert(id: String, price: Option[String]): Unit = + run(ProjectionDb.run(ProjectionStore.upsert(table, id, List(ProjectionStore.ColumnValue(priceCol, "numeric", price))))) + + private def columnOf(f: String): Option[String] = if (f == "price") Some(priceCol) else None + private def sqlTypeOf(f: String): Option[String] = if (f == "price") Some("numeric") else None + private def ids(plan: QueryPlan): List[String] = + run(ProjectionDb.run(ProjectionSql.selectDataIds(table, plan, columnOf, sqlTypeOf).get.query[String].to[List])) + + feature("DE projection data-plane on Postgres") { + scenario("create table, upsert rows, then filter / sort / paginate via compiled SQL") { + // Postgres-only: this test runs Postgres-specific SQL (ON CONFLICT) that H2 cannot execute. + // Gated OFF by default so CI / H2 / developer workstations skip it (canceled, not failed). + // Enable locally with `test.projection.postgres=true` in test.default.props AND a Postgres db.url. + if (!APIUtil.getPropsAsBoolValue("test.projection.postgres", false) || IndexingCapabilities.vendor != IndexingCapabilities.Postgres) + cancel("Postgres projection integration tests disabled (set test.projection.postgres=true with a Postgres db.url; cannot run on H2).") + + run(ProjectionDDL.dropTableIO(table)) + run(ProjectionDDL.createTableIO(table)) + run(ProjectionDDL.addColumnIO(table, priceCol, "numeric")) + + upsert("id1", Some("10")) + upsert("id2", Some("5")) + upsert("id3", Some("20")) + upsert("id4", None) // coerce-or-null: no price + + Then("numeric filter price < 10 returns only id2 (numeric, not lexical)") + ids(QueryPlan(List(Filter("price", FilterOp.Lt, List("10"))), Nil, Page.empty)) should equal(List("id2")) + + Then("between 6 and 20 returns id1 and id3") + ids(QueryPlan(List(Filter("price", FilterOp.Between, List("6", "20"))), + List(SortKey("price", SortDirection.Asc)), Page.empty)) should equal(List("id1", "id3")) + + Then("sort DESC orders by numeric value; the NULL-price row sorts last and is excluded by a filter") + ids(QueryPlan(List(Filter("price", FilterOp.Ge, List("0"))), + List(SortKey("price", SortDirection.Desc)), Page.empty)) should equal(List("id3", "id1", "id2")) + + Then("offset/limit applies after the ORDER BY") + ids(QueryPlan(List(Filter("price", FilterOp.Ge, List("0"))), + List(SortKey("price", SortDirection.Asc)), Page(Some(1), Some(1)))) should equal(List("id1")) + + run(ProjectionDDL.dropTableIO(table)) + } + } +} diff --git a/run_all_tests.sh b/run_all_tests.sh index 36f283155d..07848f4706 100755 --- a/run_all_tests.sh +++ b/run_all_tests.sh @@ -805,6 +805,29 @@ if [ "$FOUND_FILES" = false ]; then log_message "No old test database files found" fi +# --- Postgres test-DB clean (only when the suite is pointed at Postgres) --- +# Persistent Postgres + OBP's re-schemify needs a clean schema each full run, else boot aborts with +# "cannot alter type of a column used by a view". Tolerant: skipped on H2 / no psql / DB unreachable. +PG_TEST_DB_NAME="${PG_TEST_DB_NAME:-obp_test_only}" +PG_TEST_DB_USER="${PG_TEST_DB_USER:-obp_test_only}" +PG_TEST_DB_PASS="${PG_TEST_DB_PASS:-changeme}" +PG_TEST_DB_HOST="${PG_TEST_DB_HOST:-localhost}" +PG_TEST_DB_PORT="${PG_TEST_DB_PORT:-5432}" +if command -v psql >/dev/null 2>&1; then + PG_TEST_URL="postgresql://${PG_TEST_DB_USER}:${PG_TEST_DB_PASS}@${PG_TEST_DB_HOST}:${PG_TEST_DB_PORT}/${PG_TEST_DB_NAME}" + if psql "$PG_TEST_URL" -tAc "SELECT 1" >/dev/null 2>&1; then + if psql "$PG_TEST_URL" -c "DROP OWNED BY ${PG_TEST_DB_USER} CASCADE;" >/dev/null 2>&1; then + log_message " [OK] Cleaned Postgres test schema: ${PG_TEST_DB_NAME}" + else + log_message " [WARN] Could not clean Postgres test schema (continuing)" + fi + else + log_message " Postgres test DB not reachable (H2 run?) - skipping Postgres clean" + fi +else + log_message " psql not found - skipping Postgres test-DB clean" +fi + log_message "" ################################################################################ diff --git a/scripts/create_test_db.sh b/scripts/create_test_db.sh new file mode 100755 index 0000000000..7c16859276 --- /dev/null +++ b/scripts/create_test_db.sh @@ -0,0 +1,102 @@ +#!/usr/bin/env bash +# +# create_test_db.sh — create a dedicated, wipe-safe Postgres role + database for the OBP-API test suite. +# +# WHY: the test suite defaults to in-memory H2, which can't run the Postgres-specific DDL used by +# the DE indexing projection (CREATE INDEX CONCURRENTLY, pg_extension) or PostGIS (spatial). To test +# those, point the suite at a real Postgres DB — using a DEDICATED role + database so it can never +# interfere with your dev/shared 'obp' role or 'sandbox' database. +# +# WARNING: the OBP test suite RESETS this database on every test class. Use a throwaway DB ONLY — +# never your dev/prod data. +# +# Idempotent: safe to re-run. Configure via env vars (defaults shown): +# DB_NAME=obp_test_only DB_USER=obp_test_only DB_PASS=changeme DB_HOST=localhost DB_PORT=5432 +# WITH_POSTGIS=true # set false to skip the PostGIS extension (Phase 4 only) +# RESET_PASSWORD=false # if the role already exists, set true to reset its password to DB_PASS +# # (safe ONLY because DB_USER is a dedicated test role — never point this +# # at a shared role like 'obp', or you'll break other apps) +# DROP_EXISTING=false # set true to DROP and recreate the database from scratch (throwaway!) +# PSQL_SUPER='sudo -u postgres psql' # how to run psql as a superuser; override for non-peer auth, +# # e.g. PSQL_SUPER='psql -h localhost -U postgres' +# +# Usage: ./scripts/create_test_db.sh +# DB_PASS=secret ./scripts/create_test_db.sh +# DROP_EXISTING=true ./scripts/create_test_db.sh # clean slate +# +set -euo pipefail + +DB_NAME="${DB_NAME:-obp_test_only}" +DB_USER="${DB_USER:-obp_test_only}" # DEDICATED test role — separate from the shared dev 'obp' role +DB_PASS="${DB_PASS:-changeme}" +DB_HOST="${DB_HOST:-localhost}" +DB_PORT="${DB_PORT:-5432}" +WITH_POSTGIS="${WITH_POSTGIS:-true}" +RESET_PASSWORD="${RESET_PASSWORD:-false}" +DROP_EXISTING="${DROP_EXISTING:-false}" +PSQL_SUPER="${PSQL_SUPER:-sudo -u postgres psql}" + +# Run from a world-readable dir so `sudo -u postgres psql` doesn't warn +# "could not change directory to ..." when the cwd isn't readable by the postgres OS user. +cd /tmp + +echo "==> Ensuring dedicated test role '$DB_USER'" +# Create the role if missing. If it already exists, only reset its password when RESET_PASSWORD=true. +# (The shared-role-clobber safety: never silently change an existing role's password.) +$PSQL_SUPER -v ON_ERROR_STOP=1 < DROP_EXISTING=true — dropping database '$DB_NAME' if it exists" + $PSQL_SUPER -v ON_ERROR_STOP=1 -c "DROP DATABASE IF EXISTS \"$DB_NAME\";" +fi + +echo "==> Ensuring database '$DB_NAME' (owner '$DB_USER')" +if $PSQL_SUPER -tAc "SELECT 1 FROM pg_database WHERE datname = '$DB_NAME'" | grep -q 1; then + echo " exists — reassigning owner to '$DB_USER'" + $PSQL_SUPER -v ON_ERROR_STOP=1 -c "ALTER DATABASE \"$DB_NAME\" OWNER TO \"$DB_USER\";" +else + $PSQL_SUPER -v ON_ERROR_STOP=1 -c "CREATE DATABASE \"$DB_NAME\" OWNER \"$DB_USER\";" +fi + +echo "==> Granting privileges (DB, public schema, and any pre-existing objects)" +$PSQL_SUPER -v ON_ERROR_STOP=1 -c "GRANT ALL PRIVILEGES ON DATABASE \"$DB_NAME\" TO \"$DB_USER\";" +$PSQL_SUPER -v ON_ERROR_STOP=1 -d "$DB_NAME" -c "GRANT ALL ON SCHEMA public TO \"$DB_USER\";" +$PSQL_SUPER -v ON_ERROR_STOP=1 -d "$DB_NAME" -c "GRANT ALL ON ALL TABLES IN SCHEMA public TO \"$DB_USER\";" +$PSQL_SUPER -v ON_ERROR_STOP=1 -d "$DB_NAME" -c "GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO \"$DB_USER\";" + +if [ "$WITH_POSTGIS" = "true" ]; then + echo "==> Enabling PostGIS extension (for spatial / Phase 4)" + if ! $PSQL_SUPER -v ON_ERROR_STOP=1 -d "$DB_NAME" -c "CREATE EXTENSION IF NOT EXISTS postgis;"; then + echo " WARNING: could not enable PostGIS — the OS package is probably missing." + echo " Install it, e.g.: sudo apt-get install postgresql--postgis-3" + echo " then re-run, or set WITH_POSTGIS=false to skip (Phase 3 doesn't need it)." + fi +fi + +cat < Done. Dedicated test role '$DB_USER' owns database '$DB_NAME' (your shared 'obp'/'sandbox' setup is untouched). + +Point the TEST suite at it — set in obp-api/src/main/resources/props/test.default.props (do NOT commit): + + db.driver=org.postgresql.Driver + db.url=jdbc:postgresql://$DB_HOST:$DB_PORT/$DB_NAME?user=$DB_USER&password=$DB_PASS + +Verify the connection: + + psql "postgresql://$DB_USER:$DB_PASS@$DB_HOST:$DB_PORT/$DB_NAME" -c "SELECT version();" +MSG diff --git a/scripts/run_projection_tests.sh b/scripts/run_projection_tests.sh new file mode 100755 index 0000000000..b52fa65aff --- /dev/null +++ b/scripts/run_projection_tests.sh @@ -0,0 +1,57 @@ +#!/usr/bin/env bash +# +# run_projection_tests.sh — fast iteration on just the DE-indexing / projection test suites. +# +# Runs only: +# - code.api.dynamic.entity.query.QuerySpec (pure: parser / planner / executor) +# - code.api.dynamic.entity.projection.ProjectionNamingSpec (pure) +# - code.api.dynamic.entity.projection.ProjectionSqlSpec (pure: SQL generation) +# - code.api.v6_0_0.ProjectionDataPlaneIntegrationTest (Postgres-only; cancels on H2) +# +# The integration test needs: test.default.props pointing at Postgres (db.driver/db.url) AND +# test.projection.postgres=true, plus a clean schema (this script does the DROP OWNED clean first). +# On H2 / no Postgres it simply cancels — the pure specs still run. +# +# Config via env (defaults match scripts/create_test_db.sh): +# DB_NAME=obp_test_only DB_USER=obp_test_only DB_PASS=changeme DB_HOST=localhost DB_PORT=5432 +# SKIP_COMMONS_INSTALL=false # set true to skip the obp-commons install (faster, only safe if +# # obp-commons is unchanged in ~/.m2) +# +set -uo pipefail # not -e: we want to proceed even if the clean step warns + +DB_NAME="${DB_NAME:-obp_test_only}" +DB_USER="${DB_USER:-obp_test_only}" +DB_PASS="${DB_PASS:-changeme}" +DB_HOST="${DB_HOST:-localhost}" +DB_PORT="${DB_PORT:-5432}" +SKIP_COMMONS_INSTALL="${SKIP_COMMONS_INSTALL:-false}" + +cd "$(dirname "$0")/.." + +SUITES="code.api.dynamic.entity.query.QuerySpec,code.api.dynamic.entity.projection.ProjectionNamingSpec,code.api.dynamic.entity.projection.ProjectionSqlSpec,code.api.v6_0_0.ProjectionDataPlaneIntegrationTest" + +# 1) Clean the Postgres test schema so the integration test's boot schemify doesn't abort. Tolerant. +if command -v psql >/dev/null 2>&1; then + PG_URL="postgresql://${DB_USER}:${DB_PASS}@${DB_HOST}:${DB_PORT}/${DB_NAME}" + if psql "$PG_URL" -tAc "SELECT 1" >/dev/null 2>&1; then + if psql "$PG_URL" -c "DROP OWNED BY ${DB_USER} CASCADE;" >/dev/null 2>&1; then + echo "[OK] Cleaned Postgres test schema: ${DB_NAME}" + else + echo "[WARN] Could not clean Postgres test schema (continuing)" + fi + else + echo "[info] Postgres test DB not reachable - the integration test will cancel; pure specs still run." + fi +else + echo "[info] psql not found - skipping Postgres clean; integration test cancels unless db.url is Postgres." +fi + +# 2) Keep obp-commons fresh in ~/.m2 (tests resolve it from there, not target/classes). +if [ "$SKIP_COMMONS_INSTALL" != "true" ]; then + echo "==> Installing obp-commons (skipTests)" + mvn install -pl obp-commons -DskipTests -q || { echo "obp-commons install failed"; exit 1; } +fi + +# 3) Run just the projection suites. +echo "==> Running projection test suites" +mvn test -pl obp-api -DwildcardSuites="$SUITES" -DfailIfNoTests=false