Add pluggable metadata store as opt-in source of truth#1503
Open
ricardo-devis-agullo wants to merge 10 commits into
Open
Add pluggable metadata store as opt-in source of truth#1503ricardo-devis-agullo wants to merge 10 commits into
ricardo-devis-agullo wants to merge 10 commits into
Conversation
Introduce a pluggable metadata adapter (sibling to storage.adapter) that makes a database the authoritative index for which components/versions exist, while statics stay in object storage and the in-memory hot read path is unchanged. Storage-only mode remains the default and is fully non-breaking (no metadata block = today's behavior, byte-for-byte). Core - Add ComponentRow/MetadataStore/MetadataConfig types and optional metadata config on Config; presence-based enablement. - Add shared metadata-index: one getAllComponents() hydrates both ComponentsList and ComponentsDetails; MetadataIndex.add() updates the snapshot immediately after publish. - Route components-cache and components-details through the metadata index when present; storage path untouched when absent. Prevent a second DB polling loop in details during metadata mode. - Repository: initialise the store before caches; optional startup reconcileFromStorage and exportLegacyFiles; publish commits the metadata row after statics upload (insert is the commit point); map VERSION_ALREADY_EXISTS to the existing already_exists publish error. - Add optional MetadataStore.close() and wire it into registry.close() and the oc registry migrate-metadata CLI facade (finally block). - Validate metadata adapter config in registry-configuration. Adapters - oc-metadata-adapters-utils: shared ComponentRow/MetadataStore contract and VERSION_ALREADY_EXISTS code. - oc-azure-sql-metadata-adapter: first official adapter (mssql). manageSchema DDL, getAllComponents, addVersion, close() pool lifecycle, SQL Server unique-violation (2627/2601) mapping. Env-var-gated integration tests (OC_METADATA_SQL_CONNECTION_STRING), skipped otherwise. Migration - oc registry migrate-metadata <configPath> backfills from components-details.json with a storage directory scan fallback; idempotent (existing rows skipped). Tests - Metadata-mode cache/details hydration, shared snapshot reuse, repository init/publish/duplicate/concurrency/failure injection, close() wiring (repository + registry + migrate facade), migration backfill, config validation, Azure SQL adapter mocked unit tests.
Azure Table Storage metadata adapter that uses PartitionKey=component_name, RowKey=version as the unique constraint — the exact concurrency model Option B needs. Schemaless (no migrations framework), HTTP-based (no connection pool), and works with the same Azure storage account already used for blob statics. Implements the shared MetadataStore interface: - isValid: connection string or endpoint + credentials + table name rules - initialise: createTable (idempotent) or verify table accessibility - getAllComponents: listEntities via SDK paged async iterator (auto-paginates) - addVersion: createEntity, 409 Conflict -> VERSION_ALREADY_EXISTS - close: clears client reference (no pool to close) Supports connectionString, endpoint+accountName/accountKey, endpoint+sasToken, allowInsecureConnection (Azurite), and custom tableName. 18 mocked unit tests passing; 7 integration tests env-var-gated on OC_METADATA_TABLE_CONNECTION_STRING (skipped without it).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Makes a database the authoritative metadata index for which components/versions exist, while statics stay in object storage and the in-memory hot read path is unchanged. Storage-only mode remains the default and is fully non-breaking: no
metadatablock = today's behavior, byte-for-byte.Design and status live in
metadata-scaling-option-b.mdandmetadata-scaling-option-b-status.mdat the repo root.Why
Today the metadata index is a derived blob (
components.json) rebuilt by a full O(registry) directory scan on startup and after every publish, with last-writer-wins concurrency across nodes. Under unbounded, AI-accelerated publishing this is CPU/GC pressure and a correctness ceiling. A queryable store turns publish into an O(1) atomic append and cross-node correctness into aUNIQUEconstraint, and stays flat under growth.Scope
Core (
packages/oc)ComponentRow/MetadataStore/MetadataConfigtypes; optionalmetadataonConfig(presence-based enablement).metadata-index: onegetAllComponents()hydrates bothComponentsListandComponentsDetails;MetadataIndex.add()updates the snapshot right after publish.components-cacheandcomponents-detailsroute through the metadata index when present; storage path untouched when absent; no second DB polling loop in details.reconcileFromStorageandexportLegacyFiles; publish commits the metadata row after statics upload (insert = commit point);VERSION_ALREADY_EXISTS→ existingalready_existspublish error.MetadataStore.close()wired intoregistry.close()(server first, then pool) and theoc registry migrate-metadataCLI facade (finallyblock).registry-configuration.Adapters
oc-metadata-adapters-utils: sharedComponentRow/MetadataStorecontract +VERSION_ALREADY_EXISTScode. Core gains zero runtime DB deps.oc-azure-sql-metadata-adapter: first official adapter (mssql).manageSchemaDDL,getAllComponents,addVersion,close()pool lifecycle, SQL Server unique-violation (2627/2601) mapping,schemaName/tableNamecustomisation.Migration
oc registry migrate-metadata <configPath>backfills fromcomponents-details.jsonwith a storage directory scan fallback; idempotent (existing rows skipped).Tests
close()wiring (repository + registry + migrate facade), migration backfill, config validation, Azure SQL adapter mocked unit tests (19 passing).OC_METADATA_SQL_CONNECTION_STRINGand skip otherwise — they still need a live SQL Server run (Docker CI or external) to execute.Non-breaking
Full OC suite green with
metadataabsent (storage mode unchanged):894 passing. Root build:4 successful, 4 total.Out of scope (deferred by decision)
Scheduled/background
reconcileFromStorageandexportLegacyFiles; degraded DB-down cold start from legacy files; explicit readiness signaling; richer migration reporting; docs-site publishing. The S3/GSlistSubDirectoriespagination prerequisite is tracked separately in theopencomponents/storage-adaptersrepo (already merged there).