feat(warehouse): Phase 2 — BigQuery connection management#395
feat(warehouse): Phase 2 — BigQuery connection management#395Shrotriya-lalit wants to merge 11 commits into
Conversation
Phase 1 of the BigQuery warehouse connector. Adds Prisma models for BigQueryConnection, BigQuerySync and BigQuerySyncRun with four supporting enums, two Prisma migrations, typed JSON column via PrismaJson namespace, Zod schemas (zBigQuerySyncConfig + column mapping variants) in @openpanel/validation, and the @google-cloud/bigquery dependency.
- Add zBqColRef validator for column references (supports dot-notation for STRUCT nested fields like user.profile.email) - Add mappingType discriminator to both mapping schemas so the union is discriminated and TypeScript can narrow the type cleanly - Add superRefine cross-validation: append mode events syncs must declare an insertTime column (the TIMESTAMP cursor) - Update plan with verified BigQuery Node.js client type mappings: INT64 needs wrapIntegers:true, TIMESTAMP/.value not .toISOString(), DATETIME has no timezone, BYTES→Buffer, Big for NUMERIC/BIGNUMERIC
Real-world orgs connect multiple data sources to one project (e.g. jm-ebg and jm-ebg-cdp on the same ROAS project). The original @unique on projectId wrongly enforced a single connection per project. - Remove @unique from BigQueryConnection.projectId - Add name String field (user label, e.g. "CDP Source") - Replace with @@unique([projectId, name]) — names unique within project - Change Project.bigQueryConnection? → bigQueryConnections[]
Schema additions: - BigQueryConnection: gcpRegion (GDPR region compliance), lastTestedAt/lastTestStatus (connection health) - BigQuerySync: lastSyncStatus typed as enum (was String?), errorRetryCount+isErrorPaused circuit breaker, partitionFilter for cost-safe full-refresh on partitioned tables - BigQuerySyncRun: rowCount BigInt (INT max ~2.1B insufficient), bytesProcessed for cost tracking Zod additions: - zGcpProjectId: GCP project ID format regex (rejects project numbers and display names) - zBqIdentifier: dataset/table name validator (no hyphens, per BQ naming rules) - zServiceAccountJson: SA JSON structure check (rejects authorized_user creds before encryption) - zBigQueryConnectionCreate: connection creation schema with name/region/SA JSON - zBigQuerySyncConfig: dataset/tableName now use zBqIdentifier, partitionFilter field added
…d dependency upgrade
- Upgrade @google-cloud/bigquery from ^7.9.1 to ^8.3.1 (latest stable)
- Add FK bigquery_sync_runs.projectId -> projects(id) (was missing, allowing orphan runs)
- Add composite FK bigquery_syncs(projectId, connectionId) -> bigquery_connections(projectId, id)
to prevent cross-tenant data: a sync can no longer reference a connection from a different project
- Add UNIQUE INDEX bigquery_connections(projectId, id) to back the composite FK
- Add CHECK(char_length(name) > 0) on bigquery_connections to enforce non-empty names at DB level
- Backfill any dev rows with empty name using concat('connection_', id)
…Zod validators Adds the foundational schema for a multi-provider Data Warehouse Connector (BigQuery first, extensible to Snowflake/Redshift/Databricks/Postgres). Three shared Prisma tables: - warehouse_connections: one row per named connection, any provider - warehouse_syncs: one sync job per connection - warehouse_sync_runs: one run record per execution Key design decisions: - configEncrypted (AES-256-GCM) + displayIdentifier/displayEmail for UI display without decryption - Composite FK warehouse_syncs(projectId,connectionId) → warehouse_connections(projectId,id) blocks cross-tenant exploit at DB level - Three sync modes: append (cursor), full (reload + stale cleanup), onetime (backfill) - cursorOverlapMinutes (default 10) rewinds append cursor to catch late-arriving rows - syncDelayMinutes (default 0) delays cron execution to allow BQ pipelines to land - Performance indexes on all FK columns (PostgreSQL does not create these automatically) - jsonProperties column mapping flattens a JSON column into event/profile properties Validation (packages/validation/src/index.ts): - zBigQueryWarehouseConfig + zWarehouseConfig discriminated union - zBigQuerySyncConfig with superRefine cross-field rules: schedule required for append/full, insertTime required for append, eventName/eventNameStatic mutually exclusive Migrations applied: 20260610115042–20260610115046 21/21 Zod validator probes pass. Zero schema drift (prisma migrate diff).
…v#391 - Rename PrismaJson type alias IWarehouseColumnMapping → IPrismaWarehouseColumnMapping to match the established IPrisma* prefix convention used by all other types in the namespace - Update schema.prisma annotation to match (/// [IPrismaWarehouseColumnMapping]) - Add @@unique([id, projectId]) on WarehouseSync to back a composite FK - Replace single-column syncId FK on WarehouseSyncRun with composite (syncId, projectId) → warehouse_syncs(id, projectId) to enforce that a run's projectId always matches its parent sync's projectId — closes the same cross-tenant gap that the existing composite FK on WarehouseSync already closes one level up
Before adding the (syncId, projectId) composite FK, normalize any runs whose projectId doesn't match their parent sync's projectId. Without this the ADD CONSTRAINT fails if mismatched rows exist. Tables are dev-only now so this is a no-op, but makes the migration safe to apply to any environment.
Validation fixes (packages/validation/src/index.ts): - Gap 1: reject whitespace-only connection names (Zod .refine + btrim DB check) - Gap 2: require createdAt cursor for profiles in append mode (superRefine) - Gap 4: block SQL injection tokens in partitionFilter (-- /* */ ;) - Gap B: export IBigQueryWarehouseConfig named type Schema + migrations: - migration 20260611000001: tighten name_nonempty_check to char_length(btrim(name)) > 0 - migration 20260611000002: add lastTestError String? to warehouse_connections so testConnection and connect can surface a human-readable failure reason (permission denied, project not found, key revoked) alongside the boolean status Regenerated Prisma client to include lastTestError field.
Adds the full connection lifecycle for the warehouse connector: connect, test, rotate credentials, disconnect. Ships a settings tab so users can manage connections from the dashboard. Key implementation details: - tRPC router (8 procedures): listConnections, connect, testConnection, updateConnection, disconnect, listDatasets, listTables, getTableSchema - warehouse.ts data service: withTimeout, buildBigQueryClient, getWarehouseClient, list* helpers — all with 10 s BQ timeouts - Credentials tested against live GCP before any DB write (fail fast) - gcpRegion preserved on key rotation (merged from existing encrypted config) - Type-change and project-ID-change blocked on updateConnection - Idempotent disconnect (catches P2002/P2025 races) - Running-sync guard in disconnect — blocks mid-run cascade delete - Rate limiting: connect/updateConnection max 5/min, testConnection max 10/min - listConnections bounded at take: 100 - serviceAccountJson added to logger redact patterns (private key never logged) - cancelled added to WarehouseSyncRunStatus enum (migration 20260615000001) - Frontend: AddConnectionDialog with zodResolver client-side validation, RotateKeyDialog, TestStatusBadge, provider type badge via PROVIDER_LABELS, error state, skeleton loading, disconnect confirmation with sync count - Vitest suites for mapBigQueryError, zWarehouseConfig, and validation schemas
📝 WalkthroughWalkthroughAdds a complete BigQuery warehouse connector to the platform. Introduces Zod validation schemas for service account credentials, warehouse configs, and sync settings; a Prisma schema and 14 incremental migrations building and hardening ChangesWarehouse Connector Feature (BigQuery Phase 1)
Sequence Diagram(s)sequenceDiagram
participant User
participant WarehouseSettings as WarehouseSettings UI
participant warehouseRouter as tRPC warehouseRouter
participant warehouse_ts as warehouse.ts (DB layer)
participant BigQuery as BigQuery API
participant Prisma as Prisma (PostgreSQL)
rect rgba(100, 150, 255, 0.5)
Note over User,Prisma: Connect new warehouse
User->>WarehouseSettings: Fill form & click "Connect & Verify"
WarehouseSettings->>warehouseRouter: warehouse.connect(name, config, projectId)
warehouseRouter->>warehouse_ts: buildBigQueryClient(config)
warehouseRouter->>warehouse_ts: withTimeout(listWarehouseDatasets, 10s)
warehouse_ts->>BigQuery: client.getDatasets()
BigQuery-->>warehouse_ts: datasets
warehouseRouter->>Prisma: create warehouse_connections (encrypted config)
Prisma-->>warehouseRouter: connection record
warehouseRouter-->>WarehouseSettings: connection record
WarehouseSettings->>WarehouseSettings: invalidate listConnections query, close dialog
end
rect rgba(100, 200, 100, 0.5)
Note over User,Prisma: Test existing connection
User->>WarehouseSettings: Click "Test" on connection card
WarehouseSettings->>warehouseRouter: warehouse.testConnection(connectionId)
warehouseRouter->>Prisma: findUnique connection (ownership check)
warehouseRouter->>warehouse_ts: listWarehouseDatasets(connectionId, projectId)
warehouse_ts->>BigQuery: client.getDatasets()
BigQuery-->>warehouse_ts: datasets or error
warehouseRouter->>Prisma: update lastTestedAt / lastTestStatus / lastTestError
warehouseRouter-->>WarehouseSettings: success or failure result
WarehouseSettings->>WarehouseSettings: invalidate listConnections query, show toast
end
rect rgba(200, 100, 100, 0.5)
Note over User,Prisma: Disconnect warehouse
User->>WarehouseSettings: Click "Disconnect" on connection card
WarehouseSettings->>WarehouseSettings: alert confirmation
User->>WarehouseSettings: Confirm disconnect
WarehouseSettings->>warehouseRouter: warehouse.disconnect(connectionId)
warehouseRouter->>Prisma: check for running warehouse_sync_runs
warehouseRouter->>Prisma: delete warehouse_connections
warehouseRouter-->>WarehouseSettings: {ok: true}
WarehouseSettings->>WarehouseSettings: invalidate listConnections query, show toast
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
packages/trpc/src/routers/warehouse.ts (1)
343-343: ⚡ Quick winUse shared
zBqIdentifierinstead of duplicating regexes for dataset/table inputs.These two route inputs currently re-implement identifier validation and can drift from the canonical warehouse validator (including length constraints). Reuse
zBqIdentifierfrom@openpanel/validationto keep router and schema contracts aligned.Proposed refactor
import { + zBqIdentifier, zWarehouseConfig, zWarehouseConnectionCreate, } from '`@openpanel/validation`'; @@ - .input(zConnectionOwnership.extend({ dataset: z.string().regex(/^[a-zA-Z0-9_]+$/, 'Invalid dataset name') })) + .input(zConnectionOwnership.extend({ dataset: zBqIdentifier })) @@ zConnectionOwnership.extend({ - dataset: z.string().regex(/^[a-zA-Z0-9_]+$/, 'Invalid dataset name'), - tableName: z.string().regex(/^[a-zA-Z0-9_]+$/, 'Invalid table name'), + dataset: zBqIdentifier, + tableName: zBqIdentifier, }),Also applies to: 367-370
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/trpc/src/routers/warehouse.ts` at line 343, Remove the inline regex validation for dataset identifiers in the warehouse router inputs. Replace the z.string().regex() pattern with the shared zBqIdentifier validator imported from `@openpanel/validation`. This same refactoring needs to be applied wherever dataset or table identifier validation is duplicated (including the additional occurrences mentioned in the comment). This ensures all identifier validation stays consistent with the canonical warehouse validator including length constraints, and eliminates schema drift between the router definitions and the shared validation contract.packages/db/src/warehouse.test.ts (1)
127-139: 💤 Low valueStatic analysis false positive: test fixtures contain intentionally fake credentials.
The Betterleaks tool flagged the
private_keyvalues as potential secrets, but these are clearly test fixtures with placeholder values like"NEWKEY"and truncated headers. The keys are intentionally fake and pose no security risk.If desired, you could make the test nature even more explicit:
🧹 Optional: Make test credentials more obviously fake
const NEW_SA = JSON.stringify({ type: 'service_account', project_id: 'my-project', - private_key: '-----BEGIN RSA PRIVATE KEY-----\nNEWKEY', + private_key: '-----BEGIN RSA PRIVATE KEY-----\nFAKE_TEST_KEY_DO_NOT_USE', client_email: 'sa-new@my-project.iam.gserviceaccount.com', });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/db/src/warehouse.test.ts` around lines 127 - 139, The static analysis tool is flagging the private_key values in the VALID_SA and NEW_SA test fixture constants as potential secrets, even though these are clearly fake test credentials. To prevent false positives, make the test credentials more obviously fake by replacing the placeholder key values with even more explicit fake values that clearly indicate they are test fixtures (for example, using values like "FAKE_TEST_KEY" or "TEST_PRIVATE_KEY_NOT_REAL" instead of truncated RSA headers), ensuring the static analysis tools recognize these as intentional test data rather than actual credentials.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@packages/db/prisma/migrations/20260608091000_bigquery_multi_connection/migration.sql`:
- Around line 5-9: The migration adds a name column with an empty string
default, but the follow-up unique constraint (referenced on line 12) will fail
if multiple connections already exist for the same project since they would all
have the same empty name value. Add clear comments in the migration file
explaining that the follow-up migration
(20260608140000_bigquery_referential_integrity) must be applied immediately
after to backfill unique names before the constraint is enforced, and document
in the PR description whether this rollout is intended for dev-only or
single-connection deployments, or if multi-connection projects exist in any
environment, confirm the immediate application requirement to avoid constraint
violation errors.
In `@packages/db/prisma/migrations/20260608120000_bigquery_harden/migration.sql`:
- Line 9: The ALTER TABLE statement for `lastSyncStatus` attempts a direct cast
to the `BigQuerySyncRunStatus` enum type, but this will fail if there are
existing text values in the database that don't match valid enum values. Replace
the simple USING clause that casts to the enum with a CASE expression that
evaluates the current `lastSyncStatus` value and either maps invalid values to a
valid default enum value or handles them appropriately. This ensures legacy or
unexpected text values are normalized before the cast occurs, allowing the
migration to complete successfully.
In `@packages/validation/src/index.ts`:
- Around line 733-741: The code casts the parsed JSON to Record<string, unknown>
without first checking if the value is actually an object, causing property
access on null or non-object values (like sa.type) to throw a TypeError. Add a
type guard after the cast to the variable sa to verify it is a non-null object
(using typeof sa === 'object' && sa !== null) before accessing any properties.
If the guard fails, add a validation issue via ctx.addIssue with an appropriate
error message, ensuring validation errors are reported instead of uncaught
TypeErrors.
---
Nitpick comments:
In `@packages/db/src/warehouse.test.ts`:
- Around line 127-139: The static analysis tool is flagging the private_key
values in the VALID_SA and NEW_SA test fixture constants as potential secrets,
even though these are clearly fake test credentials. To prevent false positives,
make the test credentials more obviously fake by replacing the placeholder key
values with even more explicit fake values that clearly indicate they are test
fixtures (for example, using values like "FAKE_TEST_KEY" or
"TEST_PRIVATE_KEY_NOT_REAL" instead of truncated RSA headers), ensuring the
static analysis tools recognize these as intentional test data rather than
actual credentials.
In `@packages/trpc/src/routers/warehouse.ts`:
- Line 343: Remove the inline regex validation for dataset identifiers in the
warehouse router inputs. Replace the z.string().regex() pattern with the shared
zBqIdentifier validator imported from `@openpanel/validation`. This same
refactoring needs to be applied wherever dataset or table identifier validation
is duplicated (including the additional occurrences mentioned in the comment).
This ensures all identifier validation stays consistent with the canonical
warehouse validator including length constraints, and eliminates schema drift
between the router definitions and the shared validation contract.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a5616f5b-645f-473e-92c8-264641e3fee6
⛔ Files ignored due to path filters (2)
packages/db/src/generated/emptyis excluded by!**/generated/**pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (27)
apps/start/src/routes/_app.$organizationId.$projectId.settings._tabs.tsxapps/start/src/routes/_app.$organizationId.$projectId.settings._tabs.warehouse.tsxpackages/db/index.tspackages/db/package.jsonpackages/db/prisma/migrations/20260607120000_add_event_meta_description/migration.sqlpackages/db/prisma/migrations/20260608090217_add_bigquery_connector/migration.sqlpackages/db/prisma/migrations/20260608091000_bigquery_multi_connection/migration.sqlpackages/db/prisma/migrations/20260608120000_bigquery_harden/migration.sqlpackages/db/prisma/migrations/20260608140000_bigquery_referential_integrity/migration.sqlpackages/db/prisma/migrations/20260610115042_warehouse_restructure/migration.sqlpackages/db/prisma/migrations/20260610115043_warehouse_security_fks/migration.sqlpackages/db/prisma/migrations/20260610115044_warehouse_phase1_finalize/migration.sqlpackages/db/prisma/migrations/20260610115045_warehouse_onetime_mode/migration.sqlpackages/db/prisma/migrations/20260610115046_warehouse_sync_overlap_delay/migration.sqlpackages/db/prisma/migrations/20260611000000_warehouse_run_composite_fk/migration.sqlpackages/db/prisma/migrations/20260611000001_warehouse_name_btrim_check/migration.sqlpackages/db/prisma/migrations/20260611000002_warehouse_connection_last_test_error/migration.sqlpackages/db/prisma/migrations/20260615000001_warehouse_cancelled_status/migration.sqlpackages/db/prisma/schema.prismapackages/db/src/types.tspackages/db/src/warehouse.test.tspackages/db/src/warehouse.tspackages/logger/index.tspackages/trpc/src/root.tspackages/trpc/src/routers/warehouse.tspackages/validation/src/index.tspackages/validation/src/warehouse.test.ts
| -- Add name column (default '' for any pre-existing rows during dev) | ||
| ALTER TABLE "public"."bigquery_connections" ADD COLUMN "name" TEXT NOT NULL DEFAULT ''; | ||
|
|
||
| -- Remove the default now that column exists | ||
| ALTER TABLE "public"."bigquery_connections" ALTER COLUMN "name" DROP DEFAULT; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check for migration rollout documentation or pre-migration data checks
rg -n -C3 -i 'rollout|backfill|existing.*connection|pre.*migration.*check' \
--type=md --type=sqlRepository: Openpanel-dev/openpanel
Length of output: 3441
🏁 Script executed:
cat -n packages/db/prisma/migrations/20260608091000_bigquery_multi_connection/migration.sqlRepository: Openpanel-dev/openpanel
Length of output: 813
Constraint creation would fail with multiple pre-existing connections per project.
The CREATE UNIQUE INDEX on line 12 would violate uniqueness if multiple connections already exist for a single project, since they would all have name=''. The follow-up migration (20260608140000_bigquery_referential_integrity) handles this by backfilling empty names with unique values before the constraint is enforced database-wide.
Ensure the rollout strategy is documented: if this is intended for dev-only or single-connection deployments, clarify it in PR description or migration notes. If multi-connection projects exist in any environment where this runs, confirm the follow-up migration is applied immediately after to avoid constraint violation errors.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@packages/db/prisma/migrations/20260608091000_bigquery_multi_connection/migration.sql`
around lines 5 - 9, The migration adds a name column with an empty string
default, but the follow-up unique constraint (referenced on line 12) will fail
if multiple connections already exist for the same project since they would all
have the same empty name value. Add clear comments in the migration file
explaining that the follow-up migration
(20260608140000_bigquery_referential_integrity) must be applied immediately
after to backfill unique names before the constraint is enforced, and document
in the PR description whether this rollout is intended for dev-only or
single-connection deployments, or if multi-connection projects exist in any
environment, confirm the immediate application requirement to avoid constraint
violation errors.
| ALTER TABLE "public"."bigquery_connections" ADD COLUMN "lastTestStatus" BOOLEAN; | ||
|
|
||
| -- BigQuerySync: typed status enum, circuit-breaker fields, partition filter | ||
| ALTER TABLE "public"."bigquery_syncs" ALTER COLUMN "lastSyncStatus" TYPE "public"."BigQuerySyncRunStatus" USING "lastSyncStatus"::"public"."BigQuerySyncRunStatus"; |
There was a problem hiding this comment.
Guard the enum cast against legacy text values.
lastSyncStatus was previously unconstrained TEXT, so any non-null value outside the enum aborts this migration. Normalize invalid values before the cast or use a guarded USING CASE.
Suggested migration hardening
+UPDATE "public"."bigquery_syncs"
+SET "lastSyncStatus" = NULL
+WHERE "lastSyncStatus" IS NOT NULL
+ AND "lastSyncStatus" NOT IN ('pending', 'running', 'completed', 'failed');
+
ALTER TABLE "public"."bigquery_syncs" ALTER COLUMN "lastSyncStatus" TYPE "public"."BigQuerySyncRunStatus" USING "lastSyncStatus"::"public"."BigQuerySyncRunStatus";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ALTER TABLE "public"."bigquery_syncs" ALTER COLUMN "lastSyncStatus" TYPE "public"."BigQuerySyncRunStatus" USING "lastSyncStatus"::"public"."BigQuerySyncRunStatus"; | |
| UPDATE "public"."bigquery_syncs" | |
| SET "lastSyncStatus" = NULL | |
| WHERE "lastSyncStatus" IS NOT NULL | |
| AND "lastSyncStatus" NOT IN ('pending', 'running', 'completed', 'failed'); | |
| ALTER TABLE "public"."bigquery_syncs" ALTER COLUMN "lastSyncStatus" TYPE "public"."BigQuerySyncRunStatus" USING "lastSyncStatus"::"public"."BigQuerySyncRunStatus"; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/db/prisma/migrations/20260608120000_bigquery_harden/migration.sql`
at line 9, The ALTER TABLE statement for `lastSyncStatus` attempts a direct cast
to the `BigQuerySyncRunStatus` enum type, but this will fail if there are
existing text values in the database that don't match valid enum values. Replace
the simple USING clause that casts to the enum with a CASE expression that
evaluates the current `lastSyncStatus` value and either maps invalid values to a
valid default enum value or handles them appropriately. This ensures legacy or
unexpected text values are normalized before the cast occurs, allowing the
migration to complete successfully.
| const sa = parsed as Record<string, unknown>; | ||
| if (sa.type !== 'service_account') { | ||
| ctx.addIssue({ | ||
| code: z.ZodIssueCode.custom, | ||
| message: | ||
| 'Must be a service account key (type: "service_account"). Application Default Credentials are not supported.', | ||
| }); | ||
| } | ||
| if (!sa.private_key || typeof sa.private_key !== 'string') { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd packages/validation/src && wc -l index.tsRepository: Openpanel-dev/openpanel
Length of output: 80
🏁 Script executed:
sed -n '690,750p' packages/validation/src/index.tsRepository: Openpanel-dev/openpanel
Length of output: 2150
🏁 Script executed:
sed -n '1,750p' packages/validation/src/index.ts | grep -n "zServiceAccountJson" | head -20Repository: Openpanel-dev/openpanel
Length of output: 108
🏁 Script executed:
sed -n '718,780p' packages/validation/src/index.tsRepository: Openpanel-dev/openpanel
Length of output: 2070
Guard against null/non-object values before accessing properties in zServiceAccountJson.
After JSON.parse, the code casts directly to Record<string, unknown> without checking if the parsed value is actually an object. When the JSON input is the string "null", JSON.parse returns the literal null value, causing property access like sa.type to throw a TypeError instead of reporting a validation error.
Add a type guard before accessing object properties:
Proposed fix
const sa = parsed as Record<string, unknown>;
+ if (typeof parsed !== 'object' || parsed === null || Array.isArray(parsed)) {
+ ctx.addIssue({
+ code: z.ZodIssueCode.custom,
+ message: 'Service account JSON must be an object',
+ });
+ return;
+ }
+ const sa = parsed as Record<string, unknown>;
if (sa.type !== 'service_account') {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const sa = parsed as Record<string, unknown>; | |
| if (sa.type !== 'service_account') { | |
| ctx.addIssue({ | |
| code: z.ZodIssueCode.custom, | |
| message: | |
| 'Must be a service account key (type: "service_account"). Application Default Credentials are not supported.', | |
| }); | |
| } | |
| if (!sa.private_key || typeof sa.private_key !== 'string') { | |
| if (typeof parsed !== 'object' || parsed === null || Array.isArray(parsed)) { | |
| ctx.addIssue({ | |
| code: z.ZodIssueCode.custom, | |
| message: 'Service account JSON must be an object', | |
| }); | |
| return; | |
| } | |
| const sa = parsed as Record<string, unknown>; | |
| if (sa.type !== 'service_account') { | |
| ctx.addIssue({ | |
| code: z.ZodIssueCode.custom, | |
| message: | |
| 'Must be a service account key (type: "service_account"). Application Default Credentials are not supported.', | |
| }); | |
| } | |
| if (!sa.private_key || typeof sa.private_key !== 'string') { |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/validation/src/index.ts` around lines 733 - 741, The code casts the
parsed JSON to Record<string, unknown> without first checking if the value is
actually an object, causing property access on null or non-object values (like
sa.type) to throw a TypeError. Add a type guard after the cast to the variable
sa to verify it is a non-null object (using typeof sa === 'object' && sa !==
null) before accessing any properties. If the guard fails, add a validation
issue via ctx.addIssue with an appropriate error message, ensuring validation
errors are reported instead of uncaught TypeErrors.
Replace duplicate inline regexes in listTables/getTableSchema with the shared zBqIdentifier validator (adds min/max length bounds the regex lacked). Replace real-looking private_key fragments in test fixtures with an obviously fake string to silence static secret scanners.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/db/src/warehouse.test.ts (1)
130-137: ⚡ Quick winUse non-PEM fixture values for
private_keyto avoid secret-scanner noise.Line 130 and Line 137 currently use PEM-shaped values, which are flagged as potential leaked keys by scanners even in tests. Replace them with clearly synthetic non-PEM placeholders so CI/security tooling stays signal-rich.
Suggested diff
- private_key: '-----BEGIN RSA PRIVATE KEY-----\nFAKE_TEST_KEY_DO_NOT_USE', + private_key: 'fake_test_private_key_not_a_real_pem', ... - private_key: '-----BEGIN RSA PRIVATE KEY-----\nFAKE_TEST_KEY_DO_NOT_USE', + private_key: 'fake_test_private_key_not_a_real_pem',🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/db/src/warehouse.test.ts` around lines 130 - 137, Replace the PEM-formatted private_key values in both test fixtures to use non-PEM placeholder strings instead. In the first fixture definition (around line 130) and in the NEW_SA constant definition (around line 137), change the private_key from '-----BEGIN RSA PRIVATE KEY-----\nFAKE_TEST_KEY_DO_NOT_USE' to a clearly synthetic non-PEM value like 'fake-test-key-do-not-use' or similar placeholder that won't match PEM patterns and trigger secret-scanner alerts during CI.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/db/src/warehouse.test.ts`:
- Around line 130-137: Replace the PEM-formatted private_key values in both test
fixtures to use non-PEM placeholder strings instead. In the first fixture
definition (around line 130) and in the NEW_SA constant definition (around line
137), change the private_key from '-----BEGIN RSA PRIVATE
KEY-----\nFAKE_TEST_KEY_DO_NOT_USE' to a clearly synthetic non-PEM value like
'fake-test-key-do-not-use' or similar placeholder that won't match PEM patterns
and trigger secret-scanner alerts during CI.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6c2f0564-d734-46ef-a434-69ad62993a1e
📒 Files selected for processing (2)
packages/db/src/warehouse.test.tspackages/trpc/src/routers/warehouse.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/trpc/src/routers/warehouse.ts
Summary
Delivers the full connection lifecycle for the BigQuery warehouse connector. Users can connect a GCP Service Account, test connectivity, rotate credentials, and disconnect — all from a new Warehouse Sources settings tab.
listConnections,connect,testConnection,updateConnection,disconnect,listDatasets,listTables,getTableSchemapackages/db/src/warehouse.ts:withTimeout,buildBigQueryClient,getWarehouseClient, and dataset/table/schema helpers (all with 10 s GCP timeouts, timer-leak-free)AddConnectionDialogwithzodResolverclient-side validation,RotateKeyDialog,TestStatusBadge, provider type badge viaPROVIDER_LABELS, skeleton loading, error state, disconnect confirmation with live sync countserviceAccountJsonadded to logger redact patterns (private key never written to logs); rate limits onconnect/updateConnection(5/min) andtestConnection(10/min);listConnectionsbounded attake: 100; running-sync guard indisconnectblocks mid-run cascade deletescancelledadded toWarehouseSyncRunStatusenum (migration20260615000001, own transaction)mapBigQueryErrorand all Zod warehouse validatorsDesign rules enforced
getProjectAccess→findUnique→projectIdmatch (3-step ownership)configEncryptednever returned in any tRPC response — only safe display fields selectedgcpRegionpreserved on key rotation (merged from existing encrypted config if not supplied)updateConnectionP2025race)Test plan
displayEmailshown on cardtypefield → Zod error before API callmapBigQueryErrormessage shown, no DB rowtestConnectionon revoked SA → card shows red ✗ + error messagedisplayEmailshownconnect6× in 60 s → 6th returns 429 rate limit errordisconnectwhile sync running → blocked with "A sync is currently running" messageSummary by CodeRabbit