Skip to content

Replace GORM + AutoMigrate with golang-migrate + sqlc#1561

Merged
EItanya merged 31 commits intomainfrom
iplay88keys/replace-gorm-with-sqlc-migrations
Apr 8, 2026
Merged

Replace GORM + AutoMigrate with golang-migrate + sqlc#1561
EItanya merged 31 commits intomainfrom
iplay88keys/replace-gorm-with-sqlc-migrations

Conversation

@iplay88keys
Copy link
Copy Markdown
Contributor

@iplay88keys iplay88keys commented Mar 27, 2026

Description

Warning

Upgrading from a version earlier than v0.8.0 is not supported. You must upgrade to v0.8.0+ before upgrading to this version.

Important

Recommended upgrade path: Take a snapshot/backup of your database before upgrading. A fresh install is the cleanest upgrade path — back up your data, install this version on a fresh database, and restore if needed. In-place upgrades from GORM are supported and tested, but a backup ensures you can recover from any unexpected issues.

GORM's AutoMigrate is additive-only, leaves no migration history, and has no rollback path. This PR replaces it with versioned SQL migrations (golang-migrate) and compile-time type-safe query generation (sqlc).

Two new CI checks are added:

  • The first fails if any existing migration file is modified. Migrations are immutable once merged.
  • The second fails if sqlc generate returns changed files to make sure everything is up-to-date on the branch.

What this changes for operators

  • Golang-migrate takes over existing tables: existing tables use CREATE TABLE IF NOT EXISTS, so all data is preserved
  • Rollback on failure: if a migration fails mid-run, changes are rolled back automatically before the controller exits non-zero. On the initial migration run (upgrading from GORM), rollback to version 0 is skipped to protect pre-existing data.
  • pgvector pre-check: when --database-vector-enabled=true, the runner verifies that the pgvector extension is available before running any migrations. This prevents a failed vector migration from triggering a rollback that could affect core tables.
  • Migration history: every schema change is now a numbered, named SQL file checked into the repo

Architecture

Migration SQL files live in go/core/pkg/migrations/ under core/ and vector/ subdirectories. These SQL files are the stable contract for downstream consumers.

app.Start accepts an optional MigrationRunner callback. Passing nil uses the default (migrations.RunUp with the embedded migrations.FS). Downstream consumers can pass a custom runner to take full ownership of the migration process:

type MigrationRunner func(ctx context.Context, url string, vectorEnabled bool) error

Multi-instance safety

golang-migrate uses a session-level PostgreSQL advisory lock — only one instance runs migrations at a time. Others block, then find ErrNoChange. The lock releases automatically if the process crashes. If a crash leaves a dirty migration state, the next startup detects it and rolls back before retrying.

Static analysis enforcement

CI tests in cross_track_test.go enforce migration safety policies without needing a database:

Test What it enforces
TestNoCrossTrackDDL No track may ALTER TABLE or CREATE INDEX ON a table owned by another track
TestMigrationGuards Up migrations must use IF NOT EXISTS; down migrations must use IF EXISTS

Testing

E2E tests

E2E tests were run against a Kind cluster in three scenarios — all passed:

  1. Fresh install from main — baseline GORM schema, e2e tests pass
  2. Upgrade from main to this branch — migrations 000001 + 000002 applied, e2e tests pass
  3. Fresh install from this branch — both migrations applied from scratch, e2e tests pass

Additionally, main's e2e tests were run against the branch's schema (after migration 000002 applied SET NOT NULL on feedback.is_positive and lg_checkpoint.version) to confirm backward compatibility. All 15 tests passed.

Schema validation

pg_dump --schema-only was captured after each scenario and compared:

Fresh main vs upgrade (expected differences only):

Difference Source
feedback.is_positiveNOT NULL Migration 000002
lg_checkpoint.versionNOT NULL Migration 000002
memory.idDEFAULT gen_random_uuid() Vector migration 000003
schema_migrations table added golang-migrate tracking
vector_schema_migrations table added golang-migrate tracking

Upgrade vs fresh branch: Functionally identical. The only differences are column ordering within tables (GORM's struct-field ordering vs the migration file ordering) which has no runtime impact.

Data safety testing

Tested upgrade failure scenarios against an external PostgreSQL (non-pgvector) database to verify data protection:

  1. vectorEnabled=true on non-pgvector db: pgvector pre-check fails before any migrations run. No tables created, no data modified. Controller crash-loops with a clear error message.
  2. Migration failure on upgrade from GORM: when a migration fails during the initial run (prevVersion == 0), rollback to version 0 is skipped. Pre-existing GORM tables and data are preserved. On restart, the controller clears dirty state and retries without data loss.

Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Comment thread go/core/internal/database/upgrade_test.go Outdated
…ranch

Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
@chromatic-com
Copy link
Copy Markdown

chromatic-com bot commented Apr 1, 2026

Important

UI Tests need review – Review now

🟡 UI Tests: 5 visual and accessibility changes must be accepted as baselines
🟢 UI Review: Approved by Jeremy Alvis
Storybook icon Storybook Publish: 108 stories published

Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
… install

Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
@iplay88keys iplay88keys marked this pull request as ready for review April 3, 2026 21:49
Copilot AI review requested due to automatic review settings April 3, 2026 21:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces GORM's AutoMigrate with golang-migrate (versioned SQL migrations) and sqlc (compile-time type-safe query generation). The migration system manages two independent tracks (core and vector) with atomic cross-track rollback, multi-instance safety via PostgreSQL advisory locks, and automatic dirty-state recovery. Two new CI checks enforce migration immutability and sqlc generation freshness. The changes include 12 database tables migrated to versioned SQL schema, new migration runner with advisory lock support, comprehensive test coverage for cross-track safety and rollback scenarios, and type updates (int→int64, gorm.DeletedAt→*time.Time) throughout the models and handlers.

Changes:

  • Migration system: Added golang-migrate runner with advisory locks, dirty-state recovery, and cross-track rollback logic
  • Database layer: Replaced GORM with sqlc for type-safe queries; added pgx connection pooling with pgvector type registration
  • Schema migrations: Created versioned SQL files for both core (2 migrations) and vector (3 migrations) tracks with proper idempotency guards
  • Type updates: Changed model fields from int to int64 and gorm.DeletedAt to *time.Time to match sqlc generation patterns
  • CI enforcement: Added workflows to prevent migration file modification and ensure sqlc-generated code stays in sync with source queries

Reviewed changes

Copilot reviewed 69 out of 70 changed files in this pull request and generated no comments.

Show a summary per file
File Description
go/core/pkg/migrations/runner.go Migration execution logic with advisory locks, rollback, and dirty-state recovery
go/core/pkg/migrations/cross_track_test.go Static analysis tests enforcing migration guard patterns and cross-track table ownership
go/core/pkg/migrations/core/ & vector/ Versioned SQL migration files with IF NOT EXISTS/IF EXISTS guards
go/core/internal/database/connect.go Database connection with pgvector type registration and retry logic
go/core/internal/database/client_postgres.go New sqlc-based client replacing GORM-based manager
go/api/database/models.go Clean model types without GORM tags; updated field types for sqlc compatibility
go/core/pkg/app/app.go Integration point calling migration runner before pool connection
.github/workflows/ CI checks for migration immutability and sqlc generation sync

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@iplay88keys iplay88keys marked this pull request as draft April 4, 2026 13:28
…grade failure

Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
@iplay88keys iplay88keys marked this pull request as ready for review April 6, 2026 18:03
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Copy link
Copy Markdown
Contributor

@EItanya EItanya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment left by Claude on behalf of @EItanya

A few questions/observations below — overall this is a really well-done PR.

Comment thread go/core/internal/database/connect.go Outdated
Comment thread go/core/internal/database/queries/memory.sql
Comment thread go/api/database/models.go
result = append(result, &parsed)
}
return result, nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback.MessageID changed from uint to *int64 and Feedback.ID changed from gorm.Model (embedded uint ID) to explicit int64. These are breaking type changes for any code referencing these fields. Have all callers been updated? (The generated sqlc code and client_postgres.go look correct, just want to confirm nothing outside this PR relies on the old types.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these changes were done to align the types with the database. GORM handled the mismatch in the past, but the new types are generated from sqlc, so I updated the API model to match.

All of the internal callers are up-to-date and I have not found evidence that anything outside of this PR relies on those old types.

@EItanya
Copy link
Copy Markdown
Contributor

EItanya commented Apr 6, 2026

Comment left by Claude on behalf of @EItanya

Re: Makefile build-migrate / MIGRATE_IMG

The Makefile adds build-migrate and MIGRATE_IMG but the PR doesn't include a Dockerfile for the migrate binary or helm chart changes to deploy it (e.g. as an init container). The PR description says migrations run "in-app at startup" via app.Start, so the standalone migrate command seems to be for manual/ad-hoc operations only.

Is the migrate image actually used anywhere in deployment, or is it purely an operator tool for manual down/version commands? If the latter, might be worth a brief comment or doc note so people don't wonder why the image exists but isn't referenced in the helm chart.

@iplay88keys
Copy link
Copy Markdown
Contributor Author

Re: Makefile build-migrate / MIGRATE_IMG

The Makefile adds build-migrate and MIGRATE_IMG but the PR doesn't include a Dockerfile for the migrate binary or helm chart changes to deploy it (e.g. as an init container). The PR description says migrations run "in-app at startup" via app.Start, so the standalone migrate command seems to be for manual/ad-hoc operations only.

Is the migrate image actually used anywhere in deployment, or is it purely an operator tool for manual down/version commands? If the latter, might be worth a brief comment or doc note so people don't wonder why the image exists but isn't referenced in the helm chart.

All of that was removed when I changed to in-app migrations in 7034b1c.

I believe there could be benefit to having a standalone migrator binary/image, but we can re-evaluate that when the need arises.

iplay88keys and others added 3 commits April 7, 2026 07:30
@EItanya EItanya merged commit 256d11a into main Apr 8, 2026
26 of 27 checks passed
@EItanya EItanya deleted the iplay88keys/replace-gorm-with-sqlc-migrations branch April 8, 2026 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants