Skip to content

Commit 697bcaf

Browse files
committed
Merge table, update caveats and notes
1 parent 7887f46 commit 697bcaf

2 files changed

Lines changed: 50 additions & 74 deletions

File tree

guides/safe_migrations.md

Lines changed: 48 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -6,18 +6,20 @@ A guide on common migration recipes and how to avoid trouble.
66

77
| Operation | Risk | Safe Approach |
88
|-----------|------|---------------|
9-
| Add index | Blocks writes | Use `concurrently: true` and disable transactions |
10-
| Drop index | Blocks writes | Use `concurrently: true` and disable transactions |
11-
| Add foreign key | Blocks writes on both tables | Use `validate: false`, then validate separately |
12-
| Add column with default | Table rewrite (volatile defaults) | Add column first, then set default |
13-
| Add NOT NULL | Full table scan | Use check constraint, validate, then add NOT NULL |
14-
| Add check constraint | Full table scan | Create with `validate: false`, then validate separately |
15-
| Change column type | Table rewrite | Create new column, migrate data, swap reads, drop old column |
16-
| Remove column | Query failures | Remove from schema first, then drop column |
17-
| Rename column | Query failures | Use `source:` option in schema instead |
18-
| Rename table | Query failures | Rename schema module instead |
19-
| Add enum value | Transaction error | disable transactions |
20-
| Add extension | Transaction error | disable transactions |
9+
| [Adding an index](#adding-an-index) | Blocks writes | Use `concurrently: true` and disable transactions |
10+
| [Dropping an index](#dropping-an-index) | Postgres blocks reads and writes | Use `concurrently: true` in Postgres; recent MySQL keeps the table available |
11+
| [Adding a reference or foreign key](#adding-a-reference-or-foreign-key) | Blocks writes on both tables | Use `validate: false`, then validate separately |
12+
| [Adding a column with a default value](#adding-a-column-with-a-default-value) | Volatile or expression defaults may rewrite the table | Constant defaults are fast on recent Postgres/MySQL; otherwise add the column first, then set the default |
13+
| [Changing a column's default value](#changing-a-columns-default-value) | Using `modify/3` can force an unnecessary type change | Use raw SQL to change only the default |
14+
| [Changing the type of a column](#changing-the-type-of-a-column) | Table rewrite | Create a new column, migrate data, swap reads, drop the old column |
15+
| [Removing a column](#removing-a-column) | Query failures | Remove it from the schema first, then drop it |
16+
| [Renaming a column](#renaming-a-column) | Query failures | Prefer renaming the schema field and using `source:` |
17+
| [Renaming a table](#renaming-a-table) | Query failures | Prefer renaming the schema module instead |
18+
| [Adding a check constraint](#adding-a-check-constraint) | Full table scan | Create with `validate: false`, then validate separately |
19+
| [Setting NOT NULL on an existing column](#setting-not-null-on-an-existing-column) | Postgres requires a full table scan | Use a check constraint, validate it, then add `NOT NULL` |
20+
| [Adding a JSON column](#adding-a-json-column) | `SELECT DISTINCT` errors in Postgres | Use `:jsonb` instead of `:json` |
21+
| [Removing or replacing a PostgreSQL enum value](#removing-or-replacing-a-postgresql-enum-value) | Removing a value requires replacing the type | Rename directly with `RENAME VALUE` when renaming; otherwise phase app changes, backfill, then replace the type |
22+
| [Adding a PostgreSQL extension](#adding-a-postgresql-extension) | Privilege or extension-specific install requirements | Use `IF NOT EXISTS`; disable transactions only if the extension requires it |
2123

2224
## All Scenarios
2325

@@ -29,30 +31,11 @@ felt and cause timeouts. Therefore, err on the side of safety, but
2931
**always benchmark for your own database**. Also consider the hardware the
3032
database is running: for example, a Raspberry Pi 2B on a microSD will run much slower.
3133

32-
## Table of Contents
33-
34-
- [Adding an index](#adding-an-index)
35-
- [Dropping an index](#dropping-an-index)
36-
- [Adding a reference or foreign key](#adding-a-reference-or-foreign-key)
37-
- [Adding a column with a default value](#adding-a-column-with-a-default-value)
38-
- [Changing a column's default value](#changing-a-columns-default-value)
39-
- [Changing the type of a column](#changing-the-type-of-a-column)
40-
- [Removing a column](#removing-a-column)
41-
- [Renaming a column](#renaming-a-column)
42-
- [Renaming a table](#renaming-a-table)
43-
- [Adding a check constraint](#adding-a-check-constraint)
44-
- [Setting NOT NULL on an existing column](#setting-not-null-on-an-existing-column)
45-
- [Adding a JSON column](#adding-a-json-column)
46-
- [Adding a value to a PostgreSQL enum](#adding-a-value-to-a-postgresql-enum)
47-
- [Removing or replacing a PostgreSQL enum value](#removing-or-replacing-a-postgresql-enum-value)
48-
- [Adding a PostgreSQL extension](#adding-a-postgresql-extension)
49-
- [Squashing migrations](#squashing-migrations)
50-
5134
## Adding an index
5235

53-
Creating an index will [block writes](https://www.postgresql.org/docs/8.2/sql-createindex.html) to the table in Postgres.
36+
Creating an index will [block writes](https://www.postgresql.org/docs/current/sql-createindex.html) to the table in Postgres unless you use `CONCURRENTLY`.
5437

55-
MySQL is concurrent by default since [5.6](https://downloads.mysql.com/docs/mysql-5.6-relnotes-en.pdf) unless using `SPATIAL` or `FULLTEXT` indexes, which then it [blocks reads and writes](https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html#online-ddl-index-syntax-notes).
38+
In recent MySQL/InnoDB versions, adding a secondary index is an [online DDL operation](https://dev.mysql.com/doc/refman/8.4/en/innodb-online-ddl-operations.html) that permits concurrent DML. `FULLTEXT` and `SPATIAL` indexes have additional caveats.
5639

5740
### Bad
5841

@@ -112,7 +95,9 @@ For either option chosen, the migration may still take a while to run, but reads
11295

11396
## Dropping an index
11497

115-
Dropping an index blocks reads and writes while acquiring an `ACCESS EXCLUSIVE` lock.
98+
In Postgres, dropping an index blocks reads and writes while acquiring an `ACCESS EXCLUSIVE` lock.
99+
100+
In recent MySQL/InnoDB versions, dropping a secondary index is an online operation that keeps the table available for reads and writes.
116101

117102
### Bad
118103

@@ -211,24 +196,24 @@ of safety and separate constraint validation from referenced column creation whe
211196

212197
## Adding a column with a default value
213198

214-
Adding a column with a default value to an existing table may cause the table to be rewritten. During this time, reads and writes are blocked in Postgres, and writes are blocked in MySQL and MariaDB. If the default column is an expression (volatile value) it will remain unsafe.
199+
On PostgreSQL 11+ and recent MySQL/InnoDB versions, adding a column with a constant or literal default is usually a fast metadata change. The main remaining hazards are volatile or expression defaults in Postgres, and MySQL cases where the table cannot use its online DDL fast path.
215200

216-
### Bad
201+
### Caveats
217202

218-
Note: This becomes safe for non-volatile (static) defaults in:
203+
Note: A constant default is generally safe in:
219204

220205
- [Postgres 11+](https://www.postgresql.org/docs/release/11.0/). Default applies to INSERT since 7.x, and UPDATE since 9.3.
221206
- MySQL 8.0.12+
222207
- MariaDB 10.3.2+
223208

209+
The volatile-expression example below remains unsafe in Postgres.
210+
224211
```elixir
225212
def change do
226213
alter table("comments") do
227214
add :approved, :boolean, default: false
228-
# This took 10 minutes for 100 million rows with no fkeys,
229-
230-
# Obtained an AccessExclusiveLock on the table, which blocks reads and
231-
# writes.
215+
# Safe on recent PostgreSQL/MySQL when the database can use its fast path,
216+
# but older PostgreSQL versions and some table layouts may still rewrite.
232217
end
233218
end
234219
```
@@ -244,7 +229,7 @@ end
244229

245230
### Good
246231

247-
Add the column first, then alter it to include the default.
232+
If you need a conservative approach that also works for older PostgreSQL versions, or you are using a volatile default, add the column first and then alter it to include the default.
248233

249234
First migration:
250235

@@ -267,7 +252,7 @@ def change do
267252
end
268253
```
269254

270-
Note: we cannot use `Ecto.Migration.modify/3` as it will include updating the column type as
255+
Note: we cannot use `Ecto.Migration.modify/3` here as it will include updating the column type as
271256
well unnecessarily, causing Postgres to rewrite the table.
272257

273258
Schema change to read the new column:
@@ -280,11 +265,11 @@ end
280265

281266
> #### Note {: .info}
282267
>
283-
> The safe method will not materialize the default value on the column for existing rows because the default was not set when adding the column (avoiding a potential table lock so it can re-write it to _write_ the default). This may affect your queries where you'd expect the value to now be set to your default but is actually `null`. However, the next `UPDATE` operation on the row will materialize the default, additionally Ecto will apply the default on the application side when reading the record. If you want to materialize the value, then you will need to consider [backfilling](backfilling_data.html).
268+
> The safe method will not materialize the default value on the column for existing rows because the default was not set when adding the column (avoiding a potential table lock so it can re-write it to _write_ the default). This may affect your queries where you'd expect the value to now be set to your default but is actually `null`. However, the next `UPDATE` operation on the row will materialize the default, additionally Ecto will apply the default on the application side when reading the record. If you want to materialize the value, then you will need to consider [backfilling](backfilling_data.md).
284269
285270
## Changing a column's default value
286271

287-
Changing an existing column's default may risk rewriting the table.
272+
Changing only a column's default is typically a metadata change in PostgreSQL and MySQL. The real risk in Ecto is using `Ecto.Migration.modify/3`, which also restates the type.
288273

289274
### Bad
290275

@@ -293,10 +278,7 @@ def change do
293278
alter table("comments") do
294279
# Previously, the default was `true`
295280
modify :approved, :boolean, default: false
296-
# This took 10 minutes for 100 million rows with no fkeys,
297-
298-
# Obtained an AccessExclusiveLock on the table, which blocks reads and
299-
# writes.
281+
# This also restates the type, which can trigger unnecessary work.
300282
end
301283
end
302284
```
@@ -320,7 +302,7 @@ end
320302
>
321303
> This will not update the values of rows previously-set by the old default. This value has been materialized at the time of insert/update and therefore has no distinction between whether it was set by the column `DEFAULT` or set by the original operation.
322304
>
323-
> If you want to update the default of already-written rows, you must distinguish them somehow and modify them with a [backfill](backfilling_data.html)
305+
> If you want to update the default of already-written rows, you must distinguish them somehow and modify them with a [backfill](backfilling_data.md)
324306
325307
## Changing the type of a column
326308

@@ -588,7 +570,9 @@ These can be in the same deployment, but ensure there are 2 separate migrations.
588570

589571
## Setting NOT NULL on an existing column
590572

591-
Setting NOT NULL on an existing column blocks reads and writes while every row is checked. Just like the Adding a check constraint scenario, there are two operations occurring:
573+
In Postgres, setting NOT NULL on an existing column requires scanning the table and can block concurrent updates while every row is checked. Recent MySQL/InnoDB versions permit concurrent DML for many NOT NULL changes, though the operation may still rebuild the table.
574+
575+
Just like the Adding a check constraint scenario, there are two operations occurring:
592576

593577
1. Creating a new constraint for new or updating records
594578
1. Validating the new constraint for existing records
@@ -684,27 +668,21 @@ def change do
684668
end
685669
```
686670

687-
## Adding a value to a PostgreSQL enum
671+
## Removing or replacing a PostgreSQL enum value
688672

689-
Adding enum values inside a transaction can be done since PostgreSQL 12. However, if you need to support older versions or want to be safe, disable the DDL transaction.
673+
PostgreSQL does not support removing enum values or changing their sort order directly. However, it does support renaming an enum value with `ALTER TYPE ... RENAME VALUE`.
690674

691-
```elixir
692-
@disable_ddl_transaction true
693-
@disable_migration_lock true
675+
If you only need to rename a value, you can do that directly:
694676

677+
```elixir
695678
def up do
696-
execute "ALTER TYPE status ADD VALUE IF NOT EXISTS 'archived'"
697-
end
698-
699-
def down do
700-
# PostgreSQL does not support removing enum values
701-
:ok
679+
execute "ALTER TYPE status RENAME VALUE 'obsolete' TO 'draft'"
702680
end
703681
```
704682

705-
## Removing or replacing a PostgreSQL enum value
683+
For multi-node deployments, still coordinate that rename with application code changes just like any other application-visible rename.
706684

707-
PostgreSQL does not support removing or modifying enum values directly. Like renaming columns or tables, this requires coordinating application code changes with database changes.
685+
If you need to remove a value, or otherwise replace the enum definition, coordinate application code changes with database changes.
708686

709687
### Bad
710688

@@ -720,7 +698,7 @@ end
720698
Take a phased approach:
721699

722700
1. **Deploy application code** that handles both old and new enum values (stops writing the value to be removed, reads both old and new values)
723-
2. **Backfill data** to migrate rows from old value to new value (see [backfilling guide](backfilling_data.html))
701+
2. **Backfill data** to migrate rows from old value to new value (see [backfilling guide](backfilling_data.md))
724702
3. **Deploy migration** to replace the enum type
725703
4. **Deploy application code** to remove handling of old value
726704

@@ -750,7 +728,7 @@ def up do
750728
end
751729
```
752730

753-
For large tables, batch this operation. See [backfilling data](backfilling_data.html) for safe approaches.
731+
For large tables, batch this operation. See [backfilling data](backfilling_data.md) for safe approaches.
754732

755733
Third deployment (replace the enum type):
756734

@@ -769,23 +747,19 @@ end
769747
770748
## Adding a PostgreSQL extension
771749

772-
Extensions cannot be created inside a transaction.
750+
`CREATE EXTENSION` can usually run inside a transaction. The main concerns are privileges, extension availability, and whether the extension's installation script depends on commands that cannot run inside a transaction block.
773751

774-
### Bad
752+
### Example
775753

776754
```elixir
777755
def change do
778-
# Fails: CREATE EXTENSION cannot run inside a transaction block
779756
execute "CREATE EXTENSION \"uuid-ossp\""
780757
end
781758
```
782759

783760
### Good
784761

785762
```elixir
786-
@disable_ddl_transaction true
787-
@disable_migration_lock true
788-
789763
def change do
790764
execute "CREATE EXTENSION IF NOT EXISTS \"uuid-ossp\"",
791765
"DROP EXTENSION IF EXISTS \"uuid-ossp\""
@@ -795,6 +769,8 @@ end
795769
> #### Note {: .info}
796770
>
797771
> Creating extensions typically requires superuser privileges. In managed database services (AWS RDS, Heroku), some extensions may not be available.
772+
>
773+
> If an extension complains that it cannot run inside a transaction block, then disable the DDL transaction for that specific migration.
798774
799775
## Credits
800776

lib/ecto/migration.ex

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ defmodule Ecto.Migration do
8989
For the rest of this document, we will cover the migration APIs
9090
provided by Ecto. For a in-depth discussion of migrations and how
9191
to use them safely within your application and data, see the
92-
[Safe Ecto Migrations guide](safe_migrations.html).
92+
[Safe Ecto Migrations guide](safe_migrations.md).
9393
9494
## Mix tasks
9595
@@ -407,7 +407,7 @@ defmodule Ecto.Migration do
407407
408408
## Additional resources
409409
410-
* The [Safe Ecto Migrations guide](safe_migrations.html)
410+
* The [Safe Ecto Migrations guide](safe_migrations.md)
411411
412412
"""
413413

0 commit comments

Comments
 (0)