-
Notifications
You must be signed in to change notification settings - Fork 550
docs: document phone number normalization migration #2510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hperl
wants to merge
11
commits into
master
Choose a base branch
from
hperl/phone-normalization
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+131
−9
Open
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
b681f9b
docs(kratos): add phone number normalization migration guide
hperl fc9869f
chore: format
hperl ca76591
fix build
hperl 1db2440
fix tests
hperl e4eb1f1
chore: format
hperl 11a798b
fix
hperl bcd209a
chore: format
hperl 7d5df3d
code review
hperl 52f7095
add to sidebar
hperl 0d7af17
Merge remote-tracking branch 'origin/master' into hperl/phone-normali…
hperl 13559b6
Apply suggestions from code review
hperl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,126 @@ | ||
| --- | ||
| id: normalize-phone-numbers | ||
| title: Normalize phone numbers to E.164 | ||
| sidebar_label: Normalize phone numbers | ||
| --- | ||
|
|
||
| Ory Kratos normalizes phone numbers to [E.164 format](https://en.wikipedia.org/wiki/E.164) when they're used as identifiers, | ||
| verifiable addresses, or recovery addresses. New data is normalized on write. Existing data continues to work through a | ||
| backward-compatible lookup, but you should run the `normalize-phone-numbers` migration command after upgrading to converge all | ||
| rows to E.164. | ||
|
|
||
| This guide is for self-hosted Ory Kratos administrators (OSS and OEL). Ory Network customers don't need to take any action. | ||
|
|
||
| :::important | ||
|
|
||
| Back up your database before running the migration. The migration doesn't store the original value, therefore there's no automatic | ||
| rollback after migration. To revert, you will need to restore your backed-up database. | ||
|
|
||
| ::: | ||
|
|
||
| ## Why normalize | ||
|
|
||
| Before this change, Ory Kratos stored phone numbers exactly as users entered them. A user who registered with `+49 176 671 11 638` and | ||
| another who registered with `+4917667111638` would create two separate identities for the same phone number. Lookups, recovery, | ||
| and verification could behave inconsistently depending on the input format. | ||
|
|
||
| After normalization, all phone numbers are stored in E.164 format (for example, `+4917667111638`). Lookups match regardless of how | ||
| the user formatted the input. | ||
|
|
||
| ## Rollout sequence | ||
|
|
||
| :::caution | ||
|
|
||
| Don't run the migration before deploying the new Kratos version. The previous version does exact-string matching on identifiers. | ||
| If you normalize the database first, users who type their phone number in the original (non-E.164) format won't be able to log in | ||
| until the new code is deployed. | ||
|
|
||
| ::: | ||
|
|
||
| Run the steps in this exact order: | ||
|
|
||
| 1. Deploy the new Ory Kratos version. | ||
| The new code normalizes phone numbers on write and uses a backward-compatible lookup that matches both E.164 and legacy | ||
| formats. Existing users can still log in with whatever format they originally registered with. | ||
|
|
||
| 2. Run the migration command. | ||
| After the deploy completes and traffic is stable, run: | ||
|
|
||
| ``` | ||
| kratos migrate normalize-phone-numbers <database-url> | ||
| ``` | ||
|
|
||
| Or with the DSN from the environment: | ||
|
|
||
| ``` | ||
| export DSN=... | ||
| kratos migrate normalize-phone-numbers -e | ||
| ``` | ||
|
|
||
| The command iterates over `identity_credential_identifiers`, `identity_verifiable_addresses`, and `identity_recovery_addresses` | ||
| and rewrites any non-E.164 phone numbers in place. | ||
|
|
||
| ## What the command does | ||
|
|
||
| The command uses keyset pagination to scan three tables in batches: | ||
|
|
||
| | Table | Column | Filter | | ||
| | --------------------------------- | ------------ | ---------------------- | | ||
| | `identity_credential_identifiers` | `identifier` | `identifier LIKE '+%'` | | ||
| | `identity_verifiable_addresses` | `value` | `via = 'sms'` | | ||
| | `identity_recovery_addresses` | `value` | `via = 'sms'` | | ||
|
|
||
| For each row, the command parses the value with the [`nyaruka/phonenumbers`](https://github.com/nyaruka/phonenumbers) library and | ||
| rewrites it to E.164 if parsing succeeds. Rows that fail to parse - for example, an OIDC subject that happens to start with `+` - | ||
| are left untouched and counted as skipped. | ||
|
|
||
| The command is idempotent: running it twice is safe. The second run only reports skipped rows. | ||
|
|
||
| ## Flags | ||
|
|
||
| | Flag | Default | Description | | ||
| | ----------------------- | ------- | ------------------------------------------------------------------------ | | ||
| | `-e`, `--read-from-env` | `false` | Read the database connection string from the `DSN` environment variable. | | ||
| | `-b`, `--batch-size` | `1000` | Number of rows to process per batch. | | ||
| | `--dry-run` | `false` | Report what would change without writing. | | ||
|
|
||
| Use `--dry-run` first to preview the changes: | ||
|
|
||
| ``` | ||
| kratos migrate normalize-phone-numbers --dry-run -e | ||
| ``` | ||
|
|
||
| Each row that would be updated is printed in the form: | ||
|
|
||
| ``` | ||
| [dry-run] identity_credential_identifiers <id>: "+49 176 671 11 638" -> "+4917667111638" | ||
| ``` | ||
|
|
||
| ## Output | ||
|
|
||
| After processing all three tables, the command prints a summary: | ||
|
|
||
| ``` | ||
| === Summary === | ||
| identity_credential_identifiers: scanned=1234 updated=42 skipped=1192 errors=0 | ||
| identity_verifiable_addresses: scanned=987 updated=15 skipped=972 errors=0 | ||
| identity_recovery_addresses: scanned=987 updated=15 skipped=972 errors=0 | ||
| ``` | ||
|
|
||
| - `scanned`: rows examined. | ||
| - `updated`: rows rewritten to E.164 (or rows that _would_ be rewritten in dry-run mode). | ||
| - `skipped`: rows already in E.164 format, or values that aren't valid phone numbers. | ||
| - `errors`: rows that failed to update. Errors are logged to stderr with the row ID and source value. | ||
|
|
||
| ## Duplicate handling | ||
|
|
||
| If the migration finds two rows that normalize to the same E.164 value (for example, `+49 176 671 11 638` and `+4917667111638` for | ||
| the same user), the update fails on the second row with a unique constraint violation, which the command logs as an error and | ||
| skips. You can resolve the duplicate manually and re-run the command. | ||
|
|
||
| In practice, duplicates are rare. Most identities have only one phone identifier per credential type. | ||
|
|
||
| ## Rolling back | ||
|
|
||
| The migration only converts non-E.164 values to E.164. It doesn't store the original value, so there's no automatic rollback. If | ||
| you need to revert, restore from the backup you took before running the command. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.