[Cosmos][WIP]: Normalize region names passed as preferred or exclude regions.#49090
[Cosmos][WIP]: Normalize region names passed as preferred or exclude regions.#49090jeet1995 wants to merge 14 commits intoAzure:mainfrom
Conversation
Customers passing region names in non-canonical forms (e.g., 'west us 3' instead of 'West US 3') hit routing issues because the Java SDK stores region names in different forms and some comparisons use case-sensitive String.equals()/List.contains(). Changes: - Add RegionNameMapper: strips spaces + case-insensitive lookup against 90+ known Azure regions to produce canonical names (e.g., 'westus3' or 'west us 3' -> 'West US 3'). Unknown regions pass through as-is. - ConnectionPolicy.setPreferredRegions(): normalize + order-preserving dedupe at entry point. - LocationCache constructor: apply RegionNameMapper before toLowerCase for defense-in-depth. - Fix case-sensitive List.contains() bug in reevaluate() (line 502): use containsRegionIgnoreCase() instead. - Normalize user-configured exclude regions at point of use in getApplicableRegionRoutingContexts() to prevent mismatches with PPCB-derived lowercased region names. - Add RegionNameMapperTest with 43 unit tests covering case variants, space removal, passthrough, null/empty handling. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The static region list in RegionNameMapper goes stale when new Azure regions are added. Fix: add a ConcurrentHashMap-backed dynamic tier that learns canonical region names from server responses. - RegionNameMapper.registerRegionName(): registers canonical names from DatabaseAccountLocation (called from LocationCache.addRoutingContexts). After the first account read, even new regions like 'West US 4' can normalize 'westus4' → 'West US 4'. - getCosmosDBRegionName(): checks static map first, then dynamic map. - Add 2 new tests for dynamic registration behavior. - 45/45 RegionNameMapperTest pass, 32/32 LocationCacheTest pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous commit had stash conflict markers (<<<<<<< Updated upstream / >>>>>>> Stashed changes) left in RegionNameMapper.java and RegionNameMapperTest.java. Rewrote both files clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Merge the separate RegionNameMapper into RegionNameToRegionIdMap as the single source of truth for region names. This eliminates maintaining two parallel region lists that can drift out of sync. Changes: - Delete RegionNameMapper.java — normalization logic moved into RegionNameToRegionIdMap. - RegionNameToRegionIdMap now provides region ID mapping (existing) AND region name normalization (new) from one canonical list. - Sync REGION_NAME_TO_REGION_ID_MAPPINGS with backend RegionToIdMap.cs: add Bleu France Central/South (107/108), Delos Cloud Germany Central/North (109/110), Singapore Central/North (111/112), fix 'easteurope' → 'East Europe' (54). - Build NORMALIZED_REGION_NAME_TO_REGION_ID_MAPPINGS programmatically from REGION_NAME_TO_REGION_ID_MAPPINGS instead of manual duplication. - Normalization static map seeded from ID map keys + additional regions without IDs yet (from .NET SDK Regions.cs). - Rename test: RegionNameMapperTest → RegionNameToRegionIdMapNormalizationTest. - Update ConnectionPolicy and LocationCache references. - All 78 tests pass (45 normalization + 32 LocationCache + 1 consistency). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add 7 tests to LocationCacheTest using real Azure region names to verify that preferred regions and exclude regions work correctly with non-canonical input: - preferredRegions_lowercaseShouldMatchCanonical: 'west us 3' → West US 3 - preferredRegions_noSpacesShouldMatchCanonical: 'westus3' → West US 3 - preferredRegions_uppercaseShouldMatchCanonical: 'WEST US 3' → West US 3 - preferredRegions_duplicateAfterNormalizationShouldDedupe: 'westus3' + 'West US 3' deduped to single entry - excludeRegions_lowercaseNoSpacesShouldExclude: 'westus3' excludes West US 3 - excludeRegions_mixedCasingShouldExclude: 'EAST us' excludes East US - excludeRegions_requestLevelNoSpacesShouldExclude: request-level 'eastus' excludes East US All 39 LocationCacheTest unit tests pass (32 existing + 7 new). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create a second CosmosClient with space-stripped preferred regions (e.g., 'westus3' instead of 'west us 3') and verify that routing and region exclusion work identically to canonical names. New tests: - nonCanonicalPreferredRegions_shouldRouteCorrectly: client with space-stripped preferred regions routes to correct first region (7 operation types via DataProvider) - nonCanonicalExcludeRegion_shouldSkipExcludedRegion: excluding with space-stripped name (e.g., 'westus3') correctly skips that region (7 operation types via DataProvider) - uppercaseExcludeRegion_shouldSkipExcludedRegion: excluding with UPPERCASE name correctly skips that region Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add tests that create CosmosClients with space-stripped preferred regions (e.g., 'westus3' instead of 'West US 3') and verify correct routing. FaultInjectionWithAvailabilityStrategyTestsBase: - Add nonCanonicalWriteableRegions field (space-stripped from server names) - readAfterCreation_nonCanonicalPreferredRegions_shouldRouteCorrectly: creates client with space-stripped regions, reads with eager availability strategy, verifies first contacted region matches expected canonical name PerPartitionCircuitBreakerE2ETests: - nonCanonicalPreferredRegions_ppcbShouldStillRouteCorrectly: creates client with space-stripped regions, performs create+read, verifies diagnostics show routing to correct first preferred region Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Simplify RegionNameToRegionIdMap by removing the ConcurrentHashMap-backed dynamic registration tier. Unknown regions are returned as-is, which is sufficient because LocationCache's CaseInsensitiveMap + toLowerCase handles the matching for any region the server returns. - Remove DYNAMIC_NORMALIZED_TO_CANONICAL and registerRegionName() - Remove registerRegionName() call from LocationCache.addRoutingContexts() - Replace dynamic registration tests with passthrough assertion tests - 84/84 tests pass (44 normalization + 39 LocationCache + 1 consistency) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…sible Duplicate preferred regions after normalization (e.g., ['westus3', 'West US 3'] both becoming 'West US 3') are an obvious customer misconfiguration. The SDK should not silently mask this — let the duplicates pass through so the customer can see and fix their config. Also clarify code comments for the escape hatch behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add 10 regions from the authoritative LocationNames.cs that were missing from the normalization map: East US SLV, Southeast US, Southwest US, South Central US 2, Southeast US 3, Southeast US 5, Northeast US 5, India South Central, Southeast Asia 3, West Central US FRE. Region ID mappings remain a subset (only regions with assigned IDs from RegionToIdMap.cs). The normalization map is the superset sourced from LocationNames.cs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the previous RegionToIdMap.cs-based ID map with the complete regionToIdMapping from Settings.xml (IDs 1-124). This is the authoritative source for region name ↔ ID mappings used for session token region-level progress tracking. - Add 44 new region IDs (74-124): Brazil Southeast, West US 3, Qatar Central, Italy North, East US 3, Saudi Arabia East, etc. - Remove separate 'additional canonical names' block — all canonical names now derive from the ID map since Settings.xml is the superset. - Remove 'Greece Central' which was not in any authoritative source. - Update javadoc and code comments to reference Settings.xml as the authoritative source instead of RegionToIdMap.cs. - 83/83 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ationCache - Rename class to RegionUtils — better reflects its dual role (ID mapping + region name normalization). - Move normalizeRegionNames() and containsRegionIgnoreCase() from LocationCache private helpers into RegionUtils as public static methods. - Rename all test files to match: RegionUtilsNormalizationTest, RegionUtilsTests. - Update all references across ConnectionPolicy, LocationCache, PartitionScopedRegionLevelProgress, RxDocumentClientImpl. - 83/83 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@sdkReviewAgent |
There was a problem hiding this comment.
Pull request overview
This PR addresses Cosmos DB routing mismatches caused by non-canonical Azure region name inputs by introducing centralized region normalization and applying it to preferred/excluded region handling across the Cosmos Java SDK routing stack.
Changes:
- Added
RegionUtilsas the single source of truth for region ID mappings and canonical region name normalization, and updated call sites to use it. - Normalized preferred/excluded regions in
ConnectionPolicyandLocationCache, including a fix for a case-sensitive exclude-region check in PPCB reevaluation logic. - Added/updated unit and E2E tests to validate routing behavior with non-canonical region inputs and updated the Cosmos changelog entry.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentClientImpl.java | Switches region ID lookup to RegionUtils. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/routing/RegionUtils.java | Introduces region normalization + region ID mapping utilities. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/routing/RegionNameToRegionIdMap.java | Removes the old region mapping class in favor of RegionUtils. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/routing/LocationCache.java | Normalizes excluded regions and fixes PPCB exclude-region comparison behavior. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/PartitionScopedRegionLevelProgress.java | Updates region ID/name lookups to RegionUtils. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ConnectionPolicy.java | Normalizes preferred regions at configuration time. |
| sdk/cosmos/azure-cosmos/CHANGELOG.md | Documents the normalization + PPCB exclude-region fix. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/PerPartitionCircuitBreakerE2ETests.java | Adds E2E coverage for PPCB routing with non-canonical preferred regions. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/routing/RegionUtilsNormalizationTest.java | Adds unit coverage for normalization behavior. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/routing/LocationCacheTest.java | Adds integration-style unit tests for preferred/exclude region normalization with real region names. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/RegionUtilsTests.java | Updates the existing mapping-consistency test to the new RegionUtils. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/FaultInjectionWithAvailabilityStrategyTestsBase.java | Adds E2E validation for availability strategy routing with non-canonical preferred regions. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/ExcludeRegionTests.java | Adds E2E coverage for non-canonical preferred/exclude region inputs. |
| List<RegionalRoutingContext> applicableEndpoints = new ArrayList<>(); | ||
|
|
||
| // Normalize user-configured exclude regions to canonical form for consistent comparison. | ||
| // Unknown regions not in the static map are passed through as-is. |
| public static final Map<String, Integer> REGION_NAME_TO_REGION_ID_MAPPINGS = new HashMap<String, Integer>() { | ||
| { | ||
| put("East US", 1); | ||
| put("East US 2", 2); | ||
| put("Central US", 3); | ||
| put("North Central US", 4); | ||
| put("South Central US", 5); |
| assertThat(diagnosticsContext.getContactedRegionNames().iterator().next()) | ||
| .isEqualTo(expectedFirstRegion); |
|
@sdkReviewAgent |
| return canonical; | ||
| } | ||
|
|
||
| return regionName; |
There was a problem hiding this comment.
I am wondering whether we should fallback to normalized version when not found from the map. Even in globalEndpointManager, we just use the normalized version for findings the regional endpoint
| * returned as-is.</li> | ||
| * </ol> | ||
| */ | ||
| public class RegionUtils { |
There was a problem hiding this comment.
should we also change to use RegionUtils for GlobalEndpointManager as well
|
|
||
| String normalized = regionName.toLowerCase(Locale.ROOT).replace(" ", ""); | ||
|
|
||
| String canonical = NORMALIZED_TO_CANONICAL.get(normalized); |
There was a problem hiding this comment.
🟢 Suggestion — Forward Compatibility: Unknown regions with space variants won't match
For unknown regions (not in the static map), getCosmosDBRegionName returns the input as-is. This means two variant spellings of the same unknown region won't match if they differ by spaces:
- Customer passes
preferredRegions = ["futureregion"] - Server returns
"Future Region"→ stored as"future region"inaddRoutingContexts - Preferred location
"futureregion"≠"future region"→ region not matched
The PR description acknowledges this: "spaces are optional for known regions only." The defense-in-depth via CaseInsensitiveMap + toLowerCase() handles case differences for unknown regions, but not space differences.
If forward compatibility for space-stripped unknown regions is desired, the fallback could return the normalized form instead of the original:
// Instead of: return regionName;
return normalized; // lowercase + no-spaces, matches server after toLowerCaseThis would make "futureregion" and "Future Region" both collapse to "futureregion", matching in the lowercased endpoint map. The tradeoff: diagnostic logs would show the normalized form instead of the user's original input.
| } | ||
| }; | ||
|
|
||
| public static final Map<Integer, String> REGION_ID_TO_NORMALIZED_REGION_NAME_MAPPINGS = new HashMap<Integer, String>() { |
There was a problem hiding this comment.
🟡 Recommendation — Maintenance: Derive REGION_ID_TO_NORMALIZED_REGION_NAME_MAPPINGS programmatically
This map is still 124 hand-maintained entries, while NORMALIZED_REGION_NAME_TO_REGION_ID_MAPPINGS (same data, inverse direction) was correctly refactored to be derived programmatically from REGION_NAME_TO_REGION_ID_MAPPINGS. The inconsistency creates a maintenance trap: when a new region is added to REGION_NAME_TO_REGION_ID_MAPPINGS, a contributor must also remember to update this manual map. Two of four maps auto-derive; this one doesn't.
The existing test regionIdToRegionNameConsistency catches inconsistencies, but a pit-of-success design would eliminate the risk entirely:
static {
Map<Integer, String> idToNormalized = new HashMap<>();
Map<String, Integer> normalizedToId = new HashMap<>();
for (Map.Entry<String, Integer> entry : REGION_NAME_TO_REGION_ID_MAPPINGS.entrySet()) {
String normalized = entry.getKey().toLowerCase(Locale.ROOT).replace(" ", "");
normalizedToId.put(normalized, entry.getValue());
idToNormalized.putIfAbsent(entry.getValue(), normalized);
}
NORMALIZED_REGION_NAME_TO_REGION_ID_MAPPINGS = Collections.unmodifiableMap(normalizedToId);
REGION_ID_TO_NORMALIZED_REGION_NAME_MAPPINGS = Collections.unmodifiableMap(idToNormalized);
}This eliminates ~130 lines of duplicated data and makes REGION_NAME_TO_REGION_ID_MAPPINGS the true single source of truth.
| /** | ||
| * Tests for {@link RegionUtils} | ||
| */ | ||
| public class RegionUtilsNormalizationTest { |
There was a problem hiding this comment.
🟡 Recommendation — Testing: No unit tests for containsRegionIgnoreCase and normalizeRegionNames
These two new public utility methods have zero direct unit tests. containsRegionIgnoreCase is the direct replacement for the buggy List.contains() — it deserves focused tests documenting its contract. normalizeRegionNames silently drops null elements from the input (reducing list size), which is a behavioral contract that should be explicitly tested.
Suggested additions to RegionUtilsNormalizationTest:
// containsRegionIgnoreCase
assertThat(RegionUtils.containsRegionIgnoreCase(
Arrays.asList("westus3"), "West US 3")).isTrue();
assertThat(RegionUtils.containsRegionIgnoreCase(
Arrays.asList("West US 3"), "WEST US 3")).isTrue();
assertThat(RegionUtils.containsRegionIgnoreCase(
null, "anything")).isFalse();
assertThat(RegionUtils.containsRegionIgnoreCase(
Arrays.asList("East US"), "unknownRegion")).isFalse();
// normalizeRegionNames
assertThat(RegionUtils.normalizeRegionNames(
Arrays.asList("westus3", "east us")))
.containsExactly("West US 3", "East US");
assertThat(RegionUtils.normalizeRegionNames(null)).isEmpty();
assertThat(RegionUtils.normalizeRegionNames(
Arrays.asList("East US", null, "westus3")))
.containsExactly("East US", "West US 3"); // null dropped| } | ||
| String normalizedTarget = getCosmosDBRegionName(target); | ||
| for (String region : regions) { | ||
| if (getCosmosDBRegionName(region).equalsIgnoreCase(normalizedTarget)) { |
There was a problem hiding this comment.
🟡 Recommendation — Correctness: containsRegionIgnoreCase NPE if list contains null elements
If region is null, getCosmosDBRegionName(null) returns null (via the StringUtils.isEmpty guard), then null.equalsIgnoreCase(normalizedTarget) throws NullPointerException.
Current callers are safe — the sole production caller at LocationCache:506 passes a list already processed by normalizeRegionNames(), which filters nulls. But this is a package-visible utility method whose contract doesn't document the null-element restriction. A future caller could hit this.
for (String region : regions) {
if (region != null && getCosmosDBRegionName(region).equalsIgnoreCase(normalizedTarget)) {
return true;
}
}|
|
||
| if (Utils.tryGetValue(regionalRoutingContextsByRegionName, internalExcludeRegion, regionalRoutingContextValueHolder)) { | ||
| if (!regionalRoutingContextValueHolder.v.equals(firstApplicableRegionalRoutingContext) && !userConfiguredExcludeRegions.contains(internalExcludeRegion)) { | ||
| if (!regionalRoutingContextValueHolder.v.equals(firstApplicableRegionalRoutingContext) && !RegionUtils.containsRegionIgnoreCase(userConfiguredExcludeRegions, internalExcludeRegion)) { |
There was a problem hiding this comment.
🟡 Recommendation — Testing: No regression test for the PPCB List.contains() bug fix
This line is the actual bug fix described in the PR — replacing case-sensitive List.contains(internalExcludeRegion) with RegionUtils.containsRegionIgnoreCase(userConfiguredExcludeRegions, internalExcludeRegion). It prevents excluded regions from being silently re-added as retry targets when casing didn't match.
However, no test exercises this specific code path. The reevaluate() method is only entered when applicableRegionalRoutingContexts.size() < 2 AND preferredRoutingContexts.size() > 1 AND PPCB internal exclude regions are present. None of the 6 new LocationCacheTest normalization tests pass internalExcludeRegions. The E2E PPCB test (nonCanonicalPreferredRegions_ppcbShouldStillRouteCorrectly) does a basic create+read without fault injection, so it never triggers the circuit breaker or enters reevaluate.
A targeted regression test would: (1) set up user-configured exclude regions in non-canonical form (e.g., "westus3"), (2) provide internalExcludeRegions in lowercased canonical form (e.g., "west us 3"), (3) trigger the reevaluate path (ensure only 1 applicable endpoint remains), (4) assert the internally excluded region is NOT re-added when it matches the user's exclude list after normalization.
Without this, a future refactor could reintroduce the case-sensitive contains() without breaking any test.
|
✅ Review complete (58:11) Posted 5 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
- Fix comment indentation in LocationCache (line 349) - Make REGION_NAME_TO_REGION_ID_MAPPINGS unmodifiable to prevent accidental mutation after initialization - Derive REGION_ID_TO_NORMALIZED_REGION_NAME_MAPPINGS programmatically from the forward map — eliminates ~130 lines of manual duplication - Return normalized form (lowercase, no spaces) for unknown regions instead of as-is — ensures space-stripped unknown regions match after LocationCache toLowerCase() (e.g., 'futureregion' matches 'future region') - Add null guard in containsRegionIgnoreCase to prevent NPE on null list elements - Fix PPCB test to use contains() instead of iterator().next() to avoid flaky assertion on Set iteration order - Add unit tests for normalizeRegionNames() and containsRegionIgnoreCase() (9 new tests covering normalization, null/empty, null elements, matches) - Update unknown-region tests to expect normalized form - 90/90 tests pass (51 normalization + 38 LocationCache + 1 consistency) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Problem
Customers passing region names in non-canonical forms (e.g.,
west us 3orwestus3instead ofWest US 3) hit routing issues. The Java SDK stores region names in different representations (lowercased in maps, original case in lists), and some code paths use case-sensitiveString.equals()/List.contains()— causing mismatches between user-provided and server-returned region names.The .NET SDK solves this with a
RegionNameMapperthat normalizes all region input to canonical form at client construction. The Java SDK lacked this.Core approach
Normalization: strip spaces + lowercase → lookup in a static map of known Azure regions → return canonical form.
westus3westus3West US 3✓west us 3westus3West US 3✓WEST US 3westus3West US 3✓West US 3westus3West US 3✓ (no-op)Escape hatch for unknown regions: If a region is not in the static map (e.g., a brand-new Azure region not yet compiled into the SDK), the input is returned as-is. This works because
LocationCacheappliestoLowerCase()and usesCaseInsensitiveMapfor all endpoint lookups — so even unknown regions match correctly as long as the customer's input has the same words as the server response (any casing works, spaces are optional for known regions only).No deduplication: If a customer passes
["westus3", "West US 3"], both normalize to"West US 3"and the list will contain two identical entries. This is intentional — duplicate preferred regions are an obvious misconfiguration that the customer should fix, not something the SDK should silently mask.Changes
RegionUtils.java(renamed fromRegionNameToRegionIdMap.java) — single source of truthRegionUtilsto better reflect its dual role: region ID mapping + region name normalization.regionToIdMappingfrom Settings.xml (IDs 1–124). Used only for session token region-level progress tracking (localLsn).getCosmosDBRegionName(String)— static normalizer. Canonical names derived from the ID map. Unknown regions passed through as-is.normalizeRegionNames(List<String>)andcontainsRegionIgnoreCase(List<String>, String)— utilities for batch normalization and case-insensitive region membership checks (moved out ofLocationCache).NORMALIZED_REGION_NAME_TO_REGION_ID_MAPPINGSprogrammatically from the ID map (was manually duplicated before).ConnectionPolicy.setPreferredRegions()RegionUtils.getCosmosDBRegionName()at entry.LocationCache.javaRegionUtils.getCosmosDBRegionName()beforetoLowerCase()(defense-in-depth).List.contains()withRegionUtils.containsRegionIgnoreCase()in PPCB reevaluate logic — was causing excluded regions to be re-added as retry targets when casing didn't match.RegionUtils.normalizeRegionNames()before comparison ingetApplicableRegionRoutingContexts().Tests
Unit tests (run locally, all pass)
RegionUtilsNormalizationTest— case variants, space removal, passthrough, null/empty, unknown regionsLocationCacheTest— 32 existing + 6 new integration tests with real Azure region names (preferred region and exclude region normalization)RegionUtilsTests— existing ID map consistency checkE2E tests (run in CI)