Reset Copilot utility model cache and force one retry when CAPI omits chat fallback by benvillalobos · Pull Request #316720 · microsoft/vscode

benvillalobos · 2026-05-15T23:13:13Z

Fix a caching bug in ModelMetadataFetcher that can lock the Copilot utility chat model into a permanently-unresolvable state for up to 10 minutes after a single CAPI /models response that doesn't flag any model with is_chat_fallback: true.

Problem

_copilotUtilityModel is only ever assigned inside _fetchModels when a model has is_chat_fallback === true. It is never cleared. Combined with _shouldRefreshModels(), which short-circuits while _familyMap.size > 0 and _lastFetchTime is within the 10-minute window, a single bad response (or a 429 short-circuit while the cache is fresh) leaves getCopilotUtilityModel() throwing "Unable to resolve Copilot utility chat model (server did not mark a chat fallback model)" until the cache expires.

This has been a contributing source of sanity-test flakes in CI.

Fix

Two complementary changes:

1. Reset _copilotUtilityModel = undefined adjacent to _familyMap.clear() in _fetchModels.

This makes _copilotUtilityModel a derived value of the current response rather than a sticky field that accumulates across fetches. The pattern now matches what _fetchModels already does for _familyMap — clear, then repopulate from the current response.

Without the reset:

If a previous fetch set _copilotUtilityModel = ModelA and the current response no longer flags any model as the fallback, we'd keep serving the stale ModelA for the utility endpoint indefinitely.
The cache would lie about what the server actually said.

The reset on its own does not fix the original flake (a first-ever bad response still leaves _copilotUtilityModel undefined), but it is correctness hygiene that the retry below relies on.

2. Force one refresh in getCopilotUtilityModel() when models are cached but the utility model is missing.

if (!this._copilotUtilityModel && this._familyMap.size > 0) {
    await this._taskSingler.getOrCreate(ALL_MODEL_KEY, () => this._fetchModels(true));
}

The true argument is the force parameter on _fetchModels, which bypasses the _shouldRefreshModels() cache check. Without force, the retry would no-op immediately because the bad response just populated _familyMap and set _lastFetchTime = Date.now(). With force, we guarantee a fresh CAPI request, rebuild _familyMap, and (thanks to change #1) repopulate _copilotUtilityModel if the server flags one this time.

The forced refresh is bounded to one call per getCopilotUtilityModel invocation and still flows through TaskSingler, so coalescing semantics are preserved.

Copilot

Pull request overview

This PR fixes a caching edge case in the Copilot extension’s ModelMetadataFetcher where the “Copilot utility” chat model can get stuck unresolved for up to 10 minutes if a /models response omits is_chat_fallback: true.

Changes:

Clears _copilotUtilityModel when refreshing the model family map so stale/invalid state can’t persist across fetches.
Adds a bounded one-time forced refresh in getCopilotUtilityModel() when models were returned but no chat fallback was flagged, to avoid being trapped by the 10-minute cache window.

Show a summary per file

File	Description
extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts	Resets cached utility model on refresh and adds a forced single retry to recover from transient missing `is_chat_fallback` responses.

Copilot's findings

Files reviewed: 1/1 changed files
Comments generated: 2

 			this._familyMap.clear();
+			this._copilotUtilityModel = undefined;

 			const data: IModelAPIResponse[] = (await response.json()).data;
 			this._requestLogger.logModelListCall(requestId, requestMetadata, data);


 		await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, this._fetchModels.bind(this));
+		if (!this._copilotUtilityModel && this._familyMap.size > 0) {
+			// Server returned models but did not flag a chat fallback; force one refresh
+			// before throwing so we are not stuck on a stale 10-minute cache window.
+			await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, () => this._fetchModels(true));
+		}


…ment

Copilot

Copilot's findings

Files reviewed: 1/1 changed files
Comments generated: 1

+		if (!this._copilotUtilityModel && this._familyMap.size > 0 && !this._hasForcedUtilityModelRetry) {
+			// One-shot retry per auth epoch: avoids storming CAPI on persistent server misconfig.
+			this._hasForcedUtilityModelRetry = true;
+			this._logService.warn('Utility model unset after initial fetch; forcing one refresh');
+			await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, () => this._fetchModels(true));
+		}
 		const resolvedModel = this._copilotUtilityModel;
 		if (!resolvedModel || !isChatModelInformation(resolvedModel)) {


Copilot

Copilot's findings

Comments suppressed due to low confidence (4)

extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts:183

_hasForcedUtilityModelRetry is set to true before the forced refresh is awaited. If multiple callers reach this block concurrently, only the first will trigger/await the refresh; others will skip it and can throw even though the refresh is already in-flight (since TaskSingler would otherwise coalesce them). Consider setting the flag only after the forced refresh completes, or tracking/awaiting a shared in-flight retry promise so all callers wait for the same refresh.

		if (!this._copilotUtilityModel && this._familyMap.size > 0 && !this._hasForcedUtilityModelRetry) {
			// One-shot retry per auth epoch: avoids storming CAPI on persistent server misconfig.
			this._hasForcedUtilityModelRetry = true;
			this._logService.warn('Utility model unset after initial fetch; forcing one refresh');
			await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, () => this._fetchModels(true));
		}

extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts:116

The retry gating is described as “one-shot per auth epoch”, but _hasForcedUtilityModelRetry is only reset on logout (copilotToken === undefined) and on fetch errors. If the token changes/refreshes (auth change event) but remains defined, or after a later successful refresh cycle, this flag may stay true indefinitely and prevent future one-shot retries, potentially reintroducing the original failure mode. Consider resetting the flag on any auth change (or when a fetch succeeds) to match the intended “epoch” semantics.

This issue also appears in the following locations of the same file:

line 176
line 178
line 304

		this._register(this._authService.onDidAuthenticationChange(() => {
			// Auth changed so next fetch should be forced to get a new list

			// Only clear the family map if the copilot token is undefined, as this means the user has logged out and we should clear the models, otherwise we want to keep the old models around until we get a new list
			if (this._authService.copilotToken === undefined) {
				this._familyMap.clear();
				this._copilotUtilityModel = undefined;
				this._hasForcedUtilityModelRetry = false;
			}

			this._completionsFamilyMap.clear();
			this._lastFetchTime = 0;
		}));

extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts:182

PR description says the forced refresh is “bounded to one call per getCopilotUtilityModel() invocation”, but the added _hasForcedUtilityModelRetry makes it at most once for the lifetime of the instance (until logout/error), changing the documented behavior. Either adjust the implementation to match the description (one retry per invocation) or update the PR description/comments to reflect the actual global gating.

	public async getCopilotUtilityModel(): Promise<IChatModelInformation> {
		await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, this._fetchModels.bind(this));
		if (!this._copilotUtilityModel && this._familyMap.size > 0 && !this._hasForcedUtilityModelRetry) {
			// One-shot retry per auth epoch: avoids storming CAPI on persistent server misconfig.
			this._hasForcedUtilityModelRetry = true;
			this._logService.warn('Utility model unset after initial fetch; forcing one refresh');
			await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, () => this._fetchModels(true));

extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts:309

_hasForcedUtilityModelRetry is reset to false in the _fetchModels catch block. If the server is persistently misconfigured (no fallback model) or the refresh is failing, this can cause repeated forced refresh attempts on subsequent getCopilotUtilityModel() calls (potentially doubling CAPI traffic). Consider leaving the flag set after a failed forced retry (or adding a backoff/timestamp) so the “one-shot” protection still holds during outages.

				this._hasForcedUtilityModelRetry = false;
			}
			this._onDidModelRefresh.fire();
		} catch (e) {
			this._logService.error(e, `Failed to fetch models (${requestId})`);
			this._lastFetchError = e;

Files reviewed: 1/1 changed files
Comments generated: 1

@@ -272,6 +281,7 @@ export class ModelMetadataFetcher extends Disposable implements IModelMetadataFe
 			}

 			this._familyMap.clear();


Reset utility model cache and force retry when fallback flag is missing

3693d4a

Copilot AI review requested due to automatic review settings May 15, 2026 23:13

Copilot started reviewing on behalf of benvillalobos May 15, 2026 23:17 View session

vs-code-engineering Bot assigned benvillalobos May 15, 2026

Copilot AI reviewed May 15, 2026

View reviewed changes

Gate utility-model forced retry with a flag and add observability

77ba9ea

benvillalobos requested a review from Copilot May 16, 2026 00:05

Copilot started reviewing on behalf of benvillalobos May 16, 2026 00:08 View session

Tighten utility-model retry: scope flag reset to logout, compress com…

94dfd62

…ment

Copilot AI reviewed May 16, 2026

View reviewed changes

benvillalobos requested a review from Copilot May 16, 2026 00:12

Reset utility-model retry budget on each successful fetch

5cad5ed

Copilot started reviewing on behalf of benvillalobos May 16, 2026 00:14 View session

Copilot AI reviewed May 16, 2026

View reviewed changes

Comment thread extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts

@@ -272,6 +281,7 @@ export class ModelMetadataFetcher extends Disposable implements IModelMetadataFe

}

this._familyMap.clear();

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reset Copilot utility model cache and force one retry when CAPI omits chat fallback#316720

Reset Copilot utility model cache and force one retry when CAPI omits chat fallback#316720
benvillalobos wants to merge 4 commits into
microsoft:mainfrom
benvillalobos:bv/copilot-utility-model-cache

benvillalobos commented May 15, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -272,6 +281,7 @@ export class ModelMetadataFetcher extends Disposable implements IModelMetadataFe
		}

		this._familyMap.clear();

Conversation

benvillalobos commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

benvillalobos commented May 15, 2026 •

edited

Loading