Skip to content

Reset Copilot utility model cache and force one retry when CAPI omits chat fallback#316720

Draft
benvillalobos wants to merge 4 commits into
microsoft:mainfrom
benvillalobos:bv/copilot-utility-model-cache
Draft

Reset Copilot utility model cache and force one retry when CAPI omits chat fallback#316720
benvillalobos wants to merge 4 commits into
microsoft:mainfrom
benvillalobos:bv/copilot-utility-model-cache

Conversation

@benvillalobos
Copy link
Copy Markdown
Member

@benvillalobos benvillalobos commented May 15, 2026

Fix a caching bug in ModelMetadataFetcher that can lock the Copilot utility chat model into a permanently-unresolvable state for up to 10 minutes after a single CAPI /models response that doesn't flag any model with is_chat_fallback: true.

Problem

_copilotUtilityModel is only ever assigned inside _fetchModels when a model has is_chat_fallback === true. It is never cleared. Combined with _shouldRefreshModels(), which short-circuits while _familyMap.size > 0 and _lastFetchTime is within the 10-minute window, a single bad response (or a 429 short-circuit while the cache is fresh) leaves getCopilotUtilityModel() throwing "Unable to resolve Copilot utility chat model (server did not mark a chat fallback model)" until the cache expires.

This has been a contributing source of sanity-test flakes in CI.

Fix

Two complementary changes:

1. Reset _copilotUtilityModel = undefined adjacent to _familyMap.clear() in _fetchModels.

This makes _copilotUtilityModel a derived value of the current response rather than a sticky field that accumulates across fetches. The pattern now matches what _fetchModels already does for _familyMap — clear, then repopulate from the current response.

Without the reset:

  • If a previous fetch set _copilotUtilityModel = ModelA and the current response no longer flags any model as the fallback, we'd keep serving the stale ModelA for the utility endpoint indefinitely.
  • The cache would lie about what the server actually said.

The reset on its own does not fix the original flake (a first-ever bad response still leaves _copilotUtilityModel undefined), but it is correctness hygiene that the retry below relies on.

2. Force one refresh in getCopilotUtilityModel() when models are cached but the utility model is missing.

if (!this._copilotUtilityModel && this._familyMap.size > 0) {
    await this._taskSingler.getOrCreate(ALL_MODEL_KEY, () => this._fetchModels(true));
}

The true argument is the force parameter on _fetchModels, which bypasses the _shouldRefreshModels() cache check. Without force, the retry would no-op immediately because the bad response just populated _familyMap and set _lastFetchTime = Date.now(). With force, we guarantee a fresh CAPI request, rebuild _familyMap, and (thanks to change #1) repopulate _copilotUtilityModel if the server flags one this time.

The forced refresh is bounded to one call per getCopilotUtilityModel invocation and still flows through TaskSingler, so coalescing semantics are preserved.

Copilot AI review requested due to automatic review settings May 15, 2026 23:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a caching edge case in the Copilot extension’s ModelMetadataFetcher where the “Copilot utility” chat model can get stuck unresolved for up to 10 minutes if a /models response omits is_chat_fallback: true.

Changes:

  • Clears _copilotUtilityModel when refreshing the model family map so stale/invalid state can’t persist across fetches.
  • Adds a bounded one-time forced refresh in getCopilotUtilityModel() when models were returned but no chat fallback was flagged, to avoid being trapped by the 10-minute cache window.
Show a summary per file
File Description
extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts Resets cached utility model on refresh and adds a forced single retry to recover from transient missing is_chat_fallback responses.

Copilot's findings

  • Files reviewed: 1/1 changed files
  • Comments generated: 2

Comment on lines 279 to 283
this._familyMap.clear();
this._copilotUtilityModel = undefined;

const data: IModelAPIResponse[] = (await response.json()).data;
this._requestLogger.logModelListCall(requestId, requestMetadata, data);
Comment on lines 174 to +179
await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, this._fetchModels.bind(this));
if (!this._copilotUtilityModel && this._familyMap.size > 0) {
// Server returned models but did not flag a chat fallback; force one refresh
// before throwing so we are not stuck on a stale 10-minute cache window.
await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, () => this._fetchModels(true));
}
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 1/1 changed files
  • Comments generated: 1

Comment on lines +178 to 185
if (!this._copilotUtilityModel && this._familyMap.size > 0 && !this._hasForcedUtilityModelRetry) {
// One-shot retry per auth epoch: avoids storming CAPI on persistent server misconfig.
this._hasForcedUtilityModelRetry = true;
this._logService.warn('Utility model unset after initial fetch; forcing one refresh');
await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, () => this._fetchModels(true));
}
const resolvedModel = this._copilotUtilityModel;
if (!resolvedModel || !isChatModelInformation(resolvedModel)) {
@benvillalobos benvillalobos requested a review from Copilot May 16, 2026 00:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

Comments suppressed due to low confidence (4)

extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts:183

  • _hasForcedUtilityModelRetry is set to true before the forced refresh is awaited. If multiple callers reach this block concurrently, only the first will trigger/await the refresh; others will skip it and can throw even though the refresh is already in-flight (since TaskSingler would otherwise coalesce them). Consider setting the flag only after the forced refresh completes, or tracking/awaiting a shared in-flight retry promise so all callers wait for the same refresh.
		if (!this._copilotUtilityModel && this._familyMap.size > 0 && !this._hasForcedUtilityModelRetry) {
			// One-shot retry per auth epoch: avoids storming CAPI on persistent server misconfig.
			this._hasForcedUtilityModelRetry = true;
			this._logService.warn('Utility model unset after initial fetch; forcing one refresh');
			await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, () => this._fetchModels(true));
		}

extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts:116

  • The retry gating is described as “one-shot per auth epoch”, but _hasForcedUtilityModelRetry is only reset on logout (copilotToken === undefined) and on fetch errors. If the token changes/refreshes (auth change event) but remains defined, or after a later successful refresh cycle, this flag may stay true indefinitely and prevent future one-shot retries, potentially reintroducing the original failure mode. Consider resetting the flag on any auth change (or when a fetch succeeds) to match the intended “epoch” semantics.

This issue also appears in the following locations of the same file:

  • line 176
  • line 178
  • line 304
		this._register(this._authService.onDidAuthenticationChange(() => {
			// Auth changed so next fetch should be forced to get a new list

			// Only clear the family map if the copilot token is undefined, as this means the user has logged out and we should clear the models, otherwise we want to keep the old models around until we get a new list
			if (this._authService.copilotToken === undefined) {
				this._familyMap.clear();
				this._copilotUtilityModel = undefined;
				this._hasForcedUtilityModelRetry = false;
			}

			this._completionsFamilyMap.clear();
			this._lastFetchTime = 0;
		}));

extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts:182

  • PR description says the forced refresh is “bounded to one call per getCopilotUtilityModel() invocation”, but the added _hasForcedUtilityModelRetry makes it at most once for the lifetime of the instance (until logout/error), changing the documented behavior. Either adjust the implementation to match the description (one retry per invocation) or update the PR description/comments to reflect the actual global gating.
	public async getCopilotUtilityModel(): Promise<IChatModelInformation> {
		await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, this._fetchModels.bind(this));
		if (!this._copilotUtilityModel && this._familyMap.size > 0 && !this._hasForcedUtilityModelRetry) {
			// One-shot retry per auth epoch: avoids storming CAPI on persistent server misconfig.
			this._hasForcedUtilityModelRetry = true;
			this._logService.warn('Utility model unset after initial fetch; forcing one refresh');
			await this._taskSingler.getOrCreate(ModelMetadataFetcher.ALL_MODEL_KEY, () => this._fetchModels(true));

extensions/copilot/src/platform/endpoint/node/modelMetadataFetcher.ts:309

  • _hasForcedUtilityModelRetry is reset to false in the _fetchModels catch block. If the server is persistently misconfigured (no fallback model) or the refresh is failing, this can cause repeated forced refresh attempts on subsequent getCopilotUtilityModel() calls (potentially doubling CAPI traffic). Consider leaving the flag set after a failed forced retry (or adding a backoff/timestamp) so the “one-shot” protection still holds during outages.
				this._hasForcedUtilityModelRetry = false;
			}
			this._onDidModelRefresh.fire();
		} catch (e) {
			this._logService.error(e, `Failed to fetch models (${requestId})`);
			this._lastFetchError = e;
  • Files reviewed: 1/1 changed files
  • Comments generated: 1

@@ -272,6 +281,7 @@ export class ModelMetadataFetcher extends Disposable implements IModelMetadataFe
}

this._familyMap.clear();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants