Skip to content

Enable VS integration tests on CI#19930

Open
abonie wants to merge 54 commits into
dotnet:mainfrom
abonie:enable-integration-tests
Open

Enable VS integration tests on CI#19930
abonie wants to merge 54 commits into
dotnet:mainfrom
abonie:enable-integration-tests

Conversation

@abonie

@abonie abonie commented Jun 10, 2026

Copy link
Copy Markdown
Member

WIP Testing on CI

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

❗ Release notes required

You can open this PR in browser to add release notes: open in github.dev

@abonie,

Caution

No release notes found for the changed paths (see table below).

Please make sure to add an entry with an informative description of the change as well as link to this pull request, issue and language suggestion if applicable. Release notes for this repository are based on Keep A Changelog format.

The following format is recommended for this repository:

* <Informative description>. ([PR #XXXXX](https://github.com/dotnet/fsharp/pull/XXXXX))

See examples in the files, listed in the table below or in th full documentation at https://fsharp.github.io/fsharp-compiler-docs/release-notes/About.html.

If you believe that release notes are not necessary for this PR, please add NO_RELEASE_NOTES label to the pull request.

Change path Release notes path Description
src/Compiler docs/release-notes/.FSharp.Compiler.Service/11.0.100.md No release notes found or release notes format is not correct
vsintegration/src docs/release-notes/.VisualStudio/18.vNext.md No release notes found or release notes format is not correct

Can't use `dotnet` due to a conflict with mtp/xunit3
@github-actions github-actions Bot added ⚠️ Affects-Test-Tooling Tooling check: PR touches test framework infrastructure ⚠️ Affects-Build-Infra Tooling check: PR touches build infrastructure ⚠️ Affects-Restore Tooling check: PR touches NuGet packages or feeds labels Jun 10, 2026
@github-actions

Copy link
Copy Markdown
Contributor

🔍 Tooling Safety Check — Affects-Build-Infra, Affects-Restore, Affects-Test-Tooling
Affects-Build-Infra: modifies azure-pipelines-PR.yml, eng/Build.ps1, eng/Versions.props
Affects-Restore: adds PackageDownload for xunit.runner.console in .csproj
Affects-Test-Tooling: new TestUsingXUnitConsole runner bypassing dotnet test

Generated by PR Tooling Safety Check · opus46 3.6M ·

abonie and others added 20 commits June 10, 2026 18:14
The two GoToDefinitionTests [IdeFact]s were no-ops: after Shell.ExecuteCommandAsync(GotoDefn)
the caret line was sampled immediately, but FSharpNavigation.NavigateToItem schedules the
actual caret move asynchronously (via JoinableTaskFactory). The previous fixed Task.Delay
then re-anchored the caret before navigation landed, making it bounce between the call site
and the definition without ever being detected.

Replace the fixed delay with GoToDefinitionWithRetryAsync: wait for the project system, anchor
the caret on the call site, invoke GotoDefn, then poll for the caret line to change (letting the
async navigation land) before treating the attempt as a no-op and retrying. The caret is only
re-anchored at the start of each attempt, never while a navigation may be in flight.

Add EditorInProcess.ActivateAsync to keep the editor as the active command context after
PlaceCaretAsync's dte.Find shifts VS's active selection (otherwise GotoDefn routes to the wrong
target and returns E_FAIL).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI showed FsiAndFsFilesGoToCorrespondentDefinitions now passes but GoesToDefinition still
no-ops (GotoDefn never resolves the symbol after 20 attempts, no command error). The only
structural difference is that the passing test builds the solution while GoesToDefinition only
edited the buffer via SetTextAsync. Building gives the F# checker the project's full options and
on-disk source, so the symbol resolves. Mirror the sibling test by building before navigating.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous build step left the Build Output pane as the active text view, so PlaceCaretAsync
searched the build log ('Marker add 1 not found in text: Build started...') instead of the
source. Re-open Library.fs after building to make it the active document again, mirroring
FsiAndFsFilesGoToCorrespondentDefinitions which opens the file it navigates after building.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
abonie and others added 30 commits June 13, 2026 23:34
Removing the retry loop regressed GoesToDefinition on CI (build 1463019):
'GoToDefn did not navigate away from add 1 within 10s'. The first GotoDefn invocation
genuinely no-ops -- the Roslyn Document / F# checker is not ready for the freshly
rebuilt+reopened file -- and returns without navigating, so waiting longer on that single
call cannot help; the command must be RE-ISSUED. Restore the retry loop (which already waits
for the async navigation after each invocation) and document why re-issuing is load-bearing so
it is not removed again.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replaces the retry loop (which could mask a real GoToDefinition regression by eventually
succeeding) with: wait for the project system, wait a fixed FSharpCheckerSettleDelay (10s) for
the F# checker to typecheck the just-opened/built document, then issue GotoDefn ONCE and wait
for the single async navigation to land. A genuine no-op now fails the test instead of being
retried away. FSharpCheckerSettleDelay may need tuning based on CI timing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI (build 1463623) confirmed the asymmetry: FsiAndFs (adds files to disk, opens them fresh)
passes a single GotoDefn, while GoesToDefinition (SetTextAsync on the auto-opened buffer) no-ops
on the first invocation even after a 10s settle. Editing an already-open buffer is the
difference, not timing.

So restructure GoesToDefinition to mirror FsiAndFs: write the code to Test.fs, build, open it
fresh, then navigate. With a fresh document a single GotoDefn resolves, so remove both the retry
loop and the fixed settle -- the only remaining wait observes the one async navigation landing,
and a genuine no-op now fails the test instead of being retried/slept away. Also drop the
redundant standalone PlaceCaretAsync calls (GoToDefinitionAsync already places the caret).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…void ImmutableArray binding skew

PopulateWithDataAsync returns Task<ImmutableArray<SuggestedActionSet>>; ImmutableArray
binds System.Collections.Immutable, whose version skews between the NuGet reference and
the in-proc VS runtime, so a direct call throws MissingMethodException. Polling
TryGetSuggestedActionSets alone never drives the async computation and returns empty.

Invoke PopulateWithDataAsync via reflection, await it as the non-generic base Task, and
read the action sets out of the task Result via the non-generic IEnumerable interface so
the skewed type never appears in compiled IL. Own the session via CreateSession and retry
with a fresh session until actions appear (covers the slow unused-opens analyzer).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…s by reflection

The previous CreateSession-owned session was superseded once the real lightbulb had data,
so awaiting its PopulateWithDataAsync threw TaskCanceledException (fast tests) while the
slow test timed out empty. Switch back to the ShowQuickFixes command-created active session
(the historically working primitive): poll GetSession for non-null, subscribe to
SuggestedActionsUpdated, fire PopulateWithDataAsync via reflection (fire-and-forget so a
superseded session does not propagate cancellation), and read e.ActionSets via reflection to
avoid the System.Collections.Immutable version skew. Retry a fresh trigger per attempt until
items appear or the overall timeout elapses (covers the slow unused-opens analyzer).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ttempt timeout

After the previous fix UnusedOpenDeclarations passed but the two compiler-diagnostic tests
(AddNewKeyword, AddMissingFun) timed out empty: their fixes compute lazily when the lightbulb
is queried, and the 10s per-attempt timeout dismissed/cancelled that computation before it
finished on a cold checker. The F# editor doesn't drive Roslyn's async-operation listeners, so
before triggering ShowQuickFixes we now poll the error list until the document has a diagnostic
(any severity), guaranteeing the squiggle exists so the fix resolves quickly. Also raise the
per-attempt lightbulb timeout to 30s so a cold on-demand fix query isn't cut short.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ulb UI session

Driving the VS lightbulb (ShowQuickFixes command -> GetSession -> PopulateWithDataAsync ->
SuggestedActionsUpdated) was hopelessly racy in headless CI: session lifetime, command
routing, event ordering and supersession produced a different timing-dependent 2-of-3 failure
each run, and forced reflection to dodge the ImmutableArray version skew.

Replace it with a direct query of ILightBulbBroker.GetSuggestedActionsSources: for each source
await HasSuggestedActionsAsync (drives the lazy F# code-fix computation) then GetSuggestedActions
over the caret's whole line span, aggregating the returned SuggestedActionSets. These APIs are
version-safe (no ImmutableArray in their signatures), deterministic, and have no UI-session
lifetime. Retry until items appear (covers background analyzers such as unused-opens). On
timeout, throw with a per-source diagnostic dump so the CI log is self-explaining. Drop the
command/activate/error-list-barrier machinery.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…SuggestedActionsAsync

The self-diagnosing dump from the previous run showed the Roslyn-backed source that surfaces the
F# code fixes throws NotImplementedException on HasSuggestedActionsAsync with the message
'We implement GetSuggestedActionCategoriesAsync. This should not be called.' So none of the three
tests ever queried the real source (236 fast no-op attempts). Switch to ISuggestedActionsSource2
.GetSuggestedActionCategoriesAsync (the async driver) to run the lazy fix computation, then read
the sets via GetSuggestedActions, passing the returned categories. Skip sources that throw
NotImplementedException (LSP/spell-check). Keep the per-attempt diagnostic dump.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…flection

The direct ISuggestedActionsSource query could never work: in modern Roslyn the synchronous
GetSuggestedActions returns null by design (verified in dotnet/roslyn SuggestedActionsSource.cs);
action sets are produced only by IAsyncSuggestedActionsSource.GetSuggestedActionsAsync, which the
lightbulb session drives via PopulateWithDataAsync.

Go back to the command-created (active, non-superseded) session: reflection-invoke
PopulateWithDataAsync, await it, and read its aggregated Task<ImmutableArray<SuggestedActionSet>>
result through the non-generic IEnumerable interface (with the SuggestedActionsUpdated event payload
as a fallback) so the ImmutableArray version skew never enters compiled IL. Retry with a fresh
session until items appear (covers the async unused-opens analyzer). On timeout, throw a per-attempt
diagnostic dump (populate status, task status, result/event set counts) since the CI log is the only
available signal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…sion

A future-proofing requirement: when F# code actions move to LSP, the fixes will be served by the VS
LSP client's CodeActionSource rather than Roslyn's SuggestedActionsSource, so querying Roslyn's
IAsyncSuggestedActionsSource directly would stop finding them. The lightbulb broker session aggregates
all suggested-action sources, so it stays valid across that migration.

Use broker.CreateSession(Any) (a session always exists, even for a push-model analyzer whose diagnostic
isn't published yet - unlike ShowQuickFixes) and wait for the terminal SuggestedActionsUpdated event
rather than PopulateWithDataAsync's fast/empty initial snapshot. Retry with a fresh session on dismissal
or per-attempt timeout to cover background-analyzer timing. PopulateWithDataAsync and
SuggestedActionsUpdatedArgs.ActionSets are ImmutableArray-typed (System.Collections.Immutable skews
between the NuGet ref and the in-proc VS runtime), so they're invoked/read via reflection through the
non-generic IEnumerable; these are VS platform APIs unaffected by the LSP move. A per-attempt diagnostic
dump is thrown on timeout since the CI log is the only signal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Temporary diagnostic instrumentation to determine, headless, whether the unused-opens semantic
analyzer diagnostic is actually published to the Error List (the producer-agnostic lightbulb session
finds nothing and dismisses every attempt for UnusedOpenDeclarations). InvokeCodeActionListAsync now
catches the no-actions failure and appends a dump of every Error List entry (all sources/severities)
so the CI log shows whether the diagnostic exists. Passing tests are unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI evidence (build 1466711): code-action failures coincide with an EMPTY error list - the F# checker
hadn't produced any diagnostic for the document, so no code fix could exist - and the failing set
rotated run-to-run, i.e. flaky. The prior 200+-attempt lightbulb churn appears to starve the checker.

Before invoking the lightbulb, poll the error list lightly (500ms, no lightbulb activity) until the
document has at least one diagnostic of any severity, then invoke once. This targets the actual root
cause (missing diagnostic) rather than lightbulb session mechanics. Best-effort with a 60s budget; the
error-list dump on failure is retained so a genuinely diagnostic-free state is still visible.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Decisive probe of the warm-up/latency hypothesis. Last run proved the diagnostic IS published
(FS0760 in the error list) yet the session still auto-dismissed on its empty initial snapshot.
After confirming the diagnostic is present, wait 2 minutes with NO lightbulb activity so the checker,
analyzers and the editor's lightbulb tagger fully settle, then do a short bounded invoke (30s).

If it passes, the failures are a warm-up race and the approach is salvageable with an adaptive wait;
if it still 'session dismissed', settling is irrelevant and the session path is structurally unfit
headless. Temporary - to be reverted once we learn which.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…oking

The Error List barrier was hidden-blind: F# unused-opens is a DiagnosticSeverity.Hidden ('Unnecessary'
fade) diagnostic (Roslyn FSharpUnusedOpensDeclarationsDiagnosticAnalyzer), so it never appears in the
Error List - the '0 entries' readings were a measurement artifact, not proof the analyzer didn't run.

Replace the error-list barrier and the temporary 2-minute settle with a producer-agnostic poll of
ILightBulbBroker.IsLightBulbSessionActive: it reflects the editor's own lightbulb tagger detecting an
available fix, including ones from Hidden diagnostics. The 120s cap doubles as a quiet settle (no
lightbulb churn) for fixes whose computation lags. The barrier outcome is included in the failure dump.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bring back the 2-minute quiet settle before invoking the lightbulb (the best producer-agnostic result
so far: AddNew/AddMissingFun passed under it) so we can re-run a few times and measure how flaky it is.

For UnusedOpenDeclarations, skip the error-list barrier entirely: its diagnostic is Hidden and never
appears in the error list, so that wait can only ever time out. Add an InvokeCodeActionListAsync overload
taking waitForErrorListDiagnostics and pass false from that test; the 2-minute settle still applies.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ation listeners

Drain Roslyn's IAsynchronousOperationListener (the mechanism Apex/Roslyn integration tests use) instead of time/heuristic barriers: enable tracking via RoslynWaiterEnabled env + ctor Enable; pre-drain Workspace/SolutionCrawlerLegacy/DiagnosticService so the fix exists before opening the lightbulb session; LightBulb post-drain. Keep the producer-agnostic broker-session reader. Remove the 2-minute sleep and error-list barrier.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build 1468525 proved the broker.CreateSession session is superseded/dismissed by the editor's real lightbulb in ~20ms headless (the persistent 'session dismissed'). TypeScript-VS Apex and Roslyn never create an owned session - they trigger the editor's real lightbulb (ShowQuickFixes) and read broker.GetSession(view). Switch to that: PostExecCommand(VSStd14 ShowQuickFixes) after activating the document (Find leaves the wrong command target), then read the editor-owned IAsyncLightBulbSession. Producer-agnostic; keeps the best-effort diagnostic pre-drain and tracking enablement.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On a code-action failure, query ILightBulbBroker.HasSuggestedActionsAsync and ILightBulbBroker2.GetSuggestedActionCategoriesAsync at the caret to record whether any fix is offered there at all. This distinguishes 'fix never offered' (diagnostic not produced) from 'fix offered but no session' for the remaining UnusedOpenDeclarations failure.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nused-opens

Build 1472364 probe (HasSuggestedActions=False; categories=[]; Error List 0) proved failures are upstream of the lightbulb: the F# diagnostic isn't computed for the freshly-opened document, so the fix is never offered. AddMissingFun is flaky this way; UnusedOpens (Hidden background-analyzer diagnostic) never produces. - Quarantine UnusedOpenDeclarations (Skip) until reliable. - AddNew/AddMissingFun: MaxAttempts=3 (harness retries on a fresh VS instance per attempt; a pass on any attempt reports green). - Force diagnostics with a net-zero buffer edit after the project is loaded, raising per-attempt success.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build 1475156 crashed with a fatal 'Duplicate attribute' error: MaxAttempts reports a failed non-final attempt as skip+retry, and the xunit.console.exe -xml reporter used by the inttests leg throws when the same test is written to the results XML twice. Drop MaxAttempts and rely on the real-session reader plus the re-analysis trigger; the diagnostic probe stays so the next run gives a clean signal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build 1475226: AddMissingFun still failed with 'fix never offered' (probe HasSuggestedActions=False) even with a single pre-loop re-analysis trigger - that one edit can fire before the freshly-created project's F# checker has options, so analysis yields no diagnostic and nothing re-triggers it for 64s. Move the trigger into the loop: odd attempts re-force analysis (net-zero buffer edit), even attempts harvest, so we keep getting fresh chances until the checker is ready.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…le infra)

Code-action diagnostics often aren't produced in the headless CI VS because view-driven producers (error-squiggle tagger) need a realized, visible desktop. Port Roslyn's integration-test machine setup so devenv runs interactively:

- New eng/build-utils-win.ps1 with Capture-Screenshot.
- eng/Build.ps1: Setup-IntegrationTestRun (screenshot probe; on failure tsdiscon/tscon /dest:console reconnect; re-capture fail-fast) + MinimizeAll, wired into the CI testIntegration path; add devenv to the on-exit stop list.
- eng/SetupVSHive.ps1: VS/UI registry hardening (roaming, IntelliCode, background download, spell checker, Report Exceptions dialog, targeted notifications).
- azure-pipelines-PR.yml: pass -prepareMachine to the inttests leg; publish the startup screenshot to prove a live desktop.
- Un-quarantine UnusedOpenDeclarations to measure the full effect.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instrument the full unused-opens diagnostic flow to figure out why
the UnusedOpenDeclarations integration test sees 0 diagnostics after
60s on CI. Tracing is added at every decision point:

- FCS getUnusedOpens: log OpenDeclarations count, symbol uses,
  open statements, and result.
- UnusedOpensDiagnosticAnalyzer.AnalyzeSemanticsAsync: log whether
  the analyzer is called, IsFSharpMiscellaneousOrMetadata check,
  IsFSharpCodeFixesUnusedOpensEnabled check, FCS results, and
  final diagnostic count.
- AsyncOperationWaiter: log feature drain timing and whether
  Roslyn tracking is enabled, workspace type.
- LightBulbHelper: per-attempt detail logging.
- EditorInProcess: log document text, caret position, feature
  drain progress.
- CodeActionTests: pre-invoke error list dump, step-by-step trace.
- AbstractIntegrationTest: add ConsoleTraceListener so all Trace
  output appears in the xunit test log.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tests

# Conflicts:
#	vsintegration/src/FSharp.Editor/Diagnostics/UnusedOpensDiagnosticAnalyzer.fs
Collapse an over-long multi-line Trace.TraceInformation call in ServiceAnalysis.fs that fantomas --check flagged. No behavior change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…sis completes

Build 1477103 (interactive desktop) fixed AddNew + AddMissingFun; only UnusedOpens still failed. The log showed UnusedOpensAnalyzer restarting every ~11s - exactly our odd-attempt re-trigger cadence - and the Library project's F# IncrementalBuilder was only created at 64s (the timeout). The periodic net-zero buffer edit was cancelling the slow unused-opens computation (which needs the project builder + a full check) before it could finish. Trigger reanalysis only on the first attempt and leave the buffer quiet thereafter, and raise the overall timeout 60s->120s so the background analysis has room to complete.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ing the lightbulb (B+)

Build 1477635 showed the single-trigger change regressed ALL code-action tests: our retry loop repeatedly posted ShowQuickFixes and dismissed the session every ~5s, which cancels the slow F# unused-opens query (and starved the fast ones of a post-project-ready touch). Redesign GetCodeActionsAsync:

- PRODUCE-GATE: touch the buffer once, then gently poll broker.HasSuggestedActionsAsync (producer-agnostic; no session create/dismiss, no re-touch) until a fix is offered. This gives slow background analyzers an uninterrupted window. Bounded re-touch only if the gate stays false for 45s (covers a touch that fired before the checker was ready).
- READ: only after a fix is confirmed offered do we invoke the real lightbulb and read its session - session churn there can no longer cancel the producer.
- Distinct failure messages (gate vs read) make the next run self-diagnosing. Overall budget 150s; 30s read budget.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build 1477939: B+ produce-gate FIXED UnusedOpens (gate flips in 0.2s; the prior slowness was entirely our churn) and AddNew. Only AddMissingFun failed: broker.HasSuggestedActionsAsync is a false-negative for the F# parse-error ErrorFix - the fix IS available (ShowQuickFixes + session read found it in builds 1469227/1477103) but the broker's query API never reports it. Keep the gate (instant for CodeFix) but ALSO attempt a real lightbulb read every ~15s even when the gate is false, to catch gate-blind fixes. UnusedOpens gates in ~1s and is read before any fallback fires, so the 15s spacing only runs for the gate-blind case and still protects slow producers from churn.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚠️ Affects-Build-Infra Tooling check: PR touches build infrastructure ⚠️ Affects-Restore Tooling check: PR touches NuGet packages or feeds ⚠️ Affects-Test-Tooling Tooling check: PR touches test framework infrastructure

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

1 participant