Skip to content

Black-box e2e TUI tests over a PTY (layer 2, pure .NET)#166

Draft
BrettKinny wants to merge 4 commits into
mainfrom
test/tui-e2e
Draft

Black-box e2e TUI tests over a PTY (layer 2, pure .NET)#166
BrettKinny wants to merge 4 commits into
mainfrom
test/tui-e2e

Conversation

@BrettKinny

Copy link
Copy Markdown
Collaborator

Draft — do not merge. Second of two PRs adding programmatic TUI testing.

Stacked on #165 (test/tui-component) so it runs against the migrated (Terminal.Gui 2.4.5)
app. Base will retarget to main once #165 merges. Review #165 first.

What this is

Layer 2: black-box end-to-end tests that launch the published binary attached to a real
pseudo-terminal, reconstruct the rendered screen from its terminal output, and assert on it.
This is the rendered-output coverage that the in-process component tests (layer 1) can't
provide — Terminal.Gui 2.4.5 stable ships no public headless driver.

Pure .NET — no Node, no Python, no NuGet PTY/VT library.

Design

Piece Role
Pty Sized PTY via libc openpty + posix_spawn(POSIX_SPAWN_SETSID) (P/Invoke).
VtScreen Minimal in-tree VT100/ANSI → character-grid emulator.
OpcilloscopeSession Pumps output into the grid, answers Terminal.Gui's terminal queries (size CSI 18t, cursor DSR 6n, OSC 10/11) so it paints, exposes Snapshot/WaitForText/input.
PublishedBinaryFixture Publishes once, or reuses $OPCILLOSCOPE_BIN.

Why not script / Pty.Net / VtNetCore?

  • script produces a 0×0 window when stdout is a pipe (no size flag) → Terminal.Gui won't
    draw. We size the PTY ourselves.
  • Pty.Net is unmaintained (2018, unlisted, Windows-only native); VtNetCore/XtermSharp are
    stale — so the emulator is kept in-tree, scoped to what Terminal.Gui emits.

The key insight

Terminal.Gui's net driver detects size/colours by emitting escape sequences and waiting for
the terminal to reply
. A bare PTY has no emulator answering, so the app blocks before its
first paint. The session answers those queries — that's what makes black-box rendering work.

Tests (StartupTests)

  • Startup renders all four panes (Address Space / Monitored Variables / Node Details / Log).
  • Status-bar keybinding hints render.
  • Menu bar renders (File / Connection / View / Help).
  • ? opens the help dialog (opcilloscope - Help).

Stable across repeated local runs (assertions wait for text rather than racing the first paint).

CI

A dedicated e2e job publishes the binary once, sets OPCILLOSCOPE_BIN, and runs the project;
the unit job is scoped to the unit/integration/component project to avoid double-running. PTY
allocation works on ubuntu-latest by default. See Tests/Opcilloscope.E2ETests/README.md.

🤖 Generated with Claude Code

BrettKinny and others added 2 commits June 10, 2026 11:47
The component-test APIs (Application.Create, InjectKey/InjectSequence,
VirtualTimeProvider) first appear in Terminal.Gui 2.1.0, which also landed a
breaking API redesign. This ports the whole UI layer to 2.4.5:

- Namespace reorg: add GlobalUsings for Terminal.Gui.{App,ViewBase,Views,
  Drawing,Input,Drivers,Text,Configuration}; alias Attribute -> Drawing.Attribute.
- ColorScheme -> Scheme; view.ColorScheme = x -> view.SetScheme(x); add a
  WithScheme() fluent helper for initializer-style scheme assignment.
- Toplevel -> Window; RadioGroup -> OptionSelector (Labels/Value/ValueChanged).
- TableView: SelectedRow -> Value?.SelectedCell.Y; SelectedCellChanged ->
  ValueChanged<TableSelection>; MouseClick -> MouseEvent(Mouse).
- TreeView ObjectActivated -> Activated (read SelectedObject).
- ListView OpenSelectedItem -> Accepting; drop TopItem (SelectedItem autoscrolls).
- MenuItem shortcutKey: named arg -> positional Key.
- Custom drawing: Driver!.X(...) -> view-level SetAttribute/AddRune/AddStr/Move
  in OnDrawingContent(DrawContext).
- Application.Top -> TopRunnable/TopRunnableView; SizeChanging -> SubViewLayout;
  Colors.ColorSchemes["Menu"] -> SchemeManager.AddScheme; MessageBox.* now take
  Application.Instance; ShadowStyle.None -> ShadowStyles.None; Subviews -> SubViews;
  TextField.CursorPosition -> MoveEnd().

Known regression (flagged for review): Terminal.Gui 2.4 adornments have no
independent Scheme, so per-border colouring (grey borders, focus-highlight title)
now inherits the view scheme; focus highlight reapplied via FrameView scheme.

Builds clean; all 613 existing tests pass; published --help smoke exits 0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Constructs the real Terminal.Gui views/dialogs and asserts on observable
component behaviour:
- MonitoredVariablesView: AddVariable/RemoveVariable, scope-selection bookkeeping,
  per-client-handle idempotency.
- ConnectDialog: endpoint protocol-prefix handling, publishing interval, and the
  authentication selector (migrated RadioGroup -> OptionSelector).
- Theme/Scheme: DarkTheme/LightTheme schemes and ThemeStyler.ApplyTo guard the
  ColorScheme -> Scheme migration.

Tests run in a non-parallel xUnit collection because Terminal.Gui's Application is
global state. Picked up automatically by `dotnet test` / CI.

Rendered cell-buffer assertions are intentionally deferred to the black-box PTY
suite: Terminal.Gui 2.4.5 stable exposes no public headless driver (Application.Create
leaves Driver null; DriverAssert is develop-only). Documented in docs/TESTING.md.

627/627 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Jun 10, 2026

Copy link
Copy Markdown

Code Review — PR #166: Black-box e2e TUI tests over a PTY

Summary: This is a well-designed, clearly reasoned addition of a Layer 2 black-box testing harness. The architecture is layered cleanly (PtyVtScreenOpcilloscopeSession → tests), the rationale for the custom PTY/VT approach is documented, and test assertions use WaitForText to avoid first-paint races. Good work overall.


Bugs / Correctness

1. Potential deadlock in PublishedBinaryFixture.Publish() (medium)

using var p = Process.Start(psi)!;
p.WaitForExit();
if (p.ExitCode != 0)
    throw new InvalidOperationException("dotnet publish failed:\n" + p.StandardError.ReadToEnd());

Reading from a redirected stream after WaitForExit() is a known deadlock hazard: if the process writes enough to stderr to fill the OS pipe buffer it blocks, and so does WaitForExit. dotnet publish verbose output on a restore failure can exceed the buffer. Fix: drain both streams concurrently before waiting:

var stderrTask = p.StandardError.ReadToEndAsync();
var stdoutTask = p.StandardOutput.ReadToEndAsync();
p.WaitForExit();
if (p.ExitCode != 0)
    throw new InvalidOperationException("dotnet publish failed:\n" + await stderrTask);

InitializeAsync is already async Task so await works here.

2. File descriptor leak in Pty.Spawn() error path (low)

After openpty succeeds, slave is open. If posix_spawn fails, the finally block frees the unmanaged memory but slave is never closed. Add close(slave) on the failure branch, or wrap the spawn attempt in a try/finally.


Potential Flakiness

3. Startup_RendersStatusBarKeybindingHints second assertion has no wait

Assert.True(app.WaitForText("Switch", Timeout), Snapshot(app));  // waits
Assert.Contains("Subscribe", app.Snapshot());                     // no wait

If the status bar items render across separate frames, "Subscribe" may not be on screen yet. Use WaitForText("Subscribe", Timeout) to be consistent with the rest of the test suite.


Design / Robustness

4. VtScreen does not handle CSI 2 K (erase entire line)

EraseLine handles modes 0 and 1 but silently ignores mode 2. Terminal.Gui occasionally emits CSI 2 K for full-line redraws. This won't crash but can leave stale characters in the grid and produce false positives. Adding mode 2 is a one-liner:

private void EraseLine(int mode)
{
    if (!InBounds(_row, 0)) return;
    if (mode == 2) { for (int c = 0; c < _cols; c++) _grid[_row, c] = ' '; return; }
    int from = mode == 0 ? _col : 0;
    int to   = mode == 1 ? _col : _cols - 1;
    for (int c = Math.Max(0, from); c <= Math.Min(_cols - 1, to); c++) _grid[_row, c] = ' ';
}

5. POSIX_SPAWN_SETSID = 0x80 is a glibc constant (informational)

The comment correctly identifies this. Since CI targets ubuntu-latest (glibc) it is fine. If the project ever tests on Alpine (musl), the value differs. Acceptable for now — the comment is sufficient documentation.


Minor / Style

6. CI e2e job uses fetch-depth: 0

Full git history is not needed for a build-and-test job. Omitting this key (default is fetch-depth: 1) would be faster.

7. NuGet restore cache missing from e2e job

The e2e job runs a cold dotnet build without any NuGet cache, so packages are re-downloaded on every run. Adding an actions/cache step matching the main job's cache key would speed it up noticeably.

8. Solution file BOM + x64/x86 platform noise

The UTF-8 BOM at line 1 and the x64/x86 platform config entries are Visual Studio artifacts. Harmless, but they add diff noise and can trip up Linux tooling that is strict about BOM handling. Worth cleaning up before the final merge to main.


Summary

Severity Issue
Bug (medium) Publish() deadlock — reading stderr after WaitForExit
Bug (low) slave fd leaked when posix_spawn fails
Flakiness risk Missing WaitForText for the "Subscribe" assertion
Robustness CSI 2 K (erase line mode 2) silently ignored in VtScreen
Minor fetch-depth: 0 unnecessary; no NuGet cache in e2e job; sln BOM noise

Items 1–3 are worth addressing before merge. Item 4 is a good defensive addition. The rest can be deferred.

🤖 Generated with Claude Code

BrettKinny and others added 2 commits June 10, 2026 12:28
Adversarial review of the Terminal.Gui 2.0->2.4 migration confirmed three
event-routing / API-semantics regressions (TG 2.4 moved selection/activation from
overridden handlers into the command pipeline, which now runs AFTER the raw
MouseEvent/Accepting events):

- MonitoredVariablesView.HandleMouseClick: dropped `e.Handled = true` on Sel-column
  clicks. Marking it handled now pre-empts TableView's LeftButtonClicked ->
  Command.Activate -> SetSelection, so a Sel click toggled scope but no longer
  highlighted the row or raised SelectedVariableChanged. Unhandled restores both.
- SaveRecordingDialog.OnFileListOpenSelected: set `e.Handled = true`. An unhandled
  Accepting bubbles Accept to the default Save button, so navigating a folder or
  picking a file would save+close the dialog mid-browse.
- SaveRecordingDialog.NavigateToSelected: null-safe guard. ListView.SelectedItem is
  now int? (null = none); the OR-form guard didn't catch null and dereferenced
  null.Value (InvalidOperationException at a CSV-less filesystem root).
- LogView: restore log auto-scroll via EnsureSelectedItemVisible() (TG 2.4 removed
  ListView.TopItem); without it the log stopped following new entries.

627/627 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Launches the published binary attached to a real pseudo-terminal, reconstructs the
rendered screen from the VT/ANSI output, and asserts on it — the rendered-output
coverage the in-process component tests can't provide on Terminal.Gui 2.4.5 stable.

Pure .NET, no extra toolchain:
- `Pty`: sized PTY via libc `openpty` + `posix_spawn(POSIX_SPAWN_SETSID)` (P/Invoke).
  Hand-rolled because `script` yields a 0x0 window over a pipe and `Pty.Net` is
  unmaintained/Windows-only.
- `VtScreen`: minimal in-tree VT100/ANSI -> character-grid emulator (the .NET VT
  libraries are stale).
- `OpcilloscopeSession`: pumps output into the screen and, crucially, answers the
  terminal capability queries Terminal.Gui emits (CSI 18t size, DSR 6n cursor,
  OSC 10/11 colours) so the app actually paints; exposes Snapshot/WaitForText/input.
- `PublishedBinaryFixture`: publishes once (or reuses $OPCILLOSCOPE_BIN).

Tests: startup renders all panes; status-bar keybinding hints; menu bar;
'?' opens the help dialog. Stable across repeated runs (wait-for-text, not
race-the-first-paint).

CI: dedicated `e2e` job publishes once and runs the project; the unit job is scoped
to the unit/integration/component project to avoid double-running. Linux-only.

See Tests/Opcilloscope.E2ETests/README.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Jun 10, 2026

Copy link
Copy Markdown

Code Review — PR #166: Black-box e2e TUI tests over a PTY

This is impressive infrastructure: a pure-.NET PTY harness with an in-tree VT emulator is a well-reasoned solution given the state of the NuGet ecosystem for this problem (stale libraries, no headless driver in Terminal.Gui 2.4.5 stable). The design rationale in the PR description and README is clear and correct. A few findings below, ordered by severity.


Bugs / Correctness

VtScreen.P — the zero-default logic is a no-op

// Current (equivalent to: idx < p.Count ? p[idx] : def)
private static int P(List<int> p, int idx, int def) =>
    idx < p.Count && p[idx] != 0 ? p[idx] : (idx < p.Count ? p[idx] : def);

Both branches of the outer ternary return p[idx] when idx is in range, so the != 0 guard is dead code. Per VT100, a param of 0 should fall back to the default. The intended expression is almost certainly:

private static int P(List<int> p, int idx, int def) =>
    idx < p.Count && p[idx] != 0 ? p[idx] : def;

This doesn't cause visible test failures because cursor-move callers use Math.Max(1, P(…)) and cursor-position callers clamp to 0 by coincidence, but it's a correctness bug that will bite as soon as a VT sequence sends an explicit 0 param for a command that treats it as "use default."


Pty.Write — return value silently dropped

public void Write(byte[] data) => write(_master, data, data.Length);

If the child has exited or the PTY buffer is full, write() returns -1 and nothing surfaces. The test-input helpers (Send, SendByte, Reply) all go through this. Add at minimum a check and a Debug.WriteLine, or let the caller decide:

public void Write(byte[] data)
{
    long n = write(_master, data, data.Length);
    if (n < 0) /* log or throw */ ;
}

PublishedBinaryFixture.Publish — potential stdout deadlock

RedirectStandardOutput = true,
RedirectStandardError  = true,
// ...
p.WaitForExit();
if (p.ExitCode != 0)
    throw new ... p.StandardError.ReadToEnd();

Both streams are captured but neither is drained while the process runs. dotnet publish produces a lot of stdout, and if the OS pipe buffer fills the process will block, causing WaitForExit to deadlock. Read both streams asynchronously:

var stderr = p.StandardError.ReadToEndAsync();
p.StandardOutput.ReadToEnd(); // drain stdout synchronously, or also async
p.WaitForExit();
if (p.ExitCode != 0)
    throw new InvalidOperationException("dotnet publish failed:\n" + stderr.Result);

Minor / Design

Pty — P/Invoke return values from posix_spawn_file_actions_* ignored

posix_spawn_file_actions_adddup2 and friends return int (errno on failure). They're very unlikely to fail here, but ignoring them means a spawn-setup error produces a mysterious posix_spawn failure downstream instead of a clear message. Even a Debug.Assert(rc == 0, ...) would help.


Pty._pid — unsynchronised access

_pid is written in HasExited (from test thread via WaitForText) and in Dispose (from test thread on cleanup). If both happen concurrently the field isn't volatile and there's no lock. In the current test pattern this is benign (sequential), but it's worth a volatile int _pid annotation or a lock.


Solution file — unintentional x86/x64 configurations and BOM

The diff adds Debug|x64, Debug|x86, Release|x64, Release|x86 entries for every project, plus a UTF-8 BOM to the first line. These look like IDE artefacts rather than intentional changes. Worth reverting to keep the diff minimal.


CI e2e job — fetch-depth: 0 is unnecessary

The e2e job uses fetch-depth: 0 (full history) but the unit job does not. Nothing in PublishedBinaryFixture or the tests requires git history. fetch-depth: 1 (the default) is fine and faster.


Test Coverage

The four StartupTests scenarios cover the right surface: pane render, status bar, menu bar, and dialog open. One small nit:

Startup_RendersStatusBarKeybindingHints — TOCTOU on "Subscribe"

Assert.True(app.WaitForText("Switch", Timeout), Snapshot(app));
Assert.Contains("Subscribe", app.Snapshot());   // ← races the same paint

If "Switch" and "Subscribe" appear on the same render frame this is fine, but if "Subscribe" lags a frame, the second assertion could fire before it's present. An extra WaitForText("Subscribe", Timeout) would be more robust.


Summary

Severity Finding
Bug VtScreen.P zero-default is a no-op; fix the ternary
Bug Pty.Write drops libc error silently
Bug Publish() stdout deadlock under heavy output
Minor P/Invoke file-actions calls don't check return codes
Minor _pid not volatile
Nit Solution file BOM + x64/x86 configurations (IDE artefacts)
Nit fetch-depth: 0 not needed in e2e job
Nit Second status-bar assertion should use WaitForText

The P fix is the most important — everything else is hardening for the test harness. The overall approach is solid.

@BrettKinny BrettKinny force-pushed the test/tui-component branch from 7d2799b to 8535266 Compare June 11, 2026 07:59
@BrettKinny BrettKinny changed the base branch from test/tui-component to main June 11, 2026 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant