Skip to content

!DONTOT SQUASH¡ Feat/mdns discovery#12

Merged
FASTSHIFT merged 22 commits into
FASTSHIFT:mainfrom
W-Mai:feat/mdns-discovery
Jun 8, 2026
Merged

!DONTOT SQUASH¡ Feat/mdns discovery#12
FASTSHIFT merged 22 commits into
FASTSHIFT:mainfrom
W-Mai:feat/mdns-discovery

Conversation

@W-Mai

@W-Mai W-Mai commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Zero-config server discovery via mDNS

TL;DR

Let fpb_cli.py find a WebServer on the LAN automatically — no more copying --server-url http://192.168.x.x:5500 between shells.

fpb_cli.py info                  # 0 typing — auto-pick local / single visible
fpb_cli.py -s bench:5501 info     # pick by mDNS handle
fpb_cli.py discover              # see what's around

discover in action

Default output is a human-friendly table. The HANDLE column is what -s accepts, so you can copy from one command to the next without ever typing an IP:

$ fpb_cli.py discover
HANDLE         URL                       AUTH    DEVICE   VERSION
bench:5500     http://127.0.0.1:5500    token   sensor   1.6.6
bench:5501     http://127.0.0.1:5501    none    none     1.6.6
lab:5500       http://192.168.1.42:5500 token   sensor   1.6.6

$ fpb_cli.py -s bench:5501 info
{ ...info JSON... }

--json switches to a machine-readable list (handy for scripts and AI agents):

$ fpb_cli.py discover --json
[
  {
    "name": "FPBInject on bench:5500._fpbinject._tcp.local.",
    "host": "127.0.0.1",
    "port": 5500,
    "url":  "http://127.0.0.1:5500",
    "version": "1.6.6",
    "auth": "token",
    "device": "sensor",
    "path": "/api",
    "id": "fpb:eec85835-7049-4d69-bf4a-dd0f5b52c5c7",
    "handle": "bench:5500"
  },
  ...
]

Same-host services advertising both loopback and a LAN IP (multi-homed hosts) are normalized to 127.0.0.1 so you never get prompted for a token to talk to a server you started yourself. Tokens never appear in TXT records and therefore never appear in discover output either.

Why

Previously, talking to a WebServer on another host meant manually pasting an IP:

  1. LAN IPs change; manual transcription is friction
  2. Two servers on one host needed bespoke --server-url per command
  3. There was no "what's reachable?" entry point at all

What changes (user-facing)

Old New
--server-url http://192.168.1.20:5500 -s bench:5500 (mDNS handle), -s bench (when unique), URL still accepted
(none) discover subcommand — table of LAN-visible servers; --json for scripts
(none) FPB_SERVER=bench:5500 env to pin a server for the whole shell
Manual URL juggling on multi-host setups Same-host services auto-normalize to 127.0.0.1; multi-match → stderr list + exit 2
Existing --server-url / FPB_SERVER_URL Still work; emit a deprecation note under -v

Architecture

flowchart TD
  subgraph Server["WebServer (main.py)"]
    Adv["MdnsAdvertiser<br/>TXT: txtvers · version · auth · device · path · id"]
    GoodbyeNote["atexit + SIGINT/TERM<br/>→ goodbye packet"]
    Adv --- GoodbyeNote
  end

  Adv -- "_fpbinject._tcp.local. (UDP 5353)" --> LAN[("LAN mDNS")]

  subgraph CLI["fpb_cli.py"]
    Args["argparse / env"] --> Resolver["resolve_connection_plan()<br/>10-step ladder"]
    Resolver --> HResolver["handle resolver<br/>URL · host:port · host"]
    HResolver --> Cache["handle_cache<br/>(XDG cache, stale-while-revalidate)"]
    Cache --> Plan["ConnectionPlan<br/>(mode, url, token, port, cache_handle)"]
    Plan --> Connector["FPBCLI._connect_from_plan()"]
    Connector -. "FPBCLIError ⇒ invalidate(cache_handle)" .-> Cache
  end

  HResolver -. "mDNS browse<br/>(early-return on match)" .-> LAN
  Connector -- "HTTP /api/*" --> Server
Loading

Four mutually-exclusive modes survive across the refactor: OFFLINE, LOCAL_PROXY, REMOTE_PROXY, DIRECT. Mode is decided once by the resolver; the connector just executes.

Sequence — -s host:port (hot path)

sequenceDiagram
  autonumber
  participant U as user
  participant C as fpb_cli
  participant H as handle_cache
  participant T as daemon thread
  participant L as LAN mDNS
  participant S as WebServer

  U->>C: -s bench:5500 info
  C->>H: lookup("bench:5500")
  H-->>C: {url, id, fresh}
  C-)T: spawn_refresh()  (fire-and-forget)
  C-->>U: ~100 ms total
  C->>S: HTTP /api/info via cached_url
  S-->>C: result

  Note over T,L: refresh runs after main flow
  T->>L: mDNS browse, early_match=handle
  L-->>T: FPBServer{url, id}
  T->>H: store("bench:5500", url, id)
Loading

If the cached URL is stale:

sequenceDiagram
  autonumber
  participant C as fpb_cli
  participant H as handle_cache
  participant S as WebServer

  C->>H: lookup("bench:5500")
  H-->>C: {url, fresh}
  C->>S: HTTP /api/info
  S--xC: connection refused
  C->>H: invalidate("bench:5500")
  C-->>C: raise FPBCLIError
  Note over C: next invocation re-runs mDNS<br/>and writes a fresh entry
Loading

Sequence — cold cache (first call)

sequenceDiagram
  autonumber
  participant C as fpb_cli
  participant H as handle_cache
  participant L as LAN mDNS
  participant S as WebServer

  C->>H: lookup("bench:5500")
  H-->>C: None
  C->>L: discover_sync_by_handle("bench:5500")
  L-->>C: FPBServer{url, id}<br/>(early-return when match resolves)
  C->>H: store("bench:5500", url, id)
  C->>S: HTTP /api/info
  S-->>C: result
  Note over C: ~1.3 s end-to-end
Loading

Resolver ladder

flowchart TD
  Start([args]) --> P1{command_policy is OFFLINE or SERVER_ADMIN?}
  P1 -- yes --> Off[ConnectionMode.OFFLINE]
  P1 -- no  --> P2{--direct?}
  P2 -- yes --> Dir["ConnectionMode.DIRECT<br/>requires --port"]
  P2 -- no  --> P3{-s / --server?}
  P3 -- yes --> HR["handle resolver<br/>URL · host:port · host"]
  P3 -- no  --> P4{FPB_SERVER env?}
  P4 -- yes --> HR
  P4 -- no  --> P5{"--server-url (deprecated)?"}
  P5 -- yes --> URL[classify URL]
  P5 -- no  --> P6{"FPB_SERVER_URL (deprecated)?"}
  P6 -- yes --> URL
  P6 -- no  --> P7{single CLI-launched PID?}
  P7 -- yes --> Local["LOCAL_PROXY 127.0.0.1:&lt;pid_port&gt;"]
  P7 -- no  --> P8{"localhost:5500 /api/status reachable?"}
  P8 -- yes --> Local
  P8 -- no  --> P9{--no-discovery?}
  P9 -- yes --> Local
  P9 -- no  --> P10{mDNS browse}
  P10 -- 0 results --> Local
  P10 -- 1 result  --> URL
  P10 -- "≥2 results" --> Exit[stderr list + exit 2]

  HR --> URL
  URL --> ClassifyMode{is_local_url?}
  ClassifyMode -- yes --> Local
  ClassifyMode -- no  --> Remote[REMOTE_PROXY]
Loading

Performance

Measured on a single-server LAN, loopback hot path:

Path Before After (E series) After (C series)
-s URL n/a 0.13 s 0.13 s
-s host:port cold cache 3.14 s 1.32 s 1.35 s
-s host:port warm cache 3.14 s 1.32 s 0.14 s (≈22×)
no flag (same-host short-circuit) 0.14 s 0.13 s 0.13 s
-s no match n/a 3.11 s 3.11 s

Cache invalidation

Trigger How
TTL expired (24 h) lookup() returns None
Cached URL refuses connection _connect_from_plan catches FPBCLIErrorinvalidate(handle)
Background refresh sees no match refresh thread calls invalidate(handle)
User opt-out FPB_NO_CACHE=1 env or rm ~/.cache/fpbinject/handles.json

host-only handles are never cached (allowed to match multiple servers, would race with LAN topology).

Security

mDNS announcements are cleartext UDP and cached for tens of minutes by every host on the segment. Tokens never appear in TXT records. The CLI obtains tokens only from --token, FPB_TOKEN, or the server's startup banner. TXT carries txtvers / version / auth (advertised intent) / device / path / id.

Full contract: Tools/WebServer/Docs/Discovery.md.

Tests

157 / 157 new and refactored tests pass:

File Tests Covers
test_mdns_advertiser.py 14 TXT contents, no-token-leak, idempotent unregister, signal handlers
test_mdns_discovery.py 12 discover() async semantics, resource resolution
test_discover_localhost_pref.py 9 loopback > local-iface > other ordering, same-host normalization
test_discover_speed.py 3 early-return timing contract
test_connection_plan.py 5 dataclass frozen / defaults
test_resolve_connection_plan.py 18 every row of the 10-step decision matrix
test_handle_resolution.py 20 three handle forms, ambiguity, env vs flag precedence
test_handle_cache.py 11 TTL, atomic write, FPB_NO_CACHE, daemon-thread guarantee

Plus the existing test_server_proxy.py (65 / 65) is untouched. test_fpb_cli.py: 180 / 181 — the remaining failure is pre-existing on origin/main and unrelated; this PR fixes 19 other pre-existing failures.

Backwards compatibility

  • --server-url URL and FPB_SERVER_URL env still work; deprecation note shown only with -v. Will be removed in a future release.
  • --port, --token, --direct semantics unchanged.
  • 65 existing test_server_proxy.py cases pass unmodified.

Files

New (5)
  Tools/WebServer/cli/connection_plan.py       ConnectionPlan + Mode/Policy
  Tools/WebServer/cli/discover.py              mDNS browse + handle classifier
  Tools/WebServer/cli/handle_cache.py          stale-while-revalidate cache
  Tools/WebServer/services/mdns_advertiser.py  server-side advertiser
  Tools/WebServer/Docs/Discovery.md            protocol spec

Modified (core)
  Tools/WebServer/cli/fpb_cli.py               resolver + connector rewrite
  Tools/WebServer/cli/server_proxy.py          auth error message
  Tools/WebServer/main.py                      + advertiser, + --no-mdns
  Docs/CLI.md                                  rewritten server-selection section
  Tools/requirements.txt                       + zeroconf>=0.131

22 files, +3367 / −288.

Commits — 20 atomic, by theme

mDNS introduction (6)
  b3c9527 docs(mdns): add discovery protocol spec + zeroconf dependency
  705c26a refactor(tests): extract MockHTTPHandler into shared fixture
  3ac0b58 feat(server): add MdnsAdvertiser for _fpbinject._tcp.local.
  408332d feat(cli): add cli/discover.py for client-side mDNS browsing
  b6bc6a1 feat(server): wire MdnsAdvertiser into WebServer main with --no-mdns
  b433711 feat(cli): auto-discover server via mDNS by default

UX fixes (3)
  e5356e0 feat(mdns): bump default discovery timeout from 1s to 3s
  9f0fe60 fix(cli): print FPBCLIError without traceback during init
  d13176a fix(mdns): include port in service instance name

ConnectionPlan refactor — four modes + same-host normalization (5)
  025d69b refactor(cli): introduce ConnectionPlan / CommandPolicy / ConnectionMode
  1f69509 refactor(cli): replace requires_server with command_policy on every subparser
  35d8c46 feat(mdns): prefer loopback over LAN-IP and normalize same-host services
  cc4de59 refactor(cli): single resolve_connection_plan + plan-driven connector
  52eda38 docs: rewrite CLI operating-modes + Discovery client precedence ladder

Identity + cache (6)
  7c1510b feat(mdns): publish stable per-installation id in TXT
  75d3506 feat(cli): expose handle and id on FPBServer; add classifier helpers
  b09f447 feat(cli): one -s flag for URL / host:port / hostname
  9ffb606 docs: rewrite CLI server selection around -s / FPB_SERVER
  f24f77b perf(cli): -s host:port short-circuits mDNS browse
  02d311e perf(cli): cache -s host:port lookups; refresh asynchronously

W-Mai added 20 commits June 5, 2026 21:32
Document the _fpbinject._tcp.local. service contract, TXT records,
client precedence ladder, and the security invariant that the auth
token is never published in mDNS.

Add zeroconf>=0.131 to Tools/requirements.txt.
The mock HTTP handler was inlined inside test_server_proxy.py. Move it
to tests/fixtures/mock_http.py so upcoming mDNS tests can reuse the
exact same shape without duplicating HTTP/JSON/SSE plumbing.

Pure refactor: 65 existing tests still pass.
Publishes the running WebServer over mDNS / DNS-SD with TXT records
{txtvers, version, auth, device=none, path=/api}. Auth tokens are NEVER
published; auth TXT carries advertised intent only.

Lifecycle: register installs atexit + (optionally) SIGINT/SIGTERM
handlers so an interrupted server emits a goodbye packet rather than
leaving a stale entry in the LAN cache for the full ~75 min mDNS TTL.
Signal install is skipped under PYTEST_CURRENT_TEST by default; tests
override with install_signal_handlers={True,False}.

14 unit tests cover registration, idempotent unregister, no-token-leak,
auth-intent both directions, atexit hook, and signal-install policy.
discover() runs an AsyncZeroconf browse for ~1 s, resolves each
matching service via AsyncServiceInfo, and returns a list of FPBServer
records (name, host, port, version, auth, device, path, url).

discover_sync() is the blocking convenience wrapper used by the CLI
dispatcher. The auth token is intentionally NOT in the FPBServer
dataclass — discovery does not carry credentials.
The server constructs an MdnsAdvertiser after the startup banner and
unregisters in a finally block around app.run() so graceful shutdowns
produce a goodbye packet.

Failures during register() log a warning and continue without mDNS so a
machine without zeroconf (or behind a firewall blocking 5353/udp) is
not blocked from running the WebServer.

Add --no-mdns to skip advertisement entirely.
Add resolve_server_url() with the precedence ladder:
  1. --server-url flag
  2. FPB_SERVER_URL env
  3. offline subcommand (analyze/disasm/decompile/signature/search/
     get-symbols/compile) -> skip discovery, no 1 s delay
  4. --no-discovery -> fall back to http://127.0.0.1:5500
  5. mDNS browse for ~1 s:
     0 results -> fallback localhost
     1 result  -> attach silently
     2+ results -> list on stderr, exit 2

Add 'discover' subcommand emitting the visible server list as JSON for
scripts and AI agents.

--server-url default flips from DEFAULT_SERVER_URL to None so the
ladder can detect explicit-vs-implicit. Subparsers carry a
requires_server flag (default True; offline ones override to False) so
the resolver can short-circuit without a 1 s discovery delay.

Docs/CLI.md gets an Auto-Discovery section + the discover command +
exit-code table + security note pointing at Discovery.md.

12 new tests cover discover() async semantics (0/1/many), the full
precedence ladder including S7 zero-delay timing, and discover JSON
output.
The 1 s budget was tight: zeroconf's initial query + reply roundtrip on
some hosts (multi-interface, Linux loopback + LAN) takes 1-2 s before
the first Added event fires. Users hit empty results often enough to
need to remember --timeout 3.

Make 3 s the default everywhere — `discover` subcommand, the
resolver's implicit browse, and discover_sync's signature. Documented
in Docs/CLI.md and Tools/WebServer/Docs/Discovery.md.
FPBCLI(...) and try_attach_local_server() ran outside the main try/
except block, so any FPBCLIError they raised (auth failures during
discovery, server-unreachable, etc.) bubbled up as a 3-layer Python
traceback that buried the actual error message.

Wrap both call sites and print the message + exit 1 — matching the
behaviour of FPBCLIError raised from inside a command. The full
traceback is no longer dumped on the user's terminal.

Concretely: `fpb_cli.py info` (with discovery resolving to a remote
URL but no token configured) now prints one line:

  Error: WebServer rejected the request (HTTP 403). A valid auth
  token is required for remote (non-localhost) access. Pass --token
  or set FPB_TOKEN.

instead of ~50 lines of nested traceback.
Two MdnsAdvertiser instances on the same host (e.g. running two
WebServers on different ports for testing or multi-tenant setups) used
to collide because the service instance name only contained the
hostname. RFC 6763 requires unique instance names per service type, so
the second registration was silently swallowed by the local cache and
'fpb_cli.py discover' only saw one server.

Disambiguate by appending ':<port>' to the instance name. Each server
on host H now advertises 'FPBInject on H:<port>._fpbinject._tcp.local.'
and the two coexist correctly.
Single source of truth for what mode the CLI runs in and which server /
token / serial port it should use. The resolver builds a ConnectionPlan
once; the connector consumes it once. Replaces the ad-hoc split between
resolve_server_url(), FPBCLI.__init__()'s waterfall, and the lazy
try_attach_local_server() midstate.

5 unit tests pin: 3 CommandPolicy values, 4 ConnectionMode values,
plan is frozen, defaults match what the resolver implies.
…ubparser

Two parallel sets of "offline" commands had drifted apart:
- resolve_server_url() consulted the per-subparser requires_server bool;
- main() carried its own hard-coded OFFLINE_COMMANDS set.

Both used to disagree (server-stop / disconnect were in one but not the
other). Replace with one CommandPolicy enum on each subparser:

  OFFLINE       analyze, disasm, decompile, signature, search,
                get-symbols, compile, discover, disconnect
  DEVICE        info, inject, mem-*, serial-*, file-*, connect, ...
                (plus parser-level default)
  SERVER_ADMIN  server-stop

main() routes through args.command_policy now; the OFFLINE_COMMANDS
set is gone. test_mdns_discovery's namespace fixture grows a
back-compat shim mapping requires_server={True,False} to the new
field so the rest of its tests keep working.
Two changes to cli/discover.py that together kill the 'why is the CLI
asking for a token to talk to a server I started?' trap:

1. _resolve() now keeps every address mDNS returns and sorts them with
   loopback < local-interface IP < remote, instead of picking
   parsed_scoped_addresses()[0] which is essentially arbitrary order.

2. When the winning address is loopback OR matches a local interface
   IP (detected via ifaddr — already a transitive dep of zeroconf,
   with a socket.getaddrinfo fallback), the host is rewritten to
   127.0.0.1 so the CLI never tries to round-trip via a LAN IP for a
   service it could reach over loopback.

ifaddr is required because socket.getaddrinfo(gethostname()) on
Debian-style /etc/hosts only returns 127.0.1.1; ifaddr enumerates
real NIC bindings.

9 unit tests cover the sort key, the same-host detector, and three
end-to-end normalization scenarios (loopback advertised alongside
LAN, LAN-only matching local interface, truly remote untouched).
Replaces three split-brain decision points:
  resolve_server_url()           -> URL-only output
  FPBCLI.__init__() waterfall    -> direct/remote/local rediscovery
  try_attach_local_server()      -> lazy mid-state for port-less attach

with one resolver and one connector:

  resolve_connection_plan(args) -> ConnectionPlan
  FPBCLI(plan=plan)             reads the plan once

Resolver precedence (first match wins):
  1. command_policy in {OFFLINE, SERVER_ADMIN} -> Offline plan
  2. --direct                                  -> Direct plan
                                                  (rejects --server-url,
                                                   requires --port)
  3. --server-url                              -> classify URL
  4. FPB_SERVER_URL env                        -> classify URL
  5. single CLI-launched PID                   -> 127.0.0.1:<pid_port>
  6. http://127.0.0.1:5500/api/status reachable -> default localhost
  7. --no-discovery                            -> default localhost
  8. mDNS browse 3 s
       0  -> default localhost fallback
       1  -> use it (already loopback-normalized in R3)
       2+ -> stderr list, sys.exit(2)

Connector dispatches by ConnectionMode and preserves the legacy
'auto-launch failed -> direct serial' fallback only when the plan
explicitly carries allow_direct_fallback=True (which the resolver
sets only for local plans with --port present).

Killed: _pending_local_server_url, _pending_local_token,
_pending_local_baudrate, try_attach_local_server, _is_remote_url,
_init_remote_proxy, OFFLINE_COMMANDS set. The legacy __init__ kwargs
keep working through _legacy_kwargs_to_plan() so 65 existing
test_server_proxy.py tests pass unchanged.

main() loses the post-construction try_attach_local_server() retry
because the plan-driven connector handles port-less local attach as
a normal LOCAL_PROXY plan.

Test deltas:
  + 18 tests in test_resolve_connection_plan.py covering every
    decision-matrix row including the two invalid flag combos
    (--direct with --server-url, --direct without --port).
  - 3 tests in test_fpb_cli.py for the deleted try_attach_local_server.
  ~ test_main_with_port_and_baudrate now inspects plan kwarg instead
    of legacy port/baudrate kwargs.

Net: -19 pre-existing failures in test_fpb_cli.py (mDNS no longer
runs against the network during the TestMainArgumentParsing suite
because the OFFLINE/SERVER_ADMIN policies short-circuit the resolver
before discover_sync()).
Docs/CLI.md:
  - Replace the 6-row operating-modes table with the 4-mode mental
    model (Offline / Local Proxy / Remote Proxy / Direct Serial) +
    the 8-step resolver list.
  - Document the two rejected flag combos (--direct + --server-url,
    --direct without --port) inline.
  - Spell out: --port is always the device serial port, never the
    server's TCP port.
  - Auto-Discovery section explains the same-host-loopback
    normalization rule the user actually sees.

Tools/WebServer/Docs/Discovery.md:
  - Update the precedence ladder to match resolve_connection_plan's
    8 steps (was a 5-step ladder with no PID short-circuit, no
    localhost probe, and no normalization).
  - Add the 'Localhost preference' subsection with the sort key
    used by cli/discover.py::_address_sort_key.
Mint a UUID on first start, persist next to WebServer/ in
.fpbinject_server_id, advertise it as TXT 'id'. The id survives port
and hostname changes so future client-side identity matching can
follow a server across moves.

The id is local state, not config — added to .gitignore.
discover() now extracts a 'handle' (the human-friendly '<host>:<port>'
fragment of the mDNS instance name) and the TXT 'id' onto each
FPBServer record, ready to be consumed by the new -s flag.

Add two pure helpers used by the CLI:
  classify_handle(value) -> 'url' | 'host_port' | 'host'
  find_by_handle(servers, value) -> filtered matches
User no longer has to copy IP:port from 'discover' output into
'--server-url'. A single -s / --server flag (and FPB_SERVER env)
accepts:

  -s http://1.2.3.4:5500   URL, used verbatim
  -s bench:5501             mDNS handle, exact match required
  -s bench                  hostname, must be unique on the LAN
                           (else exit 2 with disambiguation hints)

Resolver gains two new steps before the legacy URL ones:
  3. -s flag             (URL classifier or mDNS handle lookup)
  4. FPB_SERVER env      (same)
  5. --server-url        (deprecated; warns under -v)
  6. FPB_SERVER_URL env  (deprecated; warns under -v)

discover output flips: default is now a human table; --json restores
the previous machine-readable list. Each row carries the handle the
user types into -s.

20 new tests in test_handle_resolution.py pin: classifier shapes,
host-only ambiguity, resolver precedence, deprecation warnings, env
vs flag precedence.
CLI.md and Discovery.md updated to:
  - lead with -s / FPB_SERVER as the canonical way to pick a server,
  - document the three handle forms (URL / host:port / host),
  - show 'discover' default table example,
  - list --server-url / FPB_SERVER_URL as deprecated,
  - note the new TXT 'id' record.
The handle resolver in resolve_connection_plan() called discover_sync(),
which always slept the full 3 s timeout regardless of how quickly the
target service replied. `-s bench:5500` therefore took ~3 s even when
the matching service answered in <100 ms.

discover() now accepts an early_match predicate and exits the moment
it returns True for a freshly-resolved server. discover_sync_by_handle()
wires the predicate up for the host:port form (the only form where
we know the exact match in advance).

Live measurement on a single-server LAN:
  -s host:port:  3.14 s -> 1.32 s  (-58%)
  -s URL:        0.13 s -> 0.13 s  (unchanged, no mDNS)
  no flag:       0.14 s -> 0.12 s  (unchanged, same-host short-circuit)
  -s missing:    3.14 s -> 3.11 s  (unchanged; no signal to early-exit)

3 new tests pin the speed contract; existing 20 handle-resolution
tests retargeted to the new mock point.
Stale-while-revalidate handle cache so the second and later -s host:port
invocations skip mDNS entirely.

  ~/.cache/fpbinject/handles.json   {handle -> {url, id, cached_at}}

Cache flow:
  Hit (<= 24 h):
    return cached URL immediately
    spawn daemon thread that re-runs discover_sync_by_handle and
    rewrites the entry for next time -- the user never blocks
  Miss / expired / FPB_NO_CACHE=1:
    synchronous mDNS, then store
  Cached URL refuses connection at connect time:
    FPBCLI._connect_from_plan() invalidates the entry and re-raises;
    next invocation gets a fresh mDNS lookup

ConnectionPlan grows a 'cache_handle' field so the connector knows
which entry to invalidate on failure. The host-only form is never
cached (ambiguity is allowed and would race with LAN topology).

Atomic writes via tempfile + os.replace -- last writer wins, no locks
needed for concurrent CLI invocations.

Live timing on a single-server LAN with the loopback hot path:
  -s host:port cold cache:  1.35 s   (full mDNS, then write)
  -s host:port warm cache:  0.14 s   (-90% vs cold, ~24x vs old 3.14 s)
  FPB_NO_CACHE=1 bypass:    1.25 s   (forced fresh mDNS)

11 new tests pin the cache: TTL boundary, atomic write, FPB_NO_CACHE
bypass, daemon-thread guarantee, end-to-end resolver wire-up.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds zero-config LAN discovery for FPBInject’s WebServer using mDNS, and refactors the CLI’s server selection into a single resolver that produces a ConnectionPlan consumed by the connector. It fits into the WebServer/CLI coexistence architecture by removing manual URL copying and enabling handle-based selection (-s bench:5500) plus a discover subcommand.

Changes:

  • Add server-side mDNS advertisement (_fpbinject._tcp.local.) and a CLI-side mDNS discovery client with localhost preference/normalization.
  • Refactor fpb_cli.py connection logic into a single resolve_connection_plan() + plan-driven connector, including handle-based selection and a handle cache.
  • Add/expand protocol + UX documentation and introduce extensive unit tests for resolver/discovery/cache behavior.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
Tools/WebServer/tests/test_server_proxy.py Reuses shared mock HTTP handler fixture.
Tools/WebServer/tests/test_resolve_connection_plan.py New tests pinning the connection-plan resolver decision ladder.
Tools/WebServer/tests/test_mdns_discovery.py New tests for discovery semantics and CLI integration paths.
Tools/WebServer/tests/test_mdns_advertiser.py New tests for server-side advertiser TXT/lifecycle behavior.
Tools/WebServer/tests/test_handle_resolution.py New tests for -s/FPB_SERVER handle parsing and resolution.
Tools/WebServer/tests/test_handle_cache.py New tests for the stale-while-revalidate handle cache and integration.
Tools/WebServer/tests/test_fpb_cli.py Updates to align CLI construction with ConnectionPlan.
Tools/WebServer/tests/test_discover_speed.py New timing-contract tests for early-return by handle.
Tools/WebServer/tests/test_discover_localhost_pref.py New tests for same-host normalization and address ordering.
Tools/WebServer/tests/test_connection_plan.py New tests for ConnectionPlan immutability/defaults/enums.
Tools/WebServer/tests/fixtures/mock_http.py New shared mock HTTP handler for integration-style tests.
Tools/WebServer/tests/fixtures/init.py Adds fixtures package marker.
Tools/WebServer/services/mdns_advertiser.py Implements server-side mDNS advertiser and lifecycle hooks.
Tools/WebServer/main.py Wires advertiser into server startup with --no-mdns.
Tools/WebServer/Docs/Discovery.md Adds the discovery protocol specification and resolver ladder.
Tools/WebServer/cli/handle_cache.py Implements persistent handle cache (XDG cache, atomic write, refresh thread).
Tools/WebServer/cli/fpb_cli.py Adds -s/--server, discover, plan resolver, and plan-driven connection execution.
Tools/WebServer/cli/discover.py Implements mDNS browse + normalization + handle matching helpers.
Tools/WebServer/cli/connection_plan.py Adds CommandPolicy, ConnectionMode, and ConnectionPlan model.
Tools/WebServer/.gitignore Ignores the per-machine .fpbinject_server_id file.
Tools/requirements.txt Adds zeroconf dependency for discovery/advertising.
Docs/CLI.md Updates user documentation for -s, discovery, modes, and precedence.

Comment thread Tools/WebServer/services/mdns_advertiser.py Outdated
Comment thread Tools/WebServer/cli/fpb_cli.py Outdated
Comment thread Tools/WebServer/cli/fpb_cli.py Outdated
Comment thread Tools/WebServer/cli/fpb_cli.py
Comment thread Tools/WebServer/Docs/Discovery.md Outdated
Comment thread Tools/WebServer/cli/fpb_cli.py
W-Mai added 2 commits June 8, 2026 18:50
Run Tools/WebServer/format.sh on the new mDNS / cache / handle code
so it conforms to the project's black config (line-length 88) and
clears flake8 (drop unused imports).

Also realign upstream test_cli_coexistence with the post-refactor API:

  * _is_remote_url(url) was deleted in the ConnectionPlan refactor;
    its replacement is the module-level cli.fpb_cli._is_local_url().
    The locality test class swaps to that and inverts assertions.
  * Two TestMain* tests inspected mock_cli_class.call_args.kwargs.get
    ("direct" / "server_url" / "token"); main() now passes a single
    plan= kwarg so the tests inspect plan.mode / plan.server_url /
    plan.token instead.
  * test_offline_no_proxy_no_launch grew an explicit ServerProxy stub
    because the connector now probes is_server_running() before
    deciding to stay offline; the test still asserts no auto-launch.

Pure mechanical pass: 0 logic changes, 0 new tests.

Test on rewritten code:
  Tools/WebServer/tests/run_tests.py --coverage --target 85
  -> 2318 passed, 82 skipped, coverage 85.5%
  * MdnsAdvertiser.update_device_state() rebuilt TXT without the 'id'
    field, breaking the Discovery.md contract after the first state
    update. Cache the persisted id on register and route both register
    and update through a shared _build_txt(state) helper. New unit test
    pins all TXT keys (including 'id') across both call sites.

  * Resolver-level ambiguity now raises AmbiguousServerError, a
    FPBCLIError subclass with exit_code=2, and main()'s handler exits
    via e.exit_code. Previously every FPBCLIError exited 1, so scripts
    couldn't distinguish 'needs disambiguation' from runtime failures —
    contradicting the documented ladder.

  * The two 'pass --server-url to choose:' prompts (the new resolver
    and the legacy resolve_server_url) now point users at
    '-s <handle>' instead. --server-url is hidden from --help and
    deprecated; the suggestion has to match.

  * Discovery.md said the mDNS instance name was 'FPBInject on
    <hostname>', but the advertiser actually emits 'FPBInject on
    <hostname>:<port>' (the port suffix is what makes the client-side
    'handle' value possible and lets multiple servers per host
    coexist). Spec updated.

  * main() now passes legacy kwargs (port / baudrate / direct /
    server_url / token) alongside plan= so downstream wrappers that
    monkeypatch FPBCLI construction see compatible call args. The
    plan kwarg is the actual source of truth; legacy kwargs are
    ignored when plan is present.

Tests: 92/92 PASS on new test files; test_cli_coexistence 43/43 PASS;
run_tests.py --coverage --target 85 -> 85.5%.
@FASTSHIFT FASTSHIFT merged commit 06fa4c8 into FASTSHIFT:main Jun 8, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants