Skip to content
Merged
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# BUG-T6 Validation Report

**Task:** BUG-T6 — Web UI port collisions (`--web-ui-port`) create unstable multi-process behavior
**Date:** 2026-02-14
**Verdict:** ✅ PASS

---

## Quality Gates

| Gate | Command | Result |
|------|---------|--------|
| Unit tests | `pytest tests/unit/` | ✅ 323 passed, 0 failed |
| Linting | `ruff check src/` | ✅ All checks passed |
| Type checking | `mypy src/` | ✅ Success: no issues found in 12 source files |

---

## Acceptance Criteria

| # | Criterion | Status | Evidence |
|---|-----------|--------|----------|
| AC1 | When port occupied, wrapper prints warning and continues as MCP-only — no crash | ✅ | `test_occupied_port_in_bridge_mode_skips_webui` passes; `run_server` OSError caught |
| AC2 | MCP stdout remains valid JSON-RPC only | ✅ | Warning printed to `sys.stderr`; stdout unaffected |
| AC3 | `--web-ui-only` mode with occupied port exits with code 1 + clear message | ✅ | `test_occupied_port_in_webui_only_mode_exits_with_error` passes |
| AC4 | Free port: behavior unchanged from pre-fix | ✅ | `test_free_port_starts_webui_normally` passes; all 36 pre-existing tests pass |
| AC5 | Tests cover (a) occupied/bridge, (b) occupied/webui-only, (c) free port, (d) `is_port_available` unit tests | ✅ | 5 new tests in `TestPortCollisionHandling` |
| AC6 | No regressions in existing test suite | ✅ | 323/323 pass |

---

## Changes

### `src/mcpbridge_wrapper/webui/server.py`
- Added `import socket` and `import sys`
- Added `is_port_available(host, port) -> bool` — attempts `socket.bind()` and returns `False` on `OSError`
- Wrapped `uvicorn.run(...)` in `try/except OSError` to catch race-condition bind failures in the daemon thread

### `src/mcpbridge_wrapper/__main__.py`
- Imports `is_port_available` from `webui.server`
- Before `--web-ui-only` server start: check port; if occupied, print error and `return 1`
- Before `run_server_in_thread`: check port; if occupied, print warning and skip Web UI; MCP bridge starts normally

### `tests/unit/test_main_webui.py`
- Added `patch("mcpbridge_wrapper.webui.server.is_port_available", return_value=True)` to 4 existing tests that previously relied on real port availability
- Added `class TestPortCollisionHandling` with 5 new tests
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# BUG-T6: Web UI Port Collisions Create Unstable Multi-Process Behavior

**Task ID:** BUG-T6
**Type:** Bug / Runtime / Process Lifecycle
**Priority:** P0
**Status:** In Progress
**Implements:** FU-P13-T8
**Date:** 2026-02-14

---

## 1. Problem Statement

When multiple stale/orphan wrapper instances exist (e.g., after a client restarts), they all
attempt to bind to the same Web UI port (default `8080`). The result is a flood of bind errors
logged to stderr that:
1. Clutter the stderr channel used for diagnostic messages
2. Prevent new instances from starting a usable Web UI
3. Create an undefined mix of stale and new listeners on the same host:port

The core issue is that `run_server` / `run_server_in_thread` do not check port availability
before starting, and the caller (`__main__.py`) does not handle the `OSError` that results.

---

## 2. Deliverables

| # | Artifact | Path |
|---|----------|------|
| 1 | Port-check utility function | `src/mcpbridge_wrapper/webui/server.py` |
| 2 | Collision handling in `__main__.py` startup | `src/mcpbridge_wrapper/__main__.py` |
| 3 | Unit tests for collision scenarios | `tests/unit/test_main_webui.py` |
| 4 | Validation report | `SPECS/INPROGRESS/BUG-T6_Validation_Report.md` |

---

## 3. Design

### 3.1 Port availability check

Add a helper `is_port_available(host, port) -> bool` in `server.py` that attempts a
`socket.bind()` and returns `False` if `OSError` is raised (i.e., port occupied).

### 3.2 Startup collision handling in `__main__.py`

Before calling `run_server_in_thread` (or `run_server` in `--web-ui-only` mode):
1. Call `is_port_available(config.host, config.port)`.
2. If the port is **occupied**:
- Print a clear message to stderr:
`Warning: Web UI port {port} is already in use. Skipping Web UI startup.`
- Continue WITHOUT starting the Web UI thread (MCP stdio bridge still starts normally).
- Do NOT exit with an error — the MCP session must not be disrupted.
3. If the port is **free**, proceed as normal.

For `--web-ui-only` mode, when the port is occupied:
- Print the same warning to stderr.
- Exit with code `1` (the user explicitly requested the dashboard; failure is fatal).

### 3.3 Thread-level guard

Wrap `run_server` body in a `try/except OSError` so that a race condition between check and
bind does not produce an unhandled exception in the daemon thread (which would be silently lost):
- Catch `OSError` in `run_server` and log to stderr instead of crashing.

---

## 4. Acceptance Criteria

- [ ] AC1: When the requested Web UI port is occupied, wrapper prints a clear warning to stderr and continues as MCP-only mode — no crash, no unhandled exception.
- [ ] AC2: MCP stdio protocol output (stdout) remains valid JSON-RPC only — no error text leaks to stdout.
- [ ] AC3: In `--web-ui-only` mode, occupied port causes exit code `1` with a clear stderr message.
- [ ] AC4: If the port is free, behavior is unchanged from pre-fix.
- [ ] AC5: Unit tests cover: (a) occupied port in bridge+webui mode, (b) occupied port in webui-only mode, (c) free port normal path.
- [ ] AC6: No regression in existing `test_main_webui.py` or `test_main.py` tests.

---

## 5. Dependencies

- `src/mcpbridge_wrapper/webui/server.py` — add `is_port_available`
- `src/mcpbridge_wrapper/__main__.py` — add collision guard around Web UI startup

---

## 6. Out of Scope

- PID file / single-instance lock (may be a follow-up)
- Auto-incrementing port fallback (YAGNI)
- Killing stale processes automatically (dangerous, out of scope)
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
## REVIEW REPORT — BUG-T6: Web UI Port Collision Handling

**Scope:** origin/main..HEAD (commit 792abf2..2af8d9f)
**Files changed:** 3 (`__main__.py`, `webui/server.py`, `test_main_webui.py`)
**Date:** 2026-02-14

---

### Summary Verdict
- [x] Approve with comments

---

### Critical Issues

None.

---

### Secondary Issues

**[Medium] `is_port_available` uses `SO_REUSEADDR` which can produce false positives on some platforms**

`SO_REUSEADDR` on Linux allows binding to a port that has connections in `TIME_WAIT`. On macOS
(the target platform) the behavior is more strict, so this is unlikely to cause problems in
practice. However, for maximum correctness the `setsockopt` call could be omitted so we get a
pure "is this port currently bound?" answer.

Recommendation: Low-priority follow-up; the current behavior is safe on macOS.

**[Low] Port check has a time-of-check/time-of-use (TOCTOU) window**

Between `is_port_available` returning `True` and `uvicorn.run()` binding, another process could
claim the port. The `OSError` catch in `run_server` handles this defensively for the thread case,
but for `--web-ui-only` mode the caller would need to deal with the error propagation.

Current mitigation: the `OSError` wrapper in `run_server` catches this at runtime for the thread.
For `--web-ui-only`, the `run_server` wrapper also now catches it and prints to stderr, so the
process will exit without crashing. Acceptable for now.

**[Nit] `audit.close()` called before `run_server_in_thread` starts in `--web-ui-only` occupied-port path**

When `--web-ui-only` and port is occupied, we call `audit.close()` then `return 1`. This is
correct — the audit logger was just initialized and nothing was logged — but it could be confusing
to future readers. A comment would clarify intent.

---

### Architectural Notes

- The `is_port_available` function is now a public API of `webui/server.py`. If the server module
is later split, this utility should move to a shared helpers module.
- The pattern "check then proceed" is the right minimal approach here. A PID-file single-instance
guard (FU-P13-T8 optional extension) was deliberately deferred to keep scope minimal.
- The OSError catch inside `run_server` means daemon-thread failures are now logged to stderr
instead of being silently swallowed, which improves observability.

---

### Tests

- 5 new tests added in `TestPortCollisionHandling`
- 4 existing tests updated to mock `is_port_available` for deterministic behavior in environments
where port 8080 is already occupied
- All 323 unit tests pass
- Coverage unaffected (new code is fully exercised by new tests)

---

### Next Steps

1. **Troubleshooting docs** — The BUG-T6 Resolution Path still has one open item:
`[ ] Document stale-process cleanup in troubleshooting`. This is a minor docs task.
Add a new follow-up task if desired.
2. **`SO_REUSEADDR` audit** — Low priority, can be addressed in a future cleanup pass.
3. No blockers. Task is complete and ready for PR merge.

---

### Follow-up Tasks

**FU-BUG-T6-1:** Add stale-process cleanup guidance to troubleshooting docs (Low priority)

No other actionable follow-ups.
6 changes: 5 additions & 1 deletion SPECS/ARCHIVE/INDEX.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# mcpbridge-wrapper Tasks Archive

**Last Updated:** 2026-02-14 (BUG-T5)
**Last Updated:** 2026-02-14 (BUG-T6)

## Archived Tasks

Expand Down Expand Up @@ -86,6 +86,7 @@
| BUG-T2 | [BUG-T2_codex_mcp_add_with_Web_UI_extras_fails_in_zsh/](BUG-T2_codex_mcp_add_with_Web_UI_extras_fails_in_zsh/) | 2026-02-14 | PASS |
| BUG-T3 | [BUG-T3_webui_only_dashboard_mode/](BUG-T3_webui_only_dashboard_mode/) | 2026-02-14 | PASS |
| BUG-T5 | [BUG-T5_Empty-content_tool_results_structuredContent/](BUG-T5_Empty-content_tool_results_structuredContent/) | 2026-02-14 | PASS |
| BUG-T6 | [BUG-T6_WebUI_Port_Collision/](BUG-T6_WebUI_Port_Collision/) | 2026-02-14 | PASS |

## Historical Artifacts

Expand Down Expand Up @@ -137,6 +138,7 @@
| [REVIEW_BUG-T2_zsh_webui_extras.md](BUG-T2_codex_mcp_add_with_Web_UI_extras_fails_in_zsh/REVIEW_BUG-T2_zsh_webui_extras.md) | Review report for BUG-T2 |
| [REVIEW_BUG-T3_webui_only_mode.md](BUG-T3_webui_only_dashboard_mode/REVIEW_BUG-T3_webui_only_mode.md) | Review report for BUG-T3 |
| [REVIEW_BUG-T5_structuredContent_empty_content.md](BUG-T5_Empty-content_tool_results_structuredContent/REVIEW_BUG-T5_structuredContent_empty_content.md) | Review report for BUG-T5 |
| [REVIEW_BUG-T6_port_collision.md](BUG-T6_WebUI_Port_Collision/REVIEW_BUG-T6_port_collision.md) | Review report for BUG-T6 |

## Archive Log

Expand Down Expand Up @@ -228,3 +230,5 @@
| 2026-02-14 | BUG-T3 | Archived REVIEW_BUG-T3_webui_only_mode report |
| 2026-02-14 | BUG-T5 | Archived Empty-content_tool_results_structuredContent (PASS) |
| 2026-02-14 | BUG-T5 | Archived REVIEW_BUG-T5_structuredContent_empty_content report |
| 2026-02-14 | BUG-T6 | Archived WebUI_Port_Collision (PASS) |
| 2026-02-14 | BUG-T6 | Archived REVIEW_BUG-T6_port_collision report |
46 changes: 46 additions & 0 deletions SPECS/INPROGRESS/BUG-T6_Validation_Report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# BUG-T6 Validation Report

**Task:** BUG-T6 — Web UI port collisions (`--web-ui-port`) create unstable multi-process behavior
**Date:** 2026-02-14
**Verdict:** ✅ PASS

---

## Quality Gates

| Gate | Command | Result |
|------|---------|--------|
| Unit tests | `pytest tests/unit/` | ✅ 323 passed, 0 failed |
| Linting | `ruff check src/` | ✅ All checks passed |
| Type checking | `mypy src/` | ✅ Success: no issues found in 12 source files |

---

## Acceptance Criteria

| # | Criterion | Status | Evidence |
|---|-----------|--------|----------|
| AC1 | When port occupied, wrapper prints warning and continues as MCP-only — no crash | ✅ | `test_occupied_port_in_bridge_mode_skips_webui` passes; `run_server` OSError caught |
| AC2 | MCP stdout remains valid JSON-RPC only | ✅ | Warning printed to `sys.stderr`; stdout unaffected |
| AC3 | `--web-ui-only` mode with occupied port exits with code 1 + clear message | ✅ | `test_occupied_port_in_webui_only_mode_exits_with_error` passes |
| AC4 | Free port: behavior unchanged from pre-fix | ✅ | `test_free_port_starts_webui_normally` passes; all 36 pre-existing tests pass |
| AC5 | Tests cover (a) occupied/bridge, (b) occupied/webui-only, (c) free port, (d) `is_port_available` unit tests | ✅ | 5 new tests in `TestPortCollisionHandling` |
| AC6 | No regressions in existing test suite | ✅ | 323/323 pass |

---

## Changes

### `src/mcpbridge_wrapper/webui/server.py`
- Added `import socket` and `import sys`
- Added `is_port_available(host, port) -> bool` — attempts `socket.bind()` and returns `False` on `OSError`
- Wrapped `uvicorn.run(...)` in `try/except OSError` to catch race-condition bind failures in the daemon thread

### `src/mcpbridge_wrapper/__main__.py`
- Imports `is_port_available` from `webui.server`
- Before `--web-ui-only` server start: check port; if occupied, print error and `return 1`
- Before `run_server_in_thread`: check port; if occupied, print warning and skip Web UI; MCP bridge starts normally

### `tests/unit/test_main_webui.py`
- Added `patch("mcpbridge_wrapper.webui.server.is_port_available", return_value=True)` to 4 existing tests that previously relied on real port availability
- Added `class TestPortCollisionHandling` with 5 new tests
89 changes: 89 additions & 0 deletions SPECS/INPROGRESS/BUG-T6_WebUI_Port_Collision.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# BUG-T6: Web UI Port Collisions Create Unstable Multi-Process Behavior

**Task ID:** BUG-T6
**Type:** Bug / Runtime / Process Lifecycle
**Priority:** P0
**Status:** In Progress
**Implements:** FU-P13-T8
**Date:** 2026-02-14

---

## 1. Problem Statement

When multiple stale/orphan wrapper instances exist (e.g., after a client restarts), they all
attempt to bind to the same Web UI port (default `8080`). The result is a flood of bind errors
logged to stderr that:
1. Clutter the stderr channel used for diagnostic messages
2. Prevent new instances from starting a usable Web UI
3. Create an undefined mix of stale and new listeners on the same host:port

The core issue is that `run_server` / `run_server_in_thread` do not check port availability
before starting, and the caller (`__main__.py`) does not handle the `OSError` that results.

---

## 2. Deliverables

| # | Artifact | Path |
|---|----------|------|
| 1 | Port-check utility function | `src/mcpbridge_wrapper/webui/server.py` |
| 2 | Collision handling in `__main__.py` startup | `src/mcpbridge_wrapper/__main__.py` |
| 3 | Unit tests for collision scenarios | `tests/unit/test_main_webui.py` |
| 4 | Validation report | `SPECS/INPROGRESS/BUG-T6_Validation_Report.md` |

---

## 3. Design

### 3.1 Port availability check

Add a helper `is_port_available(host, port) -> bool` in `server.py` that attempts a
`socket.bind()` and returns `False` if `OSError` is raised (i.e., port occupied).

### 3.2 Startup collision handling in `__main__.py`

Before calling `run_server_in_thread` (or `run_server` in `--web-ui-only` mode):
1. Call `is_port_available(config.host, config.port)`.
2. If the port is **occupied**:
- Print a clear message to stderr:
`Warning: Web UI port {port} is already in use. Skipping Web UI startup.`
- Continue WITHOUT starting the Web UI thread (MCP stdio bridge still starts normally).
- Do NOT exit with an error — the MCP session must not be disrupted.
3. If the port is **free**, proceed as normal.

For `--web-ui-only` mode, when the port is occupied:
- Print the same warning to stderr.
- Exit with code `1` (the user explicitly requested the dashboard; failure is fatal).

### 3.3 Thread-level guard

Wrap `run_server` body in a `try/except OSError` so that a race condition between check and
bind does not produce an unhandled exception in the daemon thread (which would be silently lost):
- Catch `OSError` in `run_server` and log to stderr instead of crashing.

---

## 4. Acceptance Criteria

- [ ] AC1: When the requested Web UI port is occupied, wrapper prints a clear warning to stderr and continues as MCP-only mode — no crash, no unhandled exception.
- [ ] AC2: MCP stdio protocol output (stdout) remains valid JSON-RPC only — no error text leaks to stdout.
- [ ] AC3: In `--web-ui-only` mode, occupied port causes exit code `1` with a clear stderr message.
- [ ] AC4: If the port is free, behavior is unchanged from pre-fix.
- [ ] AC5: Unit tests cover: (a) occupied port in bridge+webui mode, (b) occupied port in webui-only mode, (c) free port normal path.
- [ ] AC6: No regression in existing `test_main_webui.py` or `test_main.py` tests.

---

## 5. Dependencies

- `src/mcpbridge_wrapper/webui/server.py` — add `is_port_available`
- `src/mcpbridge_wrapper/__main__.py` — add collision guard around Web UI startup

---

## 6. Out of Scope

- PID file / single-instance lock (may be a follow-up)
- Auto-incrementing port fallback (YAGNI)
- Killing stale processes automatically (dangerous, out of scope)
Loading