You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update testing page — full 4-layer fleet test battery (12 servers, 82 tools)
Added L2 unit test summary (474+ tests), L3 live integration results
(74/74 across 6 public servers), L4 fleet composition results (20/20),
and known findings. Preserved existing L1 security and adif-mcp sections.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
**Every QSO-Graph package ships with automated security tests and must pass an independent security audit before PyPI publication.**
3
+
**Every QSO-Graph server is tested across four independent layers before release.** Each layer catches different failure modes. All four must pass before a fleet-wide release.
- **Local** — requires local infrastructure (N1MM Logger+, SQLite datasets)
37
+
- **CI/CD** — tested in GitHub Actions pipeline
23
38
24
39
---
25
40
26
-
## Security Test Suite (All 11 Packages)
41
+
## L1: Security Audit
27
42
28
-
Every package includes `test_security.py` with 6 source-code audit tests. These are not runtime tests — they scan all Python source files for forbidden patterns:
43
+
Every package includes `test_security.py` with 6 source-code audit tests. These scan all Python source files for forbidden patterns — they are not runtime tests.
|S5|`test_error_messages_safe`| Exception messages that could expose credentials |
52
+
|S6|`test_no_eval_exec`| Any use of `eval()` or `exec()` (code injection) |
38
53
39
-
These tests run in CI on every push and must pass before any PyPI publish.
54
+
These tests run in CI on every push and must pass before any PyPI publish. If the security gate fails, the publish job is **blocked**. No exceptions.
40
55
41
56
---
42
57
43
-
## CI Security Gate
58
+
## L2: Unit Tests (Mock Mode)
44
59
45
-
Every package's GitHub Actions publish workflow includes a mandatory security job:
60
+
Each server supports a mock mode (`{SERVER}_MCP_MOCK=1`) that replaces HTTP calls with embedded test fixtures. L2 tests verify tool logic, parameter handling, return shapes, parser correctness, and helper functions without making any API calls.
46
61
47
-
```yaml
48
-
jobs:
49
-
security:
50
-
name: Security gate
51
-
steps:
52
-
- Security tests (pytest test_security.py)
53
-
- Static analysis (grep for forbidden patterns)
62
+
| Category | What's Tested | Example |
63
+
|----------|---------------|---------|
64
+
|**Parser/Helper Functions**| ADIF parsing, frequency conversion, date normalization, grid validation |`parse_adif()`, `freq_to_band()`, `to_yyyymmddhhmm()`|
65
+
|**Tool Return Shapes**| Every tool returns expected fields, types, and structures |`eqsl_inbox()` returns `total`, `records`, `by_band`|
|**Data Models**| Dataclass immutability, field defaults, type conversions |`FetchResult(records=[])` is frozen |
54
69
55
-
publish:
56
-
needs: security # blocked until security passes
57
-
steps:
58
-
- Build and publish to PyPI
70
+
```bash
71
+
# Run L2 tests for any server (no network needed)
72
+
cd solar-mcp
73
+
pytest tests/test_tools.py -v
59
74
```
60
75
61
-
If the security gate fails, the publish job is **blocked**. No exceptions.
76
+
---
77
+
78
+
## L3: Live Integration Tests
79
+
80
+
L3 tests hit real APIs with known-good reference values. They verify that external services are responding correctly and that our client code handles real-world responses.
81
+
82
+
Tests are gated behind a `--live` flag and skipped by default. This keeps CI fast and avoids hammering volunteer-run services.
WSPR and HamQTH L3 tests include a 1-second pause between requests to respect volunteer-run services. Tests take longer but avoid API bans.
101
+
102
+
---
103
+
104
+
## L4: Fleet Composition Tests
105
+
106
+
L4 tests verify that all 12 servers work correctly when loaded together. They import every server's MCP object, enumerate all tools, and check for cross-server conflicts.
107
+
108
+
| Category | Tests | What's Verified |
109
+
|----------|:-----:|-----------------|
110
+
|**F1: Tool Name Uniqueness**| 5 | No unexpected name collisions, snake_case convention, server namespacing, tool counts |
|`solar_conditions` name collision | Documented | Exists in both solar-mcp (live NOAA) and ionis-mcp (historical SQLite). MCP clients disambiguate by server prefix. |
120
+
| Null defaults from `Optional` params | Tracked | FastMCP generates `{"default": null}` from Python `Optional[str] = None`. Valid JSON Schema but may affect some local LLM tool parsers. |
121
+
| Band parameter type split | By design | qso-graph servers use string band names (`"20M"`), ionis-mcp uses integer ADIF band IDs (`107`). |
122
+
123
+
```bash
124
+
# Run L4 fleet tests (all 12 servers must be installed)
adif-mcp is the foundation package. Beyond the standard 6 security tests, it carries a comprehensive validation test suite against the ADIF 3.1.6 specification:
134
+
adif-mcp is the foundation package. Beyond the standard security and unit tests, it carries a comprehensive validation suite against the ADIF 3.1.6 specification:
68
135
69
136
### Test Matrix — 48/48 PASS
70
137
@@ -114,66 +181,49 @@ The gold standard for ADIF validation. The [official test file](https://adif.org
114
181
| FRN-011 | SUBMODE without MODE field | eQSL — incomplete records | Graceful handling of missing parent field |
115
182
| FRN-012 | EQSL_AG=Y (Authenticity Guaranteed) | eQSL — AG status for DXCC | 3-value enum critical for DXCC credit eligibility |
116
183
117
-
### Enumeration Coverage
118
-
119
-
adif-mcp v1.0.0 validates all 26 ADIF 3.1.6 enumerations across 43 enum-typed fields:
120
-
121
-
| Enumeration | Records | Import-Only | Fields Using It |
- **Compound CreditList**: `CREDIT_SUBMITTED=DXCC:CARD&LOTW` — split on comma, validate credit name against Credit enum, validate each medium against QSL_Medium enum
146
-
- **Conditional Submode**: `SUBMODE=USB` checks membership in Submode enum, then warns if parent mode (SSB) doesn't match the record's MODE field
147
-
- **Import-only detection**: Deprecated values produce warnings, not errors — historical QSO data is preserved
148
-
- **Empty value rejection**: Empty or whitespace-only values for enum fields produce errors
0 commit comments