Skip to content

Add proxy integration tests and missing UTS spec coverage#462

Merged
paddybyers merged 59 commits into
mainfrom
uts-integration-proxy
May 14, 2026
Merged

Add proxy integration tests and missing UTS spec coverage#462
paddybyers merged 59 commits into
mainfrom
uts-integration-proxy

Conversation

@paddybyers
Copy link
Copy Markdown
Member

Summary

  • Proxy-based integration test specs: A programmable Go test proxy sits between the SDK and the Ably sandbox, enabling fault injection (drop connections, inject errors, delay responses) for testing connection recovery, heartbeat, auth reauth, presence reentry, and REST fault handling. The proxy is controlled via a REST API on a control port — each test creates a session with declarative rules, then inspects an event log after the test completes.
  • Reorganised UTS docs: Moved guides into docs/, rewrote README as a concise entry point, added writing-derived-tests.md for translating UTS specs to language-specific tests.
  • Added missing unit/integration UTS specs covering 14 spec areas that previously had no test coverage.

Proxy approach

The test proxy (uts/proxy/) is a Go binary that:

  1. Accepts sessions via POST /sessions — each session gets its own listener port and targets a specific upstream (sandbox-realtime/sandbox-rest)
  2. Supports declarative rules that match on message type, action, channel, HTTP method/path, and fire actions (drop, delay, inject message, return HTTP error)
  3. Supports imperative actions via POST /sessions/:id/actions for ad-hoc fault injection mid-test
  4. Records all events to a queryable log via GET /sessions/:id/log
  5. Sessions auto-cleanup on DELETE or timeout

SDK test harnesses (e.g. ably-js) build and auto-launch the proxy from their test setup — no manual proxy startup needed.

Proxy integration test specs added

Spec file Spec points Description
connection_resume.md RTN15a-j, RTN16d/k/l Connection resume, recovery key, fatal resume errors
channel_faults.md RTL2f, RTL4f, RTL13b Channel detach/reattach on transport failure
heartbeat.md RTN23a/b Heartbeat timeout detection
rest_faults.md RSC15a/d, TO3l8 REST fallback host selection
auth_reauth.md RSA4b/c/d, RTN22 Auth token renewal and server-initiated reauth
presence_reentry.md RTP17a-f Presence auto re-entry after resume
connection_open_failures.md (existing, extended) Connection open failure scenarios

Missing unit/integration specs added

Spec file Spec points Description
backoff_jitter_test.md RTB1a/b Backoff coefficient and jitter range
token_expiry_non_renewable_test.md RSA4a1/a2 Non-renewable token expiry handling
auth_callback_errors_test.md RSA4c/d/e/f Auth callback error scenarios
connection_recovery_test.md RTN16d-l Recovery key format and usage
forwards_compatibility_test.md RSF1/RTF1 Unknown protocol fields/actions
network_change_test.md RTN20a-c Network change events
push_channels.md (unit + integration) RSH7a-e Push channel subscriptions
channel_subscribe.md (extended) RTL22/MFI Message filter subscriptions
channel_publish.md (extended) RTN7e Publish error matches errorReason
rest_channel_attributes.md (extended) CHD2/CHM2 All ChannelMetrics fields

Test plan

  • Verify proxy binary builds: cd uts/proxy && go build -o test-proxy .
  • Review proxy test specs match features.md requirements
  • Verify completion-status.md reflects all new specs
  • Cross-check with ably-js test implementations (separate PR in ably-js)

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@ttypic ttypic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@lawrence-forooghian
Copy link
Copy Markdown
Collaborator

@paddybyers I'm just trying to understand what PRs exist for the UTS stuff and the stacking isn't clear here — what is the uts-integration branch that this PR is targeting? There's no PR for that branch. Is this meant to be stacked on top of uts-realtime (#460)?

@paddybyers paddybyers changed the base branch from uts-integration to uts-realtime May 14, 2026 13:07
@paddybyers
Copy link
Copy Markdown
Member Author

There's no PR for that branch. Is this meant to be stacked on top of uts-realtime

Sorry, my mistake, I've updated the base branch of this PR to uts-realtime.

@paddybyers paddybyers changed the base branch from uts-realtime to main May 14, 2026 20:16
paddybyers and others added 12 commits May 14, 2026 21:17
Add ably-common as a git submodule at submodules/ably-common, pinned to
6ff9a1a. This provides shared test fixtures and protocol definitions
used by the UTS (Universal Test Suite) specs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add portable, language-independent test specifications covering the REST
client: authentication (RSA), channels (RSL/RSP), encoding, batch
publish, pagination, stats, time, fallback hosts, push admin, and type
definitions. Includes a mock HTTP helper spec and a README documenting
the UTS framework.

These specs serve as the source of truth for expected SDK behaviour,
independent of any specific programming language implementation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add test specs covering connection failures (RTN14/RTN15), open
failures, error reason handling, fallback hosts (RSC15), heartbeats,
update events, whenState helper, and a connection lifecycle integration
test.
Ensure mock WebSocket connections are properly closed in connection
failure and open-failure test specs to prevent resource leaks in tests.
…kill

Separate the mock WebSocket specification into its own file for reuse
across test specs, and add a skill document for writing test specs.
Replace TRY/CATCH error characterisation patterns with declarative
EXPECT_THROW assertions for clearer, more portable test specifications.
…ence

Add UNIQUE_CHANNEL_NAME() calls and randomised channel names throughout
the test specs. Also adds new test specs for channel attach (RTL4),
detach (RTL6), channel options, state events, and channels collection.
Substantially rework the heartbeat test specs for better coverage of
RTN23 (heartbeat monitoring) and extend the mock WebSocket helper with
additional transport simulation capabilities. Update the write-test-spec
skill with improved patterns.
Correct the test approach for RTN15a immediate reconnection behaviour
and update the write-test-spec skill with refined patterns.
Correct small errors in channel_attach and channel_state_events specs.
Add test specs for channel connection state handling, channel error
reporting, server-initiated detach, channel properties (RTL15/RTL16),
connection ID/key (RTN8/RTN9), and connection ping (RTN13). Also adds
a completion-status tracker for spec point coverage.
Add comprehensive test specs covering channel message subscription,
filtering, listener management, and unsubscribe behaviour.
paddybyers and others added 26 commits May 14, 2026 21:17
Apply fixes and implementation notes across 22 realtime UTS spec files
based on findings from translating specs to ably-js tests.

Key changes:
- Fix presence assertions (action == ENTER → PRESENT per RTP2d2)
- Add base64 encoding notes for JSON transport in delta decoding specs
- Fix state change filtering in auth reauth tests (RTC8a/RTC8a1)
- Add connectionStateTtl and disconnectedRetryTimeout to specs that
  transition through SUSPENDED or need fake timer precision
- Add implementation notes for echo suppression, NACK alternatives,
  clientId:* restrictions, and REST API numeric action values
- Add missing test sections (RTN14b, RTL5l, RTL5i notes)
- Relax overly specific error code assertions (RSAN1a3)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update RSC10b test spec to explicitly assert that 401 errors unrelated
to token expiry must not trigger the token renewal flow.
New realtime integration test specs (7 files):
- auth/token_renewal: RSC10 token renewal on expired JWT
- auth/token_request: RSA8 createTokenRequest round-trip
- channels/channel_attach: RTL1/RTL3/RTL5d attach, detach, reattach
- channels/channel_publish: RTL6/RSL6a2 publish, extras, encoding
- channels/channel_subscribe: RTL7 subscribe and message receipt
- connection/connection_failures: RTN14g/RTN15h connection failures
- presence/presence_sync: RTP1 presence sync on attach

Also adds integration-testing.md guide covering sandbox provisioning,
test structure, and proxy vs non-proxy test categories.

REST integration spec fixes:
- auth.md: RSA4 invalid credentials returns 401/40400 (key not found)
- presence.md: RSP3a2 fixture data is a string (no encoding field)
- revoke_tokens.md: verify revocation via Realtime disconnect (40141)
  instead of REST request, matching server-side timing behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a Go-based test proxy that intercepts and manipulates WebSocket and
HTTP traffic between SDK clients and the Ably service. Includes:

- Programmable proxy server with rule-based request/response manipulation
- WebSocket and HTTP proxy handlers with protocol-level inspection
- Session management and action framework for fault injection
- Integration test specs for connection failures, resume, heartbeats,
  channel faults, and REST faults using the proxy
- Proxy helper spec for UTS integration test authors
- Updated write-test-spec skill with proxy test patterns
The test was using client-side now() to define time boundaries between
"early" and "late" message batches. This is unreliable because client
and server clocks may differ, and publishes can complete within the
same client-clock millisecond. The test now retrieves server-assigned
timestamps from the messages and derives the boundary from those.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- channel_faults.md: Remove incorrect RTL4h reference; the test that
  replaces ATTACHED with ERROR is RTL14, not RTL4h. Update test name,
  channel name prefix, and rule comment accordingly.

- rest_faults.md: Fix spec point references from RSC15a to RSC15m/REC2c2
  (the correct spec points for fallback host behaviour when the fallback
  domain set is empty). Update test title, comments, and channel names.

- Update proxy binary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New proxy test specs:
- Test 21 (RTN15j): Fatal ERROR mid-session → FAILED state
- Test 22 (RTN15g/g2): connectionStateTtl expiry prevents resume
- Test 23 (RTN19a/a2): Unacked messages resent after resume
- Test 24 (RTL12): ATTACHED with resumed=false → channel UPDATE event
- Test 25 (RTL3d): Channels reattach after connection recovery
- Test 26 (RTN22/RTC8a): Server-initiated re-authentication
- Test 27 (RTP17i/RTP17g): Presence re-enter on non-resumed reattach

Also adds "Writing Proxy Tests" guidance to integration-testing.md
covering late fault injection and two approaches for early faults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move completion-status.md, integration-testing.md, and the two skill
files (write-test-spec, write-derived-tests) into a new uts/docs/
directory. Strip skill frontmatter from the writing guides. Fix stale
integration-proxy/ references to match actual integration/proxy/ layout.
Rewrite README to reflect current spec counts and link to docs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New specs:
- RTB1: backoff and jitter for connection retries
- RSA4a: token expiry with non-renewable tokens
- RSA4c/d/f: auth callback error handling
- RTN16: connection recovery (recovery key, msgSerial, channelSerials)
- RTN20: network change events (browser-only)
- RSF1/RTF1: forwards compatibility (unknown fields/actions)
- RSH7: push channel subscriptions (unit + integration)

Extended specs:
- RTL22/MFI: message filter subscriptions (channel_subscribe)
- RTN7e: error reason on publish failure (channel_publish)
- CHD2/CHM2: all ChannelMetrics fields (rest_channel_attributes)
- RTN16d/RTN16l: proxy-based recovery tests (connection_resume)

Updates completion-status.md and README spec counts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…uccess

Tests that a request completing successfully against a cached fallback
host after fallbackRetryTimeout has expired does not re-pin that host.
Uses the existing onRequest handler with a held PendingRequest pattern
rather than introducing a new mock primitive.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rame

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Go test proxy has moved to https://github.com/ably/uts-proxy.
Updated references in README, writing guide, and proxy infrastructure
spec to point to the external repo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Per specification#466, RSA4c1 (setting errorReason with code 80019) should
only apply to RSA4c2 (CONNECTING case), not RSA4c3 (CONNECTED case).

When auth fails while CONNECTED, the connection is still healthy — the
existing token is valid. Setting errorReason with no state change is
misleading. The failure will naturally surface when the token expires.

Changes:
- RSA4c3 test now asserts errorReason is null and no events are emitted
- RSA4c1 references replaced with RSA4c2 throughout (error definition
  absorbed into RSA4c2)
- Notes updated to explain the rationale

Ref: #466

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
connectionId is a top-level ProtocolMessage field, not inside
connectionDetails. RTN24's "connectionDetails must override stored
details" does not apply to it — connection.id never changes for an
in-progress connection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TM3 is about fromEncoded/fromEncodedArray, not generic fromJson.
TM4 is about Message constructors, not toJson serialization.
TM5 is about MessageAction enum, not message equality.

Removed toJson tests (no spec point requires a toJson method) and
replaced TM4 section with constructor tests per spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
With X-Ably-Version >= 3, the server returns a BatchResult envelope
({successCount, failureCount, results}) with HTTP 200 for all batch
responses including mixed success/failure. The UTS specs were
incorrectly mocking the legacy format (plain arrays + batchResponse
with HTTP 400) which is only used without a version header.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…5l4)

Proxy-based tests that exercise fallback behaviour through the real HTTP
client: request timeout triggers fallback (RSC15l2), and CloudFront
Server header should trigger fallback (RSC15l4).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds tests for unreachable endpoint, connection drop, 5xx with/without
error body, 4xx not retried, and RSL1k4 idempotent publish (pending
proxy enhancement). Extracts token auth helper used across all tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Format: <category>/<spec-point>/<descriptive-name>-<n>
Categories: rest/unit, rest/integration, rest/proxy,
            realtime/unit, realtime/integration, realtime/proxy

Also updates writing-test-specs.md and writing-derived-tests.md
with the Test ID convention and placement rules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The proxy already supports this action — remove the "proxy limitation"
section and update the test pseudocode to use http_replace_response.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Integration tests: endpoint: "sandbox" → "nonprod:sandbox"
- Unit tests: endpoint: "sandbox" → "test" (clearly not a real environment)
- Update host assertions in unit tests to match (test.realtime.ably.net etc.)
- Leave environment: "sandbox" tests unchanged (testing deprecated option)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace deprecated sandbox-rest.ably.io with sandbox.realtime.ably-nonprod.net
in all UTS test spec provisioning blocks and documentation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Establish a convention for running data-path integration tests with both
JSON and msgpack protocols. Each annotated spec gets a Protocol Variants
section and uses PROTOCOL == "msgpack" instead of hardcoded
useBinaryProtocol: false.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Portable test spec for verifying decode and round-trip of ably-common
msgpack_test_fixtures.json across SDK implementations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@paddybyers paddybyers force-pushed the uts-integration-proxy branch from f355624 to 2e12074 Compare May 14, 2026 20:17
@paddybyers paddybyers merged commit 8de26c9 into main May 14, 2026
2 checks passed
@paddybyers paddybyers deleted the uts-integration-proxy branch May 14, 2026 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants