fix(fedex): session warm-up + JSON API interception to fix tracking failures#25
Merged
Conversation
…ot detection FedEx tracking was failing with TargetClosedError during wait_for_selector. The tracking page is fronted by a bot protection service that terminates the browser context before rendering when hit cold. Two-part fix: - Visit the FedEx homepage (domcontentloaded) before the tracking URL to establish a real-looking session with cookies/TLS fingerprint - Intercept the internal /trackingCal/track JSON API response fired during page navigation and parse it directly, skipping wait_for_selector entirely - HTML parsing retained as a fallback if the JSON response is not captured Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three fixes discovered during live testing: 1. Tracking API is at api.fedex.com/track/v2/shipments (not /trackingCal/track) — update URL filter to match "api.fedex.com/track/" instead of "trackingCal" 2. API response JSON uses output.packages[0] structure with mainStatus/ estDeliveryDt/scanEventList fields — rewrite _parse_tracking_json accordingly 3. Race condition: API response fires DURING wait_for_selector (not during goto), so using wait_for_selector or page.wait_for_response as a blocking call misses it. Fix: use asyncio.Event set by _on_response; asyncio.wait_for races the event against a 45s timeout, then parses JSON if captured. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
FedEx tracking was failing with
TargetClosedErrorduringpage.wait_for_selector— the browser context was being terminated before the page could render when hit cold.Two-part fix in
scraper/carriers/fedex.py:fedex.com/en-us/home.html(domcontentloaded) before navigating to the tracking URL, establishing a realistic session with cookies and TLS fingerprint/trackingCal/trackendpoint to load tracking data. We now intercept that response duringpage.goto()and parse the JSON directly — skippingwait_for_selectorentirely (the source ofTargetClosedError). HTML parsing is retained as a fallback if the JSON response isn't captured.Also adds
_parse_tracking_json()to handle the intercepted JSON structure (TrackPackagesResponse.packageList[0]).Test plan
make test-scraper— all 181 tests passTestParseTrackingJson— 7 new unit tests for JSON parsing (delivered, in-transit, edge cases)TestAsyncTrackupdated — verifies twogotocalls, JSON path skipswait_for_selector, HTML fallback still worksPOST /api/packages/{number}/refresh, confirm non-unknown status returned🤖 Generated with Claude Code