Skip to content

Skip WAF breaker on recoverable CF challenges; add pacer warm-up#390

Merged
simonsmallchua merged 2 commits into
mainfrom
work/modest-heisenberg-87c48b
May 21, 2026
Merged

Skip WAF breaker on recoverable CF challenges; add pacer warm-up#390
simonsmallchua merged 2 commits into
mainfrom
work/modest-heisenberg-87c48b

Conversation

@simonsmallchua
Copy link
Copy Markdown
Contributor

@simonsmallchua simonsmallchua commented May 20, 2026

Summary

  • WAF circuit breaker no longer trips on recoverable Cloudflare Cf-Mitigated values (challenge, jschallenge, managed_challenge, rate_limited). Only block (and unknown values, conservatively) still trips. The 403/429 status code still drives pacer back-off, so the job stays alive and the host gets slowed down instead of terminated.
  • New GNH_PACER_WARMUP_DELAY_MS (default 2000) seeds adaptive_delay_ms for never-crawled domains (DB column is NULL). Activates DomainPacer.EffectiveCap from the first dispatch so a fresh sitemap can't burst N requests at a CF-fronted host before the pacer has learned anything. Steps down via the existing success path; never re-applies once write-back has stored a learned value.

Why

A merrypeople.com run (job e1b38f68-…) took 42h to complete 178/1221 tasks at ~14m/task. Worker logs from a restart showed the WAF breaker tripping after ~10 minutes with vendor=cloudflare reason="cf-mitigated header present on 403" threshold=2. The Shopify storefront sits behind CF and intermittently challenges Fly's egress IPs — the existing detector treated any non-empty Cf-Mitigated on non-200 as a hard block, so two consecutive challenges (amongst many successes) terminated the job.

Test plan

  • go test ./... — full suite passes locally.
  • Restart a known-CF-fronted job (e.g. merrypeople) on main post-merge and verify it runs to completion rather than tripping the breaker.
  • Confirm a first-time domain seeds adaptive_delay_ms=2000 in Redis (HGET hover:dom:cfg:<domain> adaptive_delay_ms) and converges down via the success path.
  • Confirm domains with a learned non-NULL value in domains.adaptive_delay_seconds are unchanged on restart.

View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

Summary by CodeRabbit

  • Bug Fixes

    • Improved Cloudflare WAF circuit breaker handling: certain mitigation responses now recover gracefully instead of triggering the breaker; 403/429 errors continue to drive back-off behavior.
  • New Features

    • Added pacer warm-up delay for domains without prior adaptive delay data, configurable via environment variable (default 2000ms).

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

Warning

Rate limit exceeded

@simonsmallchua has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 55 minutes and 18 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: eff8f898-5b3a-4677-8f1d-aebda21d4e62

📥 Commits

Reviewing files that changed from the base of the PR and between e17ef94 and 8a3efb1.

📒 Files selected for processing (1)
  • internal/crawler/waf_test.go
📝 Walkthrough

Walkthrough

PR adds two crawler improvements: Cloudflare WAF now treats certain Cf-Mitigated values as recoverable (non-blocking), and pacer initialization applies a configurable warmup delay floor for domains with no prior history, ensuring inflight caps remain active from first dispatch.

Changes

Cloudflare WAF Recoverable Mitigations & Pacer Warmup

Layer / File(s) Summary
Cloudflare WAF recoverable mitigations
internal/crawler/waf.go, internal/crawler/waf_test.go
cfRecoverableMitigations map defines header values (challenge, managed_challenge) that do not trigger WAF block verdicts even on non-200 responses. DetectWAF normalizes and checks these values, returning no verdict for recoverable actions and constructing detailed reason strings (cf-mitigated=<value> on <status>) for hard blocks. Test cases updated to verify recoverable 403/429 cases and hard-block behavior.
Pacer warmup initialization floor
internal/jobs/stream_worker.go
GNH_PACER_WARMUP_DELAY_MS environment variable (default 2000ms, validated as non-negative integer) configures adaptive delay seeding for domains with no prior crawl history. When adaptive_delay_seconds is NULL during pacer initialization, the warmup delay is applied to ensure per-domain inflight caps remain active from first dispatch.
Release notes documentation
CHANGELOG.md
Unreleased section populated with Fixed entry clarifying WAF circuit breaker behavior for recoverable Cf-Mitigated values and Added entry describing pacer warmup floor for never-crawled domains.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Good-Native/hover#382: Both PRs modify Cf-Mitigated handling in DetectWAF around challenge and 429 responses, with this PR refining verdicts to treat certain mitigations as recoverable while #382 expands blocking logic.
  • Good-Native/hover#383: Both PRs modify pacer adaptive-delay seeding in internal/jobs/stream_worker.go, with this PR adding a warmup floor for never-crawled domains and #383 changing persistence and reload behavior.
  • Good-Native/hover#368: Both PRs update DetectWAF verdict logic and test expectations, with this PR addressing Cloudflare Cf-Mitigated recoverable handling and #368 addressing Akamai Bot Manager blocking.

Poem

🐇 Hops with glee through the detector's gate,
Some challenges now pass—no need to block of late!
New domains warm up before they take flight,
Adaptive delays flowing, keeping caps just right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes both main changes: WAF circuit breaker behavior for recoverable Cloudflare challenges and the new pacer warm-up feature.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch work/modest-heisenberg-87c48b

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@supabase
Copy link
Copy Markdown

supabase Bot commented May 20, 2026

Updates to Preview Branch (work/modest-heisenberg-87c48b) ↗︎

Deployments Status Updated
Database Wed, 20 May 2026 04:01:30 UTC
Services Wed, 20 May 2026 04:01:30 UTC
APIs Wed, 20 May 2026 04:01:30 UTC

Tasks are run on every commit but only new migration files are pushed.
Close and reopen this PR if you want to apply changes from existing seed or migration files.

Tasks Status Updated
Configurations Wed, 20 May 2026 04:01:32 UTC
Migrations Wed, 20 May 2026 04:01:33 UTC
Seeding Wed, 20 May 2026 04:01:35 UTC
Edge Functions Wed, 20 May 2026 04:01:35 UTC

View logs for this Workflow Run ↗︎.
Learn more about Supabase for Git ↗︎.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

Release Versions

App patch: v0.34.15v0.34.16

Changelog

Fixed

  • WAF circuit breaker no longer trips on recoverable Cloudflare Cf-Mitigated
    values (challenge, jschallenge, managed_challenge, rate_limited); the
    403/429 status code still drives pacer back-off. Only block (and unknown
    values) trips the breaker.

Added

  • Pacer warm-up floor: never-crawled domains seed adaptive_delay_ms to
    GNH_PACER_WARMUP_DELAY_MS (default 2000) instead of 0, so the per-domain
    inflight cap is active from the first dispatch. Steps down via the existing
    success path.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/crawler/waf_test.go (1)

114-154: ⚡ Quick win

Add regression cases for remaining recoverable tokens and normalisation.

Please add table rows for Cf-Mitigated=jschallenge, Cf-Mitigated=rate_limited, and a mixed-case/whitespace variant (for example " Managed_Challenge "). This locks in all branches introduced by the new normalisation/membership logic.

Suggested test additions
 		{
+			name:   "cloudflare — cf-mitigated=jschallenge on 403 is recoverable",
+			status: http.StatusForbidden,
+			headers: http.Header{
+				"Cf-Mitigated": []string{"jschallenge"},
+				"Server":       []string{"cloudflare"},
+			},
+			body:        []byte("challenge page"),
+			wantBlocked: false,
+		},
+		{
+			name:   "cloudflare — cf-mitigated=rate_limited on 429 is recoverable",
+			status: http.StatusTooManyRequests,
+			headers: http.Header{
+				"Cf-Mitigated": []string{"rate_limited"},
+				"Server":       []string{"cloudflare"},
+			},
+			body:        []byte("rate limited"),
+			wantBlocked: false,
+		},
+		{
+			name:   "cloudflare — cf-mitigated normalisation (case/space) is recoverable",
+			status: http.StatusForbidden,
+			headers: http.Header{
+				"Cf-Mitigated": []string{" Managed_Challenge "},
+				"Server":       []string{"cloudflare"},
+			},
+			body:        []byte("checking your browser"),
+			wantBlocked: false,
+		},
+		{
 			name:   "cloudflare — cf-mitigated=block on 403 is a hard block",
 			status: http.StatusForbidden,
 			headers: http.Header{
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/crawler/waf_test.go` around lines 114 - 154, Add three additional
table-driven test cases in the same test slice used in waf_test.go (the table
entries shown) to cover the remaining recoverable tokens and normalization: one
with "Cf-Mitigated" = "jschallenge" and status 403 expecting wantBlocked=false,
one with "Cf-Mitigated" = "rate_limited" and status 429 expecting
wantBlocked=false, and one with a mixed-case/whitespace variant like "
Managed_Challenge " (or "Managed_Challenge" with odd casing/spaces) to assert
the normalization logic treats it as recoverable (wantBlocked=false); for the
hard-block case style tests, also include appropriate
wantVendor=WAFVendorCloudflare and reasonPrefix assertions where applicable to
mirror the existing "block" entry pattern so all branches introduced by the
normalization/membership checks are exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/crawler/waf_test.go`:
- Around line 114-154: Add three additional table-driven test cases in the same
test slice used in waf_test.go (the table entries shown) to cover the remaining
recoverable tokens and normalization: one with "Cf-Mitigated" = "jschallenge"
and status 403 expecting wantBlocked=false, one with "Cf-Mitigated" =
"rate_limited" and status 429 expecting wantBlocked=false, and one with a
mixed-case/whitespace variant like " Managed_Challenge " (or "Managed_Challenge"
with odd casing/spaces) to assert the normalization logic treats it as
recoverable (wantBlocked=false); for the hard-block case style tests, also
include appropriate wantVendor=WAFVendorCloudflare and reasonPrefix assertions
where applicable to mirror the existing "block" entry pattern so all branches
introduced by the normalization/membership checks are exercised.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 86410889-8e14-4ca1-b59c-bc527a597055

📥 Commits

Reviewing files that changed from the base of the PR and between 598f771 and e17ef94.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • internal/crawler/waf.go
  • internal/crawler/waf_test.go
  • internal/jobs/stream_worker.go

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 28.57143% with 10 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
internal/jobs/stream_worker.go 0.00% 10 Missing ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown
Contributor

🐝 Review App Deployed

Homepage: https://hover-pr-390.fly.dev
Dashboard: https://hover-pr-390.fly.dev/dashboard

@github-actions
Copy link
Copy Markdown
Contributor

🐝 Review App Deployed

Homepage: https://hover-pr-390.fly.dev
Dashboard: https://hover-pr-390.fly.dev/dashboard

@simonsmallchua simonsmallchua merged commit fcd984c into main May 21, 2026
21 checks passed
@simonsmallchua simonsmallchua deleted the work/modest-heisenberg-87c48b branch May 21, 2026 11:01
simonsmallchua added a commit that referenced this pull request May 21, 2026
Skip WAF breaker on recoverable CF challenges; add pacer warm-up
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant