Releases: Good-Native/hover
Releases · Good-Native/hover
v0.34.17
Fixed
- WAF circuit breaker no longer trips on recoverable Cloudflare
Cf-Mitigated
values (challenge,jschallenge,managed_challenge,rate_limited); the
403/429 status code still drives pacer back-off. Onlyblock(and unknown
values) trips the breaker.
Added
- Pacer warm-up floor: never-crawled domains seed
adaptive_delay_msto
GNH_PACER_WARMUP_DELAY_MS(default 2000) instead of 0, so the per-domain
inflight cap is active from the first dispatch. Steps down via the existing
success path.
v0.34.16
Changed
- Retired the legacy
task-htmlSupabase Storage bucket. Page HTML has been
written directly to Cloudflare R2 since 2026-04-25, so the bucket was no
longer referenced by any code path but had retained the objects written during
the four-week window when it was the hot store. The accumulated bytes pushed
the Supabase project past its 100 GB allowance and triggered connection-slot
restrictions on the pooler, surfacing aspgconn.ConnectErrorevents in
Sentry (HOVER-JG). The migration drops only the service-role RLS policy on
storage.objects. Removal of the bucket row itself cannot be done via SQL
(Supabase blocks direct deletes fromstorage.bucketswith SQLSTATE 42501)
and must be performed via the Supabase Storage dashboard or API as a manual
operational step, after the bucket has been emptied. - Cleared dangling
task-htmlpointers on thetaskstable. Rows written
between 2026-03-21 and 2026-04-25 hadhtml_storage_bucket = 'task-html'and
ahtml_storage_pathreferencing the now-removed bucket. Both columns are
NULLed for those rows; the remaining HTML metadata columns
(html_content_type,html_content_encoding,html_size_bytes,
html_compressed_size_bytes,html_sha256,html_captured_at) are kept for
historical analysis. Thehtml_storage_*columns remain in active use for
newer rows, which point at the Cloudflare R2 bucket.
v0.34.15
Fixed
fly-autoscalerno longer logs
metrics collection failed: empty prometheus resultonce a minute on both
hover-autoscaler-workerandhover-autoscaler-analysis. The broker gauges
(bee_broker_stream_length,bee_broker_scheduled_zset_depth) are
synchronous OTelInt64Gauges, which only emit whenRecord()lands inside a
collect interval; during idle the series goes stale in Fly's managed
Prometheus and the autoscaler's PromQL returns no result. The autoscaler
queries now wrap withor on() vector(0)so an empty result collapses to zero
rather than erroring. Scaling behaviour is unchanged at idle (the existing
max(1, …)floor already kept a single machine running). Trade-off documented
inline: a true Redis outage now reads0instead of producing a series gap,
so the autoscaler scales toMIN=1rather than holding count — acceptable
because idle workers can't crawl during an outage anyway and restart cleanly
once Redis recovers. The full fix (async observable gauges) is tracked in a
follow-up issue.
Security
- Bump
github.com/jackc/pgx/v5from v5.7.6 to v5.9.2 to resolve a
memory-safety vulnerability (Dependabot alert #54). - Bump
@webflow/webflow-clifrom ^1.12.4 to ^1.21.0 in
webflow-designer-extension-cli/to clear transitive dev-dep vulnerabilities
(axios, follow-redirects, fast-uri, babel, postcss). Webflow extension is
dev-only tooling and does not ship to production.
v0.34.14
Security
- Bump
github.com/jackc/pgx/v5from v5.7.6 to v5.9.2 to resolve a
memory-safety vulnerability (Dependabot alert #54). - Bump
@webflow/webflow-clifrom ^1.12.4 to ^1.21.0 in
webflow-designer-extension-cli/to clear transitive dev-dep vulnerabilities
(axios, follow-redirects, fast-uri, babel, postcss). Webflow extension is
dev-only tooling and does not ship to production.
v0.34.13
Fixed
- App, worker, and analysis binaries no longer
Fatalon the first RedisPING
failure at startup. The ping is now wrapped in a bounded retry loop (30 s
total, 3 s per attempt, capped exponential backoff) so the binary rides out
the Upstash-on-Fly cold-start window that briefly closes connections with EOF
on freshly-provisioned review apps. Production behaviour is unchanged — a
healthy Redis still succeeds on the first attempt and persistent
misconfiguration still fails fast. Resolves the recurring EOF burst on every
PR preview deploy (Sentry: HOVER-JX, HOVER-MD, HOVER-JZ).
v0.34.12
Changed
JobManager.GetRobotsRulesnow caches results per normalised domain (1h
positive TTL, 60s negative TTL), and collapses concurrent misses onto a single
origin fetch via singleflight. A long crawl previously refetched/robots.txt
every five minutes (stream worker's job-info TTL) and a 429 on/robots.txt
returned on the next read; both are now bounded.
v0.34.11
Changed
- Crawler user agent is now always exactly
config.UserAgent. Dropped the dead
Worker-<id>suffix branch incrawler.Newalong with the unused variadic ID
parameter and struct field.
v0.34.10
Changed
- Pacer's per-domain adaptive delay is now durable:
domains.adaptive_delay_secondsis read on every job-info cache miss and
reseeded into Redis, and the learned value is written back from the pacer's
success/rate-limit path (debounced per domain at five-minute intervals). The
startupFlushAdaptiveDelaysis now opt-in via
GNH_PACER_FLUSH_ON_START=truefor incident recovery; default behaviour
preserves the learned rate across worker restarts. - Dispatcher now caps per-domain inflight tasks at
ceil(GNH_PACER_EST_RESPONSE_MS / adaptive_delay_ms)(default response
estimate 1500ms). Above the cap, additional entries skip dispatch without
consuming the gate, preventing the burst-then-collapse pattern that elevates
egress IP reputation on CF-fronted domains.
v0.34.9
Fixed
- WAF detection now recognises Cloudflare managed challenges served as HTTP 429
withCf-Mitigated: challenge. Previously the verdict was gated behind status
403 or 202 only, so CF challenge responses (observed against CF-fronted
Shopify storefronts with Super Bot Fight Mode enabled) were misclassified as
plain "Too Many Requests" and jobs burnt three retries before failing with a
misleading error. Jobs now fail fast and stampdomains.waf_blocked = true
withwaf_vendor = cloudflare.
v0.34.8
Security
- Enabled RLS (no policies) and revoked
anon/authenticatedgrants on
task_outbox,task_outbox_dead, andlighthouse_runs; these tables are
only accessed by the Go server via the service role. - Switched the
organisation_quota_statusview tosecurity_invoker = trueso
it honours the caller's RLS rather than the creator's. - Revoked
anon/authenticatedEXECUTEon 19 server-internal
SECURITY DEFINERfunctions (OAuth token store/get/delete for Google
Analytics, Slack, and Webflow; vault cleanup helpers; Slack user-link helpers;
increment_daily_usage). These RPCs are only called by the Go server via the
service role; the three RLS-helper functions used inside policies
(user_is_member_of,user_organisation_id,user_organisations) remain
callable.
Performance
- Rewrote 14 RLS policies on
notifications,daily_usage,
google_analytics_connections,google_analytics_accounts, and
organisation_domainsto wrapauth.uid()in a(select …)so it is
evaluated once per query instead of once per row. - Scoped the
Service role can manage usagepolicy ondaily_usage
TO service_roleso it no longer fires during anon/authenticated SELECTs,
removing the multiple-permissive-policies overhead. - Pinned
search_pathonupdate_job_queue_countersand
get_daily_quota_remaining. - Added covering indexes on nine previously-unindexed foreign keys
(google_analytics_accounts.installing_user_id,
google_analytics_connections.installing_user_id,
lighthouse_runs.source_task_id,organisation_invites.created_by,
page_analytics.ga_connection_id,platform_org_mappings.created_by,
slack_connections.installing_user_id,task_outbox_dead.lighthouse_run_id,
webflow_connections.installing_user_id) so cascade deletes and FK joins no
longer fall back to sequential scans.
Documentation
- Added
docs/security/SUPABASE_ADVISORS.md
recording the deliberate "won't fix" advisor findings (the three RLS-helper
SECURITY DEFINERfunctions, the empty-policy state ofdomain_hosts) and
deferred items (unused indexes, Auth DB connection strategy).