Skip to content

feat(pinner): add Close with ErrClosed lifecycle#1150

Merged
lidel merged 5 commits into
mainfrom
feat/pinner-close
May 14, 2026
Merged

feat(pinner): add Close with ErrClosed lifecycle#1150
lidel merged 5 commits into
mainfrom
feat/pinner-close

Conversation

@lidel
Copy link
Copy Markdown
Member

@lidel lidel commented Apr 24, 2026

Follow-up to #1146. The Pinner had no lifecycle, so kubo/pebble could close the backing datastore while pinner goroutines were still reading it and hit the use-after-close panic that #1146 papers over with recover().

👉 This PR adds Close() error to the Pinner interface and implements it in dspinner.

Close fires context.AfterFunc on every admitted op's derived ctx, so in-flight Pin, Unpin, Flush, Update, and streaming goroutines that honor ctx bail out promptly with context.Canceled (scalar) or ErrClosed (stream) rather than draining to completion. After Close, every other method fails fast with ErrClosed; streaming methods deliver it as StreamedPin.Err on a single entry then close the channel. Close is idempotent and goroutine-safe.

This means Close returns about as fast as the slowest in-flight op takes to honor its ctx. Hosts with a hard shutdown deadline (e.g. kubo via ipfs/kubo#11329's shutdown.CloseWithCtx) get bounded shutdown for free, and the panic-recovery from #1146 stays as defense in depth.

Breaking change: downstream Pinner implementations must add Close() error. Hosts owning the datastore should call Close on the pinner before closing the datastore.

Kubo PR that uses this:

Pinner gains Close() error. Close waits for every in-flight operation,
including streaming goroutines from RecursiveKeys, DirectKeys, and
InternalPins, to finish. After Close, every other Pinner method fails
fast with a new ErrClosed sentinel; streaming methods surface it as the
Err of a single entry on the returned channel, which is then closed.

dspinner gains the matching implementation: Close is idempotent,
admission is serialised with sync.Mutex + sync.WaitGroup, and stream
sends select on a shutdown channel so a parked consumer cannot stall
Close. The panic-recovery and context guards added in #1146 stay in
place as defence in depth for hosts that do not wire Close correctly.

Hosts that own the backing datastore (e.g. kubo) should call Close on
the pinner before closing the datastore to avoid use-after-close
panics in pebble and similar stores.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 98.64865% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 63.24%. Comparing base (95ad7a5) to head (121e7ad).

Files with missing lines Patch % Lines
pinning/pinner/dspinner/pin.go 98.64% 0 Missing and 1 partial ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1150      +/-   ##
==========================================
+ Coverage   63.11%   63.24%   +0.13%     
==========================================
  Files         267      267              
  Lines       26777    26848      +71     
==========================================
+ Hits        16899    16979      +80     
+ Misses       8158     8152       -6     
+ Partials     1720     1717       -3     
Files with missing lines Coverage Δ
pinning/pinner/pin.go 31.25% <ø> (ø)
pinning/pinner/dspinner/pin.go 68.48% <98.64%> (+4.52%) ⬆️

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lidel lidel marked this pull request as ready for review May 12, 2026 14:41
@lidel lidel requested a review from a team as a code owner May 12, 2026 14:41
fix the errClosedChan rationale (buffering is needed because the
send is synchronous and has no reader; there is no goroutine to
leak) and the InternalPins comment that claimed the call was not
tracked by p.wg despite begin() doing wg.Add.

- changelog: condense the two close bullets, promote the
  downstream-impl break to an action-required callout, add #1150
  refs
- pinner.Pinner.Close / ErrClosed godoc: lead with the load-bearing
  verb, drop restatement
- dspinner: errClosedChan, InternalPins, streamIndex send/recover,
  begin, Close, SetAutosync comments shortened or corrected
lidel added a commit to ipfs/kubo that referenced this pull request May 13, 2026
@lidel lidel marked this pull request as draft May 13, 2026 21:52
Close fires context.AfterFunc on every admitted op's derived ctx, so
in-flight Pin/Unpin/streams that honor ctx bail out promptly instead
of forcing kubo's shutdown to wait or trip the watchdog.

- stopCtx + AfterFunc fan-out replaces the chan-based done signal
- begin returns the derived ctx; admitted methods thread it through
- streamIndex send watches one ctx.Done; Close-driven cancel from
  snapshotIndex surfaces as ErrClosed on the stream
lidel added a commit to ipfs/kubo that referenced this pull request May 13, 2026
ipfs/boxo#1150 was reworked to use context fan-out instead of a done
channel. Pinner.Close now cancels every admitted op and waits for
them to return, broadening the shutdown contract from "drain
streams" to "drain everything". Comments and changelog reworded to
match.
@lidel lidel marked this pull request as ready for review May 13, 2026 22:29
Comment thread pinning/pinner/dspinner/pin_test.go
lidel added 2 commits May 14, 2026 17:26
drop the 20ms/50ms timing in TestCloseWaitsForInFlightOperation and
move it into a synctest bubble (gammazero review). Go 1.26 does not
treat sync.RWMutex.Lock waiting behind a durably-held RLock as
durably blocked, so swap the non-ctx-aware blocker from the pinner's
RWMutex to a tiny ctxIgnoringDAGService whose Add parks on a channel
and ignores ctx. Same behavioral assertion (Close must wait on wg
when the cancellation fan-out cannot unstick the in-flight op),
deterministic and ~50x faster.

also drop the duplicated doc paragraph above the test.
@lidel lidel merged commit b2b5d8a into main May 14, 2026
24 checks passed
lidel added a commit to ipfs/kubo that referenced this pull request May 14, 2026
* feat(pinner): close pinner before repo on shutdown

The pinner's streaming goroutines hold a reference to the backing
datastore, and pebble panics on use after Close. Before this change
the panic was recovered inside the pinner (see ipfs/boxo#1146) and the
symptom was only a transient log trace on daemon exit, but the race
remained.

Register a new fx OnStop hook that calls pinner.Close before the repo
(and therefore the datastore) closes. Close drains all in-flight
stream goroutines, so the datastore is closed only after the pinner
is fully quiesced.

Bumps boxo to pick up Pinner.Close from ipfs/boxo#1150.

Fixes #11292

* chore(deps): bump boxo to ipfs/boxo#1150 (70ffcfa)

* chore(deps): bump boxo to ipfs/boxo#1150 (75481f4)

ipfs/boxo#1150 was reworked to use context fan-out instead of a done
channel. Pinner.Close now cancels every admitted op and waits for
them to return, broadening the shutdown contract from "drain
streams" to "drain everything". Comments and changelog reworded to
match.

* chore(deps): bump boxo to latest main (b2b5d8a)
@lidel lidel deleted the feat/pinner-close branch May 15, 2026 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants