osfs: Add experimental WithMmap for read-only Opens#213
Merged
Conversation
pjbgf
reviewed
May 17, 2026
pjbgf
reviewed
May 17, 2026
Pins the concurrency contract on `billy.File.ReadAt`: parallel calls from multiple goroutines on the same handle must be safe, matching the `io.ReaderAt` contract documented in package `io`. The contract has always been inherited via the embedded interface, but only stdlib convention enforced it — backings could ship a `billy.File` whose `ReadAt` serialised reads and still satisfy the interface. Making the contract explicit in the godoc lets consumers performing concurrent random-access reads (e.g. pack readers, idx scanners) rely on a single shared handle without type-asserting against a separate capability interface. Adds `-race` tests in `memfs` and `embedfs` that fan eight workers out across a shared handle, hitting randomised offsets against a known pattern. Both backings already honoured the contract — `memfs/content.ReadAt` reads under a `sync.RWMutex`, and `embedfs` delegates to `bytes.Reader.ReadAt` which is concurrent-safe by design — but the tests pin the behaviour so future wrappers cannot regress it. `osfs` inherits the same property from `*os.File.ReadAt` (`pread` on POSIX, `ReadFile`-with-offset on Windows), exercised by the existing concurrent osfs tests. The `embedfs` fixture set picks up a deterministic 4 KiB `testdata/concurrent.bin` for the test; the existing `TestReadDir` assertion that enumerates the `testdata` directory is updated accordingly. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Adds an opt-in `osfs.WithMmap()` constructor option. On darwin and linux, an `osfs` filesystem built with the option returns a memory- mapped `billy.File` from `Open` (and any `OpenFile` without write flags); on other platforms the option is accepted but has no effect — the fd-backed `*file` wrapper continues to satisfy `billy.File` with the same `*os.File.ReadAt` semantics as before. The mmap-backed file is a new internal `*mmapFile` type. It implements `billy.File` honestly: `Read` and `Seek` track a real cursor over the mapped bytes, `ReadAt` is concurrent-safe via an `RWMutex` that serialises Close against in-flight reads (otherwise munmap would invalidate the mapping under a racing read and the kernel would raise SIGBUS), and `Write`/`WriteAt`/`Truncate` return `os.ErrPermission`. A `runtime.SetFinalizer` is set on construction and cleared on Close so a forgotten Close cannot leak the mapping — the pattern `golang.org/x/exp/mmap` uses, transposed to our handle. Files where mmap is unavailable for benign reasons (size 0, size larger than `math.MaxInt` on a 32-bit platform, kernel rejection for pipes / devices / FS quirks) fall through to the regular `*file` wrapper without surfacing an error. Real failures (stat failing, etc.) are propagated and the underlying fd is closed. The option is marked **experimental** in its godoc. Today every read-only `Open` on a `WithMmap`-configured filesystem becomes mmap-backed; that is intentional but may not be the right granularity in the long run. If real-world feedback shows the per-FS opt-in is too coarse — for example, callers reading both large hot files (where mmap helps) and small config/refs files (where it is neutral or net-negative) under the same filesystem — the option's effect may narrow to per-call. This supersedes earlier iterations of the feature (an extension `billy.ReaderAtFS` interface returning a narrow `ReaderAtCloser` type, with `OpenReaderAt` methods on every backing and forwarder plumbing through `helper/chroot` and `helper/polyfill`). The narrower design did not earn its API surface for what is, in the end, "let `osfs` opt into mmap" — `billy.File` already names the shape of a readable, closable, seekable file, and the concurrency-safe `ReadAt` contract on it (pinned in the previous commit) is sufficient for the use cases the extension interface was meant to serve. Tests parametrise across both backings (`fd`, `mmap`) where the platform offers them. `TestOpenBackingSelection` asserts that `WithMmap` actually flips the concrete type on darwin/linux and that the write-mode case bypasses mmap regardless. Concurrent ReadAt tests run under `-race` to exercise the RWMutex. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pins the
io.ReaderAtconcurrent-safety contract onbilly.File.ReadAtso consumers performing concurrent random-access reads (pack scanners, idx readers, etc.) can rely on a shared handle. Adds-racetests inmemfsandembedfsasserting compliance;osfsalready inherits the property from*os.File.ReadAt.Adds an opt-in
osfs.WithMmap()constructor option. On darwin and linux, opening a file without write flags on aWithMmap-configured filesystem returns a memory-mappedbilly.File; on other platforms the option is accepted but has no effect. The mmap-backed file:Read/Seek;ReadAtconcurrently via anRWMutexthat serialisesCloseagainst in-flight reads (otherwisemunmapwould invalidate the mapping under a racing read);Write/WriteAt/Truncatewithos.ErrPermission, it is read-only by construction;runtime.SetFinalizeron construction and clears it onClose, so a forgottenClosedoesn't leak the mapping;*filewrapper when mmap is unavailable for a specific file (size 0, larger thanmath.MaxInton a 32-bit platform, kernel rejection for pipes/devices/FS quirks).Motivation
Consumers reading large hot files repeatedly (pack/idx/rev scanners in go-git is the motivating example) benefit from an mmap-backed
ReadAtpath: per-call syscall cost goes away, and parallel reads from multiple goroutines run without serialisation. The existing*os.File.ReadAtalready serves correct concurrent random-access reads on darwin/linux; the new path replaces thepread(2)syscall with a slice copy.go-git/go-git#2132(PackHandle refcounted FD lifecycle) consumes this path viafs.Open(path).ReadAt(...)on aWithMmap-configured osfs.Experimental
WithMmapis marked experimental in its godoc. Today every read-onlyOpenon aWithMmap-configured filesystem becomes mmap-backed — including small files (config, refs, hooks) where mmap can be neutral or net-negative. If real-world feedback shows the per-FS opt-in is too coarse, the option's effect may narrow to per-call later.mmap also changes failure semantics:
SIGBUSinstead of returning an error.billy.Synceror theLockerinterface even though the surrounding filesystem advertises both capabilities.Callers that may see the file mutate underneath them should leave the option off.