Use eventfd for wakeup and enable IORING_SETUP_SINGLE_ISSUER#166
Use eventfd for wakeup and enable IORING_SETUP_SINGLE_ISSUER#166samuel-williams-shopify wants to merge 1 commit intomainfrom
Conversation
07f1949 to
dc2593b
Compare
Benchmark results (Linux 6.19, Ruby 3.4.8, dedicated machine)Methodology: two git worktrees built independently — Wakeup microbenchmark (
|
| Backend | main | PR | Δ |
|---|---|---|---|
| URing | 9.32 µs | 10.21 µs | +9% |
| EPoll | 11.19 µs | 10.84 µs | noise |
| Select | 21.06 µs | 18.36 µs | noise |
Idle wakeup cost (calling wakeup() when selector is not blocking):
| Backend | main | PR |
|---|---|---|
| URing | 1.54 µs | 1.53 µs |
The ~1 µs roundtrip gap vs NOP is the irreducible cost of the eventfd write going through the kernel and completing a pending async read, rather than a NOP arriving at io_uring_wait_cqe_timeout directly. DEFER_TASKRUN closes most of this gap (was ~2.6 µs without it).
HTTP benchmark (benchmark/server/event.rb, fiber-per-connection)
| main (NOP) | PR | Δ | |
|---|---|---|---|
| Run 1 | 14,854 | 14,821 | |
| Run 2 | 14,085 | 14,592 | |
| Run 3 | 14,171 | 14,240 | |
| Run 4 | 13,774 | 14,768 | |
| Run 5 | 14,057 | 14,544 | |
| Run 6 | 14,717 | 14,627 | |
| Run 7 | 14,965 | 14,566 | |
| Average | 14,375 req/s | 14,594 req/s | +1.5% |
The DEFER_TASKRUN flag benefits the entire completion path, not just wakeup — which is why the HTTP throughput improves even though the pure wakeup latency is similar.
dc2593b to
8e7e932
Compare
Replace the NOP-SQE cross-thread wakeup with an async read on an IO_Event_Interrupt (eventfd on Linux, pipe elsewhere): - wakeup() calls IO_Event_Interrupt_signal() — a plain write() that never touches the ring's SQ, making IORING_SETUP_SINGLE_ISSUER safe to use. - Before each blocking wait the owner thread submits an async read on the interrupt descriptor; the read completes when wakeup() fires, consuming the counter atomically with no separate drain step. - IORING_SETUP_SINGLE_ISSUER (kernel 6.0+): only the owner thread submits SQEs, allowing the kernel to skip internal SQ locking. - IORING_SETUP_DEFER_TASKRUN (kernel 6.1+, requires SINGLE_ISSUER): defers io_uring task work to the application thread, reducing cross-CPU signalling overhead across the entire completion path (+~1-2% on HTTP benchmarks). Both flags are guarded by #ifdef and degrade gracefully on older kernels. Co-authored-by: Cursor <cursoragent@cursor.com>
8e7e932 to
c9ba733
Compare
Motivation
The current
wakeup()implementation submits a NOP SQE from the waking thread directly into the ring's submission queue. This cross-thread SQ access prevents enablingIORING_SETUP_SINGLE_ISSUER, a kernel 6.0+ flag that tells the kernel only one thread will ever submit SQEs, allowing it to skip internal SQ locking on every submit call.Changes
eventfd-based wakeupInstead of posting a NOP from the waking thread, the selector now:
eventfd(EFD_CLOEXEC | EFD_NONBLOCK)at initialization.io_uring_wait_cqe_timeout, the owner thread registers a one-shotpoll_addon the eventfd (user_data = NULL, silently discarded on completion).wakeup()— called from any thread — just callswrite(wakeup_fd, 1). No ring access from the waking thread.io_uring_wait_cqe_timeoutreturns.poll_adddoesn't fire immediately.IORING_SETUP_SINGLE_ISSUERBecause the SQ is now only ever touched by the owner thread,
io_uring_queue_initis called withIORING_SETUP_SINGLE_ISSUERwhen the flag is available (guarded by#ifdef, defined in liburing ≥ 2.3 / kernel ≥ 6.0). Thesupported_pprobe uses the same flags.Benchmark workflow fix
benchmark.yamlwas missingliburing-dev, so benchmarks silently fell back to the epoll backend. Added the apt install step so benchmarks now actually exercise URing.Testing
The existing
wakeuptests intest/io/event/selector.rbcover the correctness of cross-thread wakeup. CI will run them on Ubuntu withliburing-devas usual.The benchmark workflow (
benchmark.yaml) will now also run against the URing backend, providing throughput numbers to compare against themainbranch.Made with Cursor