Skip to content

fix(threadpool): per-task nursery scope for sync tasks — snapshot UAF (no scope-pointer swap)#160

Open
EdmondDantes wants to merge 2 commits into
mainfrom
windows-threadpool-snapshot-uaf-clean
Open

fix(threadpool): per-task nursery scope for sync tasks — snapshot UAF (no scope-pointer swap)#160
EdmondDantes wants to merge 2 commits into
mainfrom
windows-threadpool-snapshot-uaf-clean

Conversation

@EdmondDantes
Copy link
Copy Markdown
Contributor

Problem

A synchronous ThreadPool task (new ThreadPool(N), i.e. coroutine_mode = false) ran its body inline in the worker's main coroutine and freed the per-task snapshot arena the instant the body returned. If the body had Async\spawn()ed a coroutine it never awaited, that coroutine was still pending in the scheduler — and its op_array lives in the just-freed arena. Running it later dereferenced freed memory: hard crash on the Windows debug heap, latent + ASAN-caught on Linux.

Approach (supersedes #159)

#159 fixed this by temporarily swapping the worker coroutine's scope pointer to redirect Async\spawn() into a per-task nursery, plus a hand-rolled cancel/await/pin dance. This PR removes that swap entirely.

The task body now runs as a real coroutine in its own per-task nursery Scope. Because CURRENT SCOPE follows the running coroutine, Async\spawn() inside the body lands in that scope on its own — no coro->scope hijacking. On task exit the worker cancels the scope and drains it — awaiting the physical disposal of every spawned coroutine — before freeing the snapshot.

Invariant: no coroutine spawned by the task outlives the snapshot. (Whether such a coroutine got to run at all is timing-dependent and intentionally not asserted — the only guarantee is no use-after-free.)

The drain reuses the canonical zombie-aware logic of Scope::awaitAfterCancellation instead of re-implementing it: its body is factored into a C function exposed on the engine API.

Changes

  • php-src (ABI v0.20.0, already on true-async-stable): new zend_async_scope_await_after_cancellation_fn + ZEND_ASYNC_SCOPE_AWAIT_AFTER_CANCELLATION macro. Scope::awaitAfterCancellation is now a thin wrapper over the shared C core.
  • ext/async: sync task path in thread_pool.c rewritten (NEW_SCOPE nursery → SPAWN_WITH → await body → cancel → drain → free snapshot); scope.c/.h factor + register the C core; async_API.c registration. pool_task_dispose/coroutine-mode path untouched.

Tests

tests/thread_pool/065-task_scope_nursery_no_uaf.phpt (spawned-not-awaited coroutine still pending at task return). Full tests/thread_pool/ 66/66 (+1 pre-existing XFAIL) and scope/task_group/task_set/coroutine/spawn/await/future/pool/channel (≈480) green under ASAN+ZTS.

Requires php-src true-async-stable at ABI v0.20.0 (pushed first so this PR's CI clones it).

…shot UAF)

A synchronous ThreadPool task ran its body inline in the worker and freed
the per-task snapshot arena the moment the body returned — while a coroutine
the body spawned but never awaited was still pending. That coroutine's
op_array lived in the freed arena (Windows debug-heap crash; ASAN-caught on
Linux).

Run the body as a coroutine in its own per-task nursery Scope instead:
Async\spawn() inside the body lands in that scope on its own (no coro->scope
hijacking). On task exit the worker cancels the scope and drains it via the
new ZEND_ASYNC_SCOPE_AWAIT_AFTER_CANCELLATION — awaiting physical disposal of
every spawned coroutine — before freeing the snapshot. Invariant: no spawned
coroutine outlives the snapshot.

Requires php-src ABI v0.20.0 (zend_async_scope_await_after_cancellation_fn).
Regression test tests/thread_pool/065-task_scope_nursery_no_uaf.phpt.
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Running the body as a coroutine meant a fatal (e.g. OOM) re-raised
zend_bailout() out of the task coroutine and longjmped to the worker's
bailout handler, past the future resolution — and that handler only drains
the channel, not the already-dequeued in-flight task, so the awaiter hung.
Track the in-flight task's future in a volatile and reject it from the
bailout handler (ThreadTransferException), restoring the prior behaviour.

Also trim the over-verbose comments from the previous commit.

Test tests/thread_pool/066-task_fatal_rejects_future.phpt.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant