fix(threadpool): per-task nursery scope for sync tasks — snapshot UAF (no scope-pointer swap)#160
Open
EdmondDantes wants to merge 2 commits into
Open
fix(threadpool): per-task nursery scope for sync tasks — snapshot UAF (no scope-pointer swap)#160EdmondDantes wants to merge 2 commits into
EdmondDantes wants to merge 2 commits into
Conversation
…shot UAF) A synchronous ThreadPool task ran its body inline in the worker and freed the per-task snapshot arena the moment the body returned — while a coroutine the body spawned but never awaited was still pending. That coroutine's op_array lived in the freed arena (Windows debug-heap crash; ASAN-caught on Linux). Run the body as a coroutine in its own per-task nursery Scope instead: Async\spawn() inside the body lands in that scope on its own (no coro->scope hijacking). On task exit the worker cancels the scope and drains it via the new ZEND_ASYNC_SCOPE_AWAIT_AFTER_CANCELLATION — awaiting physical disposal of every spawned coroutine — before freeing the snapshot. Invariant: no spawned coroutine outlives the snapshot. Requires php-src ABI v0.20.0 (zend_async_scope_await_after_cancellation_fn). Regression test tests/thread_pool/065-task_scope_nursery_no_uaf.phpt.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Running the body as a coroutine meant a fatal (e.g. OOM) re-raised zend_bailout() out of the task coroutine and longjmped to the worker's bailout handler, past the future resolution — and that handler only drains the channel, not the already-dequeued in-flight task, so the awaiter hung. Track the in-flight task's future in a volatile and reject it from the bailout handler (ThreadTransferException), restoring the prior behaviour. Also trim the over-verbose comments from the previous commit. Test tests/thread_pool/066-task_fatal_rejects_future.phpt.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A synchronous
ThreadPooltask (new ThreadPool(N), i.e.coroutine_mode = false) ran its body inline in the worker's main coroutine and freed the per-task snapshot arena the instant the body returned. If the body hadAsync\spawn()ed a coroutine it never awaited, that coroutine was still pending in the scheduler — and itsop_arraylives in the just-freed arena. Running it later dereferenced freed memory: hard crash on the Windows debug heap, latent + ASAN-caught on Linux.Approach (supersedes #159)
#159 fixed this by temporarily swapping the worker coroutine's
scopepointer to redirectAsync\spawn()into a per-task nursery, plus a hand-rolled cancel/await/pin dance. This PR removes that swap entirely.The task body now runs as a real coroutine in its own per-task nursery
Scope. BecauseCURRENT SCOPEfollows the running coroutine,Async\spawn()inside the body lands in that scope on its own — nocoro->scopehijacking. On task exit the worker cancels the scope and drains it — awaiting the physical disposal of every spawned coroutine — before freeing the snapshot.Invariant: no coroutine spawned by the task outlives the snapshot. (Whether such a coroutine got to run at all is timing-dependent and intentionally not asserted — the only guarantee is no use-after-free.)
The drain reuses the canonical zombie-aware logic of
Scope::awaitAfterCancellationinstead of re-implementing it: its body is factored into a C function exposed on the engine API.Changes
true-async-stable): newzend_async_scope_await_after_cancellation_fn+ZEND_ASYNC_SCOPE_AWAIT_AFTER_CANCELLATIONmacro.Scope::awaitAfterCancellationis now a thin wrapper over the shared C core.thread_pool.crewritten (NEW_SCOPE nursery → SPAWN_WITH → await body → cancel → drain → free snapshot);scope.c/.hfactor + register the C core;async_API.cregistration.pool_task_dispose/coroutine-mode path untouched.Tests
tests/thread_pool/065-task_scope_nursery_no_uaf.phpt(spawned-not-awaited coroutine still pending at task return). Fulltests/thread_pool/66/66 (+1 pre-existing XFAIL) andscope/task_group/task_set/coroutine/spawn/await/future/pool/channel(≈480) green under ASAN+ZTS.Requires php-src
true-async-stableat ABI v0.20.0 (pushed first so this PR's CI clones it).