-
Notifications
You must be signed in to change notification settings - Fork 147
Add support for async workflow activities #1053
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
872e75c
413f16f
b52139c
0c64d1a
69ea96e
81fa323
dadaa9a
8c4ce88
e8c4c05
7ec820e
9185482
73add2e
5709bcd
e68141c
feb60db
70c6fad
5fb88e6
6eb9ce0
8cf248b
a73e994
a2d4ad8
b1e0c3f
a4d23de
3f80ed6
7f86c10
67915a2
152f058
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -107,6 +107,26 @@ The entry point for registration and lifecycle: | |
|
|
||
| Internally wraps user functions: workflow functions get a `DaprWorkflowContext`, activity functions get a `WorkflowActivityContext`. Tracks registration state via `_workflow_registered` / `_activity_registered` attributes on functions to prevent double registration. | ||
|
|
||
| #### Sync and async activities | ||
|
|
||
| Activities can be either `def my_activity(ctx, inp)` or `async def my_activity(ctx, inp)`. At registration, `_make_activity_wrapper` calls `_is_async_callable(fn)` to detect async-ness. That helper unwraps `functools.partial`, `@functools.wraps` chains, and callable-class `__call__` so common decorator patterns route correctly. The wrapper is built `async def` or `def` to match, then stored in the registry. | ||
|
|
||
| At dispatch time (the gRPC stream loop in `_durabletask/worker.py`), `is_async_callable(activity_fn)` on the wrapper selects between two handlers. | ||
|
|
||
| - **Async activities** go through `_execute_activity_async`, then `_ActivityExecutor.execute_async`, which awaits `fn(...)` directly on the event loop. The gRPC response is delivered via `loop.run_in_executor(self._async_worker_manager.thread_pool, stub.CompleteActivityTask, ...)` — the same pool sync activities use, sized by `maximum_thread_pool_workers`. | ||
| - **Sync activities** go through `_execute_activity`, dispatched to the thread pool by `_AsyncWorkerManager._run_func`. The activity runs on a worker thread, and the response is delivered from the same thread. | ||
|
|
||
| Workflow (orchestrator) functions must remain generators (`def` with `yield`). They cannot be `async def` because durabletask's deterministic replay depends on synchronous generator semantics. Only activities support async. | ||
|
|
||
| **Decorator ordering gotcha.** Wrapping `@wfr.activity` over `@alternate_name(...)` over `async def` works because `@alternate_name` now emits an `async def innerfn` when the wrapped function is async. A user-written decorator that wraps an async function in a sync `def` (without `@functools.wraps` exposing `__wrapped__`) defeats `_is_async_callable`, routes the activity to the sync path, and produces an un-awaited coroutine. Such decorators should use `@functools.wraps(fn)` so the unwrap walks through them. | ||
|
|
||
| **`maximum_thread_pool_workers` covers both paths.** This knob sizes the worker thread pool used for sync-activity bodies and for async-activity gRPC response sends. Mixed workloads with long-running sync activities can starve async response delivery (and vice versa) since they share the pool — size to the sum of peak sync activity concurrency and peak in-flight async response sends. | ||
|
|
||
| **Concurrency sizing and load characterization.** See `docs/concurrency.md` for sizing recommendations (`maximum_concurrent_activity_work_items`, `maximum_thread_pool_workers`) and an async-vs-sync decision tree. `tests/ext/workflow/durabletask/test_async_dispatch_regression.py` (marked `perf`) guards the core invariant: a batch of async activities overlaps on the event loop instead of serializing through the thread pool. | ||
|
|
||
| **grpc.aio poller log noise.** The async client can emit benign `BlockingIOError: [Errno 11]` ERROR lines from `grpc.aio`'s `PollerCompletionQueue` under load. It is harmless and retried. `get_grpc_aio_channel` installs an internal `asyncio`-logger filter (`_silence_grpc_aio_poller_noise`) that drops only those records, so the SDK suppresses it automatically with no user action. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we need this comment?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Honestly I like using AGENTS.md as a development log of sorts to not waste time rediscovering old issues and rereading the code base. That's just a personal preference though, not sure how much we should let this file grow |
||
|
|
||
|
|
||
| ### DaprWorkflowClient (`dapr_workflow_client.py`) | ||
|
|
||
| Client for workflow lifecycle management: | ||
|
|
@@ -165,7 +185,7 @@ Retry configuration for activities and child workflows: | |
| 1. **Registration**: User decorates functions with `@wfr.workflow` / `@wfr.activity`. The runtime wraps them and stores them in the durabletask worker's registry. | ||
| 2. **Startup**: `wfr.start()` opens a gRPC stream to the Dapr sidecar. The worker polls for work items. | ||
| 3. **Scheduling**: Client calls `schedule_new_workflow(fn, input=...)`. The function's name (or `_dapr_alternate_name`) is sent to the backend. | ||
| 4. **Execution**: The durabletask engine dispatches work items. Workflow functions are Python **generators** that `yield` tasks (activity calls, timers, child workflows). The engine records history; on replay, yielded tasks return cached results without re-executing. | ||
| 4. **Execution**: The durabletask engine dispatches work items. Workflow functions are Python **generators** that `yield` tasks (activity calls, timers, child workflows). Activity functions are either sync (dispatched to the worker's thread pool) or `async def` (awaited directly on the worker's event loop). The engine records history; on replay, yielded tasks return cached results without re-executing. | ||
| 5. **Determinism**: Workflows must be deterministic — no random, no wall-clock time, no I/O. Use `ctx.current_utc_datetime` instead of `datetime.now()`. Use `ctx.is_replaying` to guard side effects like logging. | ||
| 6. **Completion**: Client polls via `wait_for_workflow_completion()` or `get_workflow_state()`. | ||
|
|
||
|
|
@@ -193,6 +213,7 @@ Two example directories exercise workflows: | |
| - `cross-app1.py`, `cross-app2.py`, `cross-app3.py` — cross-app calls | ||
| - `versioning.py` — workflow versioning with `is_patched()` | ||
| - `simple_aio_client.py` — async client variant | ||
| - `async_activities.py` — `async def` activities (fan-out/fan-in with simulated I/O, configurable payload sizes) | ||
|
|
||
| ## Testing | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,6 +9,7 @@ | |
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import logging | ||
| from typing import Optional, Sequence, Union | ||
|
|
||
| import grpc | ||
|
|
@@ -28,6 +29,30 @@ | |
| grpc_aio.StreamStreamClientInterceptor, | ||
| ] | ||
|
|
||
| _POLLER_NOISE_MARKER = 'PollerCompletionQueue._handle_events' | ||
|
|
||
|
|
||
| class _GrpcAioPollerNoiseFilter(logging.Filter): | ||
| """Drops the harmless grpc.aio poller BlockingIOError (EAGAIN) records. | ||
|
|
||
| The poller does a non-blocking read on its wake-up fd and can get EAGAIN, which | ||
| asyncio logs at ERROR even though the read is retried and nothing is lost. | ||
| """ | ||
|
|
||
| def filter(self, record: logging.LogRecord) -> bool: | ||
| exc = record.exc_info[1] if record.exc_info else None | ||
| is_poller_noise = isinstance(exc, BlockingIOError) and ( | ||
| _POLLER_NOISE_MARKER in record.getMessage() | ||
| ) | ||
| return not is_poller_noise | ||
|
|
||
|
|
||
| def _silence_grpc_aio_poller_noise() -> None: | ||
|
seherv marked this conversation as resolved.
|
||
| """Install the poller-noise filter on the asyncio logger if not already present.""" | ||
| asyncio_logger = logging.getLogger('asyncio') | ||
| if not any(isinstance(f, _GrpcAioPollerNoiseFilter) for f in asyncio_logger.filters): | ||
| asyncio_logger.addFilter(_GrpcAioPollerNoiseFilter()) | ||
|
|
||
|
|
||
| def get_grpc_aio_channel( | ||
| host_address: Optional[str], | ||
|
|
@@ -43,6 +68,8 @@ def get_grpc_aio_channel( | |
| interceptors: Optional sequence of client interceptors to apply to the channel. | ||
| options: Optional sequence of gRPC channel options as (key, value) tuples. Keys defined in https://grpc.github.io/grpc/core/group__grpc__arg__keys.html | ||
| """ | ||
| _silence_grpc_aio_poller_noise() | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this only on the asyncio side of things?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yup, grpc.aio spams the error logs when their client is used from multiple event loops, and that was the case for FastAPI applications using this SDK. Nothing was actually an error but the logs got extremely noisy in Linux. It got fixed on their 1.80.0 release, as soon as we update to that dep (in a separate PR ofc) we can delete this |
||
|
|
||
| if host_address is None: | ||
| host_address = get_default_host_address() | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.