Execution API: Normalize error response formats#63801
Execution API: Normalize error response formats#63801henry3260 wants to merge 5 commits intoapache:mainfrom
Conversation
e079ab4 to
2ae9a0d
Compare
2ae9a0d to
7441553
Compare
jason810496
left a comment
There was a problem hiding this comment.
Would it be better to add a subclass like ExecutionAPIHTTPException (or ExecutionHTTPException for the shorthand) to enforce the error response format in the future?
That's a good idea! |
There was a problem hiding this comment.
Pull request overview
Standardizes Execution API error responses to a consistent structured detail payload (e.g., {reason, message}) by introducing an ExecutionHTTPException wrapper and updating multiple routes/tests to use it.
Changes:
- Added
ExecutionHTTPExceptionto normalizedetailinto a consistent{reason, message}shape. - Replaced multiple
fastapi.HTTPExceptionusages across Execution API routes withExecutionHTTPException. - Updated task instance unit tests to assert the new structured error
detailformat.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_task_instances.py | Updates assertions to match new structured error response shapes. |
| airflow-core/src/airflow/api_fastapi/execution_api/routes/xcoms.py | Normalizes error responses by switching to ExecutionHTTPException. |
| airflow-core/src/airflow/api_fastapi/execution_api/routes/variables.py | Switches to ExecutionHTTPException and adds additional validation branches. |
| airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances.py | Replaces mixed string/dict error details with consistent structured payloads. |
| airflow-core/src/airflow/api_fastapi/execution_api/routes/hitl.py | Switches to ExecutionHTTPException for not-found/conflict cases. |
| airflow-core/src/airflow/api_fastapi/execution_api/routes/dags.py | Converts DAG not-found response to ExecutionHTTPException structured detail. |
| airflow-core/src/airflow/api_fastapi/execution_api/routes/dag_runs.py | Converts DAG/DAG run errors to ExecutionHTTPException structured detail. |
| airflow-core/src/airflow/api_fastapi/execution_api/routes/connections.py | Converts connection not-found response to ExecutionHTTPException. |
| airflow-core/src/airflow/api_fastapi/execution_api/routes/assets.py | Converts asset not-found response to ExecutionHTTPException. |
| airflow-core/src/airflow/api_fastapi/execution_api/routes/asset_events.py | Converts parameter-validation error to ExecutionHTTPException. |
| airflow-core/src/airflow/api_fastapi/common/exceptions.py | Introduces ExecutionHTTPException to enforce consistent {reason, message} error detail format. |
Comments suppressed due to low confidence (1)
airflow-core/src/airflow/api_fastapi/execution_api/routes/asset_events.py:1
- The
reasonvalue ("Missing parameter") is not normalized (it contains spaces and capitalization) and is inconsistent with the other routes’ machine-readable tokens (e.g.,not_found,invalid_request). Use a stable token (e.g.,missing_parameter) and keep the human-readable text inmessage.
# Licensed to the Apache Software Foundation (ASF) under one
| if not variable_key: | ||
| raise ExecutionHTTPException( | ||
| status.HTTP_404_NOT_FOUND, | ||
| detail={ | ||
| "reason": "not_found", | ||
| "message": "Not Found", | ||
| }, | ||
| ) |
There was a problem hiding this comment.
variable_key is a path parameter, so an “empty string” value won’t match the route (the segment can’t be empty). These branches are effectively unreachable and also return 404 Not Found, which is not an appropriate status for an invalid/missing path parameter. Consider removing these checks; if you want explicit validation, enforce it at the parameter level (e.g., Path(min_length=1)) so FastAPI returns a 422 validation error with the normalized detail shape.
| if not variable_key: | ||
| raise ExecutionHTTPException( | ||
| status.HTTP_404_NOT_FOUND, | ||
| detail={ | ||
| "reason": "not_found", | ||
| "message": "Not Found", | ||
| }, | ||
| ) |
There was a problem hiding this comment.
variable_key is a path parameter, so an “empty string” value won’t match the route (the segment can’t be empty). These branches are effectively unreachable and also return 404 Not Found, which is not an appropriate status for an invalid/missing path parameter. Consider removing these checks; if you want explicit validation, enforce it at the parameter level (e.g., Path(min_length=1)) so FastAPI returns a 422 validation error with the normalized detail shape.
| raise HTTPException( | ||
| raise ExecutionHTTPException( | ||
| status.HTTP_409_CONFLICT, | ||
| f"Human-in-the-loop detail for Task Instance with id {task_instance_id} already exists.", |
There was a problem hiding this comment.
Passing a string detail here will be normalized by ExecutionHTTPException, but it will default reason to "error". For consistency with the PR goal (“structured detail payloads with reason and message”), provide an explicit dict with a specific reason (e.g., already_exists) and the string as message.
| f"Human-in-the-loop detail for Task Instance with id {task_instance_id} already exists.", | |
| detail={ | |
| "reason": "already_exists", | |
| "message": ( | |
| f"Human-in-the-loop detail for Task Instance with id {task_instance_id} already exists." | |
| ), | |
| }, |
| class ExecutionHTTPException(HTTPException): | ||
| """ | ||
| HTTPException subclass used by Execution API. | ||
|
|
||
| Enforces consistent error response format containing `reason` and `message` keys. | ||
| """ | ||
|
|
||
| def __init__( | ||
| self, | ||
| status_code: int, | ||
| detail: dict[str, Any] | str | None = None, | ||
| headers: dict[str, str] | None = None, | ||
| ) -> None: | ||
| """Initialize with enforced dict format.""" | ||
| if isinstance(detail, str) or detail is None: | ||
| detail = { | ||
| "reason": "error", | ||
| "message": detail or "An error occurred", | ||
| } | ||
| elif isinstance(detail, dict): | ||
| detail = dict(detail) # copy | ||
| if "reason" not in detail: | ||
| detail["reason"] = "error" | ||
| if "message" not in detail: | ||
| detail["message"] = "An error occurred" | ||
| else: | ||
| detail = {"reason": "error", "message": str(detail)} | ||
|
|
||
| super().__init__(status_code=status_code, detail=detail, headers=headers) |
There was a problem hiding this comment.
Introducing ExecutionHTTPException only normalizes errors for call sites that explicitly use it. In this same module, some existing handlers (e.g., DagErrorHandler) still raise HTTPException with string detail, which will continue producing mixed formats. To fully enforce consistency, either (1) update Execution API error handlers to raise ExecutionHTTPException, or (2) register a FastAPI exception handler for HTTPException under the Execution API app/router that wraps/normalizes exc.detail into the {reason, message} shape.
pierrejeambrun
left a comment
There was a problem hiding this comment.
As mentioned by copilot, some other utility function can still raise mixed content exception.
If we want to catch everything a global exception handler is probably better. And we could have this registered at the top app level, because we have the same issue with the public API.
Always returning a 'dict' for 'details' with at least 'reason/message' seems cool.
f0bcc06 to
6c8d045
Compare
6c8d045 to
d3f8967
Compare
jason810496
left a comment
There was a problem hiding this comment.
Thanks for the update, I will defer to @pierrejeambrun regarding the exception response breaking change.
| # Normalize any HTTPException to have dict detail with reason/message | ||
| @app.exception_handler(HTTPException) | ||
| async def _http_exception_handler(request: Request, exc: HTTPException): | ||
| detail = _ensure_detail_dict(getattr(exc, "detail", None)) | ||
| headers = getattr(exc, "headers", None) | ||
| return JSONResponse(status_code=exc.status_code, content={"detail": detail}, headers=headers) |
There was a problem hiding this comment.
Is it acceptable to introduce RestAPI breaking change for 3.3.0?
cc @pierrejeambrun
Since the normalization for the HTTPException is breaking change for users who rely on the response body.
c035392 to
d1fa76d
Compare
d1fa76d to
5efc0b1
Compare
There was a problem hiding this comment.
I'm having second thought on this.
First the global exception handler is obfuscating the code. Someone reading the code won't understand at first why the response he's seeing is different from what the code raises. (a global exception handler is re modeling the response shape afterwards) So actually fixing the response at all different places is probably a more maintainable 'long term goal'.
Also as mentioned this is a breaking change (might be small but still breaking). I would say it's worth considering for a critical reason (security fix, global design shift, or huge maintainability gain), but just for cleaning the code / changing response format for consistency, i don't think it's worth the trouble. I would just accept that error response format isn't normalized...
And I would advocate for closing this.
Curious to see what other think.
edit: After double checking it appears that public/ui/auth managers apis are all consistent returning a 'string'. The problem is only located in the execution_api which uses both a mixture of 'str' and struct 'dict' for the details.
So the problem is only for that execution_api, and I'd say the repartition is half half, so it's not clear what was the original intent. But moving to fully struct seems good. I'd say we need to:
- Check if the client sdk (consuming that execution API) is using those error messages.
- If not then it means that we won't break generated client sdk by updating the error messages. (people most likely do not use the execution api directly without using the sdk client) And we can probably go ahead and do the change directly at the different places not returning the structure dict. (If we accept the risk)
- If the client sdk do use some of the error message, I'd say we close this PR, we can't break it
ashb
left a comment
There was a problem hiding this comment.
The exception handling and munging code is too complex given we control all the raise sites.
If we are unifying this, maybe we should do what the TODO in routes/t_i.py says:
# TODO: Pass a RFC 9457 compliant error message in "detail" field
# https://datatracker.ietf.org/doc/html/rfc9457
# to provide more information about the error(Note: that is not a "we must do this", but we should either do it now, or remove that comment now)
| ) | ||
|
|
||
|
|
||
| class ExecutionHTTPException(HTTPException): |
There was a problem hiding this comment.
Why is this in common if it's specific to the Execution API?
There was a problem hiding this comment.
Why is this in common if it's specific to the Execution API?
yes, if now we focus on execution API we can move this to others.
| # Catch-all for unhandled Exceptions | ||
| @app.exception_handler(Exception) | ||
| async def _unhandled_exception_handler(request: Request, exc: Exception): | ||
| exception_id = get_random_string() |
There was a problem hiding this comment.
What is the point of this? Is it logged anywhere to be able to correlate it against the exception when it's not sent to the client.
| log.exception("Unhandled error id %s while handling request %s", exception_id, request.url.path) | ||
|
|
||
| if conf.get("api", "expose_stacktrace") == "True": | ||
| message = f"Unhandled error id {exception_id}: {exc}" |
There was a problem hiding this comment.
Give this is JSON, we shouldn't put multiple things into a single string should we -- we've got structure, lets use it?
| # Normalize any HTTPException to have dict detail with reason/message | ||
| @app.exception_handler(HTTPException) | ||
| async def _http_exception_handler(request: Request, exc: HTTPException): |
There was a problem hiding this comment.
This probably shouldn't be done in line here, but put in ERROR_HANDLERS instead -- makes register fn single purpose and much smaller.
| return {"reason": "error", "message": str(nested_detail)} | ||
|
|
||
| # Idempotent path: keep already structured details intact. | ||
| normalized = dict(detail) |
| normalized.setdefault("reason", "error") | ||
| normalized.setdefault("message", "An error occurred") | ||
| return normalized | ||
| return {"reason": "error", "message": str(nested_detail)} |
There was a problem hiding this comment.
Why force it to a string.
This entire function feels like backcompat that we shouldn't need given we are in control of all of the callers.
| if extra: | ||
| # Do not allow overriding reason/message through extra accidentally. | ||
| for k, v in extra.items(): | ||
| if k not in ("reason", "message"): | ||
| detail[k] = v |
There was a problem hiding this comment.
You could do this by having extra be **extra, then this block can go.
It's a private API, so the only thing we have to about is compat with python or beta Go Task SDKs. |
We can likely handle this via a Cadwyn migration. It can mutate payloads ig needed. |
|
I'd prefer to close this PR since it contains too many unrelated changes. I’ll open a new, cleaner PR to address the specific comments we discussed. Thanks for the guidance! |
What
Standardized Execution API error responses to use structured detail payloads (with reason and message) instead of mixed
string/dictformats.Updated affected routes in
variables, xcoms, and task_instances to return consistent error shapes.Was generative AI tooling used to co-author this PR?
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.