Skip to content

fix: add error handling to OpenAI-compatible serve endpoint#774

Merged
psschwei merged 3 commits intogenerative-computing:mainfrom
markstur:serve_error_handling
Apr 3, 2026
Merged

fix: add error handling to OpenAI-compatible serve endpoint#774
psschwei merged 3 commits intogenerative-computing:mainfrom
markstur:serve_error_handling

Conversation

@markstur
Copy link
Copy Markdown
Contributor

@markstur markstur commented Apr 1, 2026

Misc PR

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

Add proper exception handling to the chat completion endpoint in cli/serve/app.py to prevent unhandled exceptions from crashing the server. Returns appropriate HTTP status codes and error messages.

Changes:

  • Wrap endpoint logic in try-except block with specific handlers
  • AttributeError → 500 (missing output attributes)
  • ValueError → 400 (validation/input errors)
  • Exception → 500 (catch-all for unexpected errors)
  • Add comprehensive test coverage (5 unit tests, 100% error path coverage)

Fixes ensure the server returns OpenAI-compatible error responses instead of crashing on exceptions.

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

@markstur markstur requested a review from a team as a code owner April 1, 2026 22:29
@github-actions github-actions bot added the bug Something isn't working label Apr 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@markstur markstur marked this pull request as draft April 1, 2026 22:36
@markstur markstur marked this pull request as draft April 1, 2026 22:36
@markstur markstur force-pushed the serve_error_handling branch from 7646265 to 907fb78 Compare April 1, 2026 22:48
@markstur markstur marked this pull request as ready for review April 1, 2026 22:50
@ajbozarth
Copy link
Copy Markdown
Contributor

Claude found a few issues to look at:

Unused import (will fail ruff check)cli/serve/app.py

Request is imported but never used:

from fastapi import FastAPI, Request  # Request is unused                                                                                                                                                                                

Test isolation — routes accumulate on the global apptest/cli/test_serve_errors.py

Each test calls app.add_api_route(...) on the shared module-level app object without cleanup. Routes pile up across test runs, which can cause 422 conflicts or ordering-dependent failures. Each test should use a fresh FastAPI()
instance or remove the route in teardown.

Nits

Duplicate error handlerscli/serve/app.py

AttributeError and the catch-all Exception produce identical responses (500 server_error). The AttributeError branch can be dropped — Exception already covers it and the distinction adds nothing for callers.

response_model=None — minor type safety loss; Union[ChatCompletion, OpenAIErrorResponse] would preserve schema generation, though it won't change runtime behavior in this pattern.

Copy link
Copy Markdown
Contributor

@jakelorocco jakelorocco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me. I have a few concerns that maybe aren't valid / might not need to be addressed (I don't know if I'm completely up to date on the purpose of this OpenAI server, so these might have already been discussed):

  1. Are these errors helpful to the end user of the server? For instance, if the ValueError is raised by having some non-conforming field / input, that makes sense to me.
  2. Will these errors leak anything that an end user might not want leaked? This may be out of scope, but I'm not sure if we have any expectations that the names of requirements, internal mellea functionality doesn't get leaked through these endpoints. Maybe thats on the end user to implement if they need that obfuscation?

Copy link
Copy Markdown
Contributor

@ajbozarth ajbozarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now, but I id discuss @jakelorocco concerns with Claude and this was what we came to, probably worth addressing in a comment before merging if Claude is correct that m serve is not designed for prod, if it it we should address it (could be in a follow up imho):

Helpfulness — The ValueError → 400 path is genuinely useful since it reflects bad input the caller can fix. The 500 path is less useful to an end user but fine for a local dev server like m serve.

Information leakage — This is the more interesting concern. The current code passes str(e) directly into the error response body. For a 500, that could expose internal details like mellea class names, attribute paths, or module
structure. For a production-facing deployment that would be a real issue, but m serve is a local developer tool, so leaking implementation details to localhost is low risk. Worth the author explicitly acknowledging that — "this
is a local dev server, not intended for production exposure" — so the reviewer knows it was considered rather than overlooked.

markstur added 2 commits April 2, 2026 12:34
Add proper exception handling to the chat completion endpoint in
cli/serve/app.py to prevent unhandled exceptions from crashing the
server.

Implements OpenAI API error format for the `m serve` endpoint to ensure
compatibility with OpenAI client libraries and tools.

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
* remove unused import
* fix FastAPI app route accumulation
* remove duplicate error handler
* add types for response_model

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
@markstur markstur force-pushed the serve_error_handling branch from e172da1 to 6c91522 Compare April 2, 2026 20:00
@markstur
Copy link
Copy Markdown
Contributor Author

markstur commented Apr 2, 2026

rebased and resolved conflicts

For the success case, return None for usage not a mock object.

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
@psschwei psschwei enabled auto-merge April 3, 2026 15:21
@psschwei psschwei added this pull request to the merge queue Apr 3, 2026
Merged via the queue into generative-computing:main with commit ecc15a6 Apr 3, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants