Skip to content

Add ChatGPT subscription LLM support#744

Open
neubig wants to merge 14 commits into
mainfrom
add-llm-subscription-support
Open

Add ChatGPT subscription LLM support#744
neubig wants to merge 14 commits into
mainfrom
add-llm-subscription-support

Conversation

@neubig
Copy link
Copy Markdown
Member

@neubig neubig commented May 23, 2026

Summary

  • Add ChatGPT subscription auth constants, service wrappers, query/mutation hooks, and MSW mocks for planned agent-server subscription endpoints.
  • Extend LLM settings/profile UI with API-key vs ChatGPT subscription mode, subscription model selection, and connect/status/logout controls.
  • Shape subscription conversation/profile payloads to omit API key/base URL and gate conversation start/profile switch on subscription connection status.

Verification

  • npm run typecheck
  • npx vitest run __tests__/api/llm-subscription-service.test.ts __tests__/api/agent-server-adapter.test.ts __tests__/routes/llm-settings.test.tsx
  • npm run check-translation-completeness
  • npm test
  • npm run lint
  • npm run build

Notes

  • Live ChatGPT subscription verification still requires real browser-login credentials/session and the planned agent-server subscription endpoints.

This PR was created by an AI agent (OpenHands) on behalf of the user.


🐳 Docker images for this PR

GHCR package: https://github.com/OpenHands/agent-canvas/pkgs/container/agent-canvas

Component Value
Image ghcr.io/openhands/agent-canvas
Architectures amd64, arm64
Agent Server ghcr.io/openhands/agent-server:1.24.0-python
Automation openhands-automation==1.0.0a5
Commit 5ca3e5502c2f5f27f37cad5c5cb9d9fb839caf0a

Pull (multi-arch manifest)

# Multi-arch manifest — Docker automatically pulls the correct architecture
docker pull ghcr.io/openhands/agent-canvas:sha-5ca3e55

Run

docker run -it --rm \
  -p 8000:8000 \
  ghcr.io/openhands/agent-canvas:sha-5ca3e55

All tags pushed for this build

ghcr.io/openhands/agent-canvas:sha-5ca3e55-amd64
ghcr.io/openhands/agent-canvas:add-llm-subscription-support-amd64
ghcr.io/openhands/agent-canvas:pr-744-amd64
ghcr.io/openhands/agent-canvas:sha-5ca3e55-arm64
ghcr.io/openhands/agent-canvas:add-llm-subscription-support-arm64
ghcr.io/openhands/agent-canvas:pr-744-arm64
ghcr.io/openhands/agent-canvas:sha-5ca3e55
ghcr.io/openhands/agent-canvas:add-llm-subscription-support
ghcr.io/openhands/agent-canvas:pr-744

About Multi-Architecture Support

  • Each tag (e.g., sha-5ca3e55) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., sha-5ca3e55-amd64) are also available if needed

Companion PRs / temporary pins

This PR description update was created by an AI agent (OpenHands) on behalf of the user.

Issue

Fixes #1059

This PR description update was created by an AI agent (OpenHands) on behalf of Graham Neubig.

Co-authored-by: openhands <openhands@all-hands.dev>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 23, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agent-canvas Ready Ready Preview, Comment May 30, 2026 4:31am

Request Review

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

🟡 Acceptable with critical issues - The subscription auth feature is well-structured overall, but has fundamental issues that must be addressed before merging.

[CRITICAL ISSUES]

  • [__tests__/api/llm-subscription-service.test.ts] Test Quality: The core test suite violates repository guidelines by mocking the entire LLMMetadataClient HTTP client instead of testing real code paths with MSW. This means tests won't catch actual regressions when the HTTP layer, auth headers, or server contract changes. Rewrite to use the MSW handlers already available in settings-handlers.ts.

  • [src/components/features/settings/llm-profiles/llm-settings-local-view.tsx:170] Complexity: The handleSave function has >4 levels of nesting (try-catch → rename if → auth type if-else → subscription model if → API key if-else → base_url if → wasActive if), violating the repository's '3 levels max' guideline. Extract helpers: buildLlmConfigForAuthType(), handleProfileRename(), handleProfileReactivation().

  • [src/api/llm-subscription-service.ts:78] Type Safety: Unsafe casting (client as unknown as LLMMetadataClientWithTransport).client to access internal transport. If @openhands/typescript-client changes its internal structure, this fails at runtime with cryptic errors. Add runtime validation that the transport exists and has the expected methods before using it.

[IMPORTANT ISSUES]

  • [src/components/features/settings/llm-settings/openai-subscription-auth-card.tsx:61] User Experience: When polling returns connected: false, an error toast is shown with a "PENDING" message. This is misleading—users think something failed when they just need to wait. Use an info/warning toast or inline status text instead.

  • [src/components/features/settings/llm-settings/openai-subscription-auth-card.tsx:36] State Management: Users can get stuck in the challenge state if they start device login but close the tab or navigate away. No way to clear the challenge except completing login or logging out (which fails if not connected). Add a "Cancel" button to reset challenge state.

  • [src/routes/llm-settings.tsx:219] Data Flow: Model selection is silently changed when switching auth types. If a user has 'gpt-4o', switches to subscription to explore, then switches back, their original selection is lost forever (replaced with defaultModel). Store previous models in component state and restore them, or at minimum show a toast explaining the change.

  • [src/api/llm-subscription-service.ts:109] Validation: Missing validation for empty strings in normalizeDeviceChallenge. A buggy server could return empty strings that pass the !deviceCode check. Add .trim().length > 0 validation for deviceCode, userCode, and verificationUri.

  • [__tests__/api/llm-subscription-service.test.ts] Test Coverage: Missing critical test cases: error handling (401/403, 500, timeouts, malformed responses), validation (incomplete device challenges), end-to-end integration (full device flow), and edge cases (empty strings, null values, concurrent requests).

  • [src/api/llm-subscription-service.ts:87] Error Handling: Network errors from the HTTP client propagate without context. Wrap errors to provide more specific messages about which subscription operation failed.

[SUGGESTIONS]

  • [src/api/agent-server-adapter.ts:820] Documentation: Add docstring to assertSubscriptionAuthReady noting that it's NOT called when resuming conversations or sending additional messages, so subscriptions could expire mid-conversation. The agent-server must handle expired tokens gracefully.

  • [src/api/llm-subscription-service.ts:136] Data Structure: The expiresAt field tries both string and number fallbacks, including expires_in which is typically a relative duration in seconds, not an absolute timestamp. Mixing relative and absolute time values in the same field could cause confusion. Document the expected behavior or convert relative durations to absolute timestamps.

  • [src/hooks/query/use-llm-subscription-status.ts:14] Query Configuration: staleTime of 30 seconds seems short for auth status that rarely changes. Consider 5 minutes to reduce unnecessary network requests, especially since mutations properly invalidate this query.

[RISK ASSESSMENT]

🟡 MEDIUM RISK - This PR adds new authentication flows and modifies conversation creation logic. The main risks are:

  • Untested error paths could cause poor UX when auth fails
  • Complex nested logic increases maintenance burden
  • Type safety issues could cause runtime failures with SDK updates

Not blocking auto-merge based on risk alone, but the critical issues above should be addressed.

VERDICT:

Needs rework - Address the critical test quality and complexity issues, then fix the important UX and validation gaps.

KEY INSIGHT:

The subscription auth implementation follows OAuth device flow patterns correctly, and backward compatibility is preserved (API key flows still work). However, the test suite doesn't provide confidence that the integration will work in production, and the complex nested save logic creates maintenance risk. Focus first on rewriting tests to use MSW and refactoring handleSave.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/agent-canvas/actions/runs/26336381445

Co-authored-by: openhands <openhands@all-hands.dev>
github-actions Bot added a commit that referenced this pull request May 23, 2026
@neubig neubig added the update-snapshots Intentional snapshot changes — CI diff check bypassed; new baselines uploaded on merge label May 23, 2026 — with OpenHands AI
github-actions Bot added a commit that referenced this pull request May 23, 2026
- Merge /api/llm/subscription/openai/models with chatgpt/ provider
  models from /api/llm/models so the subscription dropdown stays in
  sync with the full LiteLLM registry (e.g. includes gpt-5.5)
- Add gpt-5.3-codex to the hardcoded fallback constant
- Replace hardcoded OPENAI_SUBSCRIPTION_MODELS array in the dropdown
  with a dynamic useOpenAISubscriptionModels hook (falls back to
  constant when endpoints are unavailable)
- Device code block: remove 'Code: ' prefix, fix black-on-dark
  contrast by using --oh-surface-primary / --oh-text-primary tokens,
  add CopyToClipboardButton for one-click copy

Co-authored-by: openhands <openhands@all-hands.dev>
LiteLLM's chatgpt/ provider registry does not yet include gpt-5.5,
but it is available via ChatGPT Plus/Pro subscription. Add it to the
hardcoded fallback so it appears in the dropdown regardless of whether
the backend's chatgpt/ entries have been updated.

Co-authored-by: openhands <openhands@all-hands.dev>
- Drop /api/llm/subscription/openai/models as a model source; it was
  added in this PR set and is redundant — LiteLLM's chatgpt/ provider
  is the single authoritative list
- Remove the OPENAI_SUBSCRIPTION_MODELS hardcoded fallback array,
  OpenAISubscriptionModel type, DEFAULT_OPENAI_SUBSCRIPTION_MODEL, and
  isOpenAISubscriptionModel type guard — all replaced by the live
  subscriptionModels from the hook, or a simple pass-through in
  non-hook contexts (adapter, profile builder)
- Update agent-server-adapter test to reflect pass-through behavior

Co-authored-by: openhands <openhands@all-hands.dev>
@rajshah4
Copy link
Copy Markdown
Member

I tested the ChatGPT subscription flow locally. The frontend/model-selection parts of this PR helped, but the remaining
conversation-start failure appears to be in software-agent-sdk / Agent Server runtime handling, not the UI.

Observed behavior:

  • /api/llm/subscription/openai/status reports connected.
  • /api/llm/subscription/openai/models returns models.
  • Direct calls to the ChatGPT Codex endpoint with the stored OAuth token work.
  • Direct LiteLLM responses and aresponses calls also work with:
    • api_base=https://chatgpt.com/backend-api/codex
    • model=openai/gpt-5.3-codex
    • streaming enabled
    • Codex headers
  • But starting a normal Agent Canvas conversation fails with:
litellm.AuthenticationError: AuthenticationError: OpenAIException - {"detail":"Could not parse your authentication
token. Please try signing in again."}

The likely root cause is that the SDK loses subscription runtime state after serializing/reloading the LLM config.

A saved conversation LLM contains:

{
"auth_type": "subscription",
"subscription_vendor": "openai",
"model": "openai/gpt-5.3-codex"
}

But after validating/reloading it as an SDK LLM, llm.is_subscription becomes False. That means
_get_litellm_api_key_value() does not fetch the ChatGPT OAuth access token, so the ChatGPT Codex endpoint receives invalid/missing auth and returns the token-parse error.

Potential SDK fix:

@Property
def is_subscription(self) -> bool:
return self._is_subscription or (
self.auth_type == "subscription"
and self.subscription_vendor == "openai"
)

There is a second issue after auth is fixed: LiteLLM’s streamed Responses completion can expose completed_event.response as a plain dict, while the SDK expects an object with .output. I verified the stream output can be recovered from response.output_item.done events and parsed successfully.

So I think this PR may need a companion SDK fix for:

  1. Preserving/rederiving subscription mode after LLM config serialization.
  2. Handling dict-shaped streamed Responses completion objects.
  3. Optionally filtering subscription models to ones actually usable by the account.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 28, 2026

Updated this stacked PR in 1e94188: merged current main, resolved package conflicts, kept the temporary @openhands/typescript-client branch pin at b025805, rewrote subscription service tests to use MSW/real client paths, tightened device challenge validation, added login cancel/pending UX, preserved model selection when toggling auth type, and refactored profile-save logic. Local verification: npm run lint, npm run build, and full npm test passed.\n\nThis comment was generated by an AI agent (OpenHands) on behalf of neubig.

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 28, 2026

Uh oh! There was an unexpected error starting the job :(

Copy link
Copy Markdown
Contributor

all-hands-bot commented May 28, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Previous critical issues resolved — The three blockers from the prior review have all been addressed: tests now use MSW handlers instead of full-client mocks, handleSave complexity has been properly refactored into helper functions (buildProfileLlmConfig, renameProfileIfNeeded, reactivateProfileIfNeeded), and the unsafe (client as unknown as LLMMetadataClientWithTransport).client cast is gone. The cancel button, model restoration refs, and empty-string validation in readString are all present. This is a significant quality improvement.

The OAuth device flow pattern is correctly implemented, subscription credential omission from conversation payloads is good security practice, and withLlmClient's finally { client.close() } is solid resource management.


[IMPORTANT ISSUES]

  • Unconditional subscription model fetch (src/routes/llm-settings.tsx:140): useOpenAISubscriptionModels() fires on every mount of LlmSettingsScreen, including for users who never use subscription auth. The hook supports { enabled } — pass enabled: isSubscriptionMode where isSubscriptionMode is derived from the current settings' auth type. The silent failure path means no visible bug, but it generates unnecessary traffic for all API-key users and may produce 404/500 noise in agent-server logs.

  • Race condition: model saved as empty string (src/routes/llm-settings.tsx:408): In buildPayload, if subscriptionModels is still undefined when the user saves (e.g. they switch to subscription and immediately click save before the model list loads), subscriptionModels?.[0] ?? "" evaluates to "". The same race exists in handleAuthTypeChange at line 235 — switching before models load fires onChange("llm.model", ""). The subscriptionModelValue display fallback at line 181–183 hides this in the UI, but the underlying form value is "". Guard the save path: if authType === LLM_AUTH_TYPE_SUBSCRIPTION && !subscriptionModels?.length, show a validation message or block the save.

[SUGGESTIONS]

  • assertSubscriptionAuthReady missing lifecycle note (src/api/agent-server-adapter.ts:844): Flagged in the previous review, still unaddressed. The function is called at conversation start and profile switch, but not on subsequent message sends or conversation resume. A one-line JSDoc noting this scope prevents future maintainers from assuming token expiry is covered end-to-end.

  • @openhands/typescript-client git dep tracking (__tests__/package-library.test.ts:57): The comment says "temporarily allowed while this stacked PR waits for the subscription client branch to merge/release" but there is no linked issue or deadline. Link a tracking issue or PR in the comment so this exemption doesn't outlive its purpose.

  • staleTime of 30s for subscription status (src/hooks/query/use-llm-subscription-status.ts:14): Raised in the previous review. Since both poll and logout mutations already invalidate this query, the 30-second background refetch is redundant. Bumping to 5 minutes reduces background requests with no UX regression.


VERDICT: 🟡 Close to merge-ready. The race condition in buildPayload/handleAuthTypeChange can silently produce a profile with an empty model field — worth a targeted fix before merge. The unconditional model fetch is a UX polish issue. Everything else is non-blocking.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Comment thread src/routes/llm-settings.tsx Outdated
Comment thread src/routes/llm-settings.tsx Outdated
Comment thread src/routes/llm-settings.tsx Outdated
Comment thread src/api/agent-server-adapter.ts
Comment thread __tests__/package-library.test.ts
Comment thread src/hooks/query/use-llm-subscription-status.ts Outdated
Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 28, 2026

@openhands-ai[bot] it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again.

1 similar comment
@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 28, 2026

@openhands-ai[bot] it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again.

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 28, 2026

CI is green on 13ede8a, all review threads are resolved, and this PR is ready for another review/merge gate pass.\n\nThis comment was generated by an AI agent (OpenHands) on behalf of neubig.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 29, 2026

Pushed a follow-up dark-theme contrast fix and merged latest main. CI is green on 6a5799a, and review threads remain resolved.\n\nThis comment was generated by an AI agent (OpenHands) on behalf of neubig.

@neubig
Copy link
Copy Markdown
Member Author

neubig commented May 30, 2026

@OpenHands fix merge conflicts

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 30, 2026

I'm on it! neubig can track my progress at all-hands.dev

…support

Resolved conflicts:
- package.json/package-lock.json: updated @openhands/extensions hash to match main;
  kept @openhands/typescript-client branch pin (add-llm-subscription-client) needed
  for subscription feature until typescript-client PR #178 is merged and released
- src/mocks/handlers.ts: use workspaces-handlers (renamed in main) with resetMockWorkspaces
- src/mocks/mcp-handlers.ts: use main's typed MCPTestResponse version
- tests/e2e/snapshots/settings-page.snapshot.spec.ts: use 15000 (no underscore) from main

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 30, 2026

Summary

Request

Fix merge conflicts in PR #744 (add-llm-subscription-support).

What Was Done

Fetched latest main — identified it had advanced to 75e578a0 (v1.0.0-alpha.9), introducing 5 files with merge conflicts.

Resolved all 5 conflicts and pushed — merge commit 5ca3e550 was pushed to origin/add-llm-subscription-support, updating the PR.

Conflicts Resolved

File Resolution
package.json Updated @openhands/extensions to main's hash (e14f740c); kept @openhands/typescript-client on the add-llm-subscription-client branch pin (required for subscription feature until typescript-client PR #178 merges)
package-lock.json Same as package.json; preserved git+https:// URL scheme (not git+ssh://) to avoid Vercel build failures
src/mocks/handlers.ts Used main's version with the renamed workspaces-handlers (WORKSPACES_HANDLERS / resetMockWorkspaces)
src/mocks/mcp-handlers.ts Used main's fuller version with the MCPTestResponse TypeScript type and explanatory JSDoc
tests/e2e/snapshots/settings-page.snapshot.spec.ts Adopted main's 15000 style (no _ numeric separator) — purely cosmetic

No Extraneous Changes

Only the 5 conflicted files were modified — all other files in the merge were auto-resolved by git with no manual intervention needed. The subscription feature files (llm-subscription-service.ts, openai-subscription-auth-card.tsx, etc.) were untouched and remain intact in the PR.

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 30, 2026

@openhands-ai[bot] it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again.

github-actions Bot added a commit that referenced this pull request May 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

📸 Snapshot Test Report

Warning

Snapshot comparison step crashed (timeout, OOM, or runner error) — diff results below may be incomplete or absent.
Check the CI logs for the full error output (look for the "Run snapshot comparison" step).

✅ 6 snapshots changed — acknowledged via the update-snapshots label. New baselines will be uploaded when this PR merges.

Category Count
🔴 Changed 6
🆕 New 0
✅ Unchanged 67
Total 73
🔴 Changed snapshots (6)

changes-tab

changes-diff-viewer

Expected (main) Actual (PR) Diff
expected actual diff

onboarding

onboarding-step-2-setup-llm

Expected (main) Actual (PR) Diff
expected actual diff

settings-page — 2 snapshots

settings-app-page

Expected (main) Actual (PR) Diff
expected actual diff

settings-page

Expected (main) Actual (PR) Diff
expected actual diff

settings-verification — 2 snapshots

condenser-settings

Expected (main) Actual (PR) Diff
expected actual diff

verification-settings-off

Expected (main) Actual (PR) Diff
expected actual diff
✅ Unchanged snapshots (67)

archived-conversation

  • conversation-panel-with-archived-badges
  • conversation-view-archived
  • conversation-view-sandbox-error

automations

  • automations-delete-modal
  • automations-list-active-inactive
  • automations-no-automations
  • automations-search-no-results

backends-extended

  • backend-add-blank-disabled
  • backend-add-cloud-advanced-open
  • backend-add-cloud-no-key-disabled
  • backend-add-cloud-with-key-enabled
  • backend-add-form-partially-filled
  • backend-add-invalid-url-disabled
  • backend-add-local-ready
  • backend-add-name-only-disabled
  • backend-add-two-column-layout
  • backend-add-whitespace-host-disabled
  • backend-after-switch
  • backend-cancel-nothing-saved
  • backend-dropdown-two-backends
  • backend-edit-prefilled
  • backend-manage-after-removal
  • backend-manage-two-listed
  • backend-remove-cancelled
  • backend-remove-confirmation
  • backend-switch-overlay

backends

  • backend-add-modal
  • backend-manage-modal
  • backend-selector-open

changes-tab

  • changes-deleted-file
  • changes-empty

collapsible-thinking

  • reasoning-content-collapsed
  • reasoning-content-expanded
  • think-action-collapsed
  • think-action-expanded

mcp-page

  • mcp-custom-server-1-editor-open
  • mcp-custom-server-2-url-filled
  • mcp-custom-server-3-all-filled
  • mcp-custom-server-4-installed
  • mcp-custom-server-editor
  • mcp-empty-installed
  • mcp-search-filtered
  • mcp-slack-install-1-marketplace
  • mcp-slack-install-2-modal
  • mcp-slack-install-3-filled
  • mcp-slack-install-4-installed

onboarding

  • onboarding-step-0-choose-agent
  • onboarding-step-1-check-backend
  • onboarding-step-3-say-hello

projects-workspace-browser

  • projects-workspace-browser

settings-page

  • add-backend-modal
  • analytics-consent-modal
  • home-screen

settings-secrets

  • secrets-add-form-filled
  • secrets-add-form
  • secrets-after-save
  • secrets-delete-confirm
  • secrets-list

settings-verification

  • verification-settings-on

sidebar

  • sidebar-collapsed
  • sidebar-conversation-panel
  • sidebar-filter-menu

skills-page

  • skills-empty
  • skills-loaded
  • skills-no-match
  • skills-search-filtered
  • skills-type-filter

Generated by the Snapshot Tests workflow. This comment was created by an AI agent (OpenHands) on behalf of the repo maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-this update-snapshots Intentional snapshot changes — CI diff check bypassed; new baselines uploaded on merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Track PR #744: Add ChatGPT subscription LLM support

4 participants