feat(ai-chat-ui): add token usage indicator to chat view#17028
feat(ai-chat-ui): add token usage indicator to chat view#17028
Conversation
sdirix
left a comment
There was a problem hiding this comment.
Seemingly works for me. I tried some requests and the numbers went up.
Major
- I am not sure we need a new service for token counting. We already have usage response parts which we currently ignore. They could be reused for token counting
- Connected with that, I think the token count should be stored in the model. We could do it via the usage response parts or store them in additional fields or data. Currently the count lives in React state which is a temporary place and can not handle persistence, chat branches, chat switching etc.
Minor
Until the 200k limit has any specific meaning I would personally not show it and just count the tokens.
|
There is no new service in place. I just extended the existing one. |
|
You're right, I meant the new client, not service. Would you prefer to rework the approach or keep the one of the PR of now? The current implementation could serve as POC but it has so many downsides, I am not sure it's good to merge it in this state. |
|
@ndoschek if you agree i would rework this but it won't make the february release then |
2234bd0 to
5d6d461
Compare
|
@ndoschek why did you change this to draft? I updated the pr and it is ready imho |
|
Ah sorry Eugen, I was working through my GH notification mails and only saw your comment from before, changed it back again |
sdirix
left a comment
There was a problem hiding this comment.
Looks good conceptually to me.
I wonder whether we should also record the LLM/Provider into each response as the token count is conceptually dependent on that, what do you think?
There was a problem hiding this comment.
Generally speaking the PR works but it's easy to get into a "broken" state.
I let Coder look at a larger lock file. Initially the context was counted as 6k tokens. Then after I sent the next message, it suddenly jumped to 569k. I was using Opus 4.6. Coder still worked properly, so it's definitely some wrong counting
The initial UI does not look good:
Because the content-bar is not filled, it looks like a broken separator. Personally I would not show the bar nor the token count at all if no tokens were counted.
Maybe we should mark this feature as experimental and disable it by default in Theia itself? For "Theia Next" we could enable it by default via the preferences, so it gets some testing before we hand it over to the users.
The PR must also be rebased/merged with master.
| const tokenUsage = isResponseNode(node) ? node.response.tokenUsage : undefined; | ||
| const hasTokenInfo = tokenUsage && (tokenUsage.inputTokens > 0 || tokenUsage.outputTokens > 0); | ||
| const tokenInfo = hasTokenInfo | ||
| ? `Input: ${formatTokenCount(tokenUsage.inputTokens)} · Output: ${formatTokenCount(tokenUsage.outputTokens)}` |
There was a problem hiding this comment.
"Input:", "Output:", and the · separator are user-facing strings and must be localized with nls.localize.
a8dc1b4 to
a831c30
Compare
ndoschek
left a comment
There was a problem hiding this comment.
Thank you @eneufeld, overall this works nicely!
I have a few inline comments and one comment below.
I agree with Stefan, that the hardcoded 200k context window can be a bit misleading across different models.
About the default setting to false, I can take care of opening a ticket that we enable it for the Theia next product after this PR has been completed.
One additional thing I noticed:
When a request is aborted by the user, the token usage indicator is not updated for that request afaics. It keeps showing the data from the previous request. The tokens consumed by the aborted request seem to then show up added to the next request (if any), since the next request's input includes the prior conversation context. It would be good to also update the indicator on abort with whatever usage data was received up to that point. This could also be addressed in a follow-up or documented as a known limitation for now.
Add a context window usage indicator bar between the chat messages and input area, showing cumulative token consumption across all requests in a session. Architecture: - Language models yield UsageResponsePart inline in their stream/response instead of calling TokenUsageService directly - AbstractChatAgent captures usage into ChatResponseModel.tokenUsage and records it centrally via TokenUsageService - Token usage is serialized/restored with chat sessions - UI reads directly from the chat model Changes across providers: - Remove TokenUsageService injection from all 6 provider managers (Anthropic, Google, Ollama, OpenAI, Copilot, Vercel AI) - All providers yield usage data as UsageResponsePart in streaming and return usage on text/parsed responses - Claude Code and Codex agents call setTokenUsage() directly as they don't extend AbstractChatAgent UI: - Progress bar with green/yellow/red color coding against a 200k context window - Per-response token counts shown on agent label hover - Comprehensive unit tests for indicator logic and rendering
Stefan is on vacation this week, Simon will take over
There was a problem hiding this comment.
Thanks for the updates @eneufeld, works great for me 👍
As discussed yesterday, I had a quick look and did a little redesign of the UI indicator and pushed it on top.
I'll ask Simon to have a final look.
The UI is now a donut ring icon, filling up according to token usage.
For getting closer to the limit, we use warning colors for the icon and the input broder and add an info message to the hover content. If the limit was reached we use error colors and a different info message.
Some screenshots:
Simon will take over as I did the redesign of the UI
For more extensive testing we enable the token UI in the chat view by default for the Theia IDE next product See also eclipse-theia/theia#17028
- Remove standalone ChatTokenUsageIndicatorWidget and its DI binding - Integrate token usage display into the chat input widget directly - Replace full-width progress bar with a compact circular progress ring next to the send button, using conic-gradient fill - Tint input box border yellow/red when approaching or exceeding context window limit - Show multiline hover tooltip with input/output/cache breakdown and warning hints for warning/error states - Use MarkdownString for hover to support proper line breaks - Reuse localization keys for token info across badge and agent description hovers - Keep utility functions (formatTokenCount, getUsageColorClass, etc.) and their tests, remove widget-specific tests - Rename tsx file to util ts file as we do not have a dedicated indicator widget anymore
sgraband
left a comment
There was a problem hiding this comment.
Thank you! I really like the new design! LGTM 👍
For more extensive testing we enable the token UI in the chat view by default for the Theia IDE next product See also eclipse-theia/theia#17028
What it does
Adds a per-session token usage indicator to the chat view, showing current conversation context size vs a 200k token budget.
Key design: extends TokenUsageService with a required sessionId field so all LLM providers (Anthropic, OpenAI, Google, Vercel, Copilot, Ollama) feed the indicator.
ai-features.chat.tokenUsageIndicator.enabledResolves GH-17322
How to test
ai-features.chat.tokenUsageIndicator.enabledFollow-ups
The context window of 200k tokens is hardcoded. This should come from the models.
See also #16703 for further follow ups
Breaking changes
Attribution
Review checklist
nlsservice (for details, please see the Internationalization/Localization section in the Coding Guidelines)Reminder for reviewers