Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 164 additions & 0 deletions pkg/llmproxy/auth/kiro/runtime_helpers.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
package kiro

import (
"context"
"crypto/sha256"
"encoding/hex"
"fmt"
"net/http"
"net/url"
"strings"
"sync"

"github.com/google/uuid"
)

// Constants used across the kiro auth package for AWS CodeWhisperer
// runtime endpoint construction.
const (
DefaultKiroRegion = "us-east-1"
pathGetUsageLimits = "getUsageLimits"
pathListAvailableModels = "listAvailableModels"
)

// ProfileARN is a parsed representation of an AWS CodeWhisperer profile ARN.
type ProfileARN struct {
Raw string
Partition string
Service string
Region string
AccountID string
ResourceType string
ResourceID string
}

// ParseProfileARN parses a profile ARN of the form:
//
// arn:<partition>:codewhisperer:<region>:<account-id>:<resourceType>/<resourceID>
//
// Returns nil if the ARN is malformed or not for CodeWhisperer.
func ParseProfileARN(arn string) *ProfileARN {
if arn == "" {
return nil
}
parts := strings.Split(arn, ":")
if len(parts) < 6 || parts[0] != "arn" || parts[1] == "" || parts[2] != "codewhisperer" {
return nil
}
region := strings.TrimSpace(parts[3])
if region == "" || !strings.Contains(region, "-") {
return nil
}
resource := strings.Join(parts[5:], ":")
resourceType := resource
resourceID := ""
if idx := strings.Index(resource, "/"); idx > 0 {
resourceType = resource[:idx]
resourceID = resource[idx+1:]
}
return &ProfileARN{
Raw: arn,
Partition: parts[1],
Service: parts[2],
Region: region,
AccountID: parts[4],
ResourceType: resourceType,
ResourceID: resourceID,
}
}

// ExtractRegionFromProfileArn returns the region embedded in a CodeWhisperer
// profile ARN, or "" if the ARN is unparseable.
func ExtractRegionFromProfileArn(profileArn string) string {
parsed := ParseProfileARN(profileArn)
if parsed == nil {
return ""
}
return parsed.Region
}

// GetKiroAPIEndpoint returns the Q runtime endpoint for the given AWS region,
// falling back to DefaultKiroRegion when region is empty.
func GetKiroAPIEndpoint(region string) string {
if region == "" {
region = DefaultKiroRegion
}
return "https://q." + region + ".amazonaws.com"
}

// buildURL composes a runtime URL from endpoint, path, and optional query
// parameters. Empty parameter values are skipped.
func buildURL(endpoint, path string, queryParams map[string]string) string {
fullURL := fmt.Sprintf("%s/%s", endpoint, path)
if len(queryParams) == 0 {
return fullURL
}
values := url.Values{}
for key, value := range queryParams {
if strings.TrimSpace(value) == "" {
continue
}
values.Set(key, value)
}
if encoded := values.Encode(); encoded != "" {
fullURL += "?" + encoded
}
return fullURL
}

// GenerateAccountKey derives a stable short hash from a seed string,
// suitable for use as a per-account cache key.
func GenerateAccountKey(seed string) string {
hash := sha256.Sum256([]byte(seed))
return hex.EncodeToString(hash[:8])
}

// GetAccountKey produces a stable account key derived from clientID
// (preferred) or refreshToken; falls back to a UUID if both are empty.
func GetAccountKey(clientID, refreshToken string) string {
if clientID != "" {
return GenerateAccountKey(clientID)
}
if refreshToken != "" {
return GenerateAccountKey(refreshToken)
}
return GenerateAccountKey(uuid.New().String())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Returning a random UUID when both clientID and refreshToken are empty might lead to cache pollution if this key is used for a per-account cache. Consider using a stable sentinel value or returning an empty string if no stable identifier is available to avoid creating unnecessary cache entries.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The fallback path generates a new UUID every time both identifiers are empty, which makes the "account key" non-stable and creates a new fingerprint entry on each call. Because fingerprints are cached in a process-wide map by key, this can cause unbounded map growth and inconsistent headers/rate-limit identity for the same logical anonymous caller. Use a deterministic fallback key instead of a per-call random UUID. [resource leak]

Severity Level: Major ⚠️
- ❌ Long-running quota checks leak memory via FingerprintManager map.
- ⚠️ Anonymous tokens send inconsistent User-Agent and x-amz-user-agent.
Steps of Reproduction ✅
1. In application code, create a usage checker via `NewUsageChecker(cfg)` defined in
`pkg/llmproxy/auth/kiro/usage_checker.go:56-59`.

2. Call `checker.CheckUsageByAccessToken(ctx, accessToken, profileArn)` from
`usage_checker.go:120-126`, which builds a `KiroTokenData` with only `AccessToken` and
`ProfileArn` set (no `ClientID` or `RefreshToken`).

3. Inside `CheckUsage` (`usage_checker.go:70-96`), it computes `accountKey :=
GetAccountKey(tokenData.ClientID, tokenData.RefreshToken)` where both fields are empty,
causing `GetAccountKey` in `runtime_helpers.go:118-125` to execute the fallback `return
GenerateAccountKey(uuid.New().String())` on every call.

4. `setRuntimeHeaders` in `runtime_helpers.go:145-151` then calls
`GetGlobalFingerprintManager().GetFingerprint(accountKey)`, which stores a new entry in
`FingerprintManager.fingerprints` (`fingerprint.go:30-32, 83-101`) for each invocation;
repeated quota checks with only an access token cause the fingerprints map to grow
unboundedly and send inconsistent fingerprint-derived headers for the same logical caller.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/auth/kiro/runtime_helpers.go
**Line:** 125:125
**Comment:**
	*Resource Leak: The fallback path generates a new UUID every time both identifiers are empty, which makes the "account key" non-stable and creates a new fingerprint entry on each call. Because fingerprints are cached in a process-wide map by key, this can cause unbounded map growth and inconsistent headers/rate-limit identity for the same logical anonymous caller. Use a deterministic fallback key instead of a per-call random UUID.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

}

// Process-wide FingerprintManager singleton. Created on first use.
var (
globalFingerprintManager *FingerprintManager
globalFingerprintManagerOnce sync.Once
)

// GetGlobalFingerprintManager returns the process-wide FingerprintManager,
// initializing it on first call.
func GetGlobalFingerprintManager() *FingerprintManager {
globalFingerprintManagerOnce.Do(func() {
globalFingerprintManager = NewFingerprintManager()
})
return globalFingerprintManager
}

// setRuntimeHeaders applies the Authorization, user-agent, and AWS SDK
// invocation headers to req using the per-account fingerprint.
func setRuntimeHeaders(req *http.Request, accessToken string, accountKey string) {
fp := GetGlobalFingerprintManager().GetFingerprint(accountKey)
req.Header.Set("Authorization", "Bearer "+accessToken)
req.Header.Set("x-amz-user-agent", fp.BuildAmzUserAgent())
req.Header.Set("User-Agent", fp.BuildUserAgent())
req.Header.Set("amz-sdk-invocation-id", uuid.New().String())
req.Header.Set("amz-sdk-request", "attempt=1; max=1")
}

// FetchProfileArn is the exported wrapper around the internal sso_oidc.go
// fetchProfileArn helper. It delegates to the unexported method so callers
// outside this file (such as oauth_web.go) can resolve a profile ARN
// without exposing transport details. clientID and refreshToken are
// accepted for forward compatibility (per-account fingerprinting) but
// the underlying API call only requires the access token.
func (c *SSOOIDCClient) FetchProfileArn(ctx context.Context, accessToken, clientID, refreshToken string) string {
_ = clientID
_ = refreshToken
return c.fetchProfileArn(ctx, accessToken)
}
3 changes: 3 additions & 0 deletions pkg/llmproxy/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,9 @@ type RemoteManagement struct {
SecretKey string `yaml:"secret-key"`
// DisableControlPanel skips serving and syncing the bundled management UI when true.
DisableControlPanel bool `yaml:"disable-control-panel"`
// DisableAutoUpdatePanel skips the periodic auto-update of the bundled
// management UI asset while still serving the control panel itself.
DisableAutoUpdatePanel bool `yaml:"disable-auto-update-panel"`
Comment on lines +203 to +205
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Architect Review — HIGH

The new RemoteManagement.DisableAutoUpdatePanel toggle is not included in BuildConfigChangeDetails, so changing this setting is not reported in config reload diff output, unlike the other remote-management fields.

Suggestion: Extend watcher/diff/config_diff.go to compare oldCfg.RemoteManagement.DisableAutoUpdatePanel vs newCfg.RemoteManagement.DisableAutoUpdatePanel and append a formatted change entry (e.g., "remote-management.disable-auto-update-panel: false -> true"), matching the existing patterns and tests.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is an **Architect / Logical Review** comment left during a code review. These reviews are first-class, important findings — not optional suggestions. Do NOT dismiss this as a 'big architectural change' just because the title says architect review; most of these can be resolved with a small, localized fix once the intent is understood.

**Path:** pkg/llmproxy/config/config.go
**Line:** 203:205
**Comment:**
	*HIGH: The new RemoteManagement.DisableAutoUpdatePanel toggle is not included in BuildConfigChangeDetails, so changing this setting is not reported in config reload diff output, unlike the other remote-management fields.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
If a suggested approach is provided above, use it as the authoritative instruction. If no explicit code suggestion is given, you MUST still draft and apply your own minimal, localized fix — do not punt back with 'no suggestion provided, review manually'. Keep the change as small as possible: add a guard clause, gate on a loading state, reorder an await, wrap in a conditional, etc. Do not refactor surrounding code or expand scope beyond the finding.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

// PanelGitHubRepository overrides the GitHub repository used to fetch the management panel asset.
// Accepts either a repository URL (https://github.com/org/repo) or an API releases endpoint.
PanelGitHubRepository string `yaml:"panel-github-repository"`
Expand Down
17 changes: 11 additions & 6 deletions pkg/llmproxy/store/gitstore.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,12 @@ import (
"time"

"github.com/go-git/go-git/v6"
gitclient "github.com/go-git/go-git/v6/plumbing/client"
"github.com/go-git/go-git/v6/config"
"github.com/go-git/go-git/v6/plumbing"
"github.com/go-git/go-git/v6/plumbing/object"
"github.com/go-git/go-git/v6/plumbing/transport"
"github.com/go-git/go-git/v6/plumbing/transport/http"
githttp "github.com/go-git/go-git/v6/plumbing/transport/http"
cliproxyauth "github.com/kooshapari/CLIProxyAPI/v7/sdk/cliproxy/auth"
)

Expand Down Expand Up @@ -120,7 +121,7 @@ func (s *GitTokenStore) EnsureRepository() error {
s.dirLock.Unlock()
return fmt.Errorf("git token store: create repo dir: %w", errMk)
}
if _, errClone := git.PlainClone(repoDir, &git.CloneOptions{Auth: authMethod, URL: s.remote}); errClone != nil {
if _, errClone := git.PlainClone(repoDir, &git.CloneOptions{ClientOptions: authMethod, URL: s.remote}); errClone != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Refactor the repository bootstrapping and clone/error-handling branch into one or more helper methods so this modified function stays under the 40-line limit. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

The updated EnsureRepository function is clearly far longer than the 40-line limit, with the clone/error-handling branch embedded in a large block of setup logic. The suggestion correctly identifies a real custom-rule violation.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/store/gitstore.go
**Line:** 124:124
**Comment:**
	*Custom Rule: Refactor the repository bootstrapping and clone/error-handling branch into one or more helper methods so this modified function stays under the 40-line limit.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

if errors.Is(errClone, transport.ErrEmptyRemoteRepository) {
_ = os.RemoveAll(gitDir)
repo, errInit := git.PlainInit(repoDir, false)
Expand Down Expand Up @@ -176,7 +177,7 @@ func (s *GitTokenStore) EnsureRepository() error {
s.dirLock.Unlock()
return fmt.Errorf("git token store: worktree: %w", errWorktree)
}
if errPull := worktree.Pull(&git.PullOptions{Auth: authMethod, RemoteName: "origin"}); errPull != nil {
if errPull := worktree.Pull(&git.PullOptions{ClientOptions: authMethod, RemoteName: "origin"}); errPull != nil {
switch {
case errors.Is(errPull, git.NoErrAlreadyUpToDate),
errors.Is(errPull, git.ErrUnstagedChanges),
Expand Down Expand Up @@ -538,15 +539,19 @@ func (s *GitTokenStore) repoDirSnapshot() string {
return s.repoDir
}

func (s *GitTokenStore) gitAuth() transport.AuthMethod {
// gitAuth returns transport client options carrying HTTP basic auth derived
// from the configured username/password, or nil when no credentials are set.
func (s *GitTokenStore) gitAuth() []gitclient.Option {
if s.username == "" && s.password == "" {
return nil
}
user := s.username
if user == "" {
user = "git"
}
return &http.BasicAuth{Username: user, Password: s.password}
return []gitclient.Option{
gitclient.WithHTTPAuth(&githttp.BasicAuth{Username: user, Password: s.password}),
}
}

func (s *GitTokenStore) relativeToRepo(path string) (string, error) {
Expand Down Expand Up @@ -637,7 +642,7 @@ func (s *GitTokenStore) commitAndPushLocked(message string, relPaths ...string)
return errRewrite
}
s.maybeRunGC(repo)
if err = repo.Push(&git.PushOptions{Auth: s.gitAuth(), Force: true}); err != nil {
if err = repo.Push(&git.PushOptions{ClientOptions: s.gitAuth(), Force: true}); err != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Extract the push/rewrite/GC sequence into a dedicated helper to keep this modified commit flow function within the 40-line maximum. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

The commitAndPushLocked function is well over 40 lines long, and the push/rewrite/GC sequence is part of that oversized workflow. This is a real violation of the custom line-limit rule, so the suggestion is valid.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/store/gitstore.go
**Line:** 645:645
**Comment:**
	*Custom Rule: Extract the push/rewrite/GC sequence into a dedicated helper to keep this modified commit flow function within the 40-line maximum.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

if errors.Is(err, git.NoErrAlreadyUpToDate) {
return nil
}
Expand Down
2 changes: 0 additions & 2 deletions pkg/llmproxy/thinking/provider/codex/apply.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@
package codex

import (
"strings"

"github.com/kooshapari/CLIProxyAPI/v7/pkg/llmproxy/registry"
"github.com/kooshapari/CLIProxyAPI/v7/pkg/llmproxy/thinking"
"github.com/tidwall/gjson"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
continue
}
partJSON := `{}`
partJSON, _ = sjson.SetBytesM(partJSON, "text", systemPrompt)
partJSON, _ = sjson.SetBytes(partJSON, "text", systemPrompt)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

sjson.SetBytes expects a []byte as its first argument, but partJSON is a string. To fix the compilation error while keeping partJSON as a string, use sjson.Set instead.

Suggested change
partJSON, _ = sjson.SetBytes(partJSON, "text", systemPrompt)
partJSON, _ = sjson.Set(partJSON, "text", systemPrompt)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: sjson.SetBytes expects a byte-slice JSON buffer, but this call passes a string buffer, which breaks the sjson API contract and will fail type-checking. Keep this value as []byte (or use the string-based sjson API consistently) before setting fields. [type error]

Severity Level: Critical 🚨
- ❌ Antigravity Claude translator package fails to compile.
- ⚠️ Tests for antigravity Claude translator cannot run.
Steps of Reproduction ✅
1. From the repository root `/workspace/cliproxyapi-plusplus`, run `go test
./pkg/llmproxy/translator/antigravity/claude` which builds the `claude` package containing
`ConvertClaudeRequestToAntigravity` (see
`pkg/llmproxy/translator/antigravity/claude/antigravity_claude_request_test.go:11,27`).

2. The Go toolchain compiles
`pkg/llmproxy/translator/antigravity/claude/antigravity_claude_request.go`, where at lines
58–60 the system-instruction loop initializes `partJSON := \`{}\`` (line 58) as a string
and then calls `partJSON, _ = sjson.SetBytes(partJSON, "text", systemPrompt)` (line 59).

3. Because `github.com/tidwall/sjson` is imported at lines 16–17 and its `SetBytes` API
expects a `[]byte` as the first parameter, the compiler reports a type mismatch:
`partJSON` is a `string` at line 58, so it cannot be passed to `sjson.SetBytes` which
requires `[]byte`, causing a compile-time error in this package.

4. The build of `pkg/llmproxy/translator/antigravity/claude` fails, so the entire `go test
./pkg/llmproxy/translator/antigravity/claude` invocation (and any broader `go test ./...`
or `go build ./...` including this package) is blocked before any tests or runtime paths
for `ConvertClaudeRequestToAntigravity` can execute.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/translator/antigravity/claude/antigravity_claude_request.go
**Line:** 59:59
**Comment:**
	*Type Error: `sjson.SetBytes` expects a byte-slice JSON buffer, but this call passes a string buffer, which breaks the sjson API contract and will fail type-checking. Keep this value as `[]byte` (or use the string-based sjson API consistently) before setting fields.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

systemInstructionJSON, _ = sjson.SetRaw(systemInstructionJSON, "parts.-1", partJSON)
hasSystemInstruction = true
}
Expand All @@ -65,7 +65,7 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
systemPrompt := strings.TrimSpace(systemResult.String())
if systemPrompt != "" {
systemInstructionJSON = `{"role":"user","parts":[{"text":""}]}`
systemInstructionJSON, _ = sjson.SetBytesM(systemInstructionJSON, "parts.0.text", systemPrompt)
systemInstructionJSON, _ = sjson.SetBytes(systemInstructionJSON, "parts.0.text", systemPrompt)
hasSystemInstruction = true
}
}
Expand Down Expand Up @@ -380,7 +380,7 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
continue
}
partJSON := `{}`
partJSON, _ = sjson.SetBytesM(partJSON, "text", prompt)
partJSON, _ = sjson.SetBytes(partJSON, "text", prompt)
clientContentJSON, _ = sjson.SetRaw(clientContentJSON, "parts.-1", partJSON)
contentsJSON, _ = sjson.SetRaw(contentsJSON, "-1", clientContentJSON)
hasContents = true
Expand Down Expand Up @@ -526,7 +526,7 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
maxTokens = limit
}
}
out, _ = sjson.SetBytesM(out, "request.generationConfig.maxOutputTokens", maxTokens)
out, _ = sjson.SetBytes(out, "request.generationConfig.maxOutputTokens", maxTokens)
}

out = common.AttachDefaultSafetySettings(out, "request.safetySettings")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ func ConvertAntigravityResponseToOpenAI(_ context.Context, _ string, originalReq
}
promptTokenCount := usageResult.Get("promptTokenCount").Int() - cachedTokenCount
thoughtsTokenCount := usageResult.Get("thoughtsTokenCount").Int()
template, _ = sjson.SetBytesM(template, "usage.prompt_tokens", promptTokenCount+thoughtsTokenCount)
template, _ = sjson.SetBytes(template, "usage.prompt_tokens", promptTokenCount+thoughtsTokenCount)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Since this change modifies a function whose body is far over 40 lines, extract the usage-metadata update block into a dedicated helper so the main conversion function stays within the size limit. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

The modified function is well over 40 lines long, and the usage-metadata logic is embedded directly inside it. This matches the suggestion's stated size-limit concern, so the rule violation is present in the existing code.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/translator/antigravity/openai/chat-completions/antigravity_openai_response.go
**Line:** 106:106
**Comment:**
	*Custom Rule: Since this change modifies a function whose body is far over 40 lines, extract the usage-metadata update block into a dedicated helper so the main conversion function stays within the size limit.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: thoughtsTokenCount represents reasoning/output tokens, but this line adds it to usage.prompt_tokens (input tokens). That misclassifies output tokens as prompt tokens and can produce inconsistent usage accounting (prompt_tokens + completion_tokens no longer reflects the true split), which breaks downstream billing/analytics expectations. Keep prompt tokens as prompt-only and account reasoning tokens under completion/output fields. [logic error]

Severity Level: Major ⚠️
- ❌ Antigravity OpenAI usage misattributes reasoning tokens as prompt.
- ⚠️ Usage logger `TokenStats.InputTokens` inflated for antigravity reasoning.
- ⚠️ TUI usage tab shows incorrect input versus reasoning totals.
- ⚠️ Provider analytics per-provider metrics skewed for antigravity sessions.
Steps of Reproduction ✅
1. Trigger the antigravity → OpenAI chat-completions translator by calling
`ConvertAntigravityResponseToOpenAI` in
`pkg/llmproxy/translator/antigravity/openai/chat-completions/antigravity_openai_response.go:50`
with a Gemini CLI streaming chunk whose `response.usageMetadata` includes:

   - `promptTokenCount`

   - `candidatesTokenCount`

   - `totalTokenCount`

   - non-zero `thoughtsTokenCount`

   - `cachedContentTokenCount`.

2. Inside `ConvertAntigravityResponseToOpenAI`, the usage block at
`antigravity_openai_response.go:95-107` executes:

   - `promptTokenCount := usageResult.Get("promptTokenCount").Int() - cachedTokenCount`
   (line 104),

   - `thoughtsTokenCount := usageResult.Get("thoughtsTokenCount").Int()` (line 105),

   - `template, _ = sjson.SetBytes(template, "usage.prompt_tokens",
   promptTokenCount+thoughtsTokenCount)` (line 106),

   - and, if `thoughtsTokenCount > 0`, `template, _ = sjson.SetBytes(template,
   "usage.completion_tokens_details.reasoning_tokens", thoughtsTokenCount)` (line 108).

   This produces an OpenAI-style chunk where `usage.prompt_tokens` already includes
   reasoning/output tokens (`thoughtsTokenCount`), while
   `usage.completion_tokens_details.reasoning_tokens` separately records them as reasoning
   tokens.

3. Feed the resulting OpenAI-style JSON (for example, by wrapping the chunk into a
non-streaming response payload) through the shared usage parser `parseOpenAIUsage` in
`pkg/llmproxy/executor/usage_helpers.go:52-77`:

   - `InputTokens` is read from `usage.prompt_tokens` (line 54),

   - `OutputTokens` from `usage.completion_tokens` (line 55),

   - `ReasoningTokens` from `usage.completion_tokens_details.reasoning_tokens` (lines
   65-72).

   Because `usage.prompt_tokens` already contains `promptTokenCount + thoughtsTokenCount`
   and `ReasoningTokens` is set to `thoughtsTokenCount`, the same reasoning tokens are now
   counted both as input tokens and as reasoning tokens.

4. Observe downstream effects in analytics/logging:

   - `usageReporter.publishWithOutcome` in `pkg/llmproxy/executor/usage_helpers.go:63-73`
   forwards this `usage.Detail` to `usage.PublishRecord`,

   - `pkg/llmproxy/usage/logger_plugin.go:11-18` stores these as `TokenStats` where
   `InputTokens` includes reasoning tokens, and `ReasoningTokens` also includes them,

   - the TUI aggregation in `pkg/llmproxy/tui/usage_tab.go:10-26` then reports inflated
   input token totals and separate reasoning totals, breaking the expected
   prompt/completion split (prompt tokens are no longer pure input, they include
   reasoning/output tokens).

   This behavior contrasts with the canonical mapping in
   `pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_response.go:13-25`,
   where reasoning tokens are only mapped under
   `usage.output_tokens_details.reasoning_tokens` and not added into
   `usage.prompt_tokens`, showing that the antigravity translator's use of
   `promptTokenCount+thoughtsTokenCount` for `usage.prompt_tokens` is inconsistent and
   misclassifies reasoning/output tokens as prompt tokens.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/translator/antigravity/openai/chat-completions/antigravity_openai_response.go
**Line:** 106:106
**Comment:**
	*Logic Error: `thoughtsTokenCount` represents reasoning/output tokens, but this line adds it to `usage.prompt_tokens` (input tokens). That misclassifies output tokens as prompt tokens and can produce inconsistent usage accounting (`prompt_tokens + completion_tokens` no longer reflects the true split), which breaks downstream billing/analytics expectations. Keep prompt tokens as prompt-only and account reasoning tokens under completion/output fields.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

if thoughtsTokenCount > 0 {
template, _ = sjson.SetBytes(template, "usage.completion_tokens_details.reasoning_tokens", thoughtsTokenCount)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import (
"strings"

"github.com/google/uuid"
"github.com/kooshapari/CLIProxyAPI/v7/pkg/llmproxy/registry"
"github.com/kooshapari/CLIProxyAPI/v7/pkg/llmproxy/thinking"
"github.com/kooshapari/CLIProxyAPI/v7/pkg/llmproxy/util"
"github.com/tidwall/gjson"
Expand Down
Loading
Loading