Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions pkg/llmproxy/api/handlers/management/api_call.go
Original file line number Diff line number Diff line change
Expand Up @@ -201,11 +201,7 @@ func (h *Handler) APICall(c *gin.Context) {
req.Host = hostOverride
}

httpClient := &http.Client{
Timeout: defaultAPICallTimeout,
}
httpClient.Transport = h.apiCallTransport(auth)

httpClient := h.apiCallHTTPClient(auth)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Since this PR modifies a function that is far longer than 40 lines, refactor APICall by extracting cohesive sections (such as request parsing, header/token preparation, upstream execution, and response formatting) into small helper functions to keep the handler within the size rule. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

The final APICall function is much longer than 40 lines, so it violates the stated function-length rule. The suggestion to extract request parsing, header/token handling, upstream execution, and response formatting into helpers directly addresses that real violation.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/api/handlers/management/api_call.go
**Line:** 204:204
**Comment:**
	*Custom Rule: Since this PR modifies a function that is far longer than 40 lines, refactor `APICall` by extracting cohesive sections (such as request parsing, header/token preparation, upstream execution, and response formatting) into small helper functions to keep the handler within the size rule.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

resp, errDo := httpClient.Do(req)
if errDo != nil {
log.WithError(errDo).Debug("management APICall request failed")
Expand Down
50 changes: 50 additions & 0 deletions pkg/llmproxy/api/handlers/management/api_call_guard.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
package management

import (
"context"
"fmt"
"net"
"net/http"
"net/url"
)

func guardedAPICallDialContext(ctx context.Context, network string, addr string) (net.Conn, error) {
host, port, errSplit := net.SplitHostPort(addr)
if errSplit != nil {
return nil, fmt.Errorf("invalid dial address: %w", errSplit)
}
resolved, errResolve := resolveAllowedAPICallHostIPs(host)
if errResolve != nil {
return nil, errResolve
}
if len(resolved) == 0 {
return nil, fmt.Errorf("target host resolution failed")
}
dialer := &net.Dialer{}
return dialer.DialContext(ctx, network, net.JoinHostPort(resolved[0].IP.String(), port))
Comment on lines +20 to +24
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The dial guard always connects to only the first resolved IP address, so requests can fail unnecessarily when that single address is unreachable or not compatible with the requested network family while other resolved addresses are valid. Iterate through the allowed addresses (or use a fallback strategy) instead of hard-pinning to index 0. [logic error]

Severity Level: Major ⚠️
- ❌ /v0/management/api-call intermittently fails for multi-IP targets.
- ⚠️ Management integrations see 502 errors though upstream is reachable.
- ⚠️ Requires retries or manual reruns for affected API calls.
Steps of Reproduction ✅
1. Start the server so the management API handler `APICall` is exposed at POST
`/v0/management/api-call` (see `pkg/llmproxy/api/handlers/management/api_call.go:32-38`,
`api_call.go:86`).

2. From a management client, send a POST to `/v0/management/api-call` with JSON body
targeting an external URL whose hostname resolves to multiple public IPs, where the first
resolved IP is unreachable but a later one is reachable (e.g., host has both IPv6 and IPv4
records but the environment has no IPv6 route). The request is parsed and sanitized in
`APICall` (url handling at `api_call.go:117-123`, `sanitizeAPICallURL` at
`api_call_url.go:37-60`, and `validateResolvedHostIPs` at `api_call_url.go:62-89`).

3. `APICall` builds an outbound request and constructs an HTTP client via `httpClient :=
h.apiCallHTTPClient(auth)` (see `api_call.go:180-189`, `api_call.go:25`).
`apiCallHTTPClient` creates an `http.Client` using `Transport:
apiCallGuardedRoundTripper{base: h.apiCallTransport(auth)}` (see
`http_transport.go:53-56`), and `apiCallTransport` uses a cloned `http.Transport` with
`clone.DialContext = guardedAPICallDialContext` when no proxy is configured (see
`http_transport.go:43-50`).

4. During `httpClient.Do(req)` (see `api_call.go:25-30`), the HTTP transport calls
`guardedAPICallDialContext` (implemented at `api_call_guard.go:11-24`), which resolves
allowed IPs via `resolveAllowedAPICallHostIPs(host)` and then dials only `resolved[0].IP`
(lines `api_call_guard.go:16-24`). If that first IP is unreachable while another allowed
IP in `resolved[1:]` is reachable, the dial fails and `APICall` returns `HTTP 502` with
`"request failed"` (see `api_call.go:25-30`), even though a working address for the same
hostname exists—an avoidable failure caused by pinning to the first resolved IP instead of
iterating or falling back.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/api/handlers/management/api_call_guard.go
**Line:** 20:24
**Comment:**
	*Logic Error: The dial guard always connects to only the first resolved IP address, so requests can fail unnecessarily when that single address is unreachable or not compatible with the requested network family while other resolved addresses are valid. Iterate through the allowed addresses (or use a fallback strategy) instead of hard-pinning to index 0.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +16 to +24
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The dial guard always connects to resolved[0] and never retries other validated IPs for the same host. On multi-record/dual-stack DNS responses, this can fail requests when the first address is unreachable even though another allowed address would succeed. Iterate through resolved IPs (or use a dial strategy with fallback) instead of pinning to only the first entry. [logic error]

Severity Level: Major ⚠️
- ❌ /v0/management/api-call fails for partially reachable hosts.
- ⚠️ Copilot quota fetch can fail despite usable IPs.
- ⚠️ Reduces robustness against common dual-stack DNS setups.
Steps of Reproduction ✅
1. Start cliproxyapi++ so the management handler `APICall` in
`pkg/llmproxy/api/handlers/management/api_call.go:86` is serving `POST
/v0/management/api-call`.

2. Configure DNS for some allowed external host (e.g. `api.example.com`) so
`net.DefaultResolver.LookupIPAddr` returns multiple IPs where the first is unreachable
(firewalled or blackholed) and a later one is reachable.

3. From a management client, call `POST /v0/management/api-call` with JSON body
`{"method":"GET","url":"https://api.example.com/ping"}` so the handler builds an HTTP
client via `apiCallHTTPClient` (`http_transport.go:53-56`), which uses
`guardedAPICallDialContext` as its `DialContext` (`http_transport.go:47-50`).

4. When the client performs `httpClient.Do(req)` (`api_call.go:25-27`), the dial path
`guardedAPICallDialContext` (`api_call_guard.go:11-24`) resolves all allowed IPs but
unconditionally dials only `resolved[0].IP` (`api_call_guard.go:16-24`); the connection
fails even though a later allowed IP would succeed, causing the management API call to
fail unnecessarily.

5. The same single-IP dial behavior is exercised by the Copilot quota enrichment path,
which constructs an `http.Client` using `h.apiCallTransport(auth)` (with
`clone.DialContext = guardedAPICallDialContext` in `http_transport.go:43-50`) in
`copilot_quota.go:32-35`, so quota fetches also fail when the first IP is bad but others
are good.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/api/handlers/management/api_call_guard.go
**Line:** 16:24
**Comment:**
	*Logic Error: The dial guard always connects to `resolved[0]` and never retries other validated IPs for the same host. On multi-record/dual-stack DNS responses, this can fail requests when the first address is unreachable even though another allowed address would succeed. Iterate through resolved IPs (or use a dial strategy with fallback) instead of pinning to only the first entry.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +16 to +24
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Architect Review — HIGH

The guarded dialer resolves all allowed IPs for a host but always dials only resolved[0], assuming the first DNS result is reachable. On common dual-stack or multi-address setups (e.g., unreachable IPv6 first, reachable IPv4 second), management API calls will fail even though the target host is valid and reachable, regressing the default transport's multi-address fallback behavior.

Suggestion: Keep the SSRF/IP-allowlist enforcement but iterate (or otherwise perform happy-eyeballs over) the full allowed IP list, attempting connection to each in turn or with parallel race, instead of hard-pinning the dial to only the first resolved address.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is an **Architect / Logical Review** comment left during a code review. These reviews are first-class, important findings — not optional suggestions. Do NOT dismiss this as a 'big architectural change' just because the title says architect review; most of these can be resolved with a small, localized fix once the intent is understood.

**Path:** pkg/llmproxy/api/handlers/management/api_call_guard.go
**Line:** 16:24
**Comment:**
	*HIGH: The guarded dialer resolves all allowed IPs for a host but always dials only resolved[0], assuming the first DNS result is reachable. On common dual-stack or multi-address setups (e.g., unreachable IPv6 first, reachable IPv4 second), management API calls will fail even though the target host is valid and reachable, regressing the default transport's multi-address fallback behavior.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
If a suggested approach is provided above, use it as the authoritative instruction. If no explicit code suggestion is given, you MUST still draft and apply your own minimal, localized fix — do not punt back with 'no suggestion provided, review manually'. Keep the change as small as possible: add a guard clause, gate on a loading state, reorder an await, wrap in a conditional, etc. Do not refactor surrounding code or expand scope beyond the finding.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

}

type apiCallGuardedRoundTripper struct {
base http.RoundTripper
}

func (t apiCallGuardedRoundTripper) RoundTrip(req *http.Request) (*http.Response, error) {
if req == nil {
return nil, fmt.Errorf("request is nil")
}
if errValidate := validateAPICallRequestURL(req.URL); errValidate != nil {
return nil, errValidate
}
base := t.base
if base == nil {
base = http.DefaultTransport
}
return base.RoundTrip(req)
}

func validateAPICallRequestURL(reqURL *url.URL) error {
if errValidate := validateAPICallURL(reqURL); errValidate != nil {
return errValidate
}
return validateResolvedHostIPs(reqURL.Hostname())
}
Comment on lines +45 to +50
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: URL validation in the round-tripper triggers host resolution through helpers that use background context, so DNS lookup is not tied to request cancellation or client timeout. Under slow/hanging DNS, requests can block longer than intended. Plumb req.Context() into host-resolution validation so cancellation and deadlines are respected. [possible bug]

Severity Level: Major ⚠️
- ❌ /v0/management/api-call may hang on slow DNS.
- ⚠️ Copilot quota enrichment can exceed caller deadlines.
- ⚠️ Request timeout configuration not fully enforced for DNS.
Steps of Reproduction ✅
1. Start cliproxyapi++ so the management handler `APICall` in
`pkg/llmproxy/api/handlers/management/api_call.go:86` is serving `POST
/v0/management/api-call`, and note the HTTP client timeout `defaultAPICallTimeout` used in
`apiCallHTTPClient` (`http_transport.go:53-55`).

2. Misconfigure system DNS (e.g., `/etc/resolv.conf`) or the target hostname so lookups
for some allowed host (e.g. `https://safe.example.com/ping`) hang or are extremely slow
when `net.DefaultResolver.LookupIPAddr` is called.

3. From a management client, issue `POST /v0/management/api-call` with a JSON body whose
`url` points to `https://safe.example.com/ping`; the handler sanitizes the URL and already
calls `validateResolvedHostIPs(parsedURL.Hostname())` (`api_call.go:13-20`), which
internally calls `resolveAllowedAPICallHostIPs` using `context.Background()`
(`api_call_url.go:62-73`), so this first DNS pass is already uncancelable by the request
context.

4. Next, `APICall` builds an HTTP client via `apiCallHTTPClient`
(`http_transport.go:53-56`); when `httpClient.Do(req)` runs, the transport
`apiCallGuardedRoundTripper` executes `RoundTrip` (`api_call_guard.go:31-42`), which again
calls `validateAPICallRequestURL` (`api_call_guard.go:45-49`) and therefore
`validateResolvedHostIPs(reqURL.Hostname())` (`api_call_guard.go:49`,
`api_call_url.go:62-73`), performing DNS resolution with `context.Background()` inside
`resolveAllowedAPICallHostIPs`.

5. Because `resolveAllowedAPICallHostIPs` uses `context.Background()` rather than
`req.Context()` or the client's timeout context, the APICall goroutine blocks inside
`LookupIPAddr` past the configured `defaultAPICallTimeout`, causing the
`/v0/management/api-call` request to hang until OS-level DNS timeouts elapse instead of
respecting the request deadline.

6. The same uncancelable DNS behavior appears in other call sites like Copilot quota
enrichment (`copilot_quota.go:15-19`), which also calls
`validateResolvedHostIPs(parsedQuotaURL.Hostname())` and thus is vulnerable to hanging
beyond its provided `ctx` deadline.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/api/handlers/management/api_call_guard.go
**Line:** 45:50
**Comment:**
	*Possible Bug: URL validation in the round-tripper triggers host resolution through helpers that use background context, so DNS lookup is not tied to request cancellation or client timeout. Under slow/hanging DNS, requests can block longer than intended. Plumb `req.Context()` into host-resolution validation so cancellation and deadlines are respected.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

50 changes: 50 additions & 0 deletions pkg/llmproxy/api/handlers/management/api_call_transport_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
package management

import (
"errors"
"net/http"
"net/url"
"testing"
)

type apiCallRoundTripFunc func(*http.Request) (*http.Response, error)

func (f apiCallRoundTripFunc) RoundTrip(req *http.Request) (*http.Response, error) {
return f(req)
}

func TestAPICallGuardedRoundTripperRejectsUnsafeRequestURL(t *testing.T) {
t.Parallel()

called := false
transport := apiCallGuardedRoundTripper{
base: apiCallRoundTripFunc(func(*http.Request) (*http.Response, error) {
called = true
return nil, errors.New("base transport should not run")
}),
}

req, err := http.NewRequest(http.MethodGet, "http://127.0.0.1:8317/ping", nil)
if err != nil {
t.Fatalf("new request: %v", err)
}
if _, err := transport.RoundTrip(req); err == nil {
t.Fatalf("RoundTrip error = nil, want unsafe target rejection")
}
if called {
t.Fatal("base transport ran for unsafe URL")
}
}

func TestAPICallRequestURLValidationRejectsUnsafeRedirectURL(t *testing.T) {
t.Parallel()

redirectURL, err := url.Parse("http://localhost:8317/ping")
if err != nil {
t.Fatalf("parse redirect url: %v", err)
}
req := &http.Request{URL: redirectURL}
if err := validateAPICallRequestURL(req.URL); err == nil {
t.Fatalf("validation error = nil, want unsafe redirect rejection")
}
}
25 changes: 18 additions & 7 deletions pkg/llmproxy/api/handlers/management/api_call_url.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package management

import (
"context"
"fmt"
"net"
"net/url"
Expand Down Expand Up @@ -59,23 +60,33 @@ func sanitizeAPICallURL(raw string) (string, *url.URL, error) {
}

func validateResolvedHostIPs(host string) error {
_, err := resolveAllowedAPICallHostIPs(host)
return err
}

func resolveAllowedAPICallHostIPs(host string) ([]net.IPAddr, error) {
trimmed := strings.TrimSpace(host)
if trimmed == "" {
return fmt.Errorf("invalid url host")
return nil, fmt.Errorf("invalid url host")
}
resolved, errLookup := net.LookupIP(trimmed)
resolved, errLookup := net.DefaultResolver.LookupIPAddr(context.Background(), trimmed)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: DNS resolution is performed with context.Background(), which ignores request cancellation and client timeout propagation. A slow or stuck resolver can block the handler path even after the request is canceled, causing request hangs and potential resource exhaustion under load. Thread the request/dial context into this resolver call and use it for lookup. [possible bug]

Severity Level: Major ⚠️
- ❌ Management API /v0/management/api-call can hang on DNS.
- ⚠️ Copilot quota enrichment pauses when resolver is slow.
- ⚠️ Guarded dialer DNS check ignores caller cancellation context.
Steps of Reproduction ✅
1. Start the service so the management API handler `APICall` in
`pkg/llmproxy/api/handlers/management/api_call.go:86-255` is serving `POST
/v0/management/api-call` (documented in the comment block at lines 32-83).

2. From a client, send a `POST /v0/management/api-call` request with a valid JSON body
pointing `url` to an HTTPS endpoint whose hostname resolves slowly or hangs via the system
DNS (e.g., misconfigured or unreachable resolver); this causes `APICall` to call
`sanitizeAPICallURL` in `api_call_url.go:37-59`, which in turn leads to
`validateResolvedHostIPs(parsedURL.Hostname())` at `api_call.go:127`.

3. While the handler is blocked in `validateResolvedHostIPs` (implemented at
`api_call_url.go:62-65`), the function calls `resolveAllowedAPICallHostIPs` at
`api_call_url.go:67-89`, which performs
`net.DefaultResolver.LookupIPAddr(context.Background(), trimmed)` at line 72; now abort
the client request (e.g., `curl` Ctrl-C or closing the HTTP connection) so
`c.Request.Context()` in `APICall` is canceled.

4. Observe that despite the request context being canceled, the goroutine remains blocked
until the DNS lookup completes or times out at OS level because `LookupIPAddr` is invoked
with `context.Background()` (line 72) instead of the request/dial context; the same
uncancelable DNS resolution is also used in the guarded dial path
`guardedAPICallDialContext` in `api_call_guard.go:11-25`, which calls
`resolveAllowedAPICallHostIPs` at line 16, so both direct management calls and guarded
transports can hang under DNS issues.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/api/handlers/management/api_call_url.go
**Line:** 72:72
**Comment:**
	*Possible Bug: DNS resolution is performed with `context.Background()`, which ignores request cancellation and client timeout propagation. A slow or stuck resolver can block the handler path even after the request is canceled, causing request hangs and potential resource exhaustion under load. Thread the request/dial context into this resolver call and use it for lookup.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

if errLookup != nil {
return fmt.Errorf("target host resolution failed")
return nil, fmt.Errorf("target host resolution failed")
}
allowed := make([]net.IPAddr, 0, len(resolved))
for _, ip := range resolved {
if ip == nil {
if ip.IP == nil {
continue
}
if ip.IsLoopback() || ip.IsPrivate() || ip.IsUnspecified() || ip.IsMulticast() || ip.IsLinkLocalUnicast() || ip.IsLinkLocalMulticast() {
return fmt.Errorf("target host is not allowed")
if ip.IP.IsLoopback() || ip.IP.IsPrivate() || ip.IP.IsUnspecified() || ip.IP.IsMulticast() || ip.IP.IsLinkLocalUnicast() || ip.IP.IsLinkLocalMulticast() {
return nil, fmt.Errorf("target host is not allowed")
}
allowed = append(allowed, ip)
}
return nil
if len(allowed) == 0 {
return nil, fmt.Errorf("target host resolution failed")
}
return allowed, nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dial guard context not propagated to DNS resolution

Low Severity

resolveAllowedAPICallHostIPs hard-codes context.Background() for DNS resolution. guardedAPICallDialContext receives a ctx (carrying the request deadline/cancellation) but never forwards it, so DNS lookups cannot be canceled when the HTTP client timeout fires or the request is aborted. This could cause the dial to block on slow DNS even after the request context is done.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 5d88d7d. Configure here.

Comment on lines +76 to +89
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: This helper preserves resolver order and returns all allowed IPs as-is, but the dial guard consumes only the first returned address; on dual-stack hosts this can select an unreachable family (commonly IPv6-first), causing intermittent connection failures even when another resolved address is healthy. Return a preferred IP for the requested network or provide ordered/fallback candidates that the dial path can iterate. [logic error]

Severity Level: Major ⚠️
- ❌ /v0/management/api-call may fail for dual-stack hosts.
- ⚠️ Upstream services reachable via IPv4-only become intermittently unreachable.
- ⚠️ Admin tooling relying on api-call experiences avoidable connection errors.
Steps of Reproduction ✅
1. Configure DNS so that a target hostname (e.g. `api.example.com`) used with the
management endpoint `POST /v0/management/api-call` (documented and implemented in
`pkg/llmproxy/api/handlers/management/api_call.go:32-85`) resolves to multiple IPs, such
as an unreachable IPv6 address followed by a reachable IPv4 address (dual-stack,
IPv6-first).

2. Send a management request to `/v0/management/api-call` with
`"url":"https://api.example.com/path"` so that `APICall` builds an outbound HTTP request
and HTTP client via `httpClient := h.apiCallHTTPClient(auth)` at
`pkg/llmproxy/api/handlers/management/api_call.go:25-26`.

3. `apiCallHTTPClient` in `pkg/llmproxy/api/handlers/management/http_transport.go:53-63`
constructs an `http.Client` whose `Transport` is an `apiCallGuardedRoundTripper` wrapping
a transport whose `DialContext` has been overridden to `guardedAPICallDialContext` at
`http_transport.go:47-50`.

4. When `httpClient.Do(req)` runs, the transport calls `guardedAPICallDialContext` in
`pkg/llmproxy/api/handlers/management/api_call_guard.go:11-25`, which resolves the
hostname via `resolveAllowedAPICallHostIPs(host)` (in `api_call_url.go:67-90`) and then
always dials `resolved[0].IP` at `api_call_guard.go:23-24`, ignoring any additional
resolved addresses.

5. On a dual-stack host where the first resolved IP (e.g. IPv6) is unreachable but a later
address (e.g. IPv4) is reachable, the dial attempt using only `resolved[0].IP` fails,
causing `httpClient.Do(req)` to return an error and `APICall` to respond with `HTTP 502`
`"request failed"` at `api_call.go:25-30`, even though a healthy alternative IP address
was available in the DNS response.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/api/handlers/management/api_call_url.go
**Line:** 76:89
**Comment:**
	*Logic Error: This helper preserves resolver order and returns all allowed IPs as-is, but the dial guard consumes only the first returned address; on dual-stack hosts this can select an unreachable family (commonly IPv6-first), causing intermittent connection failures even when another resolved address is healthy. Return a preferred IP for the requested network or provide ordered/fallback candidates that the dial path can iterate.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

}

func isAllowedHostOverride(parsedURL *url.URL, override string) bool {
Expand Down
14 changes: 14 additions & 0 deletions pkg/llmproxy/api/handlers/management/http_transport.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,23 @@ func (h *Handler) apiCallTransport(auth *coreauth.Auth) http.RoundTripper {
}
clone := transport.Clone()
clone.Proxy = nil
clone.DialContext = guardedAPICallDialContext
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Routing all direct connections through guardedAPICallDialContext introduces a single-address dial path (it resolves and dials only one IP), which can fail on multi-record hosts when the first returned address is unreachable (common with dual-stack IPv6/IPv4 setups). This causes intermittent request failures even when other resolved addresses are healthy; iterate through allowed resolved IPs or preserve fallback dialing behavior. [logic error]

Severity Level: Major ⚠️
- ❌ Management APICall endpoint may fail on dual-stack hosts.
- ⚠️ Copilot quota enrichment requests can intermittently fail outbound.
- ⚠️ Loss of Go net/http multi-address fallback behavior.
Steps of Reproduction ✅
1. Run cliproxyapi++ so the management endpoint `POST /v0/management/api-call` is served
by `Handler.APICall` in `pkg/llmproxy/api/handlers/management/api_call.go:86-130`.

2. From a client, issue a `POST /v0/management/api-call` request (as in
`api_tools_cbor_test.go:20-39`) with a JSON body whose `"url"` points to an HTTPS API host
that resolves to multiple IPs (A/AAAA) where the first returned IP is unreachable from
this environment (e.g. IPv6-only address on an IPv4-only network).

3. The handler builds an `http.Client` via `h.apiCallHTTPClient(auth)` at
`api_call.go:204`, which uses `apiCallTransport` in `http_transport.go:17-50`; for the
direct-connect path (no proxy), `apiCallTransport` clones `http.DefaultTransport`,
disables `Proxy`, and sets `clone.DialContext = guardedAPICallDialContext` at
`http_transport.go:47-49`.

4. `guardedAPICallDialContext` in `api_call_guard.go:11-25` resolves allowed IPs with
`resolveAllowedAPICallHostIPs` and then always dials only `resolved[0]`
(`net.JoinHostPort(resolved[0].IP.String(), port)`), so if that first IP is unreachable
the dial fails immediately with no attempt to try `resolved[1:]`; `httpClient.Do(req)` at
`api_call.go:204-209` returns an error and the endpoint responds `502` (`"request
failed"`), even though other resolved IPs could have succeeded.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/api/handlers/management/http_transport.go
**Line:** 49:49
**Comment:**
	*Logic Error: Routing all direct connections through `guardedAPICallDialContext` introduces a single-address dial path (it resolves and dials only one IP), which can fail on multi-record hosts when the first returned address is unreachable (common with dual-stack IPv6/IPv4 setups). This causes intermittent request failures even when other resolved addresses are healthy; iterate through allowed resolved IPs or preserve fallback dialing behavior.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

return clone
}

func (h *Handler) apiCallHTTPClient(auth *coreauth.Auth) *http.Client {
return &http.Client{
Timeout: defaultAPICallTimeout,
Transport: apiCallGuardedRoundTripper{base: h.apiCallTransport(auth)},
CheckRedirect: func(req *http.Request, via []*http.Request) error {
if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects")
}
return validateAPICallRequestURL(req.URL)
},
}
}

func buildProxyTransportWithError(proxyStr string) (*http.Transport, error) {
proxyStr = strings.TrimSpace(proxyStr)
if proxyStr == "" {
Expand Down
61 changes: 30 additions & 31 deletions pkg/llmproxy/executor/cursor_executor.go
Original file line number Diff line number Diff line change
Expand Up @@ -333,11 +333,13 @@ func (e *CursorExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r

id := "chatcmpl-" + uuid.New().String()[:28]
created := time.Now().Unix()
openaiResp := fmt.Sprintf(`{"id":"%s","object":"chat.completion","created":%d,"model":"%s","choices":[{"index":0,"message":{"role":"assistant","content":%s},"finish_reason":"stop"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}`,
id, created, parsed.Model, jsonString(fullText.String()))
openaiResp, errMarshal := cursorCompletionJSON(id, created, parsed.Model, fullText.String())
if errMarshal != nil {
return resp, fmt.Errorf("cursor: failed to encode response: %w", errMarshal)
Comment on lines +336 to +338
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Extract the response-encoding and post-processing block into a small helper so this modified function stays within the 40-line function-length rule. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This added block is inside the already-large Execute function, and the suggested custom rule is about keeping modified functions under 40 lines. The new response-encoding/post-processing logic contributes to that length violation, so the suggestion matches a real issue.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/executor/cursor_executor.go
**Line:** 336:338
**Comment:**
	*Custom Rule: Extract the response-encoding and post-processing block into a small helper so this modified function stays within the 40-line function-length rule.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

}

// Translate response back to source format if needed
result := []byte(openaiResp)
result := openaiResp
if from.String() != "" && from.String() != "openai" {
var param any
result = sdktranslator.TranslateNonStream(ctx, to, from, req.Model, bytes.Clone(opts.OriginalRequest), payload, result, &param)
Expand Down Expand Up @@ -536,21 +538,21 @@ func (e *CursorExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A

// Wrap sendChunk/sendDone to use emitToOut
sendChunkSwitchable := func(delta string, finishReason string) {
fr := "null"
if finishReason != "" {
fr = finishReason
openaiJSON, errMarshal := cursorChunkJSON(chatId, created, parsed.Model, json.RawMessage(delta), finishReason)
if errMarshal != nil {
log.Warnf("cursor: failed to encode stream chunk: %v", errMarshal)
return
}
openaiJSON := fmt.Sprintf(`{"id":"%s","object":"chat.completion.chunk","created":%d,"model":"%s","choices":[{"index":0,"delta":%s,"finish_reason":%s}]}`,
chatId, created, parsed.Model, delta, fr)
sseLine := []byte("data: " + openaiJSON + "\n")
sseLine := append([]byte("data: "), openaiJSON...)
sseLine = append(sseLine, '\n')
Comment on lines +541 to +547
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Move stream chunk encoding/emission logic into a dedicated helper to reduce the size of this modified function body below the 40-line limit. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This is newly added logic inside ExecuteStream, which is already a very large function. The suggestion is aimed at the 40-line function-length rule, and this block is part of the extra code that makes the modified function exceed that limit.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/llmproxy/executor/cursor_executor.go
**Line:** 541:547
**Comment:**
	*Custom Rule: Move stream chunk encoding/emission logic into a dedicated helper to reduce the size of this modified function body below the 40-line limit.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎


if needsTranslate {
translated := sdktranslator.TranslateStream(ctx, to, from, req.Model, originalPayload, payload, sseLine, &streamParam)
for _, t := range translated {
emitToOut(cliproxyexecutor.StreamChunk{Payload: bytes.Clone(t)})
}
} else {
emitToOut(cliproxyexecutor.StreamChunk{Payload: []byte(openaiJSON)})
emitToOut(cliproxyexecutor.StreamChunk{Payload: openaiJSON})
}
}

Expand Down Expand Up @@ -595,25 +597,24 @@ func (e *CursorExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
thinkingActive = true
sendChunkSwitchable(`{"role":"assistant","content":"<think>"}`, "")
}
sendChunkSwitchable(fmt.Sprintf(`{"content":%s}`, jsonString(text)), "")
sendChunkSwitchable(cursorContentDeltaJSON(text), "")
} else {
if thinkingActive {
thinkingActive = false
sendChunkSwitchable(`{"content":"</think>"}`, "")
}
sendChunkSwitchable(fmt.Sprintf(`{"content":%s}`, jsonString(text)), "")
sendChunkSwitchable(cursorContentDeltaJSON(text), "")
}
},
func(exec pendingMcpExec) {
if thinkingActive {
thinkingActive = false
sendChunkSwitchable(`{"content":"</think>"}`, "")
}
toolCallJSON := fmt.Sprintf(`{"tool_calls":[{"index":%d,"id":"%s","type":"function","function":{"name":"%s","arguments":%s}}]}`,
toolCallIndex, exec.ToolCallId, exec.ToolName, jsonString(exec.Args))
toolCallJSON := cursorToolCallDeltaJSON(toolCallIndex, exec.ToolCallId, exec.ToolName, exec.Args)
toolCallIndex++
sendChunkSwitchable(toolCallJSON, "")
sendChunkSwitchable(`{}`, `"tool_calls"`)
sendChunkSwitchable(`{}`, "tool_calls")
sendDoneSwitchable()

// Close current output to end the current HTTP SSE response
Expand Down Expand Up @@ -701,23 +702,22 @@ func (e *CursorExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
}
// Include token usage in the final stop chunk
inputTok, outputTok := usage.get()
stopDelta := fmt.Sprintf(`{},"usage":{"prompt_tokens":%d,"completion_tokens":%d,"total_tokens":%d}`,
inputTok, outputTok, inputTok+outputTok)
// Build the stop chunk with usage embedded in the choices array level
fr := `"stop"`
openaiJSON := fmt.Sprintf(`{"id":"%s","object":"chat.completion.chunk","created":%d,"model":"%s","choices":[{"index":0,"delta":{},"finish_reason":%s}],"usage":{"prompt_tokens":%d,"completion_tokens":%d,"total_tokens":%d}}`,
chatId, created, parsed.Model, fr, inputTok, outputTok, inputTok+outputTok)
sseLine := []byte("data: " + openaiJSON + "\n")
openaiJSON, errMarshal := cursorUsageChunkJSON(chatId, created, parsed.Model, inputTok, outputTok)
if errMarshal != nil {
log.Warnf("cursor: failed to encode usage chunk: %v", errMarshal)
openaiJSON = []byte(`{"choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}`)
}
sseLine := append([]byte("data: "), openaiJSON...)
sseLine = append(sseLine, '\n')
if needsTranslate {
translated := sdktranslator.TranslateStream(ctx, to, from, req.Model, originalPayload, payload, sseLine, &streamParam)
for _, t := range translated {
emitToOut(cliproxyexecutor.StreamChunk{Payload: bytes.Clone(t)})
}
} else {
emitToOut(cliproxyexecutor.StreamChunk{Payload: []byte(openaiJSON)})
emitToOut(cliproxyexecutor.StreamChunk{Payload: openaiJSON})
}
sendDoneSwitchable()
_ = stopDelta // unused

// Close whatever output channel is still active
outMu.Lock()
Expand Down Expand Up @@ -1436,16 +1436,15 @@ func deriveSessionKey(clientKey string, model string, messages []gjson.Result) s
}

func sseChunk(id string, created int64, model string, delta string, finishReason string) cliproxyexecutor.StreamChunk {
fr := "null"
if finishReason != "" {
fr = finishReason
}
// Note: the framework's WriteChunk adds "data: " prefix and "\n\n" suffix,
// so we only output the raw JSON here.
data := fmt.Sprintf(`{"id":"%s","object":"chat.completion.chunk","created":%d,"model":"%s","choices":[{"index":0,"delta":%s,"finish_reason":%s}]}`,
id, created, model, delta, fr)
data, err := cursorChunkJSON(id, created, model, json.RawMessage(delta), finishReason)
if err != nil {
log.Warnf("cursor: failed to encode sse chunk: %v", err)
data = []byte(`{"choices":[{"index":0,"delta":{},"finish_reason":null}]}`)
}
return cliproxyexecutor.StreamChunk{
Payload: []byte(data),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified function sseChunk is never called anywhere

Low Severity

The sseChunk function was refactored in this PR to use cursorChunkJSON, but a codebase-wide grep confirms it has zero callers — it is dead code. The new cursorChunkJSON call, the error-handling fallback, and the log.Warnf path inside it are all unreachable.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 5d88d7d. Configure here.

Payload: data,
}
}

Expand Down
79 changes: 79 additions & 0 deletions pkg/llmproxy/executor/cursor_executor_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
package executor

import (
"encoding/json"
"testing"
)

func TestCursorCompletionJSONEscapesModelAndContent(t *testing.T) {
t.Parallel()

payload, err := cursorCompletionJSON("chatcmpl-test", 1700000000, `x","pwned":true,"y":"`, `hi "there"`)
if err != nil {
t.Fatalf("cursorCompletionJSON: %v", err)
}

var got map[string]any
if err := json.Unmarshal(payload, &got); err != nil {
t.Fatalf("unmarshal payload: %v; payload=%s", err, payload)
}
if got["model"] != `x","pwned":true,"y":"` {
t.Fatalf("model = %q", got["model"])
}
if _, ok := got["pwned"]; ok {
t.Fatalf("payload allowed model to inject top-level field: %s", payload)
}
}

func TestCursorChunkJSONEscapesModelAndFinishReason(t *testing.T) {
t.Parallel()

payload, err := cursorChunkJSON(
"chatcmpl-test",
1700000000,
`x","pwned":true,"y":"`,
json.RawMessage(`{"content":"ok"}`),
`stop","pwned":true,"x":"`,
)
if err != nil {
t.Fatalf("cursorChunkJSON: %v", err)
}

var got struct {
Model string `json:"model"`
Pwned bool `json:"pwned"`
Choices []struct {
FinishReason string `json:"finish_reason"`
} `json:"choices"`
}
if err := json.Unmarshal(payload, &got); err != nil {
t.Fatalf("unmarshal payload: %v; payload=%s", err, payload)
}
if got.Model != `x","pwned":true,"y":"` {
t.Fatalf("model = %q", got.Model)
}
if got.Pwned {
t.Fatalf("payload allowed model to inject top-level field: %s", payload)
}
if got.Choices[0].FinishReason != `stop","pwned":true,"x":"` {
t.Fatalf("finish_reason = %q", got.Choices[0].FinishReason)
}
}

func TestCursorToolCallDeltaJSONEscapesToolIdentifiers(t *testing.T) {
t.Parallel()

payload := cursorToolCallDeltaJSON(
0,
`call_1","pwned":true,"x":"`,
`tool","pwned":true,"x":"`,
`{"ok":true}`,
)
var got map[string]any
if err := json.Unmarshal([]byte(payload), &got); err != nil {
t.Fatalf("unmarshal payload: %v; payload=%s", err, payload)
}
if _, ok := got["pwned"]; ok {
t.Fatalf("payload allowed tool metadata to inject top-level field: %s", payload)
}
}
Loading