Skip to content

feat: retrival add raw api metrics calculate#437

Merged
e06084 merged 1 commit into
MigoXLab:devfrom
e06084:dev
Jun 16, 2026
Merged

feat: retrival add raw api metrics calculate#437
e06084 merged 1 commit into
MigoXLab:devfrom
e06084:dev

Conversation

@e06084

@e06084 e06084 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the computation and tracking of raw API metrics from search traces during retrieval execution. It adds helper functions to build ranked document ID lists preserving raw API ranks and prefixes these metrics accordingly. Feedback on the changes suggests using the loop index instead of the rank in build_raw_api_ranked_doc_ids to guarantee unique placeholders for unmatched or duplicate results, and adopting a more defensive pattern in _compute_raw_api_metrics_from_search_traces to prevent potential TypeError exceptions when the queries key is explicitly set to None.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +142 to +154
ranked: list[str] = []
seen_resolved_ids: set[str] = set()
for index, result in enumerate(top_api_results, start=1):
rank = result.get("rank") or index
resolved_id = str(result.get("resolved_corpus_id") or "").strip()
if not resolved_id:
ranked.append(f"__raw_api_unmatched__{rank}")
continue
if resolved_id in seen_resolved_ids:
ranked.append(f"__raw_api_duplicate__{rank}")
continue
seen_resolved_ids.add(resolved_id)
ranked.append(resolved_id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In build_raw_api_ranked_doc_ids, the unique suffixes for unmatched and duplicate placeholders are constructed using rank. However, if the input top_api_results contains duplicate or zero ranks, these placeholders might not be unique, which could cause them to collapse and shift subsequent results incorrectly during metric evaluation. Using the loop index (which is guaranteed to be strictly increasing and unique) instead of rank is much more robust and also allows simplifying the function by removing the unused rank variable entirely.

Suggested change
ranked: list[str] = []
seen_resolved_ids: set[str] = set()
for index, result in enumerate(top_api_results, start=1):
rank = result.get("rank") or index
resolved_id = str(result.get("resolved_corpus_id") or "").strip()
if not resolved_id:
ranked.append(f"__raw_api_unmatched__{rank}")
continue
if resolved_id in seen_resolved_ids:
ranked.append(f"__raw_api_duplicate__{rank}")
continue
seen_resolved_ids.add(resolved_id)
ranked.append(resolved_id)
ranked: list[str] = []
seen_resolved_ids: set[str] = set()
for index, result in enumerate(top_api_results, start=1):
resolved_id = str(result.get("resolved_corpus_id") or "").strip()
if not resolved_id:
ranked.append(f"__raw_api_unmatched__{index}")
continue
if resolved_id in seen_resolved_ids:
ranked.append(f"__raw_api_duplicate__{index}")
continue
seen_resolved_ids.add(resolved_id)
ranked.append(resolved_id)
return ranked

Comment thread dingo/exec/retrieval.py
for trace in traces:
if trace.get("task") != task_name:
continue
for query in trace.get("queries", []):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using trace.get("queries", []) can raise a TypeError: 'NoneType' object is not iterable if the "queries" key is explicitly set to None in the trace dictionary. Using trace.get("queries") or [] is more robust and consistent with the defensive pattern used on line 350 (query.get("raw_api_metrics") or {}).

Suggested change
for query in trace.get("queries", []):
for query in (trace.get("queries") or []):

@e06084 e06084 merged commit 23d9f3b into MigoXLab:dev Jun 16, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant