Skip to content

fix: preserve embedding api version suffixes#8736

Open
he-yufeng wants to merge 1 commit into
AstrBotDevs:masterfrom
he-yufeng:fix/openai-embedding-version-base
Open

fix: preserve embedding api version suffixes#8736
he-yufeng wants to merge 1 commit into
AstrBotDevs:masterfrom
he-yufeng:fix/openai-embedding-version-base

Conversation

@he-yufeng

@he-yufeng he-yufeng commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

  • preserve any /vN suffix when normalizing OpenAI-compatible embedding API bases
  • keep the existing fallback that appends /v1 when a base URL has no version suffix
  • add regression coverage for /v3, /v4, unversioned bases, and /embeddings suffix trimming

Why

Fixes #8732. Providers such as Volcengine Ark can use embedding endpoints ending in /v3. The old check only recognized /v1 and /v4, so https://.../v3 became https://.../v3/v1 and returned 404.

Verified

  • python -m py_compile astrbot\core\provider\sources\openai_embedding_source.py tests\test_openai_embedding_source.py
  • .\.venv\Scripts\python.exe -m ruff check astrbot\core\provider\sources\openai_embedding_source.py tests\test_openai_embedding_source.py
  • git diff --check
  • direct helper behavior check covering /v3, /v4, unversioned base URLs, and /v1/embeddings

I also tried .\.venv\Scripts\python.exe -m pytest tests\test_openai_embedding_source.py -q in a minimal local venv. Collection reached the shared tests/conftest.py dependency chain and stopped on missing full AstrBot runtime dependencies; I did not install the whole application dependency set just for this small provider normalization check.

Summary by Sourcery

Preserve and centralize normalization of OpenAI-compatible embedding API base URLs while ensuring a sensible default version is applied.

Bug Fixes:

  • Fix incorrect normalization that appended '/v1' to embedding API bases already using other version suffixes like '/v3', which could cause 404 responses.

Enhancements:

  • Extract API base URL normalization into a reusable helper that trims trailing slashes and '/embeddings' while respecting existing version suffixes.

Tests:

  • Add regression tests covering versioned embedding API bases (including '/v3' and '/v4'), unversioned bases, and '/embeddings' suffix trimming behavior.

@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Jun 12, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the API base normalization logic for the OpenAI embedding source into a dedicated helper function _normalize_api_base and adds corresponding unit tests. The reviewer identified a potential AttributeError if embedding_api_base is explicitly configured as None or an empty string, and provided a code suggestion to safely fall back to the default URL using the or operator.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +36 to 38
api_base = _normalize_api_base(
provider_config.get("embedding_api_base", "https://api.openai.com/v1")
.strip()
.removesuffix("/")
.removesuffix("/embeddings")
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If embedding_api_base is explicitly configured as None or an empty string in the provider configuration, provider_config.get("embedding_api_base", "https://api.openai.com/v1") will return None or "". This will cause _normalize_api_base to raise an AttributeError (since None has no strip method) or fail to apply the default base URL.

Using or instead of the get method's default argument ensures that we fall back to the default OpenAI API base URL if the configured value is falsy (such as None or "").

Suggested change
api_base = _normalize_api_base(
provider_config.get("embedding_api_base", "https://api.openai.com/v1")
.strip()
.removesuffix("/")
.removesuffix("/embeddings")
)
api_base = _normalize_api_base(
provider_config.get("embedding_api_base") or "https://api.openai.com/v1"
)

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="tests/test_openai_embedding_source.py" line_range="12-16" />
<code_context>
+    assert _normalize_api_base("https://example.test/v4") == "https://example.test/v4"
+
+
+def test_openai_embedding_api_base_adds_default_version():
+    assert _normalize_api_base("https://example.test/openai") == (
+        "https://example.test/openai/v1"
+    )
+    assert _normalize_api_base("https://example.test/v1/embeddings") == (
+        "https://example.test/v1"
+    )
</code_context>
<issue_to_address>
**suggestion:** Cover behavior for empty or whitespace-only bases and non-version `/embeddings` bases

The default-version behavior is only partially covered here. Please also add cases for:

- `"https://example.test/embeddings"``"https://example.test/v1"`, as described in the PR but not asserted.
- Empty / whitespace-only input (e.g., `""`, `"  "`), which currently returns an empty string.

These will lock in the `/embeddings` trimming and fallback behavior and help prevent regressions.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +12 to +16
def test_openai_embedding_api_base_adds_default_version():
assert _normalize_api_base("https://example.test/openai") == (
"https://example.test/openai/v1"
)
assert _normalize_api_base("https://example.test/v1/embeddings") == (

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Cover behavior for empty or whitespace-only bases and non-version /embeddings bases

The default-version behavior is only partially covered here. Please also add cases for:

  • "https://example.test/embeddings""https://example.test/v1", as described in the PR but not asserted.
  • Empty / whitespace-only input (e.g., "", " "), which currently returns an empty string.

These will lock in the /embeddings trimming and fallback behavior and help prevent regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]fix(openai-embedding): 版本号检测遗漏 /v3,导致方舟等平台 404

1 participant