feat(zhipu): 将 529 过载异常纳入指数退避重试范畴#261
Merged
Merged
Conversation
Zhipu vendor 原仅对 429 限流做就地指数退避重试,529(Overloaded
并发过载)直接交由上层 failover。现扩展内部重试以涵盖 529,与 429
共用同一退避策略(max=5、1s→2s→4s→8s、Full Jitter、优先尊重 server
retry-after),降低请求失败率与不必要的 failover。
- 新增 _BACKOFF_RETRY_STATUS={429,529} 作为可重试状态码单一事实源
- 非流式/流式判定改用集合,日志带入真实状态码
- 修复流式延迟计算忽略 529 retry-after 的缺陷,并去重延迟逻辑
- 对称补全 529 非流式/流式重试与 retry-after 回归测试
- 同步 design-patterns.md §3.12 Vendor 级重试说明
🤖 Generated with [Claude Code](https://github.com/claude), [CodeX](https://openai.com), [Gemini](https://github.com/apps/gemini-code-assist)
Co-Authored-By: Aurelius Huang<threefish.ai@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景
CC 在使用 Zhipu(GLM) vendor 时,除 429(Rate Limit 限流)外还会遇到 529(Overloaded 并发过载)。当前 Zhipu vendor 仅对 429 做就地指数退避重试挽回,529 不会被内部重试,直接交由上层 failover 处理,导致不必要的供应商切换与请求失败。本 PR 将 529 纳入与 429 完全一致的退避重试范畴。
主要变更
_BACKOFF_RETRY_STATUS = frozenset({429, 529})作为「可退避重试状态码」单一事实源;非流式send_message与流式send_message_stream改用集合判定,429/529 共用同一退避策略(max=5、1s→2s→4s→8s、Full Jitter、优先尊重 serverretry-after)。parse_rate_limit_headers(仅对 429/403 解析),导致 529 忽略 serverretry-after;现统一改用_compute_retry_delay_from_headers(固定按 429 语义解析),并删除冗余的_compute_retry_delay_from_response,实现延迟逻辑去重。日志同步带入真实状态码。tests/test_zhipu.py对称补全 5 个 529 用例(非流式/流式重试成功与耗尽、流式 retry-after 回归);同步docs/arch/design-patterns.md§3.12 Vendor 级重试说明。影响与验证
改动局限在 Zhipu,未触碰
rate_limit.py/base.py/ executor 等共享模块,耗尽重试后状态码原样返回仍走既有 failover/CircuitBreaker 路径,与 429 完全对称、对现有行为零影响。验证:全量回归uv run pytest -q1601 passed,ruff lint/format 全部通过。🤖 Generated with Claude Code