Skip to content

feat(concurrency): 支持运行时动态修改每模型并行度 (#251)#251

Merged
ThreeFish-AI merged 2 commits into
feature/1.x.xfrom
ThreeFish-AI/edit-model-concurrency
May 26, 2026
Merged

feat(concurrency): 支持运行时动态修改每模型并行度 (#251)#251
ThreeFish-AI merged 2 commits into
feature/1.x.xfrom
ThreeFish-AI/edit-model-concurrency

Conversation

@ThreeFish-AI

@ThreeFish-AI ThreeFish-AI commented May 26, 2026

Copy link
Copy Markdown
Owner

Summary

重构 ModelConcurrencyLimiter 并发控制核心,以自定义 _ConcurrencySlot(基于 asyncio.Event)替代 asyncio.Semaphore,使其支持运行时动态调整每模型并发上限;同时新增 PUT /api/concurrency API 端点与 Dashboard 前端可编辑 UI,实现无需重启即可调整并行度。

变更内容

后端

  • vendors/concurrency.py:新增 _ConcurrencySlot 类,使用 asyncio.Event + while 循环实现 FIFO 公平排队,提供 set_limit() 方法支持动态调整上限;ModelConcurrencyLimiter 新增 set_limit(model, new_limit) 方法,同步更新 config 与 slot 状态
  • vendors/zhipu.pyZhipuVendor 新增 update_concurrency(model, limit) 代理方法
  • server/routes.py:新增 PUT /api/concurrency 端点,接受 {tier, model, limit} 请求体,校验 limit 范围 1-20,遍历 tiers 查找目标 vendor 执行更新

前端

  • server/dashboard.py:Model Calling 模块中 limit 数字渲染为可点击的 .mc-limit-editable 元素,点击后展开 inline number input,支持 Enter 确认 / Escape 取消 / 失焦确认,成功/失败分别有绿色/红色闪烁动画反馈

测试

  • tests/test_zhipu_concurrency.py:适配新 API(_get_or_create_slotslot.available),全量 1520 测试通过无回归

设计决策

  • 缩小 limit 时已持有槽位不被强制回收,新 limit 在后续 acquire 中自然生效(自然收敛)
  • 运行时修改不持久化到配置文件,纯内存操作,重启后恢复为 YAML 配置值
  • acquire() 返回 self,保持现有调用模式(slot.release())不变,ZhipuVendor 调用方无需改动

Test plan

  • 启动代理服务器(zhipu vendor 配置 concurrency),Dashboard → Overview 观察 Model Calling 模块
  • 点击模型 limit 数字,修改为不同值,确认 UI 即时更新且轮询保持一致
  • 发起并发请求验证新 limit 生效(排队行为符合预期)
  • 缩小 limit 时已持有请求不受影响

🤖 Generated with Claude Code, CodeX, Gemini

重构 ModelConcurrencyLimiter,以自定义 _ConcurrencySlot 替代 asyncio.Semaphore,
支持 set_limit() 动态调整上限。新增 PUT /api/concurrency 端点,Dashboard Model
Calling 模块中 limit 数字可直接点击编辑(1-20),无需重启进程。

🤖 Generated with [Claude Code](https://github.com/claude), [CodeX](https://openai.com), [Gemini](https://github.com/apps/gemini-code-assist)
Co-Authored-By: Aurelius Huang<threefish.ai@gmail.com>
@ThreeFish-AI ThreeFish-AI changed the title feat(concurrency): 支持运行时动态修改每模型并行度 feat(concurrency): 支持运行时动态修改每模型并行度 (#251) May 26, 2026
Escape 取消时 restore() 移除 input 元素会触发浏览器 blur 事件,
导致 blur handler 通过 setTimeout 在 50ms 后调用 submit() 将已取消
的值发送到服务端。引入 _cancelled 标志在 Escape 时置位,submit 入口
及 blur 回调中双重守卫,确保取消操作不被忽略。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ThreeFish-AI ThreeFish-AI merged commit b2c3023 into feature/1.x.x May 26, 2026
0 of 6 checks passed
@ThreeFish-AI ThreeFish-AI deleted the ThreeFish-AI/edit-model-concurrency branch May 26, 2026 12:50
@ThreeFish-AI ThreeFish-AI mentioned this pull request May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant