-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Pull requests: antirez/ds4
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Server: fix agent-loop cache misses, add cancellation, observability, and robustness fixes
#489
opened Jul 2, 2026 by
elkaix
Loading…
cuda: enable streaming auto cache (implement recommended_working_set_size)
#488
opened Jul 2, 2026 by
riccardo-galbani
Loading…
cuda: fall back to pinned host memory when the model arena runs out of VRAM
#487
opened Jul 2, 2026 by
riccardo-galbani
Loading…
DSpark B2 rejection sampling + adaptive block sizing
#482
opened Jun 30, 2026 by
machiabeli
Loading…
feat: add headless browser support with curl fallback for web tools
#479
opened Jun 29, 2026 by
J3rr1ck
Loading…
CUDA: make DeepSeek-V4-Pro correct on the indexed-attention path (top_k 512→1024) + enable decode LUT gate for in_dim>4096
#478
opened Jun 29, 2026 by
slackarea
Loading…
CUDA: scale q8->f16 cache reserve on >=112 GiB cards (fixes session OOM on large models)
#472
opened Jun 28, 2026 by
slackarea
Loading…
Fix slow decodes "poisoning" sleep times when using power throttling
#464
opened Jun 27, 2026 by
omnomburp
Loading…
CUDA: batch gate/up/down uploads for selected expert cache misses
#460
opened Jun 26, 2026 by
fmolara
Loading…
Add served model name option for server discovery
#456
opened Jun 25, 2026 by
RiccardoFiorentini
Loading…
Metal: keep selected-address SSD prefill opt-in by default
#454
opened Jun 25, 2026 by
andreaborio
•
Draft
Fix ROCm Q8->F16 cache reserve starving session tensors on large models (q4q2)
#446
opened Jun 23, 2026 by
alantsev
Contributor
Loading…
AGENTS.md rename (and server performance improvements?)
#443
opened Jun 21, 2026 by
OPS-NeoRetro
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-06-03.