Releases: SharpAI/SwiftLM
SwiftLM b648
SwiftLM b648-a04b81e
Merge pull request #104 from roydsouza/fix/moe-memory-and-multimodal-tokens-rebased
Fix: Resolve multimodal BOA/EOA tokens dynamically from config.json
Changelog
- Potential fix for pull request finding (5cfc277)
- test(swiftlm): Add tests for multimodal token extraction (621a931)
- Fix #3: Resolve multimodal BOA/EOA tokens from config.json instead of hardcoding (9d495d9)
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.
SwiftLM b644
SwiftLM b644-f1dddb8
Merge pull request #101 from SharpAI/fix/qwen3-jinja-template-issue-97
fix: address post-merge PR 99 feedback and tests
Changelog
- fix: address all 7 Copilot review comments on PR #101 (7870b2f)
- fix(tests): fix MLXArray init in ContextWindowCalculationTests for Linux CI (a9abb2a)
- fix: resolve KVCacheSimple cast warning and ContextWindowCalculationTests build error (42f4946)
- Potential fix for pull request finding (677dd27)
- fix(swiftbuddy): resolve actor isolation violation in ServerManager (482782e)
- fix(swiftbuddy): update SettingsView streaming UI and link CLI builder (ccf0b41)
- test: address Copilot review for Issue 97 by adding strict role mapping regression guards (d280319)
- test: add missing Context Window, Config Persistence, and Server unit tests (a5bf26a)
- chore: remove sandbox test scripts (bbedccb)
- fix(swiftbuddy): fix SettingsView build error and onChange deprecation warning (81c5b95)
- Fix persisted SSD streaming behavior (321fc21)
- Add model loading progress for reloads (dcc0a3a)
- fix: resolve SwiftUI view update crash in SettingsView Color Scheme picker (4ac0c23)
- fix: address all critical + medium Copilot review comments on PR #99 (cb4c6e4)
- feat: restore turboKV/streamExperts controls, fix context window label (4332e50)
- fix(swiftbuddy): resolve buildCLICommand scope error in SettingsView (2cbb836)
- test: coverage gaps — SwiftBuddy embedded server, CLI builder, removed fields guard (ce2bafd)
- test: address all 4 Copilot review comments on PR #99 (4d2b858)
- feat(swiftbuddy): CLI panel, applied toast, seed wiring, remove dead config fields (c360806)
- feat(swiftbuddy): expose server endpoint URL + regression tests for settings/thinking/API (0304495)
- feat(swiftbuddy): persist settings, fix thinking mode, fix context count, add /v1/chat/completions (c80cf91)
- fix(review): address all 4 Copilot review comments on PR #99 (fbd9117)
- fix(inference): resolve Qwen3 TemplateException on multi-turn chat (Issue #97) (9f9e073)
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.
SwiftLM b618
SwiftLM b618-0cd94eb
Merge pull request #96 from SharpAI/fix/swiftbuddy-model-loading-recovery-main
Harden SwiftBuddy model loading and align local server settings
Changelog
- Restore MLXLM compatibility (77b258e)
- Address Copilot review feedback (913ae3f)
- Bump mlx-swift for quieter Metal compilation (205bbea)
- Align SwiftBuddy settings with local server config (2cbd2bc)
- Harden SwiftBuddy model loading recovery (08ceed8)
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.
SwiftLM b612
SwiftLM b612-dc1cff2
test: add ChatRequestParsingTests for tool_calls index mapping (#93)
-
test: add ChatRequestParsingTests covering tool_calls index mapping (PR #92)
-
test: address copilot review - fix stale line refs and malformed JSON schema
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Changelog
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.
SwiftLM b611
SwiftLM b611-75b4f66
Refactor tool calls mapping to include index (#92)
Changelog
- Refactor tool calls mapping to include index (#92) (75b4f66)
- build: sync submodules to latest main (#90) (0ceaf20)
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.
SwiftLM b609
SwiftLM b609-d58ef7f
Merge pull request #91 from SharpAI/fix/dflash-compiler-warnings
fix(dflash): suppress compiler warnings — remove unused var, var→let
Changelog
- fix(dflash): suppress compiler warnings — remove unused var, var→let (407e466)
- test: address Copilot review feedback on PromptCacheTests (c3c1ddb)
- test: add PromptCache regression tests (PR #85 coverage) (a0147d2)
- docs(README): remove degenerate DFlash perf row, add honest disclaimer (fea0e11)
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.
SwiftLM b602
SwiftLM b602-7df2170
fix(server): prompt-cache bleed fixes + Qwen3-A3B perf table (#85)
fix(server): prompt-cache bleed fixes — MambaCache gate + ndim guard + spec-decode ordering
Changelog
- docs(README): add Qwen3-A3B full-RAM perf table on M1 Ultra 64 GB (5a5b82a)
- Re-apply prompt-cache bleed fixes to synced main (d38fe8e)
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.
SwiftLM b598
SwiftLM b598-29f3816
Merge pull request #78 from 0xClandestine/feat/add-dflash
feat: add DFlash speculative decoding
Changelog
- fix: remove virtual allocation reference from DeepSeek key takeaways (#83) (05d0b6c)
- fix: README table shows physical RAM, not misleading virtual allocation (#81) (0212b14)
- feat: DeepSeek-V4 support via mlx-swift-lm b463 (9533e45)
- fix: prevent Metal GPU Watchdog timeout on low-RAM CI runners (2707be9)
- fix: cap Metal command buffer size during swap-assisted inference to prevent GPU timeouts (91e32af)
- fix: strip language_model. prefix, remove stale expert keys, raise FD limit (b5037f6)
- fix: correct weight key paths for DeepseekV3 and KimiLinear models (d6bcf66)
- fix: resolve CI GPU timeouts on 7GB runners by fixing Memory limit spin-loops (0e79358)
- feat: add DeepSeek V3 and Kimi Linear DFlash support (Option B) (313fa91)
- Revert "fix(ci): skip omni test gracefully when RAM is insufficient" (b224692)
- fix(ci): skip omni test gracefully when RAM is insufficient (9fc993c)
- feat: add DFlashTargetModel conformance for Qwen3, Qwen3MoE, and Llama (069a75f)
- fix: add required log lines to DFlash draft model load path (4c042a6)
- fix: add 'Using speculative decoding' log line for CI test assertions (5581f38)
- fix: remove stray banner echo outside SUITE_OPT guard (b7dcd53)
- fix: suppress interactive menu in sub-process invocations (0dba57a)
- fix: use SUITE_OPT env var to bypass menu in matrix sub-processes (2d537d6)
- fix: disable prompt cache for MambaCache hybrid models (Qwen3Next) (5553bf5)
- chore: move dflash benchmark scripts to profiling dir (fd84f80)
- fix(benchmark): exit early on DFlash tests to avoid model prompt (7e7ccd1)
- test(dflash): fix submodule pin and add E2E tests (f629f63)
- fix: restore DFlashRollbackCache protocol and clean dead extension (60d88e4)
- chore: bump mlx-swift-lm submodule to b447 (7dcdaf4)
- docs: add DFlash parameters to README CLI options list (6f0c670)
- fix(bench): increase server wait timeout to 3600s to allow large model downloads (602f940)
- fix: address Copilot review on PR #78 (2ea4e96)
- fix: resolve DFlash protocol conformance and build blockers (a52bd07)
- refactor(Qwen3Next): move DFlashTargetModel conformance to SwiftLM extension (7d150f9)
- test: reorganize DFlash test suite into tests/DFlash/ (108f0c2)
- feat(bench): add JSON result export to bench_35b.sh; add bench_coder_next.sh (0d96a5e)
- feat: add DFlashKernelBench micro-benchmark target (a2c8102)
- feat(dflash): add MambaSnapshotCache + dflashUseTapeRollback protocol property (464b959)
- refactor(dflash/kernels): branchless mask via metal::select + 2D kernel cache (f2ab918)
- feat: add Qwen3Next SSD streaming + DFlash support (485a929)
- feat: add bench_35b.sh benchmark script (d6fdef4)
- feat: add timings (tok/s, token count, duration) to all API responses (9b91b4d)
- feat: selective safetensors loader — skip expert weight data with SSD streaming (7820436)
- fix(dflash): load hiddenNorm weight + streaming + prefetch + asyncEval (e1ea48f)
- feat: add initial dflash implementation (1040e68)
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.
SwiftLM b554
SwiftLM b554-b33801a
Merge pull request #77 from SharpAI/fix/issue-72-draft-model-ssd-ram
fix: memory auto-cap strategy for SSD MoE streaming + speculative decoding (Issue #72)
Changelog
- fix: allow custom model selection in benchmark test 10 (8385350)
- fix: address Copilot review feedback on PR #77 (7b0bfd4)
- fix(ci): use bash variable for PID in ssd-draft-memory-guard (58249c2)
- ci: trigger run after YAML fix (c8b236d)
- fix(ci): repair YAML corruption in ci.yml (retention-days merged with comment) (be8353f)
- docs: document --stream-experts + --draft-model auto-cap strategy (Issue #72) (bb29e36)
- ci: add ssd-draft-memory-guard job + vm_stat readings for Issue #72 (3f6bad5)
- test(benchmark): add Test 10 — Issue #72 SSD + draft model RAM regression (7a14a67)
- fix(ssd-stream): auto-cap draft tokens to 1 when --stream-experts + --draft-model (#72) (dfd0935)
- fix(ssd-stream): prevent inference-time swap explosion with --draft-model (#72 follow-up) (5390216)
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.
SwiftLM b543
SwiftLM b543-336c8a8
Merge pull request #76 from SharpAI/fix/issue-72-draft-model-ssd-ram
fix(ssd-stream): prevent RAM explosion when --draft-model + --stream-experts combined (#72)
Changelog
- fix(ssd-stream): address Copilot review on PR #76 (9b0a31c)
- test(ssd-stream): add regression suite for Issue #72 SSD budget with draft model (8a04b2b)
- fix(ssd-stream): prevent RAM explosion when --draft-model + --stream-experts are combined (95303a5)
- chore(agents): document /opt/homebrew/bin/gh path in review-github-pr workflow (975db48)
- chore(agents): add review-github-pr workflow skill (1005d3e)
Download
- CLI Server: macOS Apple Silicon (arm64)
- GUI Desktop App: Download the attached
SwiftBuddy-macOS.dmgbelow!
Quick Start
For GUI Users (SwiftBuddy):
- Download the attached DMG and open it.
- Drag
SwiftBuddy.appinto your Applications folder natively or run directly. - When launched, click "Model Options" to select or download an MLX local model to chat with.
For CLI Users (SwiftLM):
Please refer to the Getting Started section in the README.
Note:
mlx.metallibis bundled in the tar archive. Keep it in the same directory as theSwiftLMbinary — Metal GPU compute will fail if it is missing.