Skip to content

chore(engine): bump bundled llama.cpp sidecar to b9781#253

Merged
quiet-node merged 2 commits into
mainfrom
claude/vigilant-johnson-e83929
Jun 24, 2026
Merged

chore(engine): bump bundled llama.cpp sidecar to b9781#253
quiet-node merged 2 commits into
mainfrom
claude/vigilant-johnson-e83929

Conversation

@quiet-node

Copy link
Copy Markdown
Owner

Overview

Bumps the bundled llama.cpp llama-server sidecar from b9590 to the latest release b9781, and consolidates the engine-packaging documentation into a single source of truth. This is the manual interim bump while #238 (automated bump) is still open.

What changed

  • Engine pin (scripts/ensure-llama-server.ts): LLAMA_CPP_TAG b9590b9781, and ASSET_SHA256 set to the new macOS arm64 asset's hash (read from the GitHub release digest).
  • CI cache keys (nightly-release, pr-backend-tests, pr-build-validation, release-please): bumped to …-b9781-50e822733750dbc3. Cache keys are immutable, so embedding the pin makes the new asset cache fresh instead of being restored stale from the old key.
  • Docs: docs/models-and-providers.md → "How the engine binary is packaged" is expanded into the canonical explanation of the pin, the fetch script and its five steps, when it runs (no-op via the stamp file unless the pin changes), and the dev-vs-.app file layout. docs/release-process.md is trimmed to release-specific facts and cross-links to it, removing the duplicated conceptual prose.
  • Comment (src-tauri/src/openai.rs): reworded the reasoning-kwargs note so it reads as a historical verification datapoint rather than a claim about the current pin.

How it works

The pin is a release tag plus the asset SHA-256. engine:ensure fetches that exact asset, verifies the hash, re-derives the dylib link closure, and ad-hoc re-signs. The closure is unchanged for b9781 (10 dylibs, still matches bundle.macOS.frameworks), so no tauri.conf.json change was needed.

Testing

Verified on Apple Silicon against the actual b9781 binary:

  • engine:ensure fetches, hash-verifies, and installs cleanly; dylib closure matches the frameworks list.
  • All spawn-line flags survive in --help (-m --mmproj --ctx-size --host --port --no-webui --parallel).
  • codesign -vv clean on the binary and all 10 dylibs.
  • Real spawn + /health + a non-stream completion + SSE streaming (with reasoning deltas) against gpt-oss-20b.
  • validate-build passes; the bundle re-signs cleanly.

Note: the vision/--mmproj runtime path is unexercised (no vision model installed locally), though the flag is present in the new binary. Reasoning suppression for template-switch families (e.g. Qwen3.5) was verified on b9590 and not re-confirmed on b9781 for lack of a local Qwen model; the kwargs are accepted with no error on b9781.

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
@quiet-node quiet-node merged commit 43014d3 into main Jun 24, 2026
3 checks passed
@quiet-node quiet-node deleted the claude/vigilant-johnson-e83929 branch June 24, 2026 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant