-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
server: add margin for draft model for
fit
#23485
opened May 21, 2026 by
am17an
Contributor
Loading…
ggml-webgpu: update WebGPU support and add link to blog/demo
#23483
opened May 21, 2026 by
reeselevine
Contributor
Loading…
Add missing
buffer set in allreduce fallback !COMPUTE clear
#23480
opened May 21, 2026 by
TheBlueMatt
Contributor
Loading…
feat(ui): add lazy-loaded mermaid diagram rendering
#23475
opened May 21, 2026 by
StrikeOner
Loading…
CUDA: fix PDL CC check for JIT compilation
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#23471
opened May 21, 2026 by
JohannesGaessler
Contributor
Loading…
ui: media attachments before text
examples
server/ui
#23467
opened May 21, 2026 by
sfallah
Contributor
Loading…
vocab : keep DNA k-mer ids distinct from colliding BPE tokens
#23466
opened May 21, 2026 by
kashif
Contributor
Loading…
cmake : remove STATIC from impl libraries, enable LLAMA_BUILD_APP by default
build
Compilation issues
examples
server
#23462
opened May 21, 2026 by
ggerganov
Member
Loading…
[WebGPU] Check batch_compute_passes before sending passes when not doing GPU profiling
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
#23457
opened May 21, 2026 by
nikhilJain17
Contributor
Loading…
hexagon: apply repl optimization in flash attn softmax as #22993
ggml
changes relating to the ggml tensor library for machine learning
Hexagon
#23455
opened May 21, 2026 by
njsyw1997
Contributor
Loading…
Generalize Adreno MoE kernels on size M
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#23449
opened May 20, 2026 by
shawngu-quic
Contributor
Loading…
Hip fattn expf approx
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#23441
opened May 20, 2026 by
a-huk
Loading…
MoE disk offloading for Metal
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#23440
opened May 20, 2026 by
kisasexypantera94
•
Draft
ggml/cpu: skip zero-scale blocks in TQ1_0 and TQ2_0 vec_dot kernels
ggml
changes relating to the ggml tensor library for machine learning
#23439
opened May 20, 2026 by
eriirfos-eng
Loading…
json-schema-to-grammar: expand PCRE shorthands in pattern strings
testing
Everything test related
#23436
opened May 20, 2026 by
iOptimizeThings
Loading…
ui: simplify network error handling
examples
server/ui
#23431
opened May 20, 2026 by
socram8888
Contributor
Loading…
removed unecesary mmproj download when users pass --no-mmproj
#23425
opened May 20, 2026 by
ryan-mangeno
Contributor
Loading…
ggml : add GGML_OP_COL2IM_1D (CPU + CUDA)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#23424
opened May 20, 2026 by
ServeurpersoCom
Contributor
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.