Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

server: add margin for draft model for fit
#23485 opened May 21, 2026 by am17an Contributor Loading…
ggml-webgpu: update WebGPU support and add link to blog/demo
#23483 opened May 21, 2026 by reeselevine Contributor Loading…
Add missing buffer set in allreduce fallback !COMPUTE clear
#23480 opened May 21, 2026 by TheBlueMatt Contributor Loading…
Optimize ggml_vec_dot_q4_K_q8_K_generic
#23474 opened May 21, 2026 by pauser0000001 Loading…
CUDA: fix PDL CC check for JIT compilation ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#23471 opened May 21, 2026 by JohannesGaessler Contributor Loading…
common : fix state save in common_prompt_batch_decode examples testing Everything test related
#23468 opened May 21, 2026 by danbev Member Draft
ui: media attachments before text examples server/ui
#23467 opened May 21, 2026 by sfallah Contributor Loading…
vocab : keep DNA k-mer ids distinct from colliding BPE tokens
#23466 opened May 21, 2026 by kashif Contributor Loading…
[WebGPU] Check batch_compute_passes before sending passes when not doing GPU profiling ggml changes relating to the ggml tensor library for machine learning WebGPU
#23457 opened May 21, 2026 by nikhilJain17 Contributor Loading…
hexagon: apply repl optimization in flash attn softmax as #22993 ggml changes relating to the ggml tensor library for machine learning Hexagon
#23455 opened May 21, 2026 by njsyw1997 Contributor Loading…
Generalize Adreno MoE kernels on size M ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#23449 opened May 20, 2026 by shawngu-quic Contributor Loading…
Hip fattn expf approx ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#23441 opened May 20, 2026 by a-huk Loading…
MoE disk offloading for Metal Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#23440 opened May 20, 2026 by kisasexypantera94 Draft
ggml/cpu: skip zero-scale blocks in TQ1_0 and TQ2_0 vec_dot kernels ggml changes relating to the ggml tensor library for machine learning
#23439 opened May 20, 2026 by eriirfos-eng Loading…
json-schema-to-grammar: expand PCRE shorthands in pattern strings testing Everything test related
#23436 opened May 20, 2026 by iOptimizeThings Loading…
ggml: replace fixed 1GB context pool with growable buffer in meta backend (#22404) ggml changes relating to the ggml tensor library for machine learning
#23432 opened May 20, 2026 by nonml Draft
ui: simplify network error handling examples server/ui
#23431 opened May 20, 2026 by socram8888 Contributor Loading…
removed unecesary mmproj download when users pass --no-mmproj
#23425 opened May 20, 2026 by ryan-mangeno Contributor Loading…
ggml : add GGML_OP_COL2IM_1D (CPU + CUDA) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#23424 opened May 20, 2026 by ServeurpersoCom Contributor Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.