-
Notifications
You must be signed in to change notification settings - Fork 15.6k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Mention ANV_SYS_MEM_LIMIT in Vulkan/Linux section of build.md
documentation
Improvements or additions to documentation
#20670
opened Mar 17, 2026 by
adapt-L
Loading…
ggml-webgpu: Update the Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
RMS_NORM preprocessor and add L2_NORM
documentation
#20665
opened Mar 17, 2026 by
yomaytk
Loading…
ggml-webgpu: Add supports for Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
DIAG and TRI
documentation
#20664
opened Mar 17, 2026 by
yomaytk
Loading…
vulkan: change gated_delta_net to shard a column across a subgroup
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#20662
opened Mar 17, 2026 by
jeffbolznv
Loading…
Fix chat parser regressions: inference crashes/frozen; output backtracked
testing
Everything test related
#20660
opened Mar 17, 2026 by
jpohhhh
Loading…
vulkan: dequantize iq4_xs 4 at a time
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#20657
opened Mar 16, 2026 by
netrunnereve
Loading…
Make debug build possible on Windows with HIP backend.
ggml
changes relating to the ggml tensor library for machine learning
#20655
opened Mar 16, 2026 by
Exile333
Loading…
Models: Add control vector functions to qwen3.5 and qwen-next models
model
Model specific
#20653
opened Mar 16, 2026 by
GreyWorks
Loading…
tests: set dist_sampling seq_id to 0
testing
Everything test related
#20645
opened Mar 16, 2026 by
taronaeo
Loading…
ggml-cuda: Add NVFP4 dp4a kernel
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
#20644
opened Mar 16, 2026 by
michaelw9999
Loading…
[CUDA] Use a single warp per element instead of a single block per element if the K-dimension is small
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#20635
opened Mar 16, 2026 by
gaugarg-nv
Loading…
ggml-cpu: add 128-bit RVV implementation for Quantization Vector Dot
ggml
changes relating to the ggml tensor library for machine learning
#20633
opened Mar 16, 2026 by
rehan-10xengineer
Loading…
ggml-cpu: simd_gemm implementation for riscv vector extension
ggml
changes relating to the ggml tensor library for machine learning
#20627
opened Mar 16, 2026 by
rehan-10xengineer
Loading…
[OpenVINO backend] add func is_splited_model()
ggml
changes relating to the ggml tensor library for machine learning
#20626
opened Mar 16, 2026 by
zhaixuejun1993
Loading…
Speed up claude-code by prevent adding system message that starts with 'x-anthropic-'
examples
server
#20623
opened Mar 16, 2026 by
hksdpc255
Loading…
Replace UINT64_MAX with LONG_TIMEOUT in blocking case to isolate possibility of Dawn/llvm-pipe bug
ggml
changes relating to the ggml tensor library for machine learning
#20621
opened Mar 16, 2026 by
nikhilJain17
•
Draft
kleidiai : fix MUL_MAT support for batched (3D) inputs
ggml
changes relating to the ggml tensor library for machine learning
#20620
opened Mar 16, 2026 by
jabr
Loading…
ggml webgpu: Move to no timeout for WaitAny in graph submission to avoid deadlocks
ggml
changes relating to the ggml tensor library for machine learning
#20618
opened Mar 16, 2026 by
reeselevine
Loading…
devops: add an intel mkl container
devops
improvements to build systems and github actions
#20616
opened Mar 16, 2026 by
kannon92
Loading…
ggml: MXFP flash attention with SoA layout (CPU scalar reference)
examples
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
testing
Everything test related
#20609
opened Mar 15, 2026 by
timothyeburke
•
Draft
ggml blas: set mkl threads from thread context
devops
improvements to build systems and github actions
ggml
changes relating to the ggml tensor library for machine learning
#20602
opened Mar 15, 2026 by
kannon92
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.