-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Improving inference speed for the repack buffer type on NUMA architectures
ggml
changes relating to the ggml tensor library for machine learning
#18698
opened Jan 8, 2026 by
zzjianhui
Loading…
scripts : pr2wt.sh reset to remote head
script
Script related
#18695
opened Jan 8, 2026 by
ggerganov
Loading…
debug : include LLAMA_POOLING_TYPE_UNSPECIFIED in pooling check
examples
#18692
opened Jan 8, 2026 by
danbev
Loading…
ggml-cuda: extend concat support for more types
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18690
opened Jan 8, 2026 by
Lourdle
Loading…
vulkan: Use VK_EXT_shader_64bit_indexing to handle large mat_mul(_id)
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#18678
opened Jan 7, 2026 by
jeffbolznv
Loading…
Fix integer overflow in GGUF tensor parsing
ggml
changes relating to the ggml tensor library for machine learning
#18674
opened Jan 7, 2026 by
alexanderkent
Loading…
cmake: fix cli build when LLAMA_BUILD_SERVER=OFF
examples
server
#18670
opened Jan 7, 2026 by
AsbjornOlling
Loading…
HIP: adjust RDNA3.5 MMQ kernel selction logic
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18666
opened Jan 7, 2026 by
JohannesGaessler
Loading…
server: fix n_cmpl not skipping processing prompt
examples
server
#18663
opened Jan 7, 2026 by
ngxson
Loading…
docs: update ops.md for CANN backend
documentation
Improvements or additions to documentation
#18654
opened Jan 7, 2026 by
hipudding
Loading…
CANN: support gated linear attn
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#18653
opened Jan 7, 2026 by
hipudding
Loading…
Added note for compiling on integrated GPUs
documentation
Improvements or additions to documentation
#18633
opened Jan 6, 2026 by
alosslessdev
•
Draft
rpc : implement event and async backend APIs
ggml
changes relating to the ggml tensor library for machine learning
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.