Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[MoE] Introduce Fp8MoEState and class-based dispatch for DeepGemm
#37249 opened Mar 17, 2026 by yzong-rh Loading…
5 tasks
[Model] Implement LoRA support for Qwen3ASRForConditionalGeneration documentation Improvements or additions to documentation qwen Related to Qwen models
#37247 opened Mar 17, 2026 by petern48 Loading…
5 tasks
[Bugfix] dtype mismatch in ngram gpu propose bug Something isn't working speculative-decoding v1
#37246 opened Mar 17, 2026 by PatchouliTIS Loading…
3 of 5 tasks
[Bugfix] Fix incorrect int8 dtype cast for kv_c_normed in MLA prefill bug Something isn't working
#37245 opened Mar 17, 2026 by jacob-crux Loading…
3 of 5 tasks
[Perf] Support Flashinfer trtllm tinygemm_bf16 router gemm for GPT-OSS gpt-oss Related to GPT-OSS models
#37244 opened Mar 17, 2026 by elvischenv Loading…
5 tasks
[ROCm][CI] Refine gating tests ci/build rocm Related to AMD ROCm
#37243 opened Mar 17, 2026 by AndreasKaratzas Draft
[Refactor] Relocate responses API tests ready ONLY add when PR is ready to merge/full CI is needed v1
#37241 opened Mar 17, 2026 by sfeng33 Loading…
[Models][GDN] Prevent D2H sync in ChunkGatedDeltaRule qwen Related to Qwen models v1
#37239 opened Mar 16, 2026 by lgeiger Loading…
Fix ambiguous num_blocks for hybrid attn mamba v1
#37236 opened Mar 16, 2026 by collinmccarthy Loading…
[Bugfix] Fix for builtins (forward fix of pytorch/177558) bug Something isn't working
#37234 opened Mar 16, 2026 by Lucaskabela Draft
5 tasks
[UX] Add flashinfer-cubin as CUDA default dep ci/build nvidia
#37233 opened Mar 16, 2026 by mgoin Loading…
5 tasks
[Bugfix] Expand quantization method support in perf metrics bug Something isn't working v1
#37231 opened Mar 16, 2026 by thillai-c Loading…
3 of 5 tasks
Fix Qwen3.5-Next RMSNormGated Initialization Error on TPU qwen Related to Qwen models
#37229 opened Mar 16, 2026 by jrplatin Loading…
5 tasks
[WIP] Bring back nightly build ci/build
#37226 opened Mar 16, 2026 by atalman Loading…
[Perf] Optimize top-k search in apply_top_k_top_p_triton sampler performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed v1
#37225 opened Mar 16, 2026 by mgoin Loading…
2 of 7 tasks
[UltraVox] Fix output type
#37224 opened Mar 16, 2026 by vasqu Loading…
[Bugfix] Consolidate Gemma2/3 GGUF fixes for correctness on Blackwell bug Something isn't working
#37220 opened Mar 16, 2026 by kitaekatt Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.