-
-
Notifications
You must be signed in to change notification settings - Fork 16.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[XPU] Slice GDN attention inputs to num_actual_tokens
intel-gpu
Related to Intel GPU
#42383
opened May 12, 2026 by
jasonboukheir
Loading…
[XPU] Spec-decode-aware GDN attention dispatcher + replay harness
intel-gpu
Related to Intel GPU
#42382
opened May 12, 2026 by
jasonboukheir
Loading…
Fix AttributeError in Gemma3n forward() when inputs_embeds is None during profiling
#42380
opened May 12, 2026 by
Ray-RP
Loading…
[Bugfix] Fix RMSNorm kernels to multiply in weight's native dtype
bug
Something isn't working
#42379
opened May 12, 2026 by
liulanze
Contributor
Loading…
3 of 4 tasks
[Misc] Move FlashInfer MoE helpers into utils package
nvidia
#42378
opened May 12, 2026 by
mehrdadxzaker
Loading…
[Misc] Log FlashInfer runtime JIT compilation
nvidia
#42377
opened May 12, 2026 by
mehrdadxzaker
Loading…
[Bugfix] Preserve user --speculative-config overrides for speculators-format models
bug
Something isn't working
#42376
opened May 12, 2026 by
elwhyjay
Contributor
Loading…
3 of 4 tasks
[Misc] Warn when CUDA PTX arch flags are ignored
ci/build
nvidia
#42375
opened May 12, 2026 by
mehrdadxzaker
Loading…
[Core][WIP] Standardize kv layout
cpu
Related to CPU backends
documentation
Improvements or additions to documentation
intel-gpu
Related to Intel GPU
kv-connector
needs-rebase
nvidia
performance
Performance-related issues
rocm
Related to AMD ROCm
v1
#42374
opened May 12, 2026 by
LucasWilkinson
Collaborator
•
Draft
fix: MoE model using shared routed experts crashes on AMD GPUs
deepseek
Related to DeepSeek models
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#42373
opened May 12, 2026 by
weizhoublue
Loading…
4 tasks
[chore] Refactor pooling metadata token ID accessors
frontend
v1
#42368
opened May 12, 2026 by
taneem-ibrahim
Contributor
Loading…
Fix false substring match in INCConfig.get_layer_config
#42367
opened May 12, 2026 by
n1ck-guo
Contributor
Loading…
4 tasks
[Frontend] Add HyperCLOVAX-SEED-Think reasoning and tool parsers
documentation
Improvements or additions to documentation
tool-calling
#42366
opened May 12, 2026 by
ugiugi0823
Loading…
7 tasks done
[Bugfix] Fix dsv4 deepgemm import
bug
Something isn't working
deepseek
Related to DeepSeek models
#42365
opened May 12, 2026 by
wzhao18
Contributor
Loading…
4 tasks
[PD] Bump NIXL connector dependency to 1.x
ci/build
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
#42364
opened May 12, 2026 by
alec-flowers
Contributor
Loading…
Fix/full attention ghost block race
v1
#42359
opened May 12, 2026 by
jhaotingc
Contributor
Loading…
3 of 4 tasks
[CI] Inline build artifact annotations in release pipeline
ci/build
#42357
opened May 12, 2026 by
khluu
Collaborator
Loading…
3 tasks
[CI] Migrate remaining B200 jobs to b200-k8s queue
ci/build
#42356
opened May 12, 2026 by
khluu
Collaborator
Loading…
1 of 2 tasks
[CI] Move DockerHub and PyPI publish steps to end of release pipeline
ci/build
#42355
opened May 12, 2026 by
khluu
Collaborator
Loading…
3 tasks
[Performance] Use np.fromiter for dict-to-array conversion in hot paths
v1
#42352
opened May 11, 2026 by
lokashrinav
Contributor
Loading…
2 tasks
Create
test_kimi_k2_thinking_nvfp4.py for accuracy check
#42351
opened May 11, 2026 by
puririshi98
Contributor
•
Draft
[Frontend] Add image-capable TranslateGemma chat template
documentation
Improvements or additions to documentation
#42350
opened May 11, 2026 by
anubhav-sachan
Loading…
4 tasks
[Spec Decode] Fix EAGLE3 MRoPE decode-step position computation to use per-request delta
speculative-decoding
v1
#42349
opened May 11, 2026 by
eppaneamd
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.