Commit fb0331f
[Feature](ARD-2967) ard: add mqa score for spec decode for cape
Root Cause:
add mqa score on gcu(currently only on xformers backend)
vllm-project/vllm#9298
some according to vllm-project/vllm#9291 vllm-project/vllm#12093
Solution:
add it
Test:
gcu
Impact area:
vllm
Fix status:
N/A
Change-Id: I977305ba636741090478bb4721b97f297d4aac6c1 parent ac87174 commit fb0331f
File tree
3 files changed
+1189
-0
lines changed- vllm_gcu
- attention/backends
- models/qwen
3 files changed
+1189
-0
lines changed
0 commit comments