This repository was archived by the owner on Sep 4, 2025. It is now read-only.
Commit ce1670b
authored
bs/seq bucketing for prompt and decode (#33)
* Bucketing/Warmup WIP
* Cleanup
* Revert "Fix model_output_idx on HPU (#27)"
This reverts commit 90dfa92.
* Rework selected_token_indices fix to also work with block_size padding
* Simple prompt attention POC
* Remove cumsum
* MQA/GQA support for simple prompt_attention
* Cleanup
* Fix typo
* Restore profiling runs1 parent 2664659 commit ce1670b
File tree
5 files changed
+225
-763
lines changed- vllm
- attention/backends
- hpu
- model_executor
- worker
5 files changed
+225
-763
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | 5 | | |
7 | 6 | | |
8 | 7 | | |
9 | 8 | | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
13 | 12 | | |
14 | 13 | | |
15 | 14 | | |
| |||
18 | 17 | | |
19 | 18 | | |
20 | 19 | | |
21 | | - | |
22 | 20 | | |
23 | 21 | | |
24 | 22 | | |
| |||
119 | 117 | | |
120 | 118 | | |
121 | 119 | | |
122 | | - | |
| 120 | + | |
123 | 121 | | |
124 | 122 | | |
125 | | - | |
126 | | - | |
| 123 | + | |
| 124 | + | |
127 | 125 | | |
128 | 126 | | |
129 | 127 | | |
| |||
196 | 194 | | |
197 | 195 | | |
198 | 196 | | |
199 | | - | |
| 197 | + | |
200 | 198 | | |
201 | 199 | | |
202 | 200 | | |
203 | 201 | | |
204 | 202 | | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
| 203 | + | |
226 | 204 | | |
227 | 205 | | |
228 | | - | |
229 | | - | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
230 | 218 | | |
231 | | - | |
232 | | - | |
| 219 | + | |
233 | 220 | | |
234 | 221 | | |
235 | 222 | | |
236 | 223 | | |
237 | 224 | | |
238 | | - | |
239 | | - | |
240 | | - | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
241 | 228 | | |
242 | 229 | | |
243 | 230 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
9 | 8 | | |
10 | | - | |
11 | | - | |
12 | | - | |
| 9 | + | |
13 | 10 | | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
| 11 | + | |
19 | 12 | | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | 13 | | |
66 | | - | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
192 | 192 | | |
193 | 193 | | |
194 | 194 | | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | 195 | | |
202 | 196 | | |
203 | 197 | | |
| |||
225 | 219 | | |
226 | 220 | | |
227 | 221 | | |
228 | | - | |
229 | 222 | | |
230 | 223 | | |
231 | 224 | | |
232 | 225 | | |
233 | | - | |
234 | 226 | | |
235 | 227 | | |
236 | 228 | | |
| |||
249 | 241 | | |
250 | 242 | | |
251 | 243 | | |
252 | | - | |
253 | 244 | | |
254 | 245 | | |
255 | 246 | | |
| |||
0 commit comments