Commit c663f31
committed
Fix critical NWOR bugs causing 0% acceptance rate
Bug #1 (CRITICAL): Add missing begin() and stage() methods to KVWriteRouter
- Flash attention backend calls router.begin() and router.stage()
- KVWriteRouter only had write() and commit() methods
- Added begin() to store slot_mapping and initialize shadow buffer
- Added stage() to extract per-timestep slot and stage KV pairs
- Without these, no tokens were being staged → 0% acceptance rate
Bug #2 (MODERATE): Fix bonus token counting in accepted_lens
- valid_sampled_token_ids includes [accepted_draft_tokens..., bonus_token]
- Previous: len([bonus]) = 1, incorrectly counted as 1 accepted draft token
- Fixed: Use max(0, len(seq) - 1) to exclude bonus token from count
- Now correctly reports 0 accepted when only bonus token is present
Files modified:
- vllm/v1/kv_cache/write_router.py: Added begin() and stage() methods
- vllm/v1/worker/gpu_model_runner.py: Fixed accepted_lens calculation1 parent a3c136b commit c663f31
2 files changed
+54
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
100 | 100 | | |
101 | 101 | | |
102 | 102 | | |
| 103 | + | |
103 | 104 | | |
104 | 105 | | |
105 | 106 | | |
| |||
116 | 117 | | |
117 | 118 | | |
118 | 119 | | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
119 | 167 | | |
120 | 168 | | |
121 | 169 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2353 | 2353 | | |
2354 | 2354 | | |
2355 | 2355 | | |
2356 | | - | |
| 2356 | + | |
| 2357 | + | |
| 2358 | + | |
| 2359 | + | |
2357 | 2360 | | |
2358 | 2361 | | |
2359 | | - | |
| 2362 | + | |
| 2363 | + | |
2360 | 2364 | | |
2361 | 2365 | | |
2362 | 2366 | | |
| |||
0 commit comments