Commit aca9493
committed
Fix speculator model integration by detecting speculators before ModelConfig creation
When using 'vllm serve' with a speculator model path directly
(e.g., RedHatAI/Llama-3.1-8B-Instruct-speculator.eagle3), the tokenizer
loading was failing because ModelConfig was created with the speculator
path before maybe_override_with_speculators() could swap it to the
target model path.
This fix moves the maybe_override_with_speculators() call to happen
BEFORE create_model_config(), ensuring that:
1. Speculator models are detected early
2. The target model path is extracted from the speculators config
3. ModelConfig is created with the correct target model path
4. Tokenizer loads successfully from the target model
Signed-off-by: Rahul Tuli <[email protected]>1 parent f177da1 commit aca9493
1 file changed
+6
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1275 | 1275 | | |
1276 | 1276 | | |
1277 | 1277 | | |
1278 | | - | |
1279 | | - | |
1280 | | - | |
1281 | | - | |
| 1278 | + | |
| 1279 | + | |
1282 | 1280 | | |
1283 | 1281 | | |
1284 | 1282 | | |
| |||
1289 | 1287 | | |
1290 | 1288 | | |
1291 | 1289 | | |
| 1290 | + | |
| 1291 | + | |
| 1292 | + | |
| 1293 | + | |
1292 | 1294 | | |
1293 | 1295 | | |
1294 | 1296 | | |
| |||
0 commit comments