Skip to content

Commit c4d1c1b

Browse files
njhillAlvant
authored andcommitted
[SpecDecoding] Update MLPSpeculator CI tests to use smaller model (vllm-project#6714)
Signed-off-by: Alvant <[email protected]>
1 parent 9872744 commit c4d1c1b

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

tests/spec_decode/e2e/test_mlp_correctness.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,14 @@
2424
from .conftest import run_greedy_equality_correctness_test
2525

2626
# main model
27-
MAIN_MODEL = "ibm-granite/granite-3b-code-instruct"
27+
MAIN_MODEL = "JackFram/llama-160m"
2828

2929
# speculative model
30-
SPEC_MODEL = "ibm-granite/granite-3b-code-instruct-accelerator"
30+
SPEC_MODEL = "ibm-fms/llama-160m-accelerator"
3131

3232
# max. number of speculative tokens: this corresponds to
3333
# n_predict in the config.json of the speculator model.
34-
MAX_SPEC_TOKENS = 5
34+
MAX_SPEC_TOKENS = 3
3535

3636
# precision
3737
PRECISION = "float32"

0 commit comments

Comments
 (0)