Skip to content

Commit bc18f82

Browse files
authored
Disable prefix cache by default for benchmark (vllm-project#18639)
Signed-off-by: cascade812 <[email protected]>
1 parent 94dacb8 commit bc18f82

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

latency.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,9 @@ def add_cli_args(parser: argparse.ArgumentParser):
8080
)
8181

8282
parser = EngineArgs.add_cli_args(parser)
83+
# V1 enables prefix caching by default which skews the latency
84+
# numbers. We need to disable prefix caching by default.
85+
parser.set_defaults(enable_prefix_caching=True)
8386

8487

8588
def main(args: argparse.Namespace):

0 commit comments

Comments
 (0)