[Core] Optimize evictor-v2 performance #7193

xiaobochen123 · 2024-08-06T07:45:41Z

Using the AutoPrefixCache, the block_manager_v2 performs worse than v1.

llama-3.1-8b, H800
Test 3510 cases from mmlu dataset

llm = LLM(
        model=path,
        tensor_parallel_size=1,
        trust_remote_code=True,
        gpu_memory_utilization=0.8,
        max_num_seqs=512,
        enable_prefix_caching=True,
        use_v2_block_manager=XXXX,
    )

sampling_params = SamplingParams(temperature=1.0, max_tokens=1)

mmlu_dataset = [...] # 3510 cases from mmlu

outputs = llm.generate(
        sampling_params=sampling_params,
        prompt_token_ids=mmlu_dataset,
    )

The self.free_table in evictor_v2::LRUEvictor is OrderedDict class that remembers the order in which keys were first inserted. The larger timestamps will be at the end.

The reason V2 slower than V1 , is that V2 will go through all the free_table, in evict.

V2 has the 'update', It breaks the order. So we can move the block to the end when update. That can keep the lowest timestamp at the start.

github-actions · 2024-08-06T07:45:53Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

Comment /ready on the PR
Add ready label to the PR
Enable auto-merge.

🚀

youkaichao · 2024-08-06T07:51:06Z

thanks for the contribution!

cc @cadedaniel @zhuohan123

cadedaniel · 2024-08-06T18:30:29Z

Looks good to me, although the NeuralMagic folks have better understanding of the prefix caching paths. cc @robertgshaw2-neuralmagic

youkaichao · 2024-08-06T19:34:17Z

Looks pretty reasonable to me, and the test also passed. I will go ahead to merge this.

thanks again @xiaobochen123 for the contribution!

Signed-off-by: Alvant <[email protected]>

Signed-off-by: LeiWang1999 <[email protected]>

xiaobochen123 force-pushed the opt_evictor branch from 52379a2 to 8f387b2 Compare August 6, 2024 08:04

opt evictor-v2 performance

0856f66

xiaobochen123 force-pushed the opt_evictor branch from 8f387b2 to 0856f66 Compare August 6, 2024 08:19

Yard1 mentioned this pull request Aug 6, 2024

[Performance][Core] Optimize the performance of evictor v1 and v2 by applying a priority queue and lazy deletion #7209

Merged

cadedaniel approved these changes Aug 6, 2024

View reviewed changes

youkaichao merged commit 660470e into vllm-project:main Aug 6, 2024

comaniac mentioned this pull request Aug 16, 2024

[MISC] Add prefix cache hit rate to metrics #7606

Merged

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Core] Optimize evictor-v2 performance (vllm-project#7193)

1ed56fb

Signed-off-by: Alvant <[email protected]>

LeiWang1999 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Mar 26, 2025

[Core] Optimize evictor-v2 performance (vllm-project#7193)

ba80305

Signed-off-by: LeiWang1999 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] Optimize evictor-v2 performance #7193

[Core] Optimize evictor-v2 performance #7193

Uh oh!

xiaobochen123 commented Aug 6, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Aug 6, 2024

Uh oh!

youkaichao commented Aug 6, 2024

Uh oh!

cadedaniel commented Aug 6, 2024

Uh oh!

youkaichao commented Aug 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Core] Optimize evictor-v2 performance #7193

[Core] Optimize evictor-v2 performance #7193

Uh oh!

Conversation

xiaobochen123 commented Aug 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 6, 2024

Uh oh!

youkaichao commented Aug 6, 2024

Uh oh!

cadedaniel commented Aug 6, 2024

Uh oh!

youkaichao commented Aug 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xiaobochen123 commented Aug 6, 2024 •

edited

Loading