-
Notifications
You must be signed in to change notification settings - Fork 31
Add prefix caching #586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+1,173
−62
Merged
Add prefix caching #586
Changes from 41 commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
5447dc5
Replace the block_pool list with the vLLM Block Pool
maxdebayser a43e072
manage padding blocks outside of block pool
maxdebayser 6378294
Switch to Single Type KV Cache manager
maxdebayser 5f188ac
Add prefix caching
maxdebayser ed7441f
Fix small errors
maxdebayser 6983f2a
Fix prefix caching path
maxdebayser f8dd1d2
fix linting problem
maxdebayser 78a8b84
Merge branch 'integrate_block_pool' into sched_agnostic_pc
maxdebayser 0296747
address review comments
maxdebayser 8420160
Merge branch 'main' into integrate_block_pool
maxdebayser fbaf933
Merge branch 'integrate_block_pool' into sched_agnostic_pc
maxdebayser ab2f4d3
fix mispelling
maxdebayser a514310
address some review comments
maxdebayser ec9b1f6
tmp hack: run tests on this branch
yannicks1 76059a4
fix: tmp hack to run tests
yannicks1 41d3ea9
fix bug when no cache is found
maxdebayser dc92a9b
add first unit test prefix caching
yannicks1 041b3fa
disable prefix caching by default and enable tests
maxdebayser a491e7d
Merge branch 'main' into integrate_block_pool
maxdebayser f71df91
address review comments
maxdebayser e14acff
Merge branch 'integrate_block_pool' into sched_agnostic_pc
maxdebayser 4808379
reduce test repetition
maxdebayser f201675
revert bad change
maxdebayser 4d4a228
Merge branch 'integrate_block_pool' into sched_agnostic_pc
maxdebayser a11a9b5
add test: prefix hit of a seq not part of the batch.
yannicks1 fa51002
reset prefixes across tests in cached engine
yannicks1 78c9141
Merge branch 'main' into sched_agnostic_pc
maxdebayser 5c45428
adding tests: limit number of blocks
yannicks1 8c96eec
Merge branch 'main' into sched_agnostic_pc
yannicks1 4823a7f
Merge branch 'main' into sched_agnostic_pc
yannicks1 fe8d383
Fix test
maxdebayser a1d140b
Merge branch 'sched_agnostic_pc' of github.com:vllm-project/vllm-spyr…
maxdebayser 882a530
revert tmp hack
yannicks1 cbc2b35
fix isort
yannicks1 ddbbeec
add more tests and fix a small bug in the model runner
maxdebayser 9559d46
appease linter
maxdebayser 9247e92
Merge branch 'sched_agnostic_pc' of github.com:vllm-project/vllm-spyr…
maxdebayser fd7cc90
update hf_cache
maxdebayser 2fa71e5
address review comments
maxdebayser 5495953
address review comments
maxdebayser d9c4d4a
improve comment
maxdebayser f0de597
Merge branch 'main' into sched_agnostic_pc
maxdebayser File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.