Exceed TKV Limit Unit Test and Bugfix#913
Conversation
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. We also recommend installing prek and configuring it to check your code before every local commit. |
5ae68c2 to
81de71f
Compare
|
bot:test |
| max_num_batched_tokens=512, | ||
| max_num_seqs=32, | ||
| max_model_len=32768, | ||
| available_blocks=32768, |
There was a problem hiding this comment.
I don't see this having an impact here, but maybe set this to 8192 as this is the max number of blocks we currently support? 32K blocks (of 64 tokens) seems like a looot.
There was a problem hiding this comment.
So we did actually have available_blocks=8192 at first... But there was an error stating that 16385 blocks were needed. Maybe @joerunde can comment further on this.
There was a problem hiding this comment.
+1, though we ended up with some startup errors because the max batch tkv limit wasn't being respected, which ended up crashing because we calculated a required 16k blocks 😢
We should probably understand why the env patch wasn't getting picked up, but that also doesn't need t block getting this in
|
@Daniel-Schenker it looks like we have internal GH links in the PR description. We were warned not to do that |
|
^^ good catch @prashantgupta24! |
|
@joerunde @yannicks1 I have cleaned up the test case a bit and added some (hopefully) helpful comments as this is pretty niche test. Let me know if it looks good. |
|
I confirmed that the test was still failing without scheduler changes and passing after changes. |
Signed-off-by: Daniel Schenker <[email protected]>
Signed-off-by: Daniel Schenker <[email protected]>
Signed-off-by: Daniel Schenker <[email protected]>
Signed-off-by: Daniel Schenker <[email protected]>
Signed-off-by: Daniel Schenker <[email protected]>
Signed-off-by: Daniel Schenker <[email protected]>
5342f79 to
b8b9141
Compare
Signed-off-by: Daniel Schenker <[email protected]>
|
Comments have been addressed. |
Description
This PR adds a unit test case to reproduce [redacted]
A potential fix for the bug is also included in this PR.
Related Issues
Relates to [redacted]
Test Plan
Utilize existing and newly added vllm-spyre unit tests
Checklist
bash format.sh)Signed-off-by:line (DCO compliance)