Skip to content

Conversation

@tjohnson31415
Copy link
Collaborator

@tjohnson31415 tjohnson31415 commented Nov 19, 2025

Description

Sets the default chunk size for granite 3 8b TP4 to the expected/supported value of 4096.

Note that using --max-num-batched-tokens does not override this setting, but setting VLLM_DT_CHUNK_LEN directly will take precedence.

@github-actions
Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Copy link
Collaborator

@yannicks1 yannicks1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we assert that backend is Spyre? this would allow us to still pass different chunk sizes when testing on cpu.

@maxdebayser
Copy link
Collaborator

could we assert that backend is Spyre? this would allow us to still pass different chunk sizes when testing on cpu.

good idea

@tjohnson31415
Copy link
Collaborator Author

could we assert that backend is Spyre?
Updated.

I also changed the code to allow VLLM_DT_CHUNK_LEN to override the new defaulting behavior even on the Spyre backend (for dev testing/debug, for user settings, we'd want to use --max-num-batched-tokens instead).

@tjohnson31415 tjohnson31415 merged commit bf07727 into main Nov 20, 2025
20 checks passed
@tjohnson31415 tjohnson31415 deleted the default-chunk-size branch November 20, 2025 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants