Skip to content

Conversation

@wallashss
Copy link
Collaborator

@wallashss wallashss commented Sep 26, 2025

Description

This PR introduces a wrapper of the logits processor that are injected to vlllm that will handle the logits processor in a distributed way. The wrapper is initialized with the logits class and the batch_size. So, from the logits processor perspective it will "think" that it's only handling a request per step, while the wrapper receives the batch of logits, slice and redistribute for each request.

Signed-off-by: Wallas Santos <[email protected]>
@github-actions
Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Signed-off-by: Wallas Santos <[email protected]>
Signed-off-by: Wallas Santos <[email protected]>
Signed-off-by: Wallas Santos <[email protected]>
@wallashss
Copy link
Collaborator Author

bot:test


pytestmark = [pytest.mark.full_model, pytest.mark.other_e2e]

# TODO: REVERT THIS CHANGE!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we parametrize the test such that they get executed for SB and CB?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or do we already have a similar test for CB?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have this test for CB. I think the issue here is increase too much the time of CI. Moreover, they do not repro very well the issue of this PR.

Signed-off-by: Wallas Santos <[email protected]>
Signed-off-by: Wallas Santos <[email protected]>
@wallashss
Copy link
Collaborator Author

bot:test

Copy link
Collaborator

@maxdebayser maxdebayser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok as a stopgap solution, but we should think about a more comprehensive solution later when more of the sampling params are implemented with logits processors. There could be performance implications in applying the LPs per requests vs per batch.

@wallashss
Copy link
Collaborator Author

bot:test

1 similar comment
@wallashss
Copy link
Collaborator Author

bot:test

@wallashss
Copy link
Collaborator Author

bot:test

@wallashss wallashss requested a review from joerunde as a code owner October 1, 2025 14:26
@wallashss
Copy link
Collaborator Author

bot:test

@wallashss wallashss merged commit a70890c into main Oct 1, 2025
19 of 20 checks passed
@wallashss wallashss deleted the wallas-fix-cb-logits-processors branch October 1, 2025 16:53
@wallashss
Copy link
Collaborator Author

bot:test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants