Skip to content

Conversation

@wallashss
Copy link
Collaborator

Description

This PR fixes FP8 with static batching was hardcoded

Related Issues

Signed-off-by: Wallas Santos <[email protected]>
@github-actions
Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

@wallashss
Copy link
Collaborator Author

bot:test

Copy link
Collaborator

@joerunde joerunde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manually validated on spyre, lgtm!

@joerunde joerunde merged commit 7f2c8e7 into main Sep 11, 2025
16 of 19 checks passed
@joerunde joerunde deleted the wallas-fix-sb-fp8 branch September 11, 2025 23:29
yannicks1 added a commit that referenced this pull request Sep 12, 2025
### [fp8] fix tests: increase ISCLOSE_ABS_TOL_QUANTIZATION

post #457 test_spyre_basic.py::test_output fp8 tests were failing. This
PR increases the `ISCLOSE_ABS_TOL_QUANTIZATION` to have the test passing
again.

---------

Signed-off-by: Yannick Schnider <[email protected]>
prashantgupta24 added a commit that referenced this pull request Oct 1, 2025
# Description

Fp8 should now be fully supported (almost). Some tests are failing due
to output mismatch, those are marked `xfail` manually.

## Related PRs

fix: static batching with FP8 -
#457
and 
fix: tests for graph comparison with FP8 -
#462

---------

Signed-off-by: Prashant Gupta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants