-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[Model] Support Mistral3 in the HF Transformers format #15505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: mgoin <[email protected]>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
…/vllm into mistral-3-hf-support
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
|
The transformers PR has been merge, can you update this one? |
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
|
Can you merge neuralmagic#56 into this PR? Otherwise the PR LGTM |
|
Hello, is there also an example which makes use of the chat_template? |
Co-authored-by: Cyrus Leung <[email protected]>
What do you mean by not working? |
|
I start the model: Now when I send a query, e.g. I get this error: Without the system prompt it works. |
|
Yes the chat_template.json in the model card is used by default now according to the behavior of Transformers. I would expect this to be a behavior issue with the chat template itself and not vLLM, so I would recommend opening an issue on the upstream model card |
|
@thies1006 I had the same issue and am now using a modified version of the template (set via EDIT: fixed system prompt I've opened a PR on HF |
…15505) Signed-off-by: mgoin <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Wang Ran (汪然) <[email protected]>
…15505) Signed-off-by: mgoin <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: xinyuxiao <[email protected]>
…15505) Signed-off-by: mgoin <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
…15505) Signed-off-by: mgoin <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
…15505) Signed-off-by: mgoin <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
…15505) Signed-off-by: mgoin <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Mu Huai <[email protected]>
Works for text input and single image batches. Requires a fix to the pixtral processing in Transformers (huggingface/transformers#37019). It still fails to succeed on a full eval of chartqa in vLLM V1, seemingly due to batching encoding issues, so I forced this model to only run with V0 for now.
FIX #15212
Testing
Server started with
ChartQA Eval
FP8 checkpoint:
Original checkpoint:
Single-image example script
Client script:
Output: