-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[Model] Add GraniteMoeHybrid 4.0 model #17497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
DarkLight1337
merged 24 commits into
vllm-project:main
from
s3woz:granitemoehybrid_clean
May 6, 2025
Merged
Changes from 18 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
24bf59b
Added test to functionally verify match between HF and vLLM
bohnstingl b7e89a0
GraniteMoeHybrid model
s3woz e0136b3
Removed MLA and added RoPE
bohnstingl 4cfef42
Updated basic examples
bohnstingl 859e473
TensorParallel and cleanup
s3woz bcb5e77
Fixing previous commit
s3woz 18bb63d
Cleanup
s3woz 7cb6a81
Cleanup
s3woz b69ca16
Removing sampler. Fixing URLs
s3woz 6f8b1f4
Fixing pre-hook errors.
s3woz c3b6460
ruff reformatting pre-commit fix
s3woz 171532c
Skip tests until HF models become available
s3woz 586247a
Pre-commit hook fix
s3woz b298dc1
Integrated code review; Moved tests; Added model to test_hybrid.py
bohnstingl 0937f79
Added missing files
bohnstingl 8ff0691
Added to supported_models.md and fixed URLs
s3woz 495fe33
Update docs/source/models/supported_models.md
s3woz e114f1b
Commenting out failing HF registry test as code is not yet in HF used…
s3woz ca1d999
Temporarily marking model as is_available_online=False until it appea…
s3woz dc04497
Temporarily commenting out registration test until the model appears …
s3woz a636074
Fixing pre-commit error
s3woz 8c68720
Marking model with next minor min_transformers_version
s3woz 6c8e664
Update tests/models/language/generation/test_hybrid.py
s3woz 95ede00
Clarifying comments
s3woz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| import pytest | ||
|
|
||
| from ...utils import check_logprobs_close | ||
|
|
||
| # Path of the checkpoints | ||
| MODELS = [ | ||
| "ibm-granite/granite-4.0-tiny-preview", | ||
| ] | ||
|
|
||
|
|
||
| @pytest.mark.skip(reason="HF model is in the HF main yet") | ||
| @pytest.mark.parametrize("model", MODELS) | ||
| @pytest.mark.parametrize("dtype", ["float16", "bfloat16"]) | ||
| @pytest.mark.parametrize("max_tokens", [64]) | ||
| @pytest.mark.parametrize("num_logprobs", [5]) | ||
| def test_model_equivalence_to_hf_greedy( | ||
| hf_runner, | ||
| vllm_runner, | ||
| example_prompts, | ||
| model: str, | ||
| dtype: str, | ||
| max_tokens: int, | ||
| num_logprobs: int, | ||
| ): | ||
| with vllm_runner(model, dtype=dtype) as vllm_model: | ||
| vllm_outputs = vllm_model.generate_greedy_logprobs( | ||
| example_prompts, max_tokens, num_logprobs) | ||
|
|
||
| with hf_runner(model, dtype=dtype) as hf_model: | ||
| hf_outputs = hf_model.generate_greedy_logprobs_limit( | ||
| example_prompts, max_tokens, num_logprobs) | ||
|
|
||
| check_logprobs_close( | ||
| outputs_0_lst=hf_outputs, | ||
| outputs_1_lst=vllm_outputs, | ||
| name_0="hf", | ||
| name_1="vllm", | ||
| ) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.