Fix KeyError in GPT-OSS weight conversion script#42007
Fix KeyError in GPT-OSS weight conversion script#42007zucchini-nlp merged 3 commits intohuggingface:mainfrom
Conversation
zucchini-nlp
left a comment
There was a problem hiding this comment.
I am quite lost here since I didn't work with GPT-OSS. To make sure, the original_config is an HF config object iiuc and we need to make sure it is Yarn if no params are saved in config?
| # "beta_fast": float(original_config.pop("rope_ntk_beta")), | ||
| # "beta_slow": float(original_config.pop("rope_ntk_alpha")), | ||
| # "factor": float(original_config.pop("rope_parameters_factor")), |
There was a problem hiding this comment.
can we hardcode the values safely? I think it was better to obtain from the config
|
Thanks for reviewing @zucchini-nlp! Let me clarify:
You're absolutely correct. We should try to obtain values from config with defaults as fallback. I'll update to use Updating now.. |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: gpt_oss |
|
Updated! changes made:
Ready for another look! |
|
Thanks @Aznix07 for making efforts to solve the issue raised. I noticed that the issue is mostly because model config.json contains "rope_scaling_factor" as the key instead of "rope_parameters_factor" which the script is looking for. |
zucchini-nlp
left a comment
There was a problem hiding this comment.
It's still not clear why the original_config has rope paramaters if it is not in HF format yet. I believe there are some configs already with "rope_parameters" to be converted
Lgtm then!
|
Thank you @zucchini-nlp for the review and approval! @shubhagr-qc Thanks for identifying |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
* Fix KeyError in GPT-OSS weight conversion script * Fix code quality: remove trailing whitespace * Address review: use pop with defaults, remove commented code
* Fix KeyError in GPT-OSS weight conversion script * Fix code quality: remove trailing whitespace * Address review: use pop with defaults, remove commented code
What does this PR do?
Fixes a
KeyError: 'rope_parameters_factor'in the GPT-OSS weight conversion script that was preventing users from converting GPT-OSS model weights to HuggingFace format.Fixes #42003
Problem
After PR #39847 standardized RoPE parameter handling across models, the GPT-OSS conversion script was still trying to access individual parameter keys that no longer exist in the original config:
Reproduction
Solution
Added proper handling for both config formats:
Testing
✅ Verified conversion script no longer throws KeyError
✅ Tested fallback logic with both config formats
✅ Default values match
configuration_gpt_oss.pyBefore submitting
Who can review?
@zucchini-nlp