-
Notifications
You must be signed in to change notification settings - Fork 3.3k
add hoper llama golden with mcore calling stack #987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add hoper llama golden with mcore calling stack #987
Conversation
|
--use-legacy-models - why this option is passed ? |
The latest updates use m-core models by default. For llama2 benchmark test, no need to switch to m-core model and new dataset API: |
|
when i use convert shell script in your commit, |
|
and I found, Using "TOKENIZER_MODEL=meta-llama/Llama-2-7b-hf" in shell script can convert hf to megatron successfully. |
Hi @carlove /workspace/models is the standard location where I have models in the docker, you can either create soft links of the location to point the real model, if it hosted in your distributed file system. For tput test, you don't need to download the dataset and model parameters, otherwise, you should run convert script first to create 3D parallel (classical llama2) checkpoint and load the weights and optimizer states depending on your task type, The usage of Llama2 tokenizer relies on the tokenizer class you choose. For latest megatron (> 2403), I recommend to use meta-llama2 tokenrizer. For old megatron (< 2310), I recommend use HuggingFaceLlama2Tokenizer . Here is the difference: which uses sentencepiece to load model: That means the megatron after 2403 is built for meta-llama. |
|
Marking as stale. No activity in 60 days. |
|
@yiakwy-xpu-ml-framework-team thank you for the contribution! We have now added a llama3 example here: https://github.com/NVIDIA/Megatron-LM/tree/main/examples/llama |
Add Hoper llama2 7b mcore gold example