Skip to content

Conversation

@yiakwy-xpu-ml-framework-team

Add Hoper llama2 7b mcore gold example

@puneeshkhanna
Copy link

--use-legacy-models - why this option is passed ?

@yiakwy-xpu-ml-framework-team
Copy link
Author

--use-legacy-models - why this option is passed ?

The latest updates use m-core models by default. For llama2 benchmark test, no need to switch to m-core model and new dataset API:

check here

@carolove
Copy link

when i use convert shell script in your commit,
It shows "Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/workspace/models/Llama-2-7b-hf/tokenizer.model'. Use repo_type argument if needed." error,
Do you know how to use local tokenizer.model file?
Thank you.

@carolove
Copy link

and I found, Using "TOKENIZER_MODEL=meta-llama/Llama-2-7b-hf" in shell script can convert hf to megatron successfully.

@yiakwy-xpu-ml-framework-team
Copy link
Author

yiakwy-xpu-ml-framework-team commented Sep 30, 2024

and I found, Using "TOKENIZER_MODEL=meta-llama/Llama-2-7b-hf" in shell script can convert hf to megatron successfully.

Hi @carlove /workspace/models is the standard location where I have models in the docker, you can either create soft links of the location to point the real model, if it hosted in your distributed file system.

HF_MODEL_DIR=/workspace/models/$MODEL
OUTPUT=/workspace/models/$MODEL-to-megatron-tp$TP-pp$PP
TOKENIZER_MODEL=/workspace/models/$MODEL/tokenizer.model

For tput test, you don't need to download the dataset and model parameters, otherwise, you should run convert script first to create 3D parallel (classical llama2) checkpoint and load the weights and optimizer states depending on your task type,

The usage of Llama2 tokenizer relies on the tokenizer class you choose. For latest megatron (> 2403), I recommend to use meta-llama2 tokenrizer. For old megatron (< 2310), I recommend use HuggingFaceLlama2Tokenizer .

Here is the difference:

// megatron > 2403 uses sentencepiece proto API to load Meta 32K BPE vocabulary
class _Llama2Tokenizer(_SentencePieceTokenizer):
    """SentencePieceTokenizer-Megatron wrapper"""

    def __init__(self, model_file,):
        super().__init__(model_file, vocab_extra_ids=0)

    def _initalize(self, vocab_extra_ids):
        self._populate_vocab()

        # BOS / EOS token IDs
        self.n_words: int = self.tokenizer.vocab_size()
        self.bos_id: int = self.tokenizer.bos_id()
        self.eos_id: int = self.tokenizer.eos_id()
        self.pad_id: int = self.tokenizer.pad_id()
        assert self.tokenizer.vocab_size() == self.tokenizer.get_piece_size()
        ...

which uses sentencepiece to load model:

        import sentencepiece
        self.tokenizer = sentencepiece.SentencePieceProcessor(model_file=model_file)

That means the megatron after 2403 is built for meta-llama.

@github-actions
Copy link
Contributor

Marking as stale. No activity in 60 days.

@github-actions github-actions bot added the stale No activity in 60 days on issue or PR label Nov 29, 2024
@sbhavani sbhavani added enhancement New feature or request module: training and removed stale No activity in 60 days on issue or PR labels Jul 25, 2025
@sbhavani
Copy link
Contributor

@yiakwy-xpu-ml-framework-team thank you for the contribution! We have now added a llama3 example here: https://github.com/NVIDIA/Megatron-LM/tree/main/examples/llama

@sbhavani sbhavani closed this Aug 1, 2025
@yiakwy-xpu-ml-framework-team
Copy link
Author

@sbhavani Hi I have added an example how to train Gpt-OSS in #2383 you might be useful for your team. Thank you for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants