add hoper llama golden with mcore calling stack #987

yiakwy-xpu-ml-framework-team · 2024-08-08T11:01:46Z

Add Hoper llama2 7b mcore gold example

puneeshkhanna · 2024-08-29T13:37:40Z

--use-legacy-models - why this option is passed ?

yiakwy-xpu-ml-framework-team · 2024-09-02T07:26:55Z

--use-legacy-models - why this option is passed ?

The latest updates use m-core models by default. For llama2 benchmark test, no need to switch to m-core model and new dataset API:

check here

carolove · 2024-09-29T02:41:26Z

when i use convert shell script in your commit,
It shows "Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/workspace/models/Llama-2-7b-hf/tokenizer.model'. Use repo_type argument if needed." error,
Do you know how to use local tokenizer.model file?
Thank you.

carolove · 2024-09-29T02:49:22Z

and I found, Using "TOKENIZER_MODEL=meta-llama/Llama-2-7b-hf" in shell script can convert hf to megatron successfully.

yiakwy-xpu-ml-framework-team · 2024-09-30T06:50:50Z

and I found, Using "TOKENIZER_MODEL=meta-llama/Llama-2-7b-hf" in shell script can convert hf to megatron successfully.

Hi @carlove /workspace/models is the standard location where I have models in the docker, you can either create soft links of the location to point the real model, if it hosted in your distributed file system.

HF_MODEL_DIR=/workspace/models/$MODEL
OUTPUT=/workspace/models/$MODEL-to-megatron-tp$TP-pp$PP
TOKENIZER_MODEL=/workspace/models/$MODEL/tokenizer.model

For tput test, you don't need to download the dataset and model parameters, otherwise, you should run convert script first to create 3D parallel (classical llama2) checkpoint and load the weights and optimizer states depending on your task type,

The usage of Llama2 tokenizer relies on the tokenizer class you choose. For latest megatron (> 2403), I recommend to use meta-llama2 tokenrizer. For old megatron (< 2310), I recommend use HuggingFaceLlama2Tokenizer .

Here is the difference:

// megatron > 2403 uses sentencepiece proto API to load Meta 32K BPE vocabulary
class _Llama2Tokenizer(_SentencePieceTokenizer):
    """SentencePieceTokenizer-Megatron wrapper"""

    def __init__(self, model_file,):
        super().__init__(model_file, vocab_extra_ids=0)

    def _initalize(self, vocab_extra_ids):
        self._populate_vocab()

        # BOS / EOS token IDs
        self.n_words: int = self.tokenizer.vocab_size()
        self.bos_id: int = self.tokenizer.bos_id()
        self.eos_id: int = self.tokenizer.eos_id()
        self.pad_id: int = self.tokenizer.pad_id()
        assert self.tokenizer.vocab_size() == self.tokenizer.get_piece_size()
        ...

which uses sentencepiece to load model:

        import sentencepiece
        self.tokenizer = sentencepiece.SentencePieceProcessor(model_file=model_file)

That means the megatron after 2403 is built for meta-llama.

github-actions · 2024-11-29T18:21:34Z

Marking as stale. No activity in 60 days.

sbhavani · 2025-07-25T17:25:45Z

@yiakwy-xpu-ml-framework-team thank you for the contribution! We have now added a llama3 example here: https://github.com/NVIDIA/Megatron-LM/tree/main/examples/llama

yiakwy-xpu-ml-framework-team · 2025-11-24T20:47:44Z

@sbhavani Hi I have added an example how to train Gpt-OSS in #2383 you might be useful for your team. Thank you for your reply.

add hoper llama golden with mcore calling stack

dc540a8

github-actions bot added the stale No activity in 60 days on issue or PR label Nov 29, 2024

sbhavani added enhancement New feature or request module: training and removed stale No activity in 60 days on issue or PR labels Jul 25, 2025

sbhavani closed this Aug 1, 2025

github-actions bot added the community-request label Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add hoper llama golden with mcore calling stack #987

add hoper llama golden with mcore calling stack #987

Uh oh!

yiakwy-xpu-ml-framework-team commented Aug 8, 2024

Uh oh!

puneeshkhanna commented Aug 29, 2024

Uh oh!

yiakwy-xpu-ml-framework-team commented Sep 2, 2024

Uh oh!

carolove commented Sep 29, 2024

Uh oh!

carolove commented Sep 29, 2024

Uh oh!

yiakwy-xpu-ml-framework-team commented Sep 30, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Nov 29, 2024

Uh oh!

sbhavani commented Jul 25, 2025

Uh oh!

yiakwy-xpu-ml-framework-team commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

add hoper llama golden with mcore calling stack #987

add hoper llama golden with mcore calling stack #987

Uh oh!

Conversation

yiakwy-xpu-ml-framework-team commented Aug 8, 2024

Uh oh!

puneeshkhanna commented Aug 29, 2024

Uh oh!

yiakwy-xpu-ml-framework-team commented Sep 2, 2024

Uh oh!

carolove commented Sep 29, 2024

Uh oh!

carolove commented Sep 29, 2024

Uh oh!

yiakwy-xpu-ml-framework-team commented Sep 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 29, 2024

Uh oh!

sbhavani commented Jul 25, 2025

Uh oh!

yiakwy-xpu-ml-framework-team commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yiakwy-xpu-ml-framework-team commented Sep 30, 2024 •

edited

Loading