WIP for adding support for Tekken tokenizer needed for Mistral NeMo by HanClinto · Pull Request #8578 · ggml-org/llama.cpp

HanClinto · 2024-07-18T21:38:39Z

Attempting to add support for Mistral NeMo (#8577), but I've never added support for a new model before, so this is heavily a WIP. I need to take a break for a while, so uploading my notes here in case it's useful for anyone else.

They claim it can be a drop-in replacement of Mistral 7B, so surely it shouldn't be too much work to make it work with ggml since Mistral 7B works.

While the model architecture may be a drop-in replacement for Mistral 7B, the tokenizer is not (yet) added to our list of supported BPE tokenizers. Attempting to quantize Mistra-NeMo via GGUF-my-repo results in:

Error: Error converting to fp16: b'INFO:hf-to-gguf:Loading model: Mistral-Nemo-Instruct-2407
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 1024000
INFO:hf-to-gguf:gguf: embedding length = 5120
INFO:hf-to-gguf:gguf: feed forward length = 14336
INFO:hf-to-gguf:gguf: head count = 32
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 1000000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**********************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:** There are 2 possible reasons for this:
WARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref: https://github.com/ggerganov/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh: aa78fe8b04bc622b077520b1fb3d3a5c6f7a53dd375e2361e62599be3cf58de1
WARNING:hf-to-gguf:**********************************************************************************
WARNING:hf-to-gguf:

I have not yet expanded the tests to include the new tokenizer.

I haven't figured out any other settings or options that may need to be set for this tokenizer.

I haven't looked into the regex used by llm_tokenizer_bpe to see if it needs to be changed from the default or not.

Basically it's drastically untested, and I would have liked to get this further before uploading a WIP.

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

…upport.

HanClinto · 2024-07-18T22:21:42Z

Superceded by #8579

WIP for adding support for Tekken tokenizer needed for Mistral NeMo s…

5edd6ea

…upport.

github-actions bot added the python python script changes label Jul 18, 2024

HanClinto closed this Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP for adding support for Tekken tokenizer needed for Mistral NeMo#8578

WIP for adding support for Tekken tokenizer needed for Mistral NeMo#8578
HanClinto wants to merge 1 commit intoggml-org:masterfrom
HanClinto:feature-mistral-nemo

HanClinto commented Jul 18, 2024

Uh oh!

HanClinto commented Jul 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

HanClinto commented Jul 18, 2024

Uh oh!

HanClinto commented Jul 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant