Skip to content

WIP for adding support for Tekken tokenizer needed for Mistral NeMo#8578

Closed
HanClinto wants to merge 1 commit intoggml-org:masterfrom
HanClinto:feature-mistral-nemo
Closed

WIP for adding support for Tekken tokenizer needed for Mistral NeMo#8578
HanClinto wants to merge 1 commit intoggml-org:masterfrom
HanClinto:feature-mistral-nemo

Conversation

@HanClinto
Copy link
Copy Markdown
Contributor

Attempting to add support for Mistral NeMo (#8577), but I've never added support for a new model before, so this is heavily a WIP. I need to take a break for a while, so uploading my notes here in case it's useful for anyone else.

They claim it can be a drop-in replacement of Mistral 7B, so surely it shouldn't be too much work to make it work with ggml since Mistral 7B works.

While the model architecture may be a drop-in replacement for Mistral 7B, the tokenizer is not (yet) added to our list of supported BPE tokenizers. Attempting to quantize Mistra-NeMo via GGUF-my-repo results in:

Error: Error converting to fp16: b'INFO:hf-to-gguf:Loading model: Mistral-Nemo-Instruct-2407
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 1024000
INFO:hf-to-gguf:gguf: embedding length = 5120
INFO:hf-to-gguf:gguf: feed forward length = 14336
INFO:hf-to-gguf:gguf: head count = 32
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 1000000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**********************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:** There are 2 possible reasons for this:
WARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref: https://github.com/ggerganov/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh: aa78fe8b04bc622b077520b1fb3d3a5c6f7a53dd375e2361e62599be3cf58de1
WARNING:hf-to-gguf:**********************************************************************************
WARNING:hf-to-gguf:

I have not yet expanded the tests to include the new tokenizer.

I haven't figured out any other settings or options that may need to be set for this tokenizer.

I haven't looked into the regex used by llm_tokenizer_bpe to see if it needs to be changed from the default or not.

Basically it's drastically untested, and I would have liked to get this further before uploading a WIP.

@github-actions github-actions bot added the python python script changes label Jul 18, 2024
@HanClinto
Copy link
Copy Markdown
Contributor Author

Superceded by #8579

@HanClinto HanClinto closed this Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant