-
Notifications
You must be signed in to change notification settings - Fork 1.9k
add rwkv5 model and fix rwkv4 init prompt bug #1275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 29 commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
9a151b5
fix llm_chat.cc rwkv bug
BBuf 28cf751
fix llm_chat.cc rwkv bug
BBuf a48762c
refine
BBuf dfd6eeb
fix format erroe
BBuf 44c73a5
fix ci error
BBuf 47a6101
Merge branch 'mlc-ai:main' into main
BBuf 107d95d
Merge branch 'mlc-ai:main' into main
BBuf 5f8a1b3
Merge branch 'mlc-ai:main' into main
BBuf 9519e75
init
BBuf 62a8518
fix rwkv5 model error
BBuf 0c5a140
refine
BBuf 003056b
refine
BBuf 8e7b02d
wkv5 tir in fp32
BBuf 980b22c
refine
BBuf 58ddb80
fix metal wkv tir bug by fsy
BBuf 33d51dc
fix groupnorm bug
BBuf 6f739c6
run success in metal
BBuf aac52e1
fix mlc-llm rwkv5 bug
BBuf 56cc017
Merge branch 'mlc-ai:main' into main
BBuf 6b655bc
Merge pull request #2 from GiantPandaCV/support_rwkv5_2
BBuf 542cde9
update submodule
BBuf b96f421
refine
BBuf 1629f3c
delete build.py
BBuf 1dc0833
revert
BBuf 6b0855c
Merge branch 'mlc-ai:main' into main
BBuf dae253b
fix model bug
BBuf 1aeb3e2
fix prompt
BBuf ddb3908
refine config
BBuf 71eefb6
fix int4 bug
BBuf b32109e
refine
BBuf File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Submodule tokenizers-cpp
updated
3 files
| +28 −2 | include/rwkv_world_tokenizer.h | |
| +3 −6 | rust/src/lib.rs | |
| +57 −17 | src/rwkv_world_tokenizer.cc |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use
static_cast<int64_t>(tokens.size())There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I think this is a quite serious bug that has troubled me for more than two weeks. The quantized int4 version of rwkv5 seems to give very unintelligent responses, and it was only today that I thought to print out the prompt. Then I discovered that all the code after line 618 was ineffective, and finally pinpointed this issue. Now the quantized int4 version of rwkv5 can also generate text normally. The performance has also improved in other modes for rwkv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you elaboratae a bit why static cast int is needed here? do we involve some negative numbers in computing this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error might occur whenever the input prompt is relatively long. Before using
static_cast<int64_t>to converttokens.size(), the expression(this->total_seq_len_ + tokens.size() + gen_mean_gen_len)might have experienced integer overflow at some stage, causing the result to be incorrectly interpreted as a negative number, which in turn erroneously returnedtruefor the comparison operation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I feel this is a strange way to think about it, given
max_window_size_ == -1, we should specially check it, and that means there is no out of bound and we do not need to re-encode (aka running the code after), would be good for @Hzfengsy to take a loo as wellThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, there seems to be a bug in the handling of the rwkv system prompts. I expect that each interaction with the rwkv model should include the system prompt along with the current text. This is because its series of models(rwkv4,5,6) have higher requirements for prompts. Currently, only the first round of dialogue includes the system's prompt, and the system prompt is forgotten in subsequent dialogues