Fix: RWKV7 infctx forward兼容 PEFT 传入 inputs_embeds(避免 TypeError)#81
Open
Deng-Xian-Sheng wants to merge 1 commit intoJoluck:mainfrom
Open
Fix: RWKV7 infctx forward兼容 PEFT 传入 inputs_embeds(避免 TypeError)#81Deng-Xian-Sheng wants to merge 1 commit intoJoluck:mainfrom
Deng-Xian-Sheng wants to merge 1 commit intoJoluck:mainfrom
Conversation
Author
|
训练启动脚本: |
Author
|
我在使用 |
Owner
|
感谢,infctx太久没用也基本没人使用所以没有修复,稍后我会检查一下 |
Author
|
似乎用infctx来实现无限上下文? 我的数据集文本长度非常长,所以考虑了infctx。 如果不使用infctx,训练时数据集会不会被截断? 我可能对infctx的作用有点误解 |
Owner
是的不适用infctx会被截断,但是infctx会让训练非常慢,你要训练多长的数据? |
Author
几万字的小说。 我合成了一个用于生成短篇、长篇小说的数据集,打算利用rwkv长上下文时不增加内存、不降低速度的特性。 |
|
有进展了吗? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景 / 问题
在
RWKV_TRAIN_TYPE=infctx且使用 PEFT (PeftModelForCausalLM) 训练时,启动训练会报错:TypeError: RWKV7.forward_infctx() got an unexpected keyword argument 'inputs_embeds'根因分析
PEFT/HF 模型 wrapper 的
forward()会透传inputs_embeds(即使为None)以及其它 HF 常见关键字参数到 base model。RWKV7 在 infctx 路径的
forward_infctx()签名未包含inputs_embeds,也没有**kwargs兜底,因此触发 unexpected keyword argument。修复内容
RWKV7.forward()调整为 HF/PEFT 友好的显式签名:支持input_ids / attention_mask / inputs_embeds / **kwargsforward_infctx()增加inputs_embeds=None并接收**kwargs,当inputs_embeds非空时直接使用 embedding;否则使用input_ids做 embeddinglast_shift_states/last_wkv_states时初始化空 state,避免某些调用路径崩溃影响范围 / 兼容性
inputs_embeds is None时逻辑与原来一致inputs_embeds)错误日志