Skip to content

Commit e8fd39a

Browse files
authored
Fixed the bug that caused the training to stop directly when the training data was not alternating between user and assistant. (#221)
* fix bug * Update parse.py
1 parent aca179d commit e8fd39a

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

specforge/data/parse.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,11 @@ def parse(
6868
convroles = ["user", "assistant"]
6969
for j, sentence in enumerate(conversation):
7070
role = sentence["role"]
71-
assert role == convroles[j % 2], f"unexpected role {role}"
71+
if role != convroles[j % 2]:
72+
warnings.warn(
73+
f"Conversation truncated due to unexpected role '{role}'. Expected '{convroles[j % 2]}'."
74+
)
75+
break
7276
messages.append({"role": role, "content": sentence["content"]})
7377

7478
conversation = self.tokenizer.apply_chat_template(

0 commit comments

Comments
 (0)