Skip to content

reward model 训练数据量的问题 #439

@chloeHXY

Description

@chloeHXY

老师您好,我是NLP萌新,想请教您一个问题:

  1. 训练医疗大模型,reward model的训练数据是什么呢,是医疗有关的(图1 的4k条数据)还是图二这种和医疗无关的呢?还是两者混合?
    2.如果要混合,比例大概是多少呢?
    3.训练reward model,数据量一般多少才有效呢?
Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions