Skip to content

实验细节咨询 #11

@hshc123

Description

@hshc123

感谢作者出色的工作,想请教一下,感激不尽!!!!
1、 sft和rl分别对精度的提升是多少,看论文好像只提供了最终的精度提升结果。
2、论文里面的sft的batchsize是512,请问是要把这个per_device_train_batch_size参数改成512/8(8卡)么?
3、把这个# ### eval注释删掉,是不是会自动跑eval并且存在wandb里面?

### train
run_name: dirl_sink_8b_math_glm_openr1math
include_effective_tokens_per_second: true
per_device_train_batch_size: 1
gradient_accumulation_steps: 1 #4
learning_rate: 1.0e-5
num_train_epochs: 10
lr_scheduler_type: constant_with_warmup
warmup_ratio: 0.03
bf16: true
ddp_timeout: 180000000
# ### eval
# val_size: 0.05
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 10

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions