ReTool-Implementation / src /retool_trainer.py

Commit History

Add custom sampler, train data loader and GRPO style train loop for ReTool_trainer
c710786
verified

bird-of-paradise commited on

replace `model.generate` with custom generation function to optimize kv_cache
a0dec77
verified

bird-of-paradise commited on

Use weighted list reward functions
e9196fe
verified

bird-of-paradise commited on