a080fe0
1
2
3
4
5
6
7
8
9
10
# Reward Functions This module contains some useful reward functions, primarily intended for use with the [`GRPOTrainer`]. ## Format rewards ### think_format_reward [[autodoc]] rewards.think_format_reward