dshin/flan-t5-ppo-user-f-batch-size-64-use-violation Reinforcement Learning • Updated Mar 14, 2023 • 4
dshin/flan-t5-ppo-user-h-batch-size-64-use-violation Reinforcement Learning • Updated Mar 14, 2023 • 5
dshin/flan-t5-ppo-user-e-batch-size-64-use-violation Reinforcement Learning • Updated Mar 14, 2023 • 4
vincentmin/opt-125m-eli5-rl-finetune-128-8-8-1.4e-5_ada Reinforcement Learning • Updated Apr 10, 2023
dshin/flan-t5-ppo-user-a-allenai-prosocial-dialog-testing-upload Reinforcement Learning • Updated Apr 12, 2023 • 6