TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models
Paper
• 2602.15449 • Published
• 6
Instruction tuning, Diffusion models
torch.compile2 ** search_round) and repeat 1 - 3.diffusers 🧨bistandbytes as the official backend but using others like torchao is already very simple. enable_model_cpu_offload()torch.compile() them.