view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 56
view article Article Fast, High-Fidelity LLM Decoding with Regex Constraints By vivien • Feb 23, 2024 • 9