LongSafety: Evaluating Long-Context Safety of Large Language Models Paper • 2502.16971 • Published Feb 24 • 1
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization Paper • 2503.20491 • Published Mar 26
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 207
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 207
AdaptThink: Reasoning Models Can Learn When to Think Paper • 2505.13417 • Published May 19 • 82
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation Paper • 2412.21059 • Published Dec 30, 2024 • 18
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training Paper • 2311.04155 • Published Nov 7, 2023 • 1
CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation Paper • 2311.18702 • Published Nov 30, 2023
AlignBench: Benchmarking Chinese Alignment of Large Language Models Paper • 2311.18743 • Published Nov 30, 2023 • 1
On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark Paper • 2110.08466 • Published Oct 16, 2021
PAL: Persona-Augmented Emotional Support Conversation Generation Paper • 2212.09235 • Published Dec 19, 2022