Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation Paper • 2212.07981 • Published Dec 15, 2022
Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models Paper • 2502.19918 • Published Feb 27
Learning to Reason via Mixture-of-Thought for Logical Reasoning Paper • 2505.15817 • Published May 21 • 17
QTSumm: A New Benchmark for Query-Focused Table Summarization Paper • 2305.14303 • Published May 23, 2023
Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization Paper • 2311.09184 • Published Nov 15, 2023 • 1