Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper β’ 2507.10532 β’ Published 30 days ago β’ 85
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper β’ 2408.10914 β’ Published Aug 20, 2024 β’ 43
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper β’ 2408.09174 β’ Published Aug 17, 2024 β’ 53
Self-Play Preference Optimization for Language Model Alignment Paper β’ 2405.00675 β’ Published May 1, 2024 β’ 28
ReFT: Representation Finetuning for Language Models Paper β’ 2404.03592 β’ Published Apr 4, 2024 β’ 100
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper β’ 2403.10704 β’ Published Mar 15, 2024 β’ 60
Running 996 996 Can You Run It? LLM version π Determine GPU requirements for large language models
Runtime error 563 563 Open Ko-LLM Leaderboard π Explore and filter language model benchmark results