4 408

M Saad Salman

MSS444

MSS444

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining

upvoted a paper 3 days ago

Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training

upvoted a paper 4 days ago

ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

View all activity

Organizations

None yet

upvoted 2 papers 3 days ago

Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining

Paper • 2603.11103 • Published 5 days ago • 7

Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training

Paper • 2603.12246 • Published 4 days ago • 4

upvoted 4 papers 4 days ago

ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

Paper • 2603.05863 • Published 10 days ago • 5

Towards a Neural Debugger for Python

Paper • 2603.09951 • Published 6 days ago • 5

The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness

Paper • 2603.09200 • Published 6 days ago • 5

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Paper • 2603.09906 • Published 6 days ago • 63

upvoted 9 papers 5 days ago

ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

Paper • 2603.03583 • Published 13 days ago • 2

Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models

Paper • 2603.07777 • Published 8 days ago • 5

Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems

Paper • 2603.07779 • Published 8 days ago • 5

Agentic Critical Training

Paper • 2603.08706 • Published 7 days ago • 13

Unlocking Data Value in Finance: A Study on Distillation and Difficulty-Aware Training

Paper • 2603.07223 • Published 9 days ago • 13

NLE: Non-autoregressive LLM-based ASR by Transcript Editing

Paper • 2603.08397 • Published 7 days ago • 19

Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces

Paper • 2603.06713 • Published 11 days ago • 15

CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

Paper • 2603.08652 • Published 7 days ago • 33

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published 7 days ago • 49

upvoted 3 papers 6 days ago

upvoted 2 papers 10 days ago

Specificity-aware reinforcement learning for fine-grained open-world classification

Paper • 2603.03197 • Published 13 days ago • 13

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Paper • 2603.04257 • Published 12 days ago • 19

M Saad Salman

AI & ML interests

Recent Activity

Organizations

MSS444's activity