Interested
updated
Large Language Model Unlearning via Embedding-Corrupted Prompts
Paper
• 2406.07933
• Published • 9
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper
• 2406.02657
• Published • 41
Learn Beyond The Answer: Training Language Models with Reflection for
Mathematical Reasoning
Paper
• 2406.12050
• Published • 19
How Do Large Language Models Acquire Factual Knowledge During
Pretraining?
Paper
• 2406.11813
• Published • 31
Breaking the Attention Bottleneck
Paper
• 2406.10906
• Published • 4
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
Scale
Paper
• 2406.17557
• Published • 102
Unlocking Continual Learning Abilities in Language Models
Paper
• 2406.17245
• Published • 30
Scaling Laws for Linear Complexity Language Models
Paper
• 2406.16690
• Published • 23
Aligning Teacher with Student Preferences for Tailored Training Data
Generation
Paper
• 2406.19227
• Published • 25
Is Programming by Example solved by LLMs?
Paper
• 2406.08316
• Published • 13
MoA: Mixture of Sparse Attention for Automatic Large Language Model
Compression
Paper
• 2406.14909
• Published • 16
Can LLMs Learn by Teaching? A Preliminary Study
Paper
• 2406.14629
• Published • 21
To Forget or Not? Towards Practical Knowledge Unlearning for Large
Language Models
Paper
• 2407.01920
• Published • 17
On Leakage of Code Generation Evaluation Datasets
Paper
• 2407.07565
• Published • 6
Paper
• 2407.10671
• Published • 169
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Paper
• 2407.10969
• Published • 23
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled
Refusal Training
Paper
• 2407.09121
• Published • 6
Practical Unlearning for Large Language Models
Paper
• 2407.10223
• Published • 4
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix"
Cycle
Paper
• 2407.13833
• Published • 12
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
• 2403.19887
• Published • 112
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented
Generation
Paper
• 2408.02545
• Published • 40
CoverBench: A Challenging Benchmark for Complex Claim Verification
Paper
• 2408.03325
• Published • 15
Better Alignment with Instruction Back-and-Forth Translation
Paper
• 2408.04614
• Published • 15
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
• 2408.04619
• Published • 175
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper
• 2408.10914
• Published • 45
ReMamba: Equip Mamba with Effective Long-Sequence Modeling
Paper
• 2408.15496
• Published • 12
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with
100+ NLP Researchers
Paper
• 2409.04109
• Published • 48
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation
Generation
Paper
• 2410.23090
• Published • 55
Can Language Models Replace Programmers? REPOCOD Says 'Not Yet'
Paper
• 2410.21647
• Published • 18
Paper
• 2410.21276
• Published • 87
LongReward: Improving Long-context Large Language Models with AI
Feedback
Paper
• 2410.21252
• Published • 19
Hymba: A Hybrid-head Architecture for Small Language Models
Paper
• 2411.13676
• Published • 47