nth-attempt 's Collections to-read
updated
MambaByte: Token-free Selective State Space Model
Paper
• 2401.13660
• Published
• 60
Medusa: Simple LLM Inference Acceleration Framework with Multiple
Decoding Heads
Paper
• 2401.10774
• Published
• 59
Self-Rewarding Language Models
Paper
• 2401.10020
• Published
• 152
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper
• 2401.12954
• Published
• 33
ChatQA: Building GPT-4 Level Conversational QA Models
Paper
• 2401.10225
• Published
• 36
ReFT: Reasoning with Reinforced Fine-Tuning
Paper
• 2401.08967
• Published
• 31
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and
DeepSpeed-Inference
Paper
• 2401.08671
• Published
• 15
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper
• 2401.06080
• Published
• 28
MoE-Mamba: Efficient Selective State Space Models with Mixture of
Experts
Paper
• 2401.04081
• Published
• 74
Paper
• 2401.04088
• Published
• 160
Blending Is All You Need: Cheaper, Better Alternative to
Trillion-Parameters LLM
Paper
• 2401.02994
• Published
• 52
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper
• 2401.02038
• Published
• 65
TinyLlama: An Open-Source Small Language Model
Paper
• 2401.02385
• Published
• 95
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model
Scaling Laws
Paper
• 2401.00448
• Published
• 30
Principled Instructions Are All You Need for Questioning LLaMA-1/2,
GPT-3.5/4
Paper
• 2312.16171
• Published
• 37
FunAudioLLM: Voice Understanding and Generation Foundation Models for
Natural Interaction Between Humans and LLMs
Paper
• 2407.04051
• Published
• 40