view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • Jun 12 • 121
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper • 2505.14669 • Published May 20 • 78
view article Article Exploring Hard Negative Mining with NV-Retriever in Korean Financial Text By Albertmade and 1 other • Jan 12 • 14
EmoPillars Collection This collection contains models and a dataset for fine-grained context-aware and context-less emotion classification. • 7 items • Updated Apr 25 • 1
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 175
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models Paper • 2411.14257 • Published Nov 21, 2024 • 13
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 153
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated Apr 30 • 88
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation Paper • 2409.06820 • Published Sep 10, 2024 • 69
view article Article How NuminaMath Won the 1st AIMO Progress Prize By yfleureau and 7 others • Jul 11, 2024 • 122
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design Paper • 2408.12503 • Published Aug 22, 2024 • 27
Linear Transformers with Learnable Kernel Functions are Better In-Context Models Paper • 2402.10644 • Published Feb 16, 2024 • 82
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 By tomaarsen • May 28, 2024 • 240
Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian Paper • 2405.13929 • Published May 22, 2024 • 54