Drowning in Documents: Consequences of Scaling Reranker Inference Paper • 2411.11767 • Published Nov 18, 2024 • 18
Text2SQL is Not Enough: Unifying AI and Databases with TAG Paper • 2408.14717 • Published Aug 27, 2024 • 27
Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning Rates Paper • 2206.00832 • Published Jun 2, 2022
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models Paper • 2405.20541 • Published May 30, 2024 • 24
LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms Paper • 2311.13133 • Published Nov 22, 2023
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining Paper • 2312.17482 • Published Dec 29, 2023 • 1
Towards Characterizing Domain Counterfactuals For Invertible Latent Causal Models Paper • 2306.11281 • Published Jun 20, 2023
World Model on Million-Length Video And Language With RingAttention Paper • 2402.08268 • Published Feb 13, 2024 • 41
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments Paper • 2401.04290 • Published Jan 9, 2024 • 3
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts Paper • 2211.15841 • Published Nov 29, 2022 • 7
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines Paper • 2310.03714 • Published Oct 5, 2023 • 35
Feature Shift Detection: Localizing Which Features Have Shifted via Conditional Distribution Tests Paper • 2107.06929 • Published Jul 14, 2021