FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving Paper • 2501.01005 • Published Jan 2, 2025 • 2
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving Paper • 2310.19102 • Published Oct 29, 2023 • 11
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning Paper • 2210.12873 • Published Oct 23, 2022