view article Article Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training By siro1 and 4 others • 7 days ago • 44
view article Article Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs By wenhuach and 8 others • Apr 29 • 39
Tiny dummy models Collection Randomly initialized tiny models for debugging/testing purpose • 112 items • Updated 3 days ago • 6
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • Jun 12 • 125
view article Article Accelerating LLM Inference with TGI on Intel Gaudi By baptistecolle and 4 others • Mar 28 • 14
view article Article Universal Assisted Generation: Faster Decoding with Any Assistant Model By danielkorat and 7 others • Oct 29, 2024 • 57
view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques By jmamou and 8 others • Mar 24 • 19
view article Article Welcome to Inference Providers on the Hub 🔥 By julien-c and 6 others • Jan 28 • 487
view article Article Timm ❤️ Transformers: Use any timm model with transformers By ariG23498 and 4 others • Jan 16 • 51
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference By mfuntowicz and 1 other • Jan 16 • 75
view article Article The 5 Most Under-Rated Tools on Hugging Face By derek-thomas • Aug 22, 2024 • 90
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10, 2024 • 72
view article Article CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG By peterizsak and 5 others • Mar 15, 2024 • 10