view article Article 🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other • Jun 21 • 66
view article Article SmolVLM2: Bringing Video Understanding to Every Device By orrzohar and 6 others • Feb 20 • 293
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer Paper • 2501.11319 • Published Jan 20 • 1
view article Article DeepSearch Using Visual RAG in Agentic Frameworks 🔎 By paultltc and 1 other • Mar 21 • 35
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14 • 114
view article Article Open-Source Handwritten Signature Detection Model By samuellimabraz • Mar 14 • 117
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper • 2410.10306 • Published Oct 14, 2024 • 57
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Paper • 2409.02634 • Published Sep 4, 2024 • 98
SLIM Models Collection Structured Language Instruction Models (SLIMs) • 31 items • Updated Feb 10 • 32
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 628
Granite 3.0 Language Models Collection A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated May 2 • 98
SLIM GGUF Collection Quantized GGUF 'tool' implementations of SLIM Models • 30 items • Updated Feb 23 • 12
view article Article TTS Arena: Benchmarking Text-to-Speech Models in the Wild By mrfakename and 6 others • Feb 27, 2024 • 71
Open-source speech datasets annotated using Data-Speech Collection Open-source annotated speech datasets ranging from 1,000 hours to 45,000 hours. • 11 items • Updated Aug 8, 2024 • 5
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 624
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition Paper • 2402.15504 • Published Feb 23, 2024 • 23
Industry BERT Models Collection Industry and specialized domain finetuned BERT embedding models • 6 items • Updated May 28 • 8
TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation Paper • 2401.14373 • Published Jan 25, 2024 • 11
InstantID: Zero-shot Identity-Preserving Generation in Seconds Paper • 2401.07519 • Published Jan 15, 2024 • 58