DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22, 2025 • 441
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces Paper • 2501.09756 • Published Jan 16, 2025 • 20
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models Paper • 2501.02955 • Published Jan 6, 2025 • 44
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction Paper • 2501.01957 • Published Jan 3, 2025 • 47
SDPO: Segment-Level Direct Preference Optimization for Social Agents Paper • 2501.01821 • Published Jan 3, 2025 • 20
Metric-AI/armenian-text-embeddings-1 Feature Extraction • 0.3B • Updated Feb 20, 2025 • 619 • 20
Running on Zero Featured 2.72k Whisper 📉 2.72k Transcribe audio files and YouTube videos into text
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation Paper • 2410.23090 • Published Oct 30, 2024 • 55
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching Paper • 2410.06885 • Published Oct 9, 2024 • 46