Spatial Audio & Visual Spatial Audio & Visual LLMs JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments Paper β’ 2602.18527 β’ Published Feb 20 β’ 2 tsinghua-ee/JAEGER Updated 1 day ago β’ 15 β’ 4
JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments Paper β’ 2602.18527 β’ Published Feb 20 β’ 2
General Time Series SciTS: Scientific Time Series Understanding and Generation with LLMs Paper β’ 2510.03255 β’ Published Sep 26, 2025 OpenTSLab/SciTS Preview β’ Updated Mar 19 β’ 2.19k β’ 2
SciTS: Scientific Time Series Understanding and Generation with LLMs Paper β’ 2510.03255 β’ Published Sep 26, 2025
Brain Signals BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals Paper β’ 2505.18185 β’ Published May 18, 2025 β’ 1 OpenTSLab/BrainOmni Updated Oct 15, 2025 β’ 2
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals Paper β’ 2505.18185 β’ Published May 18, 2025 β’ 1
Speech & Audio Processing SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper β’ 2510.25955 β’ Published Oct 29, 2025 β’ 1 marcoyang/spear-xlarge-speech-audio Feature Extraction β’ 0.6B β’ Updated 2 days ago β’ 10.7k β’ 7
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper β’ 2510.25955 β’ Published Oct 29, 2025 β’ 1
marcoyang/spear-xlarge-speech-audio Feature Extraction β’ 0.6B β’ Updated 2 days ago β’ 10.7k β’ 7
video-SALMONN 2 video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions. tsinghua-ee/video-SALMONN-2_plus_72B Updated Sep 28, 2025 β’ 11 β’ 2 tsinghua-ee/video_SALMONN2plus_72B_audioAlign Updated Jan 28 β’ 6 tsinghua-ee/video-SALMONN2_plus_7B_full 9B β’ Updated Feb 23 β’ 446 tsinghua-ee/video-SALMONN-2_plus_7B Updated Sep 28, 2025 β’ 74 β’ 6
Spatial Audio & Visual Spatial Audio & Visual LLMs JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments Paper β’ 2602.18527 β’ Published Feb 20 β’ 2 tsinghua-ee/JAEGER Updated 1 day ago β’ 15 β’ 4
JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments Paper β’ 2602.18527 β’ Published Feb 20 β’ 2
Speech & Audio Processing SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper β’ 2510.25955 β’ Published Oct 29, 2025 β’ 1 marcoyang/spear-xlarge-speech-audio Feature Extraction β’ 0.6B β’ Updated 2 days ago β’ 10.7k β’ 7
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper β’ 2510.25955 β’ Published Oct 29, 2025 β’ 1
marcoyang/spear-xlarge-speech-audio Feature Extraction β’ 0.6B β’ Updated 2 days ago β’ 10.7k β’ 7
General Time Series SciTS: Scientific Time Series Understanding and Generation with LLMs Paper β’ 2510.03255 β’ Published Sep 26, 2025 OpenTSLab/SciTS Preview β’ Updated Mar 19 β’ 2.19k β’ 2
SciTS: Scientific Time Series Understanding and Generation with LLMs Paper β’ 2510.03255 β’ Published Sep 26, 2025
video-SALMONN 2 video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions. tsinghua-ee/video-SALMONN-2_plus_72B Updated Sep 28, 2025 β’ 11 β’ 2 tsinghua-ee/video_SALMONN2plus_72B_audioAlign Updated Jan 28 β’ 6 tsinghua-ee/video-SALMONN2_plus_7B_full 9B β’ Updated Feb 23 β’ 446 tsinghua-ee/video-SALMONN-2_plus_7B Updated Sep 28, 2025 β’ 74 β’ 6
Brain Signals BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals Paper β’ 2505.18185 β’ Published May 18, 2025 β’ 1 OpenTSLab/BrainOmni Updated Oct 15, 2025 β’ 2
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals Paper β’ 2505.18185 β’ Published May 18, 2025 β’ 1