Spaces:

nvidia
/

Speech-IQ-leaderboard

Running

App Files Files Community

Speech-IQ-leaderboard / src /about.py

huckiyang

[release] speechIQ layout imprv-v4.2

0975119 16 days ago

raw

history blame contribute delete

3.8 kB

	from dataclasses import dataclass
	from enum import Enum

	# Your leaderboard name
	TITLE = """<h1 align="center" id="space-title">🎙️ ACL-25 SpeechIQ Leaderboard</h1>"""

	# What does your leaderboard evaluate?
	INTRODUCTION_TEXT = """
	## 🎯 Welcome to the SpeechIQ Leaderboard!

	This leaderboard presents evaluation results for voice understanding large language models (LLM<sub>Voice</sub>) using our novel SpeechIQ evaluation framework.
	The Speech IQ Score provides a unified metric for comparing both cascaded methods (ASR+LLM) and end-to-end models.
	"""

	# Which evaluations are you running? how can people reproduce what you have?
	LLM_BENCHMARKS_TEXT = """
	## 📊 About SpeechIQ Evaluation

	Speech Intelligence Quotient (SpeechIQ) represents a first-of-its-kind intelligence examination that bridges cognitive principles with voice-oriented benchmarks. Our framework moves beyond traditional metrics like Word Error Rate (WER) to provide comprehensive evaluation of voice understanding capabilities.

	### 🎯 Evaluation Framework

	SpeechIQ evaluates models across three cognitive dimensions inspired by Bloom's Taxonomy:

	1. Remember (Verbatim Accuracy): Tests the model's ability to accurately capture spoken content
	2. Understand (Interpretation Similarity): Evaluates how well the model comprehends the meaning of speech
	3. Apply (Downstream Performance): Measures the model's ability to use speech understanding for practical tasks

	### 🏆 Model Categories

	- Agentic (ASR + LLM): Cascaded approaches using separate ASR and LLM components
	- End2End: Direct speech-to-text models that process audio end-to-end

	### 🔬 Key Benefits

	- Unified Comparisons: Compare cascaded and end-to-end approaches on equal footing
	- Error Detection: Identify annotation errors in existing benchmarks
	- Hallucination Detection: Detect and quantify hallucinations in voice LLMs
	- Cognitive Assessment: Map model capabilities to human cognitive principles

	### 📈 Speech IQ Score

	The final Speech IQ Score combines performance across all three dimensions to provide a comprehensive measure of voice understanding intelligence.

	## 🔄 Reproducibility

	For detailed methodology and reproduction instructions, please refer to our paper and codebase.
	"""

	EVALUATION_QUEUE_TEXT = """
	## 🚀 Submit Your Model for SpeechIQ Evaluation

	To submit your voice understanding model for SpeechIQ evaluation:

	### 1) Ensure Model Compatibility
	Make sure your model can process audio inputs and generate text outputs in one of these formats:
	- ASR + LLM: Separate ASR and LLM components
	- End-to-End: Direct audio-to-text processing

	### 2) Model Requirements
	- Model must be publicly accessible
	- Provide clear documentation of audio input format and expected outputs
	- Include information about audio encoder specifications

	### 3) Evaluation Domains
	Your model will be evaluated across:
	- Remember: Transcription accuracy
	- Understand: Semantic understanding
	- Apply: Task-specific performance

	### 4) Documentation
	Please provide:
	- Model architecture details
	- Training data information
	- Audio preprocessing requirements
	- Expected input/output formats

	## 📧 Contact

	For questions about SpeechIQ evaluation or to submit your model, please contact the research team.
	"""

	CITATION_BUTTON_LABEL = "Refer to the following ACL 2025 main conference paper."
	CITATION_BUTTON_TEXT = r"""@article{speechiq2025,
	title={SpeechIQ: Speech Intelligence Quotient Across Cognitive Levels in Voice Understanding Large Language Models},
	author={Zhen Wan, Chao-Han Huck Yang, Yahan Yu, Jinchuan Tian, Sheng Li, Ke Hu, Zhehuai Chen, Shinji Watanabe, Fei Cheng, Chenhui Chu, Sadao Kurohashi},
	journal={ACL 2025 main conference},
	year={2025}
	}"""