Multiverse4FM
/

Multiverse-32B

Text Generation

text-generation-inference

Model card Files Files and versions Community

Multiverse-32B / README.md

Hanyuezhuohua's picture

Update README.md

2997dc8 verified about 2 months ago

|

history blame contribute delete

1.66 kB

	---
	license: apache-2.0
	datasets:
	- Multiverse4FM/Multiverse-1K-mixed
	- Multiverse4FM/Multiverse-1K
	- simplescaling/s1K-1.1
	base_model:
	- Qwen/Qwen2.5-32B-Instruct
	pipeline_tag: text-generation
	library_name: transformers
	---

	# Model Summary

	> Multiverse-32B, built on [Multiverse](https://multiverse4fm.github.io/), is the first open-source, non-AR model to achieve scores of 54% and 46% on AIME 2024 & 2025.

	- Webpage: [Multiverse](https://multiverse4fm.github.io/)
	- Paper: [https://arxiv.org/abs/2506.09991](https://arxiv.org/abs/2506.09991)

	# Use

	The model usage is documented [here](https://github.com/Multiverse4FM/Multiverse).

	# Evaluation

	\| Model \| AIME24 \| AIME25 \| MATH500 \| GPQA-Diamond \|
	\| :--- \| :---: \| :---: \| :---: \| :---: \|
	\| s1-32B \| 35.4 \| 25.8 \| 88.6 \| 48.0 \|
	\| s1.1-32B \| 52.9 \| 41.7 \| 93.4 \| 62.6 \|
	\| Qwen2.5-32B-Instruct \| 15.8 \| 10.4 \| 80.4 \| 47.0 \|
	\| Autoregressive-32B \| 54.6 \| <u>45.0</u> \| 92.8 \| <u>61.6</u> \|
	\| Multiverse-32B-zero \| 52.1 \| 44.2 \| <u>92.4</u> \| 63.6 \|
	\| Multiverse-32B \| 53.8 \| 45.8 \| 91.8 \| 60.7 \|

	# Acknowledge

	Thanks to the amazing s1 team for their s1.1 dataset as base data, and the Qwen team for their Qwen-2.5-32B-Instruct as base model.

	# Citation Information

	```bibtex
	@misc{yang2025multiverselanguagemodelssecretly,
	title={Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation},
	author={Xinyu Yang and Yuwei An and Hongyi Liu and Tianqi Chen and Beidi Chen},
	year={2025},
	eprint={2506.09991},
	archivePrefix={arXiv},
	primaryClass={cs.LG},
	url={https://arxiv.org/abs/2506.09991},
	}
	```