File size: 4,871 Bytes
3ef49b7
 
 
 
 
 
 
c246cd6
 
 
 
4538445
c246cd6
4538445
c246cd6
 
11d96b0
c246cd6
 
11d96b0
 
c246cd6
 
 
11d96b0
 
 
c246cd6
11d96b0
 
 
c246cd6
 
 
11d96b0
c246cd6
 
 
 
11d96b0
c246cd6
 
 
 
 
 
 
 
11d96b0
 
 
 
 
c246cd6
 
11d96b0
cade6bc
c246cd6
 
0da18b9
c246cd6
 
11d96b0
c246cd6
11d96b0
 
4538445
 
c246cd6
11d96b0
c246cd6
 
11d96b0
 
c246cd6
11d96b0
c246cd6
 
 
 
 
 
 
 
 
11d96b0
 
 
 
c246cd6
 
 
11d96b0
 
 
 
 
 
 
 
 
3ef49b7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
license: mit
language:
- en
base_model:
- Skywork/Matrix-Game
---
<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable html -->
<!-- markdownlint-disable no-duplicate-header -->

# Matrix-Game: Interactive World Foundation Model
<div style="display: flex; justify-content: center; gap: 10px;">
  <a href="https://github.com/SkyworkAI/Matrix-Game">
    <img src="https://img.shields.io/badge/GitHub-100000?style=flat&logo=github&logoColor=white" alt="GitHub">
  </a>
  <a href="https://github.com/SkyworkAI/Matrix-Game/blob/main/assets/report.pdf">
    <img src="https://img.shields.io/badge/arXiv-Report-b31b1b?style=flat&logo=arxiv&logoColor=white" alt="arXiv">
  </a>

  
</div>

## πŸ“ Overview
**Matrix-Game** is a 17B-parameter interactive world foundation model for controllable game world generation.

## ✨ Key Features

- 🎯 **Feature 1**: **Interactive Generation.**  A diffusion-based image-to-world model that generates high-quality videos conditioned on keyboard and mouse inputs, enabling fine-grained control and dynamic scene evolution.
- πŸš€ **Feature 2**: **GameWorld Score.** A comprehensive benchmark for evaluating Minecraft world models across four key dimensions, including visual quality, temporal quality, action controllability, and physical rule understanding. 
- πŸ’‘ **Feature 3**: **Matrix-Game Dataset** A large-scale Minecraft dataset with fine-grained action annotations, supporting scalable training for interactive and physically grounded world modeling.

## πŸ”₯ Latest Updates

* [2025-05] πŸŽ‰ Initial release of Matrix-Game Model

## πŸš€ Performance Comparison
### GameWorld Score Benchmark Comparison

| Model     | Image Quality ↑ | Aesthetic Quality ↑ | Temporal Cons. ↑ | Motion Smooth. ↑ | Keyboard Acc. ↑ | Mouse Acc. ↑ | 3D Cons. ↑ |
|-----------|------------------|-------------|-------------------|-------------------|------------------|---------------|-------------|
| Oasis     | 0.65             | 0.48        | 0.94              | **0.98**          | 0.77             | 0.56          | 0.56        |
| MineWorld | 0.69             | 0.47        | 0.95              | **0.98**          | 0.86             | 0.64          | 0.51        |
| **Ours**  | **0.72**         | **0.49**    | **0.97**          | **0.98**          | **0.95**         | **0.95**      | **0.76**    |

**Metric Descriptions**:

- **Image Quality** / **Aesthetic**: Visual fidelity and perceptual appeal of generated frames  
- **Temporal Consistency** / **Motion Smoothness**: Temporal coherence and smoothness between frames  
- **Keyboard Accuracy** / **Mouse Accuracy**: Accuracy in following user control signals  
- **3D Consistency**: Geometric stability and physical plausibility over time

  Please check our [GameWorld](https://github.com/SkyworkAI/Matrix-Game/tree/main/GameWorldScore) benchmark for detailed implementation.

### Human Evaluation

![Human Win Rate](assets/human_win_rate.png)

> Double-blind human evaluation by two independent groups across four key dimensions: **Overall Quality**, **Controllability**, **Visual Quality**, and **Temporal Consistency**.  
> Scores represent the percentage of pairwise comparisons in which each method was preferred. Matrix-Game consistently outperforms prior models across all metrics and both groups.


## πŸš€ Quick Start

```
# clone the repository:
git clone https://github.com/SkyworkAI/Matrix-Game.git
cd Matrix-Game

# install dependencies:
pip install -r requirements.txt

# install apex and FlashAttention-3
# Our project also depends on [apex](https://github.com/NVIDIA/apex) and [FlashAttention-3](https://github.com/Dao-AILab/flash-attention)

# inference
bash run_inference.sh
```

## ⭐ Acknowledgements

We would like to express our gratitude to:

- [Diffusers](https://github.com/huggingface/diffusers) for their excellent diffusion model framework
- [HunyuanVideo](https://github.com/Tencent/HunyuanVideo) for their strong base model
- [MineDojo](https://minedojo.org/knowledge_base) for their Minecraft video dataset
- [MineRL](https://github.com/minerllabs/minerl) for their excellent gym framework
- [Video-Pre-Training](https://github.com/openai/Video-Pre-Training) for their accurate Inverse Dynamics Model
- [GameFactory](https://github.com/KwaiVGI/GameFactory) for their idea of action control module 

We are grateful to the broader research community for their open exploration and contributions to the field of interactive world generation.

## πŸ“Ž Citation
If you find this project useful, please cite our paper:
```bibtex
@article{zhang2025matrixgame,
  title     = {Matrix-Game: Interactive World Foundation Model},
  author    = {Yifan Zhang and Chunli Peng and Boyang Wang and Puyi Wang and Qingcheng Zhu and Zedong Gao and Eric Li and Yang Liu and Yahui Zhou},
  journal   = {arXiv},
  year      = {2025}
}
```