File size: 11,266 Bytes
b6e638d
83a3255
 
b6e638d
83a3255
 
 
b6e638d
83a3255
b2d6ffc
b6e638d
83a3255
b6e638d
 
 
 
c5bb942
 
b6e638d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83a3255
 
b6e638d
 
 
 
 
 
 
 
 
83a3255
 
b6e638d
 
 
 
 
 
 
 
a9ae026
b6e638d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34bca4a
b6e638d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83a3255
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b6e638d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
---
base_model:
- OpenGVLab/InternVL2-8B
language:
- en
library_name: transformers
license: apache-2.0
metrics:
- accuracy
pipeline_tag: image-text-to-text
---

# MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Repo: [https://github.com/mathllm/MathCoder](https://github.com/mathllm/MathCoder)

Paper: [https://huggingface.co/papers/2505.10557](https://huggingface.co/papers/2505.10557)


## Introduction
We introduce MathCoder-VL, a series of open-source large multimodal models (LMMs) specifically tailored for general math problem-solving. We also introduce [FigCodifier-8B](https://huggingface.co/MathLLMs/FigCodifier), an image-to-code model.

| Base Model                                          	|Ours                                               |
|-------------------------------------------------------------------|-----------------------------------------------------------------------|
|  [Mini-InternVL-Chat-2B-V1-5](https://huggingface.co/OpenGVLab/Mini-InternVL-Chat-2B-V1-5)  |  [MathCoder-VL-2B](https://huggingface.co/MathLLMs/MathCoder-VL-2B)   	|
|  [InternVL2-8B](https://huggingface.co/OpenGVLab/InternVL2-8B)  |     	[MathCoder-VL-8B](https://huggingface.co/MathLLMs/MathCoder-VL-8B)|



## Usage
For training and inference code, please refer to [InternVL](https://github.com/OpenGVLab/InternVL).

### Prompt for TikZ Code Generation

```
<image>
Please generate the corresponding TikZ code that accurately represents the visual elements in the image. TikZ is a powerful tool for creating vector graphics within LaTeX documents. Your generated code should be precise, well-structured, and should recreate the image as faithfully as possible.
```

<div align="center">
  <img src="./examples/tikzimage.png" width="100%" title="Result Figure">
</div>

### Prompt for Python Code Generation

```
Please provide the Python code needed to reproduce this image.
<image>
```

<div align="center">
  <img src="./examples/pyimage.png" width="100%" title="Result Figure">
</div>


## Motivation

<div align="center">
  <img src="./examples/fig1.png" width="100%" title="Result Figure">
</div>

## Construction of FigCodifier

<div align="center">
  <img src="./examples/fig2.png" width="100%" title="Result Figure">
</div>

## Construction of MathCoder-VL

<div align="center">
  <img src="./examples/fig4.png" width="100%" title="Result Figure">
</div>

## Performance

<div align="center">
  <img src="./examples/tab1.png" width="100%" title="Result Figure">
</div>

## **Citation**

Please cite the paper if you use our data, model or code.

```
@inproceedings{
wang2025mathcodervl,
title={MathCoder-{VL}: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning},
author={Ke Wang and Junting Pan and Linda Wei and Aojun Zhou and Weikang Shi and Zimu Lu and Han Xiao and Yunqiao Yang and Houxing Ren and Mingjie Zhan and Hongsheng Li},
booktitle={The 63rd Annual Meeting of the Association for Computational Linguistics},
year={2025},
url={https://openreview.net/forum?id=nuvtX1imAb}
}
```
```
@inproceedings{
lu2025mathcoder2,
title={MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code},
author={Zimu Lu and Aojun Zhou and Ke Wang and Houxing Ren and Weikang Shi and Junting Pan and Mingjie Zhan and Hongsheng Li},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=1Iuw1jcIrf}
}
```
```
@inproceedings{
wang2024mathcoder,
title={MathCoder: Seamless Code Integration in {LLM}s for Enhanced Mathematical Reasoning},
author={Ke Wang and Houxing Ren and Aojun Zhou and Zimu Lu and Sichun Luo and Weikang Shi and Renrui Zhang and Linqi Song and Mingjie Zhan and Hongsheng Li},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=z8TW0ttBPp}
}
```

# **MathCoder**
This repo is for "[MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning](https://openreview.net/pdf?id=z8TW0ttBPp)"

πŸ”₯πŸ”₯πŸ”₯ We release "[MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning](https://openreview.net/pdf?id=lclKPTKM9R)"


## πŸ’₯ News πŸ’₯

- **[2025.05.16]** πŸ€— [MathCoder-VL-2B](https://huggingface.co/MathLLMs/MathCoder-VL-2B), [MathCoder-VL-8B](https://huggingface.co/MathLLMs/MathCoder-VL-8B) and [FigCodifier-8B](https://huggingface.co/MathLLMs/FigCodifier) is available now! πŸ”₯πŸ”₯πŸ”₯
- **[2025.05.16]** Our MathCoder-VL is accepted to ACL 2025 Findings. πŸ”₯πŸ”₯πŸ”₯
- **[2024.05.20]** πŸ€— [MathCodeInstruct Dataset-Plus](https://huggingface.co/datasets/MathLLMs/MathCodeInstruct-Plus) is available now! πŸ”₯
- **[2024.04.29]** πŸ€— [MathCodeInstruct Dataset](https://huggingface.co/datasets/MathLLMs/MathCodeInstruct) is available now! πŸ”₯
- **[2024.02.27]** πŸš€ [MathGenie](https://mathgenie.github.io/) achieves an accuracy of 87.7% on GSM8K and 55.7% on MATH. πŸŽ‰ Congratulations!
- **[2024.02.27]** The inference and evaluation code for MathCoders is available now.
- **[2024.01.16]** 🌟 Our [**MathCoder**](https://openreview.net/forum?id=z8TW0ttBPp) and [**CSV**](https://openreview.net/forum?id=c8McWs4Av0) has been accepted at **ICLR 2024**! πŸŽ‰ Cheers!
- **[2023.10.05]** Our work was featured by [Aran Komatsuzaki](https://twitter.com/arankomatsuzaki). Thanks!
- **[2023.10.05]** Our 7B models are available at Huggingface now.
- **[2023.10.05]** Our paper is now accessible at https://arxiv.org/abs/2310.03731.

### Datasets and Models
Our models are available at Hugging Face now.

πŸ€— [MathCodeInstruct Dataset](https://huggingface.co/datasets/MathLLM/MathCodeInstruct)

| Base Model: Llama-2                                           	| Base Model: Code Llama                                                |
|-------------------------------------------------------------------|-----------------------------------------------------------------------|
|  [MathCoder-L-7B](https://huggingface.co/MathLLM/MathCoder-L-7B)  |  [MathCoder-CL-7B](https://huggingface.co/MathLLM/MathCoder-CL-7B)   	|
|  [MathCoder-L-13B](https://huggingface.co/MathLLM/MathCoder-L-13B)  |  [MathCoder-CL-34B](https://huggingface.co/MathLLM/MathCoder-CL-34B)   	|


## Training Data
The models are trained on the [MathCodeInstruct](https://huggingface.co/datasets/MathLLM/MathCodeInstruct) Dataset.

<br>
<div align="center">
  <img src="figures/result.png" width="100%" title="Result Figure">
</div>


## **Introduction**
The recently released GPT-4 Code Interpreter has demonstrated remarkable proficiency in solving challenging math problems, primarily attributed to its ability to seamlessly reason with natural language, generate code, execute code, and continue reasoning based on the execution output. In this paper, we present a method to fine-tune open-source language models, enabling them to use code for modeling and deriving math equations and, consequently, enhancing their mathematical reasoning abilities.

We propose a method of generating novel and high-quality datasets with math problems and their code-based solutions, referred to as MathCodeInstruct. Each solution interleaves *natural language*, *code*, and *execution results*. 

We also introduce a customized supervised fine-tuning and inference approach. This approach yields the MathCoder models, a family of models capable of generating code-based solutions for solving challenging math problems.

Impressively, the MathCoder models achieve state-of-the-art scores among open-source LLMs on the MATH (45.2\%) and GSM8K (83.9\%) datasets, substantially outperforming other open-source alternatives. Notably, the MathCoder model not only surpasses ChatGPT-3.5 and PaLM-2 on GSM8K and MATH but also outperforms GPT-4 on the competition-level MATH dataset. The proposed dataset and models will be released upon acceptance.
<br>
<div align="center">
  <img src="figures/pipeline.png" width="100%" title="Result Figure">
</div>

## Usage

### Model deployment
We use the Text Generation Inference (TGI) to deploy our MathCoders for response generation.
TGI is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and T5. Your can follow the guide [here](https://huggingface.co/docs/text-generation-inference/index).
After successfully installing TGI, you can easily deploy the models using `deploy.sh`.
```sh
model_path="local model path"

max_input_tokens=1536
max_total_tokens=2048

set -x
hostname -I # print the host ip

text-generation-launcher --port 8000 \
--max-batch-prefill-tokens ${max_input_tokens} \
--max-input-length ${max_input_tokens} \
--max-total-tokens ${max_total_tokens} \
--model-id ${model_path}
```

### Inference
We provide a script for inference. Just replace the `ip` and `port` in the following command correctly with the API forwarded by TGI like:
```sh
python inference.py --pnum=4 --outdir=outs/debug --ip=10.119.18.159 --port=8001 --type=test --dataset=GSM8K
```
We also open-source all of the model outputs from our MathCoders under the outs/ folder.

### Evaluation
To evaluate the predicted answer, run the following command:
```sh
python evaluate.py outs/MathCoder-L-7b/MATH/MATH_test_result-20230917-2026.jsonl 
```

## **Citation**

Please cite the paper if you use our data, model or code.

```
@inproceedings{
wang2025mathcodervl,
title={MathCoder-{VL}: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning},
author={Ke Wang and Junting Pan and Linda Wei and Aojun Zhou and Weikang Shi and Zimu Lu and Han Xiao and Yunqiao Yang and Houxing Ren and Mingjie Zhan and Hongsheng Li},
booktitle={The 63rd Annual Meeting of the Association for Computational Linguistics},
year={2025},
url={https://openreview.net/forum?id=nuvtX1imAb}
}
```
```
@inproceedings{
lu2025mathcoder2,
title={MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code},
author={Zimu Lu and Aojun Zhou and Ke Wang and Houxing Ren and Weikang Shi and Junting Pan and Mingjie Zhan and Hongsheng Li},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=1Iuw1jcIrf}
}
```
```
@inproceedings{
wang2024mathcoder,
title={MathCoder: Seamless Code Integration in {LLM}s for Enhanced Mathematical Reasoning},
author={Ke Wang and Houxing Ren and Aojun Zhou and Zimu Lu and Sichun Luo and Weikang Shi and Renrui Zhang and Linqi Song and Mingjie Zhan and Hongsheng Li},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=z8TW0ttBPp}
}
```
```
@inproceedings{
zhou2024solving,
title={Solving Challenging Math Word Problems Using {GPT}-4 Code Interpreter with Code-based Self-Verification},
author={Aojun Zhou and Ke Wang and Zimu Lu and Weikang Shi and Sichun Luo and Zipeng Qin and Shaoqing Lu and Anya Jia and Linqi Song and Mingjie Zhan and Hongsheng Li},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=c8McWs4Av0}
}
```