Kaguya-19 nielsr HF Staff commited on
Commit
08c3956
·
verified ·
1 Parent(s): f24b538

Add link to paper and mention it in the description (#1)

Browse files

- Add link to paper and mention it in the description (cc94bbc72311db38d213667116911954c6fd9681)
- Remove file information section (9af7f6ee114cecf8fbbb7f7ff5508827db5e3588)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +200 -196
README.md CHANGED
@@ -1,196 +1,200 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - zh
5
- - en
6
- pipeline_tag: text-generation
7
- library_name: transformers
8
- ---
9
- <div align="center">
10
- <img src="https://github.com/OpenBMB/MiniCPM/blob/main/assets/minicpm_logo.png?raw=true" width="500em" ></img>
11
- </div>
12
-
13
- <p align="center">
14
- <a href="https://github.com/OpenBMB/MiniCPM/" target="_blank">GitHub Repo</a> |
15
- <a href="https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf" target="_blank">Technical Report</a>
16
- </p>
17
- <p align="center">
18
- 👋 Join us on <a href="https://discord.gg/3cGQn9b3YM" target="_blank">Discord</a> and <a href="https://github.com/OpenBMB/MiniCPM/blob/main/assets/wechat.jpg" target="_blank">WeChat</a>
19
- </p>
20
-
21
- ## What's New
22
-
23
- * [2025-06-05] 🚀🚀🚀 We have open-sourced **MiniCPM4-Survey**, a model built upon MiniCPM4-8B that is capable of generating trustworthy, long-form survey papers while maintaining competitive performance relative to significantly larger models.
24
-
25
- ## MiniCPM4 Series
26
- MiniCPM4 series are highly efficient large language models (LLMs) designed explicitly for end-side devices, which achieves this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems.
27
- - [MiniCPM4-8B](https://huggingface.co/openbmb/MiniCPM4-8B): The flagship of MiniCPM4, with 8B parameters, trained on 8T tokens.
28
- - [MiniCPM4-0.5B](https://huggingface.co/openbmb/MiniCPM4-0.5B): The small version of MiniCPM4, with 0.5B parameters, trained on 1T tokens.
29
- - [MiniCPM4-8B-Eagle-FRSpec](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec): Eagle head for FRSpec, accelerating speculative inference for MiniCPM4-8B.
30
- - [MiniCPM4-8B-Eagle-FRSpec-QAT-cpmcu](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec-QAT-cpmcu): Eagle head trained with QAT for FRSpec, efficiently integrate speculation and quantization to achieve ultra acceleration for MiniCPM4-8B.
31
- - [MiniCPM4-8B-Eagle-vLLM](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-vLLM): Eagle head in vLLM format, accelerating speculative inference for MiniCPM4-8B.
32
- - [MiniCPM4-8B-marlin-Eagle-vLLM](https://huggingface.co/openbmb/MiniCPM4-8B-marlin-Eagle-vLLM): Quantized Eagle head for vLLM format, accelerating speculative inference for MiniCPM4-8B.
33
- - [BitCPM4-0.5B](https://huggingface.co/openbmb/BitCPM4-0.5B): Extreme ternary quantization applied to MiniCPM4-0.5B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
34
- - [BitCPM4-1B](https://huggingface.co/openbmb/BitCPM4-1B): Extreme ternary quantization applied to MiniCPM3-1B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
35
- - [MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey): Based on MiniCPM4-8B, accepts users' quiries as input and autonomously generate trustworthy, long-form survey papers. (**<-- you are here**)
36
- - [MiniCPM4-MCP](https://huggingface.co/openbmb/MiniCPM4-MCP): Based on MiniCPM4-8B, accepts users' queries and available MCP tools as input and autonomously calls relevant MCP tools to satisfy users' requirements.
37
-
38
- ## Overview
39
-
40
- **MiniCPM4-Survey** is an open-source LLM agent model jointly developed by [THUNLP](https://nlp.csai.tsinghua.edu.cn), Renmin University of China and [ModelBest](https://modelbest.cn/en). Built on [MiniCPM4](https://github.com/OpenBMB/MiniCPM4) with 8 billion parameters, it accepts users' quiries as input and autonomously generate trustworthy, long-form survey papers.
41
-
42
- Key features include:
43
-
44
- - **Plan-Retrieve-Write Survey Generation Framework** We propose a multi-agent generation framework, which operates through three core stages: planning (defining the overall structure of the survey), retrieval (generating appropriate retrieval keywords), and writing (synthesizing the retrieved information to generate coherent section-level content).
45
-
46
- - **High-Quality Dataset Construction** — We gather and process lots of expert-written survey papers to construct a high-quality training dataset. Meanwhile, we collect a large number of research papers to build a retrieval database.
47
-
48
- - **Multi-Aspect Reward Design** — We carefully design a reward system with three aspects (structure, content, and citations) to evaluate the quality of the surveys, which is used as the reward function in the RL training stage.
49
-
50
- - **Multi-Step RL Training Strategy** — We propose a *Context Manager* to ensure retention of essential information while facilitating efficient reasoning, and we construct *Parallel Environment* to maintain efficient RL training cycles.
51
-
52
-
53
- ## Quick Start
54
-
55
- ### Download the model
56
-
57
- Download [MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey) from Hugging Face and place it in `model/MiniCPM4-Survey`.
58
- We recommend using [MiniCPM-Embedding-Light](https://huggingface.co/openbmb/MiniCPM-Embedding-Light) as the embedding model, which can be downloaded from Hugging Face and placed in `model/MiniCPM-Embedding-Light`.
59
- ### Perpare the environment
60
-
61
- You can download the [paper data](https://www.kaggle.com/datasets/Cornell-University/arxiv) from Kaggle, then extract it. You can run `python data_process.py` to process the data and generate the retrieval database. Then you can run `python build_index.py` to build the retrieval database.
62
-
63
- ```
64
- cd ./code
65
- curl -L -o ~/Downloads/arxiv.zip\
66
- https://www.kaggle.com/api/v1/datasets/download/Cornell-University/arxiv
67
- unzip ~/Downloads/arxiv.zip -d .
68
- mkdir data
69
- python ./src/preprocess/data_process.py
70
- mkdir index
71
- python ./src/preprocess/build_index.py
72
- ```
73
-
74
- ### Model Inference
75
-
76
- You can run the following command to build the retrieval environment and start the inference:
77
-
78
- ```bash
79
- cd ./code
80
- python ./src/retriever.py
81
- bash ./scripts/run.sh
82
- ```
83
-
84
- If you want to run with the frontend, you can run the following command:
85
-
86
- ```bash
87
- cd ./code
88
- python ./src/retriever.py
89
- bash ./scripts/run_with_frontend.sh
90
- cd frontend/minicpm4-survey
91
- npm install
92
- npm run dev
93
- ```
94
-
95
- Then you can visit `http://localhost:5173` in your browser to use the model.
96
-
97
- ## Performance Evaluation
98
-
99
- | Method | Relevance | Coverage | Depth | Novelty | Avg. | Fact Score |
100
- |---------------------------------------------|-----------|----------|-------|---------|-------|------------|
101
- | Naive RAG (driven by G2FT) | 3.25 | 2.95 | 3.35 | 2.60 | 3.04 | 43.68 |
102
- | AutoSurvey (driven by G2FT) | 3.10 | 3.25 | 3.15 | **3.15**| 3.16 | 46.56 |
103
- | Webthinker (driven by WTR1-7B) | 3.30 | 3.00 | 2.75 | 2.50 | 2.89 | -- |
104
- | Webthinker (driven by QwQ-32B) | 3.40 | 3.30 | 3.30 | 2.50 | 3.13 | -- |
105
- | OpenAI Deep Research (driven by GPT-4o) | 3.50 |**3.95** | 3.55 | 3.00 | **3.50** | -- |
106
- | MiniCPM4-Survey | 3.45 | 3.70 | **3.85** | 3.00 | **3.50** | **68.73** |
107
- | &nbsp;&nbsp;&nbsp;*w/o* RL | **3.55** | 3.35 | 3.30 | 2.25 | 3.11 | 50.24 |
108
-
109
- *Performance comparison of the survey generation systems. "G2FT" stands for Gemini-2.0-Flash-Thinking, and "WTR1-7B" denotes Webthinker-R1-7B. FactScore evaluation was omitted for Webthinker, as it does not include citation functionality, and for OpenAI Deep Research, which does not provide citations when exporting the results.*
110
-
111
- ## Statement
112
- - As a language model, MiniCPM generates content by learning from a vast amount of text.
113
- - However, it does not possess the ability to comprehend or express personal opinions or value judgments.
114
- - Any content generated by MiniCPM does not represent the viewpoints or positions of the model developers.
115
- - Therefore, when using content generated by MiniCPM, users should take full responsibility for evaluating and verifying it on their own.
116
-
117
- ## LICENSE
118
- - This repository and MiniCPM models are released under the [Apache-2.0](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE) License.
119
-
120
- ## Citation
121
- - Please cite our [paper](https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf) if you find our work valuable.
122
-
123
- ```bibtex
124
- @article{minicpm4,
125
- title={{MiniCPM4}: Ultra-Efficient LLMs on End Devices},
126
- author={MiniCPM Team},
127
- year={2025}
128
- }
129
- ```
130
-
131
- # 中文
132
- ## News
133
-
134
- * [2025-06-05] 🚀🚀🚀我们开源了基于MiniCPM4-8B构建的MiniCPM4-Survey,能够生成可信的长篇调查报告,性能比肩更大模型。
135
-
136
- ## 概览
137
-
138
- MiniCPM4-Survey是由[THUNLP](https://nlp.csai.tsinghua.edu.cn)、中国人民大学和[ModelBest](https://modelbest.cn)联合开发的开源大语言模型智能体。它基于[MiniCPM4](https://github.com/OpenBMB/MiniCPM4) 80亿参数基座模型,接受用户质量作为输入,自主生成可信的长篇综述论文。
139
-
140
- 主要特性包括:
141
- - 计划-检索-写作生成框架 — 我们提出了一个多智能体生成框架,包含三个核心阶段:计划(定义综述的整体结构)、检索(生成合适的检索关键词)和写作(利用检索到的信息,生成连贯的段落)。
142
- - 高质量数据集构建——我们收集并处理大量人类专家写作的综述论文,构建高质量训练集。同时,我们收集大量研究论文,构建检索数据库。
143
- - 多方面奖励设计 — 我们精心设计了包含结构、内容和引用的奖励,用于评估综述的质量,在强化学习训练阶段作奖励函数。
144
- - 多步强化学习训练策略 — 我们提出了一个上下文管理器,以确保在促进有效推理的同时保留必要的信息,并构建了并行环境,维持强化学习训练高效。
145
-
146
-
147
- ## 使用
148
-
149
- ### 下载模型
150
- 从 Hugging Face 下载[MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey)并将其放在model/MiniCPM4-Survey中。
151
- 我们建议使用[MiniCPM-Embedding-Light](https://huggingface.co/openbmb/MiniCPM-Embedding-Light)作为表���模型,放在model/MiniCPM-Embedding-Light中。
152
-
153
- ### 准备环境
154
- Kaggle 下载论文数据,然后解压。运行`python data_process.py`,处理数据并生成检索数据库。然后运行`python build_index.py`,构建检索数据库。
155
- ``` bash
156
- cd ./code
157
- curl -L -o ~/Downloads/arxiv.zip\
158
- https://www.kaggle.com/api/v1/datasets/download/Cornell-University/arxiv
159
- unzip ~/Downloads/arxiv.zip -d .
160
- mkdir data
161
- python ./src/preprocess/data_process.py
162
- mkdir index
163
- python ./src/preprocess/build_index.py
164
- ```
165
-
166
- ### 模型推理
167
- 运行以下命令来构建检索环境并开始推理:
168
- ``` bash
169
- cd ./code
170
- python ./src/retriever.py
171
- bash ./scripts/run.sh
172
- ```
173
- 如果您想使用前端运行,可以运行以下命令:
174
- ``` bash
175
- cd ./code
176
- python ./src/retriever.py
177
- bash ./scripts/run_with_frontend.sh
178
- cd frontend/minicpm4-survey
179
- npm install
180
- npm run dev
181
- ```
182
- 然后你可以在浏览器中访问`http://localhost:5173`使用。
183
-
184
- ## 性能
185
-
186
- | Method | Relevance | Coverage | Depth | Novelty | Avg. | Fact Score |
187
- |---------------------------------------------|-----------|----------|-------|---------|-------|------------|
188
- | Naive RAG (driven by G2FT) | 3.25 | 2.95 | 3.35 | 2.60 | 3.04 | 43.68 |
189
- | AutoSurvey (driven by G2FT) | 3.10 | 3.25 | 3.15 | **3.15**| 3.16 | 46.56 |
190
- | Webthinker (driven by WTR1-7B) | 3.30 | 3.00 | 2.75 | 2.50 | 2.89 | -- |
191
- | Webthinker (driven by QwQ-32B) | 3.40 | 3.30 | 3.30 | 2.50 | 3.13 | -- |
192
- | OpenAI Deep Research (driven by GPT-4o) | 3.50 |**3.95** | 3.55 | 3.00 | **3.50** | -- |
193
- | MiniCPM4-Survey | 3.45 | 3.70 | **3.85** | 3.00 | **3.50** | **68.73** |
194
- | &nbsp;&nbsp;&nbsp;*w/o* RL | **3.55** | 3.35 | 3.30 | 2.25 | 3.11 | 50.24 |
195
-
196
- *GPT-4o对综述生成系统的性能比较。“G2FT”代表Gemini-2.0-Flash-Thinking,“WTR1-7B”代表Webthinker-R1-7B。由于Webthinker不包括引用功能,OpenAI Deep Research在导出结果时不提供引用,因此省略了对它们的FactScore评估。我们的技术报告中包含评测的详细信息。*
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zh
4
+ - en
5
+ library_name: transformers
6
+ license: apache-2.0
7
+ pipeline_tag: text-generation
8
+ ---
9
+
10
+ <div align="center">
11
+ <img src="https://github.com/OpenBMB/MiniCPM/blob/main/assets/minicpm_logo.png?raw=true" width="500em" ></img>
12
+ </div>
13
+
14
+ <p align="center">
15
+ <a href="https://github.com/OpenBMB/MiniCPM/\" target="_blank">GitHub Repo</a> |
16
+ <a href="https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf" target="_blank">Technical Report</a> |
17
+ <a href="https://huggingface.co/papers/2506.07900" target="_blank">Paper</a>
18
+ </p>
19
+ <p align="center">
20
+ 👋 Join us on <a href="https://discord.gg/3cGQn9b3YM" target="_blank">Discord</a> and <a href="https://github.com/OpenBMB/MiniCPM/blob/main/assets/wechat.jpg" target="_blank">WeChat</a>
21
+ </p>
22
+
23
+ This repository contains the model described in the paper [MiniCPM4: Ultra-Efficient LLMs on End Devices](https://huggingface.co/papers/2506.07900).
24
+
25
+ ## What's New
26
+
27
+ * [2025-06-05] 🚀🚀🚀 We have open-sourced **MiniCPM4-Survey**, a model built upon MiniCPM4-8B that is capable of generating trustworthy, long-form survey papers while maintaining competitive performance relative to significantly larger models.
28
+
29
+ ## MiniCPM4 Series
30
+ MiniCPM4 series are highly efficient large language models (LLMs) designed explicitly for end-side devices, which achieves this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems.
31
+ - [MiniCPM4-8B](https://huggingface.co/openbmb/MiniCPM4-8B): The flagship of MiniCPM4, with 8B parameters, trained on 8T tokens.
32
+ - [MiniCPM4-0.5B](https://huggingface.co/openbmb/MiniCPM4-0.5B): The small version of MiniCPM4, with 0.5B parameters, trained on 1T tokens.
33
+ - [MiniCPM4-8B-Eagle-FRSpec](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec): Eagle head for FRSpec, accelerating speculative inference for MiniCPM4-8B.
34
+ - [MiniCPM4-8B-Eagle-FRSpec-QAT-cpmcu](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec-QAT-cpmcu): Eagle head trained with QAT for FRSpec, efficiently integrate speculation and quantization to achieve ultra acceleration for MiniCPM4-8B.
35
+ - [MiniCPM4-8B-Eagle-vLLM](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-vLLM): Eagle head in vLLM format, accelerating speculative inference for MiniCPM4-8B.
36
+ - [MiniCPM4-8B-marlin-Eagle-vLLM](https://huggingface.co/openbmb/MiniCPM4-8B-marlin-Eagle-vLLM): Quantized Eagle head for vLLM format, accelerating speculative inference for MiniCPM4-8B.
37
+ - [BitCPM4-0.5B](https://huggingface.co/openbmb/BitCPM4-0.5B): Extreme ternary quantization applied to MiniCPM4-0.5B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
38
+ - [BitCPM4-1B](https://huggingface.co/openbmb/BitCPM4-1B): Extreme ternary quantization applied to MiniCPM3-1B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
39
+ - [MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey): Based on MiniCPM4-8B, accepts users' quiries as input and autonomously generate trustworthy, long-form survey papers. (**<-- you are here**)
40
+ - [MiniCPM4-MCP](https://huggingface.co/openbmb/MiniCPM4-MCP): Based on MiniCPM4-8B, accepts users' queries and available MCP tools as input and autonomously calls relevant MCP tools to satisfy users' requirements.
41
+
42
+ ## Overview
43
+
44
+ **MiniCPM4-Survey** is an open-source LLM agent model jointly developed by [THUNLP](https://nlp.csai.tsinghua.edu.cn), Renmin University of China and [ModelBest](https://modelbest.cn/en). Built on [MiniCPM4](https://github.com/OpenBMB/MiniCPM4) with 8 billion parameters, it accepts users' quiries as input and autonomously generate trustworthy, long-form survey papers.
45
+
46
+ Key features include:
47
+
48
+ - **Plan-Retrieve-Write Survey Generation Framework** — We propose a multi-agent generation framework, which operates through three core stages: planning (defining the overall structure of the survey), retrieval (generating appropriate retrieval keywords), and writing (synthesizing the retrieved information to generate coherent section-level content).
49
+
50
+ - **High-Quality Dataset Construction** — We gather and process lots of expert-written survey papers to construct a high-quality training dataset. Meanwhile, we collect a large number of research papers to build a retrieval database.
51
+
52
+ - **Multi-Aspect Reward Design** — We carefully design a reward system with three aspects (structure, content, and citations) to evaluate the quality of the surveys, which is used as the reward function in the RL training stage.
53
+
54
+ - **Multi-Step RL Training Strategy** — We propose a *Context Manager* to ensure retention of essential information while facilitating efficient reasoning, and we construct *Parallel Environment* to maintain efficient RL training cycles.
55
+
56
+
57
+ ## Quick Start
58
+
59
+ ### Download the model
60
+
61
+ Download [MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey) from Hugging Face and place it in `model/MiniCPM4-Survey`.
62
+ We recommend using [MiniCPM-Embedding-Light](https://huggingface.co/openbmb/MiniCPM-Embedding-Light) as the embedding model, which can be downloaded from Hugging Face and placed in `model/MiniCPM-Embedding-Light`.
63
+ ### Perpare the environment
64
+
65
+ You can download the [paper data](https://www.kaggle.com/datasets/Cornell-University/arxiv) from Kaggle, then extract it. You can run `python data_process.py` to process the data and generate the retrieval database. Then you can run `python build_index.py` to build the retrieval database.
66
+
67
+ ```
68
+ cd ./code
69
+ curl -L -o ~/Downloads/arxiv.zip\
70
+ https://www.kaggle.com/api/v1/datasets/download/Cornell-University/arxiv
71
+ unzip ~/Downloads/arxiv.zip -d .
72
+ mkdir data
73
+ python ./src/preprocess/data_process.py
74
+ mkdir index
75
+ python ./src/preprocess/build_index.py
76
+ ```
77
+
78
+ ### Model Inference
79
+
80
+ You can run the following command to build the retrieval environment and start the inference:
81
+
82
+ ```bash
83
+ cd ./code
84
+ python ./src/retriever.py
85
+ bash ./scripts/run.sh
86
+ ```
87
+
88
+ If you want to run with the frontend, you can run the following command:
89
+
90
+ ```bash
91
+ cd ./code
92
+ python ./src/retriever.py
93
+ bash ./scripts/run_with_frontend.sh
94
+ cd frontend/minicpm4-survey
95
+ npm install
96
+ npm run dev
97
+ ```
98
+
99
+ Then you can visit `http://localhost:5173` in your browser to use the model.
100
+
101
+ ## Performance Evaluation
102
+
103
+ | Method | Relevance | Coverage | Depth | Novelty | Avg. | Fact Score |
104
+ |---------------------------------------------|-----------|----------|-------|---------|-------|------------|
105
+ | Naive RAG (driven by G2FT) | 3.25 | 2.95 | 3.35 | 2.60 | 3.04 | 43.68 |
106
+ | AutoSurvey (driven by G2FT) | 3.10 | 3.25 | 3.15 | **3.15**| 3.16 | 46.56 |
107
+ | Webthinker (driven by WTR1-7B) | 3.30 | 3.00 | 2.75 | 2.50 | 2.89 | -- |
108
+ | Webthinker (driven by QwQ-32B) | 3.40 | 3.30 | 3.30 | 2.50 | 3.13 | -- |
109
+ | OpenAI Deep Research (driven by GPT-4o) | 3.50 |**3.95** | 3.55 | 3.00 | **3.50** | -- |
110
+ | MiniCPM4-Survey | 3.45 | 3.70 | **3.85** | 3.00 | **3.50** | **68.73** |
111
+ | &nbsp;&nbsp;&nbsp;*w/o* RL | **3.55** | 3.35 | 3.30 | 2.25 | 3.11 | 50.24 |
112
+
113
+ *Performance comparison of the survey generation systems. "G2FT" stands for Gemini-2.0-Flash-Thinking, and "WTR1-7B" denotes Webthinker-R1-7B. FactScore evaluation was omitted for Webthinker, as it does not include citation functionality, and for OpenAI Deep Research, which does not provide citations when exporting the results.*
114
+
115
+ ## Statement
116
+ - As a language model, MiniCPM generates content by learning from a vast amount of text.
117
+ - However, it does not possess the ability to comprehend or express personal opinions or value judgments.
118
+ - Any content generated by MiniCPM does not represent the viewpoints or positions of the model developers.
119
+ - Therefore, when using content generated by MiniCPM, users should take full responsibility for evaluating and verifying it on their own.
120
+
121
+ ## LICENSE
122
+ - This repository and MiniCPM models are released under the [Apache-2.0](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE) License.
123
+
124
+ ## Citation
125
+ - Please cite our [paper](https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf) if you find our work valuable.
126
+
127
+ ```bibtex
128
+ @article{minicpm4,
129
+ title={{MiniCPM4}: Ultra-Efficient LLMs on End Devices},
130
+ author={MiniCPM Team},
131
+ year={2025}
132
+ }
133
+ ```
134
+
135
+ # 中文
136
+ ## News
137
+
138
+ * [2025-06-05] 🚀🚀🚀我们开源了基于MiniCPM4-8B构建的MiniCPM4-Survey,能够生成可信的长篇调查报告,性能比肩更大模型。
139
+
140
+ ## 概览
141
+
142
+ MiniCPM4-Survey是由[THUNLP](https://nlp.csai.tsinghua.edu.cn)、中国人民大学和[ModelBest](https://modelbest.cn)联合开发的开源大语言模型智能体。它基于[MiniCPM4](https://github.com/OpenBMB/MiniCPM4) 80亿参数基座模型,接受用户质量作为输入,自主生成可信的长篇综述论文。
143
+
144
+ 主要特性包括:
145
+ - 计划-检索-写作生成框架 — 我们提出了一个多智能体生成框架,包含三个核心阶段:计划(定义综述的整体结构)、检索(生成合适的检索关键词)和写作(利用检索到的信息,生成连贯的段落)。
146
+ - 高质量数据集构建——我们收集并处理大量人类专家写作的综述论文,构建高质量训练集。同时,我们收集大量研究论文,构建检索数据库。
147
+ - 多方面奖励设计 — 我们精心设计了包含结构、内容和引用的奖励,用于评估综述的质量,在强化学习训练阶段作奖励函数。
148
+ - 多步强化学习训练策略 — 我们提出了一个上下文管理器,以确保在促进有效推理的同时保留必要的信息,并构建了并行环境,维持强化学习训练高效。
149
+
150
+
151
+ ## 使用
152
+
153
+ ### 下载模型
154
+ Hugging Face 下载[MiniCPM4-Survey](https://huggingface.co/openbmb/MiniCPM4-Survey)并将其放在model/MiniCPM4-Survey中。
155
+ 我们建议使用[MiniCPM-Embedding-Light](https://huggingface.co/openbmb/MiniCPM-Embedding-Light)作为表征模型,放在model/MiniCPM-Embedding-Light中。
156
+
157
+ ### 准备环境
158
+ 从 Kaggle 下载论文数据,然后解压。运行`python data_process.py`,处理数据并生成检索数据库。然后运行`python build_index.py`,构建检索数据库。
159
+ ``` bash
160
+ cd ./code
161
+ curl -L -o ~/Downloads/arxiv.zip\
162
+ https://www.kaggle.com/api/v1/datasets/download/Cornell-University/arxiv
163
+ unzip ~/Downloads/arxiv.zip -d .
164
+ mkdir data
165
+ python ./src/preprocess/data_process.py
166
+ mkdir index
167
+ python ./src/preprocess/build_index.py
168
+ ```
169
+
170
+ ### 模型推理
171
+ 运行以下命令来构建检索环境并开始推理:
172
+ ``` bash
173
+ cd ./code
174
+ python ./src/retriever.py
175
+ bash ./scripts/run.sh
176
+ ```
177
+ 如果您想使用前端运行,可以运行以下命令:
178
+ ``` bash
179
+ cd ./code
180
+ python ./src/retriever.py
181
+ bash ./scripts/run_with_frontend.sh
182
+ cd frontend/minicpm4-survey
183
+ npm install
184
+ npm run dev
185
+ ```
186
+ 然后你可以在浏览器中访问`http://localhost:5173`使用。
187
+
188
+ ## 性能
189
+
190
+ | Method | Relevance | Coverage | Depth | Novelty | Avg. | Fact Score |
191
+ |---------------------------------------------|-----------|----------|-------|---------|-------|------------|
192
+ | Naive RAG (driven by G2FT) | 3.25 | 2.95 | 3.35 | 2.60 | 3.04 | 43.68 |
193
+ | AutoSurvey (driven by G2FT) | 3.10 | 3.25 | 3.15 | **3.15**| 3.16 | 46.56 |
194
+ | Webthinker (driven by WTR1-7B) | 3.30 | 3.00 | 2.75 | 2.50 | 2.89 | -- |
195
+ | Webthinker (driven by QwQ-32B) | 3.40 | 3.30 | 3.30 | 2.50 | 3.13 | -- |
196
+ | OpenAI Deep Research (driven by GPT-4o) | 3.50 |**3.95** | 3.55 | 3.00 | **3.50** | -- |
197
+ | MiniCPM4-Survey | 3.45 | 3.70 | **3.85** | 3.00 | **3.50** | **68.73** |
198
+ | &nbsp;&nbsp;&nbsp;*w/o* RL | **3.55** | 3.35 | 3.30 | 2.25 | 3.11 | 50.24 |
199
+
200
+ *GPT-4o对综述生成系统的性能比较。“G2FT”代表Gemini-2.0-Flash-Thinking,“WTR1-7B”代表Webthinker-R1-7B。由于Webthinker不包括引用功能,OpenAI Deep Research在导出结果时不提供引用,因此省略了对它们的FactScore评估。我们的技术报告中包含评测的详细信息。*