Files changed (1) hide show
  1. README.md +128 -114
README.md CHANGED
@@ -1,115 +1,129 @@
1
- ---
2
- library_name: transformers
3
- license: apache-2.0
4
- base_model:
5
- - Qwen/Qwen2.5-32B-Instruct
6
- tags:
7
- - llama-factory
8
- - full
9
- - generated_from_trainer
10
- model-index:
11
- - name: OpenThinker-32B
12
- results: []
13
- datasets:
14
- - open-thoughts/open-thoughts-114k
15
- ---
16
-
17
- <p align="center">
18
- <img src="https://huggingface.co/datasets/open-thoughts/open-thoughts-114k/resolve/main/open_thoughts.png" width="50%">
19
- </p>
20
-
21
- # OpenThinker-32B
22
-
23
- This model is a fine-tuned version of [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) on the
24
- [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset.
25
-
26
- The dataset is derived by distilling DeepSeek-R1 using the [data pipeline available on github](https://github.com/open-thoughts/open-thoughts).
27
- More info about the dataset can be found on the dataset card at [OpenThoughts-114k dataset](https://huggingface.co/datasets/open-thoughts/open-thoughts-114k).
28
-
29
- The numbers reported in the table below are evaluated with our open-source tool [Evalchemy](https://github.com/mlfoundations/Evalchemy).
30
-
31
-
32
- |Model Name|Dataset Size|AIME24 I/II|AIME25 I|MATH500|GPQA Diamond|LCBv2|
33
- |---|---|---|---|---|---|---|
34
- |LIMO-32B|0.8k|56.7|49.3|86.6|58.1|60.0|
35
- |s1-32B|1k|36.0|25.3|84.8|50.5|40.9|
36
- |s1.1-32B|1k|64.7|49.3|89.0|60.1|65.5|
37
- |DeepSeek-R1-Distill-Qwen-32B|800k (closed)|**76.7**|**55.9**|89.4|57.6|**71.2**|
38
- |**OpenThinker-32B**|114k|66.0|53.3|**90.6**|**61.6**|68.9|
39
-
40
-
41
- We are fully open-source. Our [model weights](https://huggingface.co/open-thoughts), [datasets](https://huggingface.co/open-thoughts), [data generation code](https://github.com/open-thoughts/open-thoughts), [evaluation code](https://github.com/mlfoundations/Evalchemy), and [training code](https://github.com/hiyouga/LLaMA-Factory) are all publicly available.
42
-
43
- | | Open Weights | Open Data | Open Code |
44
- |--|--------------|-----------| --------- |
45
- |OpenThinker-32B|✅|[✅](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)|[✅](https://github.com/open-thoughts/open-thoughts) |
46
- |DeepSeek-R1-Distill-Qwen-32B|✅|❌|❌|
47
- |OpenAI/Gemini|❌|❌|❌|❌|
48
-
49
-
50
-
51
- ## Intended uses & limitations
52
-
53
- Apache 2.0 License
54
-
55
-
56
- ## Training procedure
57
-
58
- We finetune [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct)
59
- on [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) for
60
- 3 epochs with a 16k context length using [LlamaFactory](https://github.com/hiyouga/LLaMA-Factory).
61
- Our [full training configuration](https://github.com/open-thoughts/open-thoughts/blob/main/train/OpenThinker-32B.yaml)
62
- is provided in [our repository](https://github.com/open-thoughts/open-thoughts/tree/main).
63
- Training the 32B model on [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)
64
- was done on AWS SageMaker with 8xH100 P5 nodes. On 4 nodes, this took around 90 hours.
65
- Meanwhile, for training on [OpenThoughts-Unverified-173k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-Unverfied-173k),
66
- we used 96 nodes of 4xA100 (64 GB per GPU), training took 30 hours, spending 11,520 A100 hours on the Leonardo Supercomputer.
67
-
68
- ### Training hyperparameters
69
-
70
- The following hyperparameters were used during training:
71
- - learning_rate: 1e-05
72
- - train_batch_size: 1
73
- - eval_batch_size: 8
74
- - seed: 42
75
- - distributed_type: multi-GPU
76
- - num_devices: 32
77
- - gradient_accumulation_steps: 3
78
- - total_train_batch_size: 96
79
- - total_eval_batch_size: 256
80
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
81
- - lr_scheduler_type: cosine
82
- - lr_scheduler_warmup_ratio: 0.1
83
- - num_epochs: 3.0
84
-
85
- ### Framework versions
86
-
87
- - Transformers 4.46.1
88
- - Pytorch 2.3.0
89
- - Datasets 3.1.0
90
- - Tokenizers 0.20.3
91
-
92
- More info can be found in our repository: [https://github.com/open-thoughts/open-thoughts](https://github.com/open-thoughts/open-thoughts).
93
-
94
- # Citation
95
- ```
96
- @misc{openthoughts,
97
- author = {Team, OpenThoughts},
98
- month = jan,
99
- title = {{Open Thoughts}},
100
- howpublished = {https://open-thoughts.ai},
101
- year = {2025}
102
- }
103
- ```
104
-
105
- # Links
106
- - 📊 [Open Thoughts Launch Blog Post](https://www.open-thoughts.ai/blog/launch)
107
- - 📊 [Open Thoughts Measuring Reasoning with Evalchmey Blog Post](https://www.open-thoughts.ai/blog/measure)
108
- - 📊 [Open Thoughts OpenThinker-32B Post](https://www.open-thoughts.ai/blog/scale)
109
- - 💻 [Open Thoughts GitHub Repository](https://github.com/open-thoughts/open-thoughts)
110
- - 🧠 [OpenThoughts-114k dataset](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)
111
- - 🧠 [OpenThoughts-Unverified-173k dataset](https://huggingface.co/datasets/open-thoughts/OpenThoughts-Unverified-173k)
112
- - 🤖 [OpenThinker-7B model](https://huggingface.co/open-thoughts/OpenThinker-7B)
113
- - 🤖 [OpenThinker-7B-Unverfied model](https://huggingface.co/open-thoughts/OpenThinker-7B-Unverified)
114
- - 🤖 [OpenThinker-32B model](https://huggingface.co/open-thoughts/OpenThinker-32B) - this model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
  - 🤖 [OpenThinker-32B-Unverified model](https://huggingface.co/open-thoughts/OpenThinker-32B-Unverified)
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model:
5
+ - Qwen/Qwen2.5-32B-Instruct
6
+ tags:
7
+ - llama-factory
8
+ - full
9
+ - generated_from_trainer
10
+ datasets:
11
+ - open-thoughts/open-thoughts-114k
12
+ language:
13
+ - zho
14
+ - eng
15
+ - fra
16
+ - spa
17
+ - por
18
+ - deu
19
+ - ita
20
+ - rus
21
+ - jpn
22
+ - kor
23
+ - vie
24
+ - tha
25
+ - ara
26
+ model-index:
27
+ - name: OpenThinker-32B
28
+ results: []
29
+ ---
30
+
31
+ <p align="center">
32
+ <img src="https://huggingface.co/datasets/open-thoughts/open-thoughts-114k/resolve/main/open_thoughts.png" width="50%">
33
+ </p>
34
+
35
+ # OpenThinker-32B
36
+
37
+ This model is a fine-tuned version of [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) on the
38
+ [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset.
39
+
40
+ The dataset is derived by distilling DeepSeek-R1 using the [data pipeline available on github](https://github.com/open-thoughts/open-thoughts).
41
+ More info about the dataset can be found on the dataset card at [OpenThoughts-114k dataset](https://huggingface.co/datasets/open-thoughts/open-thoughts-114k).
42
+
43
+ The numbers reported in the table below are evaluated with our open-source tool [Evalchemy](https://github.com/mlfoundations/Evalchemy).
44
+
45
+
46
+ |Model Name|Dataset Size|AIME24 I/II|AIME25 I|MATH500|GPQA Diamond|LCBv2|
47
+ |---|---|---|---|---|---|---|
48
+ |LIMO-32B|0.8k|56.7|49.3|86.6|58.1|60.0|
49
+ |s1-32B|1k|36.0|25.3|84.8|50.5|40.9|
50
+ |s1.1-32B|1k|64.7|49.3|89.0|60.1|65.5|
51
+ |DeepSeek-R1-Distill-Qwen-32B|800k (closed)|**76.7**|**55.9**|89.4|57.6|**71.2**|
52
+ |**OpenThinker-32B**|114k|66.0|53.3|**90.6**|**61.6**|68.9|
53
+
54
+
55
+ We are fully open-source. Our [model weights](https://huggingface.co/open-thoughts), [datasets](https://huggingface.co/open-thoughts), [data generation code](https://github.com/open-thoughts/open-thoughts), [evaluation code](https://github.com/mlfoundations/Evalchemy), and [training code](https://github.com/hiyouga/LLaMA-Factory) are all publicly available.
56
+
57
+ | | Open Weights | Open Data | Open Code |
58
+ |--|--------------|-----------| --------- |
59
+ |OpenThinker-32B|✅|[](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)|[✅](https://github.com/open-thoughts/open-thoughts) |
60
+ |DeepSeek-R1-Distill-Qwen-32B|✅|❌|❌|
61
+ |OpenAI/Gemini|❌|❌|❌|❌|
62
+
63
+
64
+
65
+ ## Intended uses & limitations
66
+
67
+ Apache 2.0 License
68
+
69
+
70
+ ## Training procedure
71
+
72
+ We finetune [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct)
73
+ on [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) for
74
+ 3 epochs with a 16k context length using [LlamaFactory](https://github.com/hiyouga/LLaMA-Factory).
75
+ Our [full training configuration](https://github.com/open-thoughts/open-thoughts/blob/main/train/OpenThinker-32B.yaml)
76
+ is provided in [our repository](https://github.com/open-thoughts/open-thoughts/tree/main).
77
+ Training the 32B model on [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)
78
+ was done on AWS SageMaker with 8xH100 P5 nodes. On 4 nodes, this took around 90 hours.
79
+ Meanwhile, for training on [OpenThoughts-Unverified-173k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-Unverfied-173k),
80
+ we used 96 nodes of 4xA100 (64 GB per GPU), training took 30 hours, spending 11,520 A100 hours on the Leonardo Supercomputer.
81
+
82
+ ### Training hyperparameters
83
+
84
+ The following hyperparameters were used during training:
85
+ - learning_rate: 1e-05
86
+ - train_batch_size: 1
87
+ - eval_batch_size: 8
88
+ - seed: 42
89
+ - distributed_type: multi-GPU
90
+ - num_devices: 32
91
+ - gradient_accumulation_steps: 3
92
+ - total_train_batch_size: 96
93
+ - total_eval_batch_size: 256
94
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
95
+ - lr_scheduler_type: cosine
96
+ - lr_scheduler_warmup_ratio: 0.1
97
+ - num_epochs: 3.0
98
+
99
+ ### Framework versions
100
+
101
+ - Transformers 4.46.1
102
+ - Pytorch 2.3.0
103
+ - Datasets 3.1.0
104
+ - Tokenizers 0.20.3
105
+
106
+ More info can be found in our repository: [https://github.com/open-thoughts/open-thoughts](https://github.com/open-thoughts/open-thoughts).
107
+
108
+ # Citation
109
+ ```
110
+ @misc{openthoughts,
111
+ author = {Team, OpenThoughts},
112
+ month = jan,
113
+ title = {{Open Thoughts}},
114
+ howpublished = {https://open-thoughts.ai},
115
+ year = {2025}
116
+ }
117
+ ```
118
+
119
+ # Links
120
+ - 📊 [Open Thoughts Launch Blog Post](https://www.open-thoughts.ai/blog/launch)
121
+ - 📊 [Open Thoughts Measuring Reasoning with Evalchmey Blog Post](https://www.open-thoughts.ai/blog/measure)
122
+ - 📊 [Open Thoughts OpenThinker-32B Post](https://www.open-thoughts.ai/blog/scale)
123
+ - 💻 [Open Thoughts GitHub Repository](https://github.com/open-thoughts/open-thoughts)
124
+ - 🧠 [OpenThoughts-114k dataset](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)
125
+ - 🧠 [OpenThoughts-Unverified-173k dataset](https://huggingface.co/datasets/open-thoughts/OpenThoughts-Unverified-173k)
126
+ - 🤖 [OpenThinker-7B model](https://huggingface.co/open-thoughts/OpenThinker-7B)
127
+ - 🤖 [OpenThinker-7B-Unverfied model](https://huggingface.co/open-thoughts/OpenThinker-7B-Unverified)
128
+ - 🤖 [OpenThinker-32B model](https://huggingface.co/open-thoughts/OpenThinker-32B) - this model
129
  - 🤖 [OpenThinker-32B-Unverified model](https://huggingface.co/open-thoughts/OpenThinker-32B-Unverified)