AI & ML interests
None defined yet.
Organization Card
OpenCSG
OpenCSG stands for Converged resources, Software refined, and Generative LM. The 'C' represents Converged resources, indicating the integration and full utilization of hybrid resources. The 'S' stands for Software refined, signifying software that is refined by large models. The 'G' represents Generative LM, which denotes widespread, inclusive, and democratized generative large models.
The vision of OpenCSG is to empower every industry, every company, and every individual to own their models. We adhere to the principles of openness and open source, making the large model software stack of OpenCSG available to the community. We welcome everyone to use, feedback, and collaborative contribute.
SLM pretrained from scratch
a suite of high-quality Chinese datasets, used for pretraining, fine-tuning or preference alignment. And the models trained on these datasets.
-
opencsg/Fineweb-Edu-Chinese-V2.1
Viewer • Updated • 958M • 22.1k • 40 -
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training
Paper • 2501.08197 • Published • 8 -
opencsg/chinese-fineweb-edu-v2
Viewer • Updated • 188M • 1.22k • 64 -
opencsg/chinese-fineweb-edu
Viewer • Updated • 84.6M • 5.61k • 104
SLM pretrained from scratch
a suite of high-quality Chinese datasets, used for pretraining, fine-tuning or preference alignment. And the models trained on these datasets.
-
opencsg/Fineweb-Edu-Chinese-V2.1
Viewer • Updated • 958M • 22.1k • 40 -
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training
Paper • 2501.08197 • Published • 8 -
opencsg/chinese-fineweb-edu-v2
Viewer • Updated • 188M • 1.22k • 64 -
opencsg/chinese-fineweb-edu
Viewer • Updated • 84.6M • 5.61k • 104
models
34

opencsg/OpenCSG-R1-Qwen2.5-Code-3B-V1
Text Generation
•
3B
•
Updated
•
16

opencsg/OpenCSG-Qwen2.5-3B-GUI
4B
•
Updated
•
8
•
1

opencsg/OpenCSG-Qwen2.5-7B-GUI
8B
•
Updated
•
7
•
2

opencsg/OpenCSG-R1-Qwen2.5-Math-7B-V1
8B
•
Updated
•
7
•
4

opencsg/OpenCSG-R1-Qwen2.5-Math-3B-V1
3B
•
Updated
•
7
•
3

opencsg/csg-wukong-2b-ultrafeedback-chinese-binarized-lowest
2B
•
Updated
•
4
•
1

opencsg/csg-wukong-2b-ultrafeedback-chinese-binarized
2B
•
Updated
•
3

opencsg/csg-wukong-2b-smoltalk-chinese
2B
•
Updated
•
3
•
2

opencsg/opencsg-starcoder2-15b-v0.1
Text Generation
•
16B
•
Updated
•
14
•
2

opencsg/opencsg-CodeLlama-34b-v0.2
Text Generation
•
34B
•
Updated
•
15
•
2
datasets
10
opencsg/autohub-benchmark
Viewer
•
Updated
•
99
•
78
•
1
opencsg/Fineweb-Edu-Chinese-V2.1
Viewer
•
Updated
•
958M
•
22.1k
•
40
opencsg/chinese-fineweb-v2-scorer-train-data
Preview
•
Updated
•
20
opencsg/chinese-fineweb-edu
Viewer
•
Updated
•
84.6M
•
5.61k
•
104
opencsg/chinese-fineweb-edu-v2
Viewer
•
Updated
•
188M
•
1.22k
•
64
opencsg/smoltalk-chinese
Preview
•
Updated
•
202
•
33
opencsg/chinese-cosmopedia
Preview
•
Updated
•
468
•
70
opencsg/UltraFeedback-chinese
Preview
•
Updated
•
340
•
12
opencsg/PR_review_deepseek
Viewer
•
Updated
•
24.8k
•
54
•
4
opencsg/csg-robomaster
Updated
•
488
•
2