zhiyucheng commited on
Commit
574fdb8
·
1 Parent(s): 761db7e

add accuracy benchmarks

Browse files
Files changed (1) hide show
  1. README.md +48 -1
README.md CHANGED
@@ -15,7 +15,7 @@ This model is ready for commercial/non-commercial use. <br>
15
  This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to Non-NVIDIA [(DeepSeek R1) Model Card](https://huggingface.co/deepseek-ai/DeepSeek-R1).
16
 
17
  ### License/Terms of Use:
18
- [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md)
19
 
20
 
21
  ## Model Architecture:
@@ -101,6 +101,53 @@ if __name__ == '__main__':
101
 
102
  ```
103
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
104
 
105
  ## Ethical Considerations
106
 
 
15
  This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to Non-NVIDIA [(DeepSeek R1) Model Card](https://huggingface.co/deepseek-ai/DeepSeek-R1).
16
 
17
  ### License/Terms of Use:
18
+ [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md)
19
 
20
 
21
  ## Model Architecture:
 
101
 
102
  ```
103
 
104
+ ### Evaluation
105
+ The accuracy benchmark results are presented in the table below:
106
+ <table>
107
+ <tr>
108
+ <td><strong>Precision</strong>
109
+ </td>
110
+ <td><strong>MMLU</strong>
111
+ </td>
112
+ <td><strong>GSM8K</strong>
113
+ </td>
114
+ <td><strong>AIME2024</strong>
115
+ </td>
116
+ <td><strong>GPQA Diamond</strong>
117
+ </td>
118
+ <td><strong>MATH-500</strong>
119
+ </td>
120
+ </tr>
121
+ <tr>
122
+ <td>FP8
123
+ </td>
124
+ <td>90.8
125
+ </td>
126
+ <td>96.3
127
+ </td>
128
+ <td>80.0
129
+ </td>
130
+ <td>69.7
131
+ </td>
132
+ <td>95.4
133
+ </td>
134
+ </tr>
135
+ <tr>
136
+ <td>FP4
137
+ </td>
138
+ <td>90.7
139
+ </td>
140
+ <td>96.1
141
+ </td>
142
+ <td>80.0
143
+ </td>
144
+ <td>69.2
145
+ </td>
146
+ <td>94.2
147
+ </td>
148
+ </tr>
149
+ <tr>
150
+ </table>
151
 
152
  ## Ethical Considerations
153