Commit
·
574fdb8
1
Parent(s):
761db7e
add accuracy benchmarks
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ This model is ready for commercial/non-commercial use. <br>
|
|
15 |
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to Non-NVIDIA [(DeepSeek R1) Model Card](https://huggingface.co/deepseek-ai/DeepSeek-R1).
|
16 |
|
17 |
### License/Terms of Use:
|
18 |
-
[MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md)
|
19 |
|
20 |
|
21 |
## Model Architecture:
|
@@ -101,6 +101,53 @@ if __name__ == '__main__':
|
|
101 |
|
102 |
```
|
103 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
104 |
|
105 |
## Ethical Considerations
|
106 |
|
|
|
15 |
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to Non-NVIDIA [(DeepSeek R1) Model Card](https://huggingface.co/deepseek-ai/DeepSeek-R1).
|
16 |
|
17 |
### License/Terms of Use:
|
18 |
+
[MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md)
|
19 |
|
20 |
|
21 |
## Model Architecture:
|
|
|
101 |
|
102 |
```
|
103 |
|
104 |
+
### Evaluation
|
105 |
+
The accuracy benchmark results are presented in the table below:
|
106 |
+
<table>
|
107 |
+
<tr>
|
108 |
+
<td><strong>Precision</strong>
|
109 |
+
</td>
|
110 |
+
<td><strong>MMLU</strong>
|
111 |
+
</td>
|
112 |
+
<td><strong>GSM8K</strong>
|
113 |
+
</td>
|
114 |
+
<td><strong>AIME2024</strong>
|
115 |
+
</td>
|
116 |
+
<td><strong>GPQA Diamond</strong>
|
117 |
+
</td>
|
118 |
+
<td><strong>MATH-500</strong>
|
119 |
+
</td>
|
120 |
+
</tr>
|
121 |
+
<tr>
|
122 |
+
<td>FP8
|
123 |
+
</td>
|
124 |
+
<td>90.8
|
125 |
+
</td>
|
126 |
+
<td>96.3
|
127 |
+
</td>
|
128 |
+
<td>80.0
|
129 |
+
</td>
|
130 |
+
<td>69.7
|
131 |
+
</td>
|
132 |
+
<td>95.4
|
133 |
+
</td>
|
134 |
+
</tr>
|
135 |
+
<tr>
|
136 |
+
<td>FP4
|
137 |
+
</td>
|
138 |
+
<td>90.7
|
139 |
+
</td>
|
140 |
+
<td>96.1
|
141 |
+
</td>
|
142 |
+
<td>80.0
|
143 |
+
</td>
|
144 |
+
<td>69.2
|
145 |
+
</td>
|
146 |
+
<td>94.2
|
147 |
+
</td>
|
148 |
+
</tr>
|
149 |
+
<tr>
|
150 |
+
</table>
|
151 |
|
152 |
## Ethical Considerations
|
153 |
|