Spaces:
Runtime error
Runtime error
Create MinervaOnSTEMMath
Browse files- MinervaOnSTEMMath +15 -0
MinervaOnSTEMMath
ADDED
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
|
3 |
+
# Evaluation on STEM Benchmarks
|
4 |
+
|
5 |
+
To test Minerva’s quantitative reasoning abilities we evaluated the model on STEM benchmarks ranging in difficulty from grade school level problems to graduate level coursework.
|
6 |
+
|
7 |
+
1. MATH: High school math competition level problems
|
8 |
+
2. MMLU-STEM: A subset of the Massive Multitask Language Understanding benchmark focused on STEM, covering topics such as engineering, chemistry, math, and physics at high school and college level.
|
9 |
+
3. GSM8k: Grade school level math problems involving basic arithmetic operations that should all be solvable by a talented middle school student.
|
10 |
+
|
11 |
+
We also evaluated Minerva on OCWCourses, a collection of college and graduate level problems covering a variety of STEM topics such as solid state chemistry, astronomy, differential equations, and special relativity that we collected from MIT OpenCourseWare.
|
12 |
+
|
13 |
+
In all cases, Minerva obtains state-of-the-art results, sometimes by a wide margin.
|
14 |
+
|
15 |
+
Reference: https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html
|