awacke1 commited on
Commit
1342280
·
1 Parent(s): 1486bba

Create MinervaOnSTEMMath

Browse files
Files changed (1) hide show
  1. MinervaOnSTEMMath +15 -0
MinervaOnSTEMMath ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+
3
+ # Evaluation on STEM Benchmarks
4
+
5
+ To test Minerva’s quantitative reasoning abilities we evaluated the model on STEM benchmarks ranging in difficulty from grade school level problems to graduate level coursework.
6
+
7
+ 1. MATH: High school math competition level problems
8
+ 2. MMLU-STEM: A subset of the Massive Multitask Language Understanding benchmark focused on STEM, covering topics such as engineering, chemistry, math, and physics at high school and college level.
9
+ 3. GSM8k: Grade school level math problems involving basic arithmetic operations that should all be solvable by a talented middle school student.
10
+
11
+ We also evaluated Minerva on OCWCourses, a collection of college and graduate level problems covering a variety of STEM topics such as solid state chemistry, astronomy, differential equations, and special relativity that we collected from MIT OpenCourseWare.
12
+
13
+ In all cases, Minerva obtains state-of-the-art results, sometimes by a wide margin.
14
+
15
+ Reference: https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html