|
--- |
|
datasets: |
|
- cerebras/SlimPajama-627B |
|
language: |
|
- en |
|
tags: |
|
- llama |
|
--- |
|
|
|
200m-ish parameter model (I think the param count in the graphic here is wrong, but the bench values are correct) with the token embedding and language modelling head of Llama2-70b attached, with linear transformations from Llama2-70b's 8192d space down to this model's 1024d space. |
|
 |
|
|
|
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr| |
|
|-------------|-------|------|-----:|--------|-----:|---|-----:| |
|
|arc_challenge|Yaml |none | 25|acc |0.1775|± |0.0112| |
|
| | |none | 25|acc_norm|0.2133|± |0.0120| |
|
|truthfulqa_mc2|Yaml |none | 0|acc |0.4457|± |0.0152| |
|
|winogrande|Yaml |none | 5|acc |0.5154|± | 0.014| |
|
|hellaswag|Yaml |none | 10|acc |0.2832|± |0.0045| |
|
| | |none | 10|acc_norm|0.3024|± |0.0046| |
|
|
|
### MMLU |
|
|
|
(avg accuracy: 26.17%) |
|
|
|
| Tasks |Version|Filter|n-shot|Metric|Value | |Stderr| |
|
|-----------------------------------|-------|------|-----:|------|-----:|---|-----:| |
|
|abstract_algebra |Yaml |none | 5|acc |0.2200|± |0.0416| |
|
|anatomy |Yaml |none | 5|acc |0.2222|± |0.0359| |
|
|astronomy |Yaml |none | 5|acc |0.1776|± |0.0311| |
|
|business_ethics |Yaml |none | 5|acc |0.2300|± |0.0423| |
|
|clinical_knowledge |Yaml |none | 5|acc |0.2415|± |0.0263| |
|
|college_biology |Yaml |none | 5|acc |0.3194|± |0.0390| |
|
|college_chemistry |Yaml |none | 5|acc |0.2000|± |0.0402| |
|
|college_computer_science |Yaml |none | 5|acc |0.2800|± |0.0451| |
|
|college_mathematics |Yaml |none | 5|acc |0.2800|± |0.0451| |
|
|college_medicine |Yaml |none | 5|acc |0.2254|± |0.0319| |
|
|college_physics |Yaml |none | 5|acc |0.2157|± |0.0409| |
|
|computer_security |Yaml |none | 5|acc |0.2200|± |0.0416| |
|
|conceptual_physics |Yaml |none | 5|acc |0.2553|± |0.0285| |
|
|econometrics |Yaml |none | 5|acc |0.2368|± |0.0400| |
|
|electrical_engineering |Yaml |none | 5|acc |0.2345|± |0.0353| |
|
|elementary_mathematics |Yaml |none | 5|acc |0.2646|± |0.0227| |
|
|formal_logic |Yaml |none | 5|acc |0.2302|± |0.0376| |
|
|global_facts |Yaml |none | 5|acc |0.1700|± |0.0378| |
|
|high_school_biology |Yaml |none | 5|acc |0.2903|± |0.0258| |
|
|high_school_chemistry |Yaml |none | 5|acc |0.2611|± |0.0309| |
|
|high_school_computer_science |Yaml |none | 5|acc |0.2300|± |0.0423| |
|
|high_school_european_history |Yaml |none | 5|acc |0.2788|± |0.0350| |
|
|high_school_geography |Yaml |none | 5|acc |0.3081|± |0.0329| |
|
|high_school_government_and_politics|Yaml |none | 5|acc |0.3731|± |0.0349| |
|
|high_school_macroeconomics |Yaml |none | 5|acc |0.2923|± |0.0231| |
|
|high_school_mathematics |Yaml |none | 5|acc |0.2630|± |0.0268| |
|
|high_school_microeconomics |Yaml |none | 5|acc |0.3403|± |0.0308| |
|
|high_school_physics |Yaml |none | 5|acc |0.2715|± |0.0363| |
|
|high_school_psychology |Yaml |none | 5|acc |0.2881|± |0.0194| |
|
|high_school_statistics |Yaml |none | 5|acc |0.4722|± |0.0340| |
|
|high_school_us_history |Yaml |none | 5|acc |0.3529|± |0.0335| |
|
|high_school_world_history |Yaml |none | 5|acc |0.2532|± |0.0283| |
|
|human_aging |Yaml |none | 5|acc |0.2108|± |0.0274| |
|
|human_sexuality |Yaml |none | 5|acc |0.2672|± |0.0388| |
|
|international_law |Yaml |none | 5|acc |0.2479|± |0.0394| |
|
|jurisprudence |Yaml |none | 5|acc |0.2500|± |0.0419| |
|
|logical_fallacies |Yaml |none | 5|acc |0.2393|± |0.0335| |
|
|machine_learning |Yaml |none | 5|acc |0.2946|± |0.0433| |
|
|management |Yaml |none | 5|acc |0.1650|± |0.0368| |
|
|marketing |Yaml |none | 5|acc |0.1923|± |0.0258| |
|
|medical_genetics |Yaml |none | 5|acc |0.3000|± |0.0461| |
|
|miscellaneous |Yaml |none | 5|acc |0.2720|± |0.0159| |
|
|moral_disputes |Yaml |none | 5|acc |0.1936|± |0.0213| |
|
|moral_scenarios |Yaml |none | 5|acc |0.2380|± |0.0142| |
|
|nutrition |Yaml |none | 5|acc |0.2484|± |0.0247| |
|
|philosophy |Yaml |none | 5|acc |0.2283|± |0.0238| |
|
|prehistory |Yaml |none | 5|acc |0.2346|± |0.0236| |
|
|professional_accounting |Yaml |none | 5|acc |0.2589|± |0.0261| |
|
|professional_law |Yaml |none | 5|acc |0.2445|± |0.0110| |
|
|professional_medicine |Yaml |none | 5|acc |0.4485|± |0.0302| |
|
|professional_psychology |Yaml |none | 5|acc |0.2614|± |0.0178| |
|
|public_relations |Yaml |none | 5|acc |0.2364|± |0.0407| |
|
|security_studies |Yaml |none | 5|acc |0.4000|± |0.0314| |
|
|sociology |Yaml |none | 5|acc |0.3035|± |0.0325| |
|
|us_foreign_policy |Yaml |none | 5|acc |0.2800|± |0.0451| |
|
|virology |Yaml |none | 5|acc |0.2048|± |0.0314| |
|
|world_religions |Yaml |none | 5|acc |0.1988|± |0.0306| |