lvkaokao
commited on
Commit
·
4121a2d
1
Parent(s):
0b39ee8
update.
Browse files- src/display/about.py +19 -9
src/display/about.py
CHANGED
|
@@ -40,22 +40,32 @@ We chose these benchmarks as they test a variety of reasoning and general knowle
|
|
| 40 |
|
| 41 |
## REPRODUCIBILITY
|
| 42 |
To reproduce our results, here is the commands you can run, using [v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness/tree/v0.4.2) of the Eleuther AI Harness:
|
| 43 |
-
|
| 44 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
|
| 46 |
```
|
| 47 |
-
python main.py --model=hf-causal-experimental
|
| 48 |
-
--model_args="pretrained=<your_model>,use_accelerate=True,revision=<your_model_revision>"
|
| 49 |
-
--tasks=<task_list>
|
| 50 |
-
--num_fewshot=<n_few_shot>
|
| 51 |
-
--batch_size=1
|
| 52 |
--output_path=<output_path>
|
| 53 |
|
| 54 |
```
|
| 55 |
|
| 56 |
-
**Note:**
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
-
The tasks and few shots parameters are:
|
| 59 |
- ARC-C: 0-shot, *arc_challenge* (`acc`)
|
| 60 |
- ARC-E: 0-shot, *arc_easy* (`acc`)
|
| 61 |
- HellaSwag: 0-shot, *hellaswag* (`acc`)
|
|
|
|
| 40 |
|
| 41 |
## REPRODUCIBILITY
|
| 42 |
To reproduce our results, here is the commands you can run, using [v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness/tree/v0.4.2) of the Eleuther AI Harness:
|
| 43 |
+
```
|
| 44 |
+
python main.py --model=hf-causal-experimental
|
| 45 |
+
--model_args="pretrained=<your_model>,use_accelerate=True,revision=<your_model_revision>"
|
| 46 |
+
--tasks=<task_list>
|
| 47 |
+
--num_fewshot=<n_few_shot>
|
| 48 |
+
--batch_size=1
|
| 49 |
+
--output_path=<output_path>
|
| 50 |
+
```
|
| 51 |
|
| 52 |
```
|
| 53 |
+
python main.py --model=hf-causal-experimental
|
| 54 |
+
--model_args="pretrained=<your_model>,use_accelerate=True,revision=<your_model_revision>"
|
| 55 |
+
--tasks=<task_list>
|
| 56 |
+
--num_fewshot=<n_few_shot>
|
| 57 |
+
--batch_size=1
|
| 58 |
--output_path=<output_path>
|
| 59 |
|
| 60 |
```
|
| 61 |
|
| 62 |
+
**Note:**
|
| 63 |
+
- We run `llama.cpp` series models on Xeon CPU and others on NVidia GPU.
|
| 64 |
+
- If model paramerters > 7B, we use `--batch_size 4`. If model parameters < 7B, we use `--batch_size 2`. And we set `--batch_size 1` for llama.cpp. You can expect results to vary slightly for different batch sizes because of padding.
|
| 65 |
+
|
| 66 |
+
|
| 67 |
|
| 68 |
+
### The tasks and few shots parameters are:
|
| 69 |
- ARC-C: 0-shot, *arc_challenge* (`acc`)
|
| 70 |
- ARC-E: 0-shot, *arc_easy* (`acc`)
|
| 71 |
- HellaSwag: 0-shot, *hellaswag* (`acc`)
|