File size: 436 Bytes
430238c
 
70c0c5a
 
 
430238c
 
70c0c5a
430238c
70c0c5a
430238c
 
70c0c5a
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
---
library_name: transformers
license: mit
base_model:
- meta-llama/Meta-Llama-3-8B
---

# Model Details

`meta-llama/Meta-Llama-3-8B` model finetuned on 100,000 [CLRS-Text](https://github.com/google-deepmind/clrs/tree/master/clrs/_src/clrs_text) examples.

## Training Details
- Learning Rate: 1e-4, 150 warmup steps then cosine decayed to 5e-06 using AdamW optimiser
- Batch size: 128
- Loss taken over answer only, not on question.