Updated README
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ It was then converted to the WordPiece format used by BERT.
|
|
| 24 |
|
| 25 |
## Pretraining
|
| 26 |
|
| 27 |
-
We used the BERT-base configuration with 12 layers, 768 hidden units, 12 heads,
|
| 28 |
|
| 29 |
## Citation
|
| 30 |
|
|
|
|
| 24 |
|
| 25 |
## Pretraining
|
| 26 |
|
| 27 |
+
We used the BERT-base configuration with 12 layers, 768 hidden units, 12 heads, 512 sequence length, 128 mini-batch size and 32k token vocabulary.
|
| 28 |
|
| 29 |
## Citation
|
| 30 |
|