patrickvonplaten commited on
Commit
5b8b83b
·
1 Parent(s): 9df3444

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -38,14 +38,14 @@ A sequence of word embeddings is therefore processed sequentially by each transf
38
 
39
  The *conventional* T5 architectures are summarized in the following table.
40
 
41
- | Model | NL | dff | dmodel | dkv | NH | #Params|
42
  | ----| ---- | ---- | ---- | ---- | ---- | ----|
43
  | Tiny | 4/4 | 1024 | 256 | 32 | 4 | 16M|
44
  | Mini | 4/4 | 1536 | 384 | 32 | 8 | 31M|
45
  | Small | 6/6 | 2048 | 512 | 32 | 8 | 60M|
46
  | Base | 12/12 | 3072 | 768 | 64 | 12 | 220M|
47
  | Large | 24/24 | 4096 | 1024 | 64 | 16 | 738M|
48
- **| XL | 24/24 | 16384 | 1024 | 128 | 32 | 3B|**
49
  | XXL | 24/24 | 65536 | 1024 | 128 | 128 | 11B|
50
 
51
  This
 
38
 
39
  The *conventional* T5 architectures are summarized in the following table.
40
 
41
+ | Model | nl | ff | dm | kv | nh | #Params|
42
  | ----| ---- | ---- | ---- | ---- | ---- | ----|
43
  | Tiny | 4/4 | 1024 | 256 | 32 | 4 | 16M|
44
  | Mini | 4/4 | 1536 | 384 | 32 | 8 | 31M|
45
  | Small | 6/6 | 2048 | 512 | 32 | 8 | 60M|
46
  | Base | 12/12 | 3072 | 768 | 64 | 12 | 220M|
47
  | Large | 24/24 | 4096 | 1024 | 64 | 16 | 738M|
48
+ | **XL** | **24/24** | **16384** | **1024** | **128** | **32** | **3B**|
49
  | XXL | 24/24 | 65536 | 1024 | 128 | 128 | 11B|
50
 
51
  This