RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper
•
2104.09864
•
Published
•
16
NOTE: THIS IS THE ORIGINAL MEGATRON-DEEPSPEED CHECKPOINT INCLUDING OPTIMIZER STATES
A GPT2 language model for European languages (EU-24 + Ukrainian). The model follows the original architecture as OpenAI's GPT2 apart from using rotary instead of learned positional embeddigs.
Included languages: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Irish, Croatian, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovenian, Swedish, and Ukrainian.
| Language | Ratio |
|---|---|
| bg | 5,92% |
| cs | 4,77% |
| da | 2,19% |
| de | 7,36% |
| el | 8,60% |
| en | 10,11% |
| es | 6,57% |
| et | 1,67% |
| fi | 2,70% |
| fr | 7,18% |
| ga | 0,25% |
| hr | 1,09% |
| hu | 6,38% |
| it | 5,80% |
| lt | 2,01% |
| lv | 1,76% |
| mt | 1,49% |
| nl | 5,20% |
| pl | 4,82% |
| pt | 4,64% |
| ro | 2,93% |
| sk | 2,03% |
| sl | 1,54% |
| sv | 3,00% |
MIT