Spaces:
Paused
Paused
cff-version: 1.2.0 | |
title: 'TRL: Transformer Reinforcement Learning' | |
message: >- | |
If you use this software, please cite it using the | |
metadata from this file. | |
type: software | |
authors: | |
- given-names: Leandro | |
family-names: von Werra | |
- given-names: Younes | |
family-names: Belkada | |
- given-names: Lewis | |
family-names: Tunstall | |
- given-names: Edward | |
family-names: Beeching | |
- given-names: Tristan | |
family-names: Thrush | |
- given-names: Nathan | |
family-names: Lambert | |
- given-names: Shengyi | |
family-names: Huang | |
- given-names: Kashif | |
family-names: Rasul | |
- given-names: Quentin | |
family-names: Gallouédec | |
repository-code: 'https://github.com/huggingface/trl' | |
abstract: "With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by \U0001F917 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point, most decoder and encoder-decoder architectures are supported." | |
keywords: | |
- rlhf | |
- deep-learning | |
- pytorch | |
- transformers | |
license: Apache-2.0 | |
version: 0.18 | |