cff-version: 1.2.0 title: 'TRL: Transformer Reinforcement Learning' message: >- If you use this software, please cite it using the metadata from this file. type: software authors: - given-names: Leandro family-names: von Werra - given-names: Younes family-names: Belkada - given-names: Lewis family-names: Tunstall - given-names: Edward family-names: Beeching - given-names: Tristan family-names: Thrush - given-names: Nathan family-names: Lambert - given-names: Shengyi family-names: Huang - given-names: Kashif family-names: Rasul - given-names: Quentin family-names: Gallouédec repository-code: 'https://github.com/huggingface/trl' abstract: "With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by \U0001F917 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point, most decoder and encoder-decoder architectures are supported." keywords: - rlhf - deep-learning - pytorch - transformers license: Apache-2.0 version: 0.18