# Iterative Trainer [![](https://img.shields.io/badge/All_models-Iterative_SFT-blue)](https://huggingface.co/models?other=iterative-sft,trl) Iterative fine-tuning is a training method that enables to perform custom actions (generation and filtering for example) between optimization steps. In TRL we provide an easy-to-use API to fine-tune your models in an iterative way in just a few lines of code. ## Quickstart To get started quickly, you can either pass a model identifier or a pre-instantiated model to the trainer: ```python from trl import IterativeSFTConfig, IterativeSFTTrainer # Using a model identifier trainer = IterativeSFTTrainer( "facebook/opt-350m", args=IterativeSFTConfig( max_length=512, output_dir="./output", ), ) # Or using a pre-instantiated model from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("facebook/opt-350m") tokenizer = AutoTokenizer.from_pretrained("facebook/opt-350m") trainer = IterativeSFTTrainer( model, args=IterativeSFTConfig( max_length=512, output_dir="./output", ), processing_class=tokenizer, ) ``` ## Usage The [`IterativeSFTTrainer`] supports two ways of providing input data to the `step` function: ### Using a list of tensors as input: ```python inputs = { "input_ids": input_ids, "attention_mask": attention_mask, } trainer.step(**inputs) ``` ### Using a list of strings as input: ```python inputs = { "texts": texts, "texts_labels": texts_labels, # Optional, defaults to texts } trainer.step(**inputs) ``` For causal language models, labels will automatically be created from `input_ids` or from `texts`. When using sequence to sequence models you will have to provide your own labels or `text_labels`. ## Configuration The [`IterativeSFTConfig`] class provides several parameters to customize the training: ```python from trl import IterativeSFTConfig config = IterativeSFTConfig( # Model initialization parameters model_init_kwargs={"torch_dtype": "bfloat16"}, # Data preprocessing parameters max_length=512, truncation_mode="keep_end", # Training parameters output_dir="./output", learning_rate=2e-5, per_device_train_batch_size=4, gradient_accumulation_steps=4, max_steps=1000, logging_steps=10, save_steps=100, optim="adamw_torch", report_to="wandb", ) ``` ### Model Initialization You can control how the model is initialized by passing keyword arguments to `model_init_kwargs`: ```python config = IterativeSFTConfig( model_init_kwargs={ "torch_dtype": "bfloat16", "device_map": "auto", "trust_remote_code": True, } ) ``` ### Data Preprocessing The trainer supports two truncation modes: - `keep_end`: Truncates from the start of the sequence - `keep_start`: Truncates from the end of the sequence ```python config = IterativeSFTConfig( max_length=512, truncation_mode="keep_end", # or "keep_start" ) ``` ### Training Optimization You can optimize CUDA cache usage for more memory-efficient training: ```python config = IterativeSFTConfig( optimize_device_cache=True, ) ``` ## IterativeSFTTrainer [[autodoc]] IterativeSFTTrainer ## IterativeSFTConfig [[autodoc]] IterativeSFTConfig