File size: 3,307 Bytes
33d4721
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# Quickstart with Python

AutoTrain is a library that allows you to train state of the art models on Hugging Face Spaces, or locally. 
It provides a simple and easy-to-use interface to train models for various tasks like llm finetuning, text classification, 
image classification, object detection, and more.

In this quickstart guide, we will show you how to train a model using AutoTrain in Python.

## Getting Started

AutoTrain can be installed using pip:

```bash
$ pip install autotrain-advanced
```

The example code below shows how to finetune an LLM model using AutoTrain in Python:

```python
import os

from autotrain.params import LLMTrainingParams
from autotrain.project import AutoTrainProject


params = LLMTrainingParams(
    model="meta-llama/Llama-3.2-1B-Instruct",
    data_path="HuggingFaceH4/no_robots",
    chat_template="tokenizer",
    text_column="messages",
    train_split="train",
    trainer="sft",
    epochs=3,
    batch_size=1,
    lr=1e-5,
    peft=True,
    quantization="int4",
    target_modules="all-linear",
    padding="right",
    optimizer="paged_adamw_8bit",
    scheduler="cosine",
    gradient_accumulation=8,
    mixed_precision="bf16",
    merge_adapter=True,
    project_name="autotrain-llama32-1b-finetune",
    log="tensorboard",
    push_to_hub=True,
    username=os.environ.get("HF_USERNAME"),
    token=os.environ.get("HF_TOKEN"),
)


backend = "local"
project = AutoTrainProject(params=params, backend=backend, process=True)
project.create()
```

In this example, we are finetuning the `meta-llama/Llama-3.2-1B-Instruct` model on the `HuggingFaceH4/no_robots` dataset.
We are training the model for 3 epochs with a batch size of 1 and a learning rate of `1e-5`.
We are using the `paged_adamw_8bit` optimizer and the `cosine` scheduler.
We are also using mixed precision training with a gradient accumulation of 8.
The final model will be pushed to the Hugging Face Hub after training.

To train the model, run the following command:

```bash
$ export HF_USERNAME=<your-hf-username>
$ export HF_TOKEN=<your-hf-write-token>
$ python train.py
```

This will create a new project directory with the name `autotrain-llama32-1b-finetune` and start the training process.
Once the training is complete, the model will be pushed to the Hugging Face Hub.

Your HF_TOKEN and HF_USERNAME are only required if you want to push the model or if you are accessing a gated model or dataset.

## AutoTrainProject Class

[[autodoc]] project.AutoTrainProject

## Parameters

### Text Tasks

[[autodoc]] trainers.clm.params.LLMTrainingParams

[[autodoc]] trainers.sent_transformers.params.SentenceTransformersParams

[[autodoc]] trainers.seq2seq.params.Seq2SeqParams

[[autodoc]] trainers.token_classification.params.TokenClassificationParams

[[autodoc]] trainers.extractive_question_answering.params.ExtractiveQuestionAnsweringParams

[[autodoc]] trainers.text_classification.params.TextClassificationParams

[[autodoc]] trainers.text_regression.params.TextRegressionParams

### Image Tasks

[[autodoc]] trainers.image_classification.params.ImageClassificationParams

[[autodoc]] trainers.image_regression.params.ImageRegressionParams

[[autodoc]] trainers.object_detection.params.ObjectDetectionParams


### Tabular Tasks

[[autodoc]] trainers.tabular.params.TabularParams