Spaces:
Sleeping
Sleeping
File size: 3,307 Bytes
33d4721 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
# Quickstart with Python
AutoTrain is a library that allows you to train state of the art models on Hugging Face Spaces, or locally.
It provides a simple and easy-to-use interface to train models for various tasks like llm finetuning, text classification,
image classification, object detection, and more.
In this quickstart guide, we will show you how to train a model using AutoTrain in Python.
## Getting Started
AutoTrain can be installed using pip:
```bash
$ pip install autotrain-advanced
```
The example code below shows how to finetune an LLM model using AutoTrain in Python:
```python
import os
from autotrain.params import LLMTrainingParams
from autotrain.project import AutoTrainProject
params = LLMTrainingParams(
model="meta-llama/Llama-3.2-1B-Instruct",
data_path="HuggingFaceH4/no_robots",
chat_template="tokenizer",
text_column="messages",
train_split="train",
trainer="sft",
epochs=3,
batch_size=1,
lr=1e-5,
peft=True,
quantization="int4",
target_modules="all-linear",
padding="right",
optimizer="paged_adamw_8bit",
scheduler="cosine",
gradient_accumulation=8,
mixed_precision="bf16",
merge_adapter=True,
project_name="autotrain-llama32-1b-finetune",
log="tensorboard",
push_to_hub=True,
username=os.environ.get("HF_USERNAME"),
token=os.environ.get("HF_TOKEN"),
)
backend = "local"
project = AutoTrainProject(params=params, backend=backend, process=True)
project.create()
```
In this example, we are finetuning the `meta-llama/Llama-3.2-1B-Instruct` model on the `HuggingFaceH4/no_robots` dataset.
We are training the model for 3 epochs with a batch size of 1 and a learning rate of `1e-5`.
We are using the `paged_adamw_8bit` optimizer and the `cosine` scheduler.
We are also using mixed precision training with a gradient accumulation of 8.
The final model will be pushed to the Hugging Face Hub after training.
To train the model, run the following command:
```bash
$ export HF_USERNAME=<your-hf-username>
$ export HF_TOKEN=<your-hf-write-token>
$ python train.py
```
This will create a new project directory with the name `autotrain-llama32-1b-finetune` and start the training process.
Once the training is complete, the model will be pushed to the Hugging Face Hub.
Your HF_TOKEN and HF_USERNAME are only required if you want to push the model or if you are accessing a gated model or dataset.
## AutoTrainProject Class
[[autodoc]] project.AutoTrainProject
## Parameters
### Text Tasks
[[autodoc]] trainers.clm.params.LLMTrainingParams
[[autodoc]] trainers.sent_transformers.params.SentenceTransformersParams
[[autodoc]] trainers.seq2seq.params.Seq2SeqParams
[[autodoc]] trainers.token_classification.params.TokenClassificationParams
[[autodoc]] trainers.extractive_question_answering.params.ExtractiveQuestionAnsweringParams
[[autodoc]] trainers.text_classification.params.TextClassificationParams
[[autodoc]] trainers.text_regression.params.TextRegressionParams
### Image Tasks
[[autodoc]] trainers.image_classification.params.ImageClassificationParams
[[autodoc]] trainers.image_regression.params.ImageRegressionParams
[[autodoc]] trainers.object_detection.params.ObjectDetectionParams
### Tabular Tasks
[[autodoc]] trainers.tabular.params.TabularParams |