Spaces:
Running
Running
# Frequently Asked Questions | |
## Are my data and models secure? | |
Yes, your data and models are secure. AutoTrain uses the Hugging Face Hub to store your data and models. | |
All your data and models are uploaded to your Hugging Face account as private repositories and are only accessible by you. | |
Read more about security [here](https://huggingface.co/docs/hub/en/security). | |
## Do you upload my data to the Hugging Face Hub? | |
AutoTrain will not upload your dataset to the Hub if you are using the local backend or training in the same space. | |
AutoTrain will push your dataset to the Hub if you are using features like: DGX Cloud | |
or using local CLI to train on Hugging Face's infrastructure. | |
You can safely remove the dataset from the Hub after training is complete. | |
If uploaded, the dataset will be stored in your Hugging Face account as a private repository and will only be accessible by you | |
and the training process. It is not used once the training is complete. | |
## My training space paused for no reason mid-training | |
AutoTrain Training Spaces will pause itself after training is done (or failed). This is done to save resources and costs. | |
If your training failed, you can still see the space logs and find out what went wrong. Note: you won't be able to retrive the logs if you restart the space. | |
Another reason for the space to pause is if the space is space's sleep time kicking in. If you have a long running training job, you must set the sleep time to a much higher value. | |
The space will anyways pause itself after the training is done thus saving you costs. | |
## I get error `Your installed package nvidia-ml-py is corrupted. Skip patch functions` | |
This error can be safely ignored. It is a warning from the `nvitop` library and does not affect the functionality of AutoTrain. | |
## I get 409 conflict error when using the UI | |
This error occurs when you try to create a project with the same name as an existing project. | |
To resolve this error, you can either delete the existing project or create a new project | |
with a different name. | |
This error can also occur when you are trying to train a model while a model is already training in the same space or locally. | |
## The model I want to use doesn't show up in the model selection dropdown. | |
If the model you want to use is not available in the model selection dropdown, | |
you can add it in the environment variable `AUTOTRAIN_CUSTOM_MODELS` in the space settings. | |
For example, if you want to add the `xxx/yyy` model, go to space settings, create a variable named `AUTOTRAIN_CUSTOM_MODELS` | |
and set the value to `xxx/yyy`. | |
You can also pass the model name as query parameter in the URL. For example, if you want to use the `xxx/yyy` model, | |
you can use the URL `https://huggingface.co/spaces/your_autotrain_space?custom_models=xxx/yyy`. | |
## How do I use AutoTrain locally? | |
AutoTrain can be used locally by installing the AutoTrain Advanced pypi package. | |
You can read more in *Use AutoTrain Locally* section. | |
## Can I run AutoTrain on Colab? | |
To start the UI on Colab, you can simply click on the following link: | |
[](https://colab.research.google.com/github/huggingface/autotrain-advanced/blob/main/colabs/AutoTrain.ipynb) | |
Please note, to run the app on Colab, you will need an ngrok token. You can get one by signing up for free on [ngrok](https://ngrok.com/). | |
This is because Colab does not allow exposing ports to the internet directly. | |
To use the CLI instead on Colab, you can follow the same instructions as for using AutoTrain locally. | |
## Does AutoTrain have a docker image? | |
Yes, AutoTrain has a docker image. | |
You can find the docker image on Docker Hub [here](https://hub.docker.com/r/huggingface/autotrain-advanced). | |
## Is windows supported? | |
Unfortunately, AutoTrain does not officially support Windows at the moment. | |
You can try using WSL (Windows Subsystem for Linux) to run AutoTrain on Windows or the docker image. | |
## "--project-name" argument can not be set as a directory | |
`--project-name` argument should not be a path. it will be created where autotrain command is run. | |
This parameter must be alphanumeric and can contain hypens. | |
## I am getting `config.json` not found error | |
This means you have trained an adapter model (peft=true) which doesnt generate config.json. | |
It doesnt matter though, the model can still be loaded with AutoModelForCausalLM or with Inference endpoints. | |
If you want to merge weights with base models, you can use `autotrain tools`. Please read about it in miscelleneous section. | |
## Does autotrain support multi-gpu training? | |
Yes, autotrain supports multi-gpu training. | |
AutoTrain will determine on its own if the user is running the command on a multi-gpu setup and will use | |
multi-gpu ddp if number of gpus is greater than 1 and less than 4 and deepspeed if number of gpus is greater than or equal to 4. | |
## How can i use a hub dataset with multiple configs? | |
If your hub dataset has multiple configs, you can use `train_split` parameter to specify the both the config and the split. | |
For example, in this dataset [here](https://huggingface.co/datasets/timdettmers/openassistant-guanaco), | |
there are multiple configs: `pair`, `pair-class`, `pair-score` and `triplet`. | |
If i want to use `train` split of `pair-class` config, i can use write `pair-class:train` as `train_split` in the UI or the CLI / config. | |
An example config is shown below: | |
```yaml | |
data: | |
path: sentence-transformers/all-nli | |
train_split: pair-class:train | |
valid_split: pair-class:test | |
column_mapping: | |
sentence1_column: premise | |
sentence2_column: hypothesis | |
target_column: label | |
``` |