File size: 6,757 Bytes
d853002
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NQUk3Y0WwYZ4"
      },
      "source": [
        "# 🤗 x 🦾: Training SmolVLA with LeRobot Notebook\n",
        "\n",
        "Welcome to the **LeRobot SmolVLA training notebook**! This notebook provides a ready-to-run setup for training imitation learning policies using the [🤗 LeRobot](https://github.com/huggingface/lerobot) library.\n",
        "\n",
        "In this example, we train an `SmolVLA` policy using a dataset hosted on the [Hugging Face Hub](https://huggingface.co/), and optionally track training metrics with [Weights & Biases (wandb)](https://wandb.ai/).\n",
        "\n",
        "## ⚙️ Requirements\n",
        "- A Hugging Face dataset repo ID containing your training data (`--dataset.repo_id=YOUR_USERNAME/YOUR_DATASET`)\n",
        "- Optional: A [wandb](https://wandb.ai/) account if you want to enable training visualization\n",
        "- Recommended: GPU runtime (e.g., NVIDIA A100) for faster training\n",
        "\n",
        "## ⏱️ Expected Training Time\n",
        "Training with the `SmolVLA` policy for 20,000 steps typically takes **about 5 hours on an NVIDIA A100** GPU. On less powerful GPUs or CPUs, training may take significantly longer!\n",
        "\n",
        "## Example Output\n",
        "Model checkpoints, logs, and training plots will be saved to the specified `--output_dir`. If `wandb` is enabled, progress will also be visualized in your wandb project dashboard.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MOJyX0CnwA5m"
      },
      "source": [
        "## Install conda\n",
        "This cell uses `condacolab` to bootstrap a full Conda environment inside Google Colab.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "QlKjL1X5t_zM"
      },
      "outputs": [],
      "source": [
        "!pip install -q condacolab\n",
        "import condacolab\n",
        "condacolab.install()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DxCc3CARwUjN"
      },
      "source": [
        "## Install LeRobot\n",
        "This cell clones the `lerobot` repository from Hugging Face, installs FFmpeg (version 7.1.1), and installs the package in editable mode.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "dgLu7QT5tUik"
      },
      "outputs": [],
      "source": [
        "!git clone https://github.com/huggingface/lerobot.git\n",
        "!conda install ffmpeg=7.1.1 -c conda-forge\n",
        "!cd lerobot && pip install -e ."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Q8Sn2wG4wldo"
      },
      "source": [
        "## Weights & Biases login\n",
        "This cell logs you into Weights & Biases (wandb) to enable experiment tracking and logging."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "PolVM_movEvp"
      },
      "outputs": [],
      "source": [
        "!wandb login"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zTWQAgX9xseE"
      },
      "source": [
        "## Install SmolVLA dependencies"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "DiHs0BKwxseE"
      },
      "outputs": [],
      "source": [
        "!cd lerobot && pip install -e \".[smolvla]\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IkzTo4mNwxaC"
      },
      "source": [
        "## Start training SmolVLA with LeRobot\n",
        "\n",
        "This cell runs the `train.py` script from the `lerobot` library to train a robot control policy.  \n",
        "\n",
        "Make sure to adjust the following arguments to your setup:\n",
        "\n",
        "1. `--dataset.repo_id=YOUR_HF_USERNAME/YOUR_DATASET`:  \n",
        "   Replace this with the Hugging Face Hub repo ID where your dataset is stored, e.g., `pepijn223/il_gym0`.\n",
        "\n",
        "2. `--batch_size=64`: means the model processes 64 training samples in parallel before doing one gradient update. Reduce this number if you have a GPU with low memory.\n",
        "\n",
        "3. `--output_dir=outputs/train/...`:  \n",
        "   Directory where training logs and model checkpoints will be saved.\n",
        "\n",
        "4. `--job_name=...`:  \n",
        "   A name for this training job, used for logging and Weights & Biases.\n",
        "\n",
        "5. `--policy.device=cuda`:  \n",
        "   Use `cuda` if training on an NVIDIA GPU. Use `mps` for Apple Silicon, or `cpu` if no GPU is available.\n",
        "\n",
        "6. `--wandb.enable=true`:  \n",
        "   Enables Weights & Biases for visualizing training progress. You must be logged in via `wandb login` before running this."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ZO52lcQtxseE"
      },
      "outputs": [],
      "source": [
        "!cd lerobot && python lerobot/scripts/train.py \\\n",
        "  --policy.path=lerobot/smolvla_base \\\n",
        "  --dataset.repo_id=${HF_USER}/mydataset \\\n",
        "  --batch_size=64 \\\n",
        "  --steps=20000 \\\n",
        "  --output_dir=outputs/train/my_smolvla \\\n",
        "  --job_name=my_smolvla_training \\\n",
        "  --policy.device=cuda \\\n",
        "  --wandb.enable=true"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2PBu7izpxseF"
      },
      "source": [
        "## Login into Hugging Face Hub\n",
        "Now after training is done login into the Hugging Face hub and upload the last checkpoint"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "8yu5khQGIHi6"
      },
      "outputs": [],
      "source": [
        "!huggingface-cli login"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zFMLGuVkH7UN"
      },
      "outputs": [],
      "source": [
        "!huggingface-cli upload ${HF_USER}/my_smolvla \\\n",
        "  /content/lerobot/outputs/train/my_smolvla/checkpoints/last/pretrained_model"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "gpuType": "A100",
      "machine_shape": "hm",
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}