lilkm commited on
Commit
5ad528e
·
verified ·
1 Parent(s): 01239f1

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +87 -0
  2. config.json +13 -0
  3. model.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - robotics
5
+ - imitation-learning
6
+ - octo
7
+ - pytorch
8
+ ---
9
+
10
+ # Octo-Base PyTorch Model
11
+
12
+ This is the octo-base model converted to PyTorch format.
13
+
14
+ ## Model Description
15
+
16
+ Octo is a generalist robot policy trained on diverse robot manipulation tasks.
17
+
18
+ - **Paper**: [Octo: An Open-Source Generalist Robot Policy](https://arxiv.org/pdf/2405.12213)
19
+ - **Original JAX Implementation**: [octo-models/octo](https://github.com/octo-models/octo)
20
+ - **Original Pytorch Implementation**: [emb-ai/octo-pytorch](https://github.com/emb-ai/octo-pytorch)
21
+ - **lil'km Implementation**: [s1lent4gnt/octo-pytorch](https://github.com/s1lent4gnt/octo-pytorch)
22
+ - **Model Size**: octo-base
23
+
24
+ ## Usage
25
+
26
+ ### Loading the pretrained model
27
+
28
+ ```python
29
+ import torch
30
+ from safetensors.torch import load_file
31
+ import json
32
+ from octo_pytorch.model import OctoModel
33
+ from octo_pytorch.model.configuration_octo import OctoConfig
34
+
35
+ # Load config
36
+ with open('config.json', 'r') as f:
37
+ config_dict = json.load(f)
38
+
39
+ # Initialize model configuration
40
+ config = OctoConfig(model_name=config_dict['model_name'])
41
+
42
+ # Initialize model
43
+ model = OctoModel(config)
44
+
45
+ # Load weights (T5 encoder weights will be loaded automatically from HuggingFace Hub)
46
+ state_dict = load_file('model.safetensors')
47
+ model.load_state_dict(state_dict, strict=False) # strict=False because T5 weights are not in the file
48
+ ```
49
+
50
+ ### Alternative: Direct loading from HuggingFace Hub
51
+
52
+ ```python
53
+ from octo_pytorch.model import OctoModel
54
+
55
+ # Load model directly from HuggingFace Hub
56
+ model = OctoModel.from_pretrained('lilkm/octo-small-test')
57
+ ```
58
+
59
+ **Note**: The T5-base language encoder weights are not included in this upload to save space. They will be automatically downloaded from HuggingFace Hub when you initialize the model.
60
+
61
+ ### Model Architecture
62
+
63
+ - **Transformer**: 12 layers, 768 dim, 12 heads
64
+ - **Vision Encoder**: Custom CNN (SmallStem16)
65
+ - **Language Encoder**: T5-Base
66
+ - **Action Head**: Diffusion policy with 4 action steps
67
+ - **Max Horizon**: 10 timesteps
68
+ - **Action Dimension**: 7
69
+
70
+ ## Files
71
+
72
+ - `model.safetensors`: Model weights in safetensors format
73
+ - `config.json`: Model configuration
74
+ - `dataset_statistics.npy`: Dataset statistics used for normalization (if available)
75
+
76
+ ## Citation
77
+
78
+ If you use this model, please cite:
79
+
80
+ ```bibtex
81
+ @article{octo_2023,
82
+ title={Octo: An Open-Source Generalist Robot Policy},
83
+ author={Octo Model Team et al.},
84
+ journal={arXiv preprint arXiv:2405.12213},
85
+ year={2024}
86
+ }
87
+ ```
config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "octo",
3
+ "model_name": "octo-base",
4
+ "token_embedding_size": 768,
5
+ "num_layers": 12,
6
+ "num_heads": 12,
7
+ "mlp_dim": 3072,
8
+ "max_horizon": 10,
9
+ "repeat_task_tokens": true,
10
+ "action_horizon": 4,
11
+ "action_dim": 7,
12
+ "diffusion_steps": 20
13
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f8c8e2f7c7b97b889d413006801ea351cb2213060a69b830ed7b2f83abbceaeb
3
+ size 371352224