Spaces:
Sleeping
Sleeping
File size: 6,896 Bytes
db3da1e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
- [Data Preparation](#data-preparation)
- [Download Omni3D json](#download-omni3d-json)
- [Download Individual Datasets](#download-individual-datasets)
- [Data Usage](#data-usage)
- [Coordinate System](#coordinate-system)
- [Annotation Format](#annotation-format)
- [Example Loading Data](#example-loading-data)
# Data Preparation
The Omni3D dataset is comprised of 6 datasets which have been pre-processed into the same annotation format and camera coordinate systems. To use a subset or the full dataset you must download:
1. The processed Omni3D json files
2. RGB images from each dataset separately
## Download Omni3D json
Run
```
sh datasets/Omni3D/download_omni3d_json.sh
```
to download and extract the Omni3D train, val and test json annotation files.
## Download Individual Datasets
Below are the instructions for setting up each individual dataset. It is recommended to download only the data you plan to use.
### KITTI
Download the left color images from [KITTI's official website](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d). Unzip or softlink the images into the root `./Omni3D/` which should have the folder structure as detailed below. Note that we only require the image_2 folder.
```bash
datasets/KITTI_object
βββ training
βββ image_2
```
### nuScenes
Download the trainval images from the [official nuScenes website](https://www.nuscenes.org/nuscenes#download). Unzip or softlink the images into the root `./Omni3D/` which should have the folder structure as detailed below. Note that we only require the CAM_FRONT folder.
```bash
datasets/nuScenes/samples
βββ samples
βββ CAM_FRONT
```
### Objectron
Run
```
sh datasets/objectron/download_objectron_images.sh
```
to download and extract the Objectron pre-processed images (~24 GB).
### SUN RGB-D
Download the "SUNRGBD V1" images at [SUN RGB-D's official website](https://rgbd.cs.princeton.edu/). Unzip or softlink the images into the root `./Omni3D/` which should have the folder structure as detailed below.
```bash
./Omni3D/datasets/SUNRGBD
βββ kv1
βββ kv2
βββ realsense
```
### ARKitScenes
Run
```
sh datasets/ARKitScenes/download_arkitscenes_images.sh
```
to download and extract the ARKitScenes pre-processed images (~28 GB).
### Hypersim
Follow the [download instructions](https://github.com/apple/ml-hypersim/tree/main/contrib/99991) from [Thomas Germer](https://github.com/99991) in order to download all \*tonemap.jpg preview images in order to avoid downloading the full Hypersim dataset. For example:
```bash
git clone https://github.com/apple/ml-hypersim
cd ml-hypersim/
python contrib/99991/download.py -c .tonemap.jpg -d /path/to/Omni3D/datasets/hypersim --silent
```
Then arrange or unzip the downloaded images into the root `./Omni3D/` so that it has the below folder structure.
```bash
datasets/hypersim/
βββ ai_001_001
βββ ai_001_002
βββ ai_001_003
βββ ai_001_004
βββ ai_001_005
βββ ai_001_006
...
```
# Data Usage
Below we describe the unified 3D annotation coordinate systems, annotation format, and an example script.
## Coordinate System
All 3D annotations are provided in a shared camera coordinate system with
+x right, +y down, +z toward screen.
The vertex order of bbox3D_cam:
```
v4_____________________v5
/| /|
/ | / |
/ | / |
/___|_________________/ |
v0| | |v1 |
| | | |
| | | |
| | | |
| |_________________|___|
| / v7 | /v6
| / | /
| / | /
|/_____________________|/
v3 v2
```
## Annotation Format
Each dataset is formatted as a dict in python in the below format.
```python
dataset {
"info" : info,
"images" : [image],
"categories" : [category],
"annotations" : [object],
}
info {
"id" : str,
"source" : int,
"name" : str,
"split" : str,
"version" : str,
"url" : str,
}
image {
"id" : int,
"dataset_id" : int,
"width" : int,
"height" : int,
"file_path" : str,
"K" : list (3x3),
"src_90_rotate" : int, # im was rotated X times, 90 deg counterclockwise
"src_flagged" : bool, # flagged as potentially inconsistent sky direction
}
category {
"id" : int,
"name" : str,
"supercategory" : str
}
object {
"id" : int, # unique annotation identifier
"image_id" : int, # identifier for image
"category_id" : int, # identifier for the category
"category_name" : str, # plain name for the category
# General 2D/3D Box Parameters.
# Values are set to -1 when unavailable.
"valid3D" : bool, # flag for no reliable 3D box
"bbox2D_tight" : [x1, y1, x2, y2], # 2D corners of annotated tight box
"bbox2D_proj" : [x1, y1, x2, y2], # 2D corners projected from bbox3D
"bbox2D_trunc" : [x1, y1, x2, y2], # 2D corners projected from bbox3D then truncated
"bbox3D_cam" : [[x1, y1, z1]...[x8, y8, z8]] # 3D corners in meters and camera coordinates
"center_cam" : [x, y, z], # 3D center in meters and camera coordinates
"dimensions" : [width, height, length], # 3D attributes for object dimensions in meters
"R_cam" : list (3x3), # 3D rotation matrix to the camera frame rotation
# Optional dataset specific properties,
# used mainly for evaluation and ignore.
# Values are set to -1 when unavailable.
"behind_camera" : bool, # a corner is behind camera
"visibility" : float, # annotated visibility 0 to 1
"truncation" : float, # computed truncation 0 to 1
"segmentation_pts" : int, # visible instance segmentation points
"lidar_pts" : int, # visible LiDAR points in the object
"depth_error" : float, # L1 of depth map and rendered object
}
```
## Example Loading Data
Each dataset is named as "Omni3D_{name}_{split}.json" where split can be train, val, or test.
The annotations are in a COCO-like format such that if you load the json from the Omni3D class which inherits the [COCO class](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/coco.py#L70), you can use basic COCO dataset functions as demonstrated with the below code.
```python
from cubercnn import data
dataset_paths_to_json = ['path/to/Omni3D/{name}_{split}.json', ...]
# Example 1. load all images
dataset = data.Omni3D(dataset_paths_to_json)
imgIds = dataset.getImgIds()
imgs = dataset.loadImgs(imgIds)
# Example 2. load annotations for image index 0
annIds = dataset.getAnnIds(imgIds=imgs[0]['id'])
anns = dataset.loadAnns(annIds)
``` |