| # Use Builtin Datasets | |
| A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog) | |
| for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc). | |
| This document explains how to setup the builtin datasets so they can be used by the above APIs. | |
| [Use Custom Datasets](https://detectron2.readthedocs.io/tutorials/datasets.html) gives a deeper dive on how to use `DatasetCatalog` and `MetadataCatalog`, | |
| and how to add new datasets to them. | |
| Detectron2 has builtin support for a few datasets. | |
| The datasets are assumed to exist in a directory specified by the environment variable | |
| `DETECTRON2_DATASETS`. | |
| Under this directory, detectron2 will look for datasets in the structure described below, if needed. | |
| ``` | |
| $DETECTRON2_DATASETS/ | |
| coco/ | |
| lvis/ | |
| cityscapes/ | |
| VOC20{07,12}/ | |
| ``` | |
| You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`. | |
| If left unset, the default is `./datasets` relative to your current working directory. | |
| The [model zoo](https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md) | |
| contains configs and models that use these builtin datasets. | |
| ## Expected dataset structure for [COCO instance/keypoint detection](https://cocodataset.org/#download): | |
| ``` | |
| coco/ | |
| annotations/ | |
| instances_{train,val}2017.json | |
| person_keypoints_{train,val}2017.json | |
| {train,val}2017/ | |
| # image files that are mentioned in the corresponding json | |
| ``` | |
| You can use the 2014 version of the dataset as well. | |
| Some of the builtin tests (`dev/run_*_tests.sh`) uses a tiny version of the COCO dataset, | |
| which you can download with `./datasets/prepare_for_tests.sh`. | |
| ## Expected dataset structure for PanopticFPN: | |
| Extract panoptic annotations from [COCO website](https://cocodataset.org/#download) | |
| into the following structure: | |
| ``` | |
| coco/ | |
| annotations/ | |
| panoptic_{train,val}2017.json | |
| panoptic_{train,val}2017/ # png annotations | |
| panoptic_stuff_{train,val}2017/ # generated by the script mentioned below | |
| ``` | |
| Install panopticapi by: | |
| ``` | |
| pip install git+https://github.com/cocodataset/panopticapi.git | |
| ``` | |
| Then, run `python datasets/prepare_panoptic_fpn.py`, to extract semantic annotations from panoptic annotations. | |
| ## Expected dataset structure for [LVIS instance segmentation](https://www.lvisdataset.org/dataset): | |
| ``` | |
| coco/ | |
| {train,val,test}2017/ | |
| lvis/ | |
| lvis_v0.5_{train,val}.json | |
| lvis_v0.5_image_info_test.json | |
| lvis_v1_{train,val}.json | |
| lvis_v1_image_info_test{,_challenge}.json | |
| ``` | |
| Install lvis-api by: | |
| ``` | |
| pip install git+https://github.com/lvis-dataset/lvis-api.git | |
| ``` | |
| To evaluate models trained on the COCO dataset using LVIS annotations, | |
| run `python datasets/prepare_cocofied_lvis.py` to prepare "cocofied" LVIS annotations. | |
| ## Expected dataset structure for [cityscapes](https://www.cityscapes-dataset.com/downloads/): | |
| ``` | |
| cityscapes/ | |
| gtFine/ | |
| train/ | |
| aachen/ | |
| color.png, instanceIds.png, labelIds.png, polygons.json, | |
| labelTrainIds.png | |
| ... | |
| val/ | |
| test/ | |
| # below are generated Cityscapes panoptic annotation | |
| cityscapes_panoptic_train.json | |
| cityscapes_panoptic_train/ | |
| cityscapes_panoptic_val.json | |
| cityscapes_panoptic_val/ | |
| cityscapes_panoptic_test.json | |
| cityscapes_panoptic_test/ | |
| leftImg8bit/ | |
| train/ | |
| val/ | |
| test/ | |
| ``` | |
| Install cityscapes scripts by: | |
| ``` | |
| pip install git+https://github.com/mcordts/cityscapesScripts.git | |
| ``` | |
| Note: to create labelTrainIds.png, first prepare the above structure, then run cityscapesescript with: | |
| ``` | |
| CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createTrainIdLabelImgs.py | |
| ``` | |
| These files are not needed for instance segmentation. | |
| Note: to generate Cityscapes panoptic dataset, run cityscapesescript with: | |
| ``` | |
| CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createPanopticImgs.py | |
| ``` | |
| These files are not needed for semantic and instance segmentation. | |
| ## Expected dataset structure for [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/index.html): | |
| ``` | |
| VOC20{07,12}/ | |
| Annotations/ | |
| ImageSets/ | |
| Main/ | |
| trainval.txt | |
| test.txt | |
| # train.txt or val.txt, if you use these splits | |
| JPEGImages/ | |
| ``` | |
| ## Expected dataset structure for [ADE20k Scene Parsing](http://sceneparsing.csail.mit.edu/): | |
| ``` | |
| ADEChallengeData2016/ | |
| annotations/ | |
| annotations_detectron2/ | |
| images/ | |
| objectInfo150.txt | |
| ``` | |
| The directory `annotations_detectron2` is generated by running `python datasets/prepare_ade20k_sem_seg.py`. | |