Datasets¶
Torchvision provides many built-in datasets in the torchvision.datasets module, as well as utility classes for building your own datasets.
Built-in datasets¶
All datasets are subclasses of torch.utils.data.Dataset i.e, they have __getitem__ and __len__ methods implemented. Hence, they can all be passed to a torch.utils.data.DataLoader which can load multiple samples in parallel using torch.multiprocessing workers. For example:
imagenet_data = torchvision.datasets.ImageNet('path/to/imagenet_root/') data_loader = torch.utils.data.DataLoader(imagenet_data, batch_size=4, shuffle=True, num_workers=args.nThreads) All the datasets have almost similar API. They all have two common arguments: transform and target_transform to transform the input and target respectively. You can also create your own datasets using the provided base classes.
Warning
When a dataset object is created with download=True, the files are first downloaded and extracted in the root directory. This download logic is not multi-process safe, so it may lead to conflicts / race conditions if it is run within a distributed setting. In distributed mode, we recommend creating a dummy dataset object to trigger the download logic before setting up distributed mode.
Image classification¶
| Caltech 101 Dataset. |
| Caltech 256 Dataset. |
| |
| CIFAR10 Dataset. |
| CIFAR100 Dataset. |
| The Country211 Data Set from OpenAI. |
| |
| EMNIST Dataset. |
| RGB version of the EuroSAT Dataset. |
| A fake dataset that returns randomly generated images and returns them as PIL images |
| Fashion-MNIST Dataset. |
| FER2013 Dataset. |
| FGVC Aircraft Dataset. |
| Flickr8k Entities Dataset. |
| Flickr30k Entities Dataset. |
| Oxford 102 Flower Dataset. |
| |
| |
| iNaturalist Dataset. |
| ImageNet 2012 Classification Dataset. |
| Imagenette image classification dataset. |
| Kuzushiji-MNIST Dataset. |
| LFW Dataset. |
| LSUN dataset. |
| MNIST Dataset. |
| Omniglot Dataset. |
| |
| Places365 classification dataset. |
| |
| QMNIST Dataset. |
| |
| SEMEION Dataset. |
| SBU Captioned Photo Dataset. |
| Stanford Cars Dataset |
| STL10 Dataset. |
| |
| SVHN Dataset. |
| USPS Dataset. |
Image detection or segmentation¶
| MS Coco Detection Dataset. |
| |
| Cityscapes Dataset. |
| KITTI Dataset. |
| |
| |
| Pascal VOC Segmentation Dataset. |
| Pascal VOC Detection Dataset. |
| WIDERFace Dataset. |
Optical Flow¶
| FlyingChairs Dataset for optical flow. |
| FlyingThings3D dataset for optical flow. |
| HD1K dataset for optical flow. |
| KITTI dataset for optical flow (2015). |
| Sintel Dataset for optical flow. |
Stereo Matching¶
| Carla simulator data linked in the CREStereo github repo. |
| KITTI dataset from the 2012 stereo evaluation benchmark. |
| KITTI dataset from the 2015 stereo evaluation benchmark. |
| Synthetic dataset used in training the CREStereo architecture. |
| FallingThings dataset. |
| Dataset interface for Scene Flow datasets. |
| Sintel Stereo Dataset. |
| InStereo2k dataset. |
| ETH3D Low-Res Two-View dataset. |
| Publicly available scenes from the Middlebury dataset 2014 version <https://vision.middlebury.edu/stereo/data/scenes2014/>. |
Image pairs¶
| LFW Dataset. |
| Multi-view Stereo Correspondence Dataset. |
Image captioning¶
| MS Coco Captions Dataset. |
Video classification¶
| HMDB51 dataset. |
| Generic Kinetics dataset. |
| UCF101 dataset. |
Video prediction¶
| MovingMNIST Dataset. |
Base classes for custom datasets¶
| A generic data loader. |
| A generic data loader where the images are arranged in this way by default: . |
| Base Class For making datasets which are compatible with torchvision. |
Transforms v2¶
| Wrap a |