A repository for storing, versioning, and validating relational datasets.
- A list of all versions is on the GitHub tags page
- A list of all datasets is on the relational-datasets download page
- Descriptions of each dataset are hosted with the relational-datasets documentation, for example:
boston_housingoverview
relational-datasets is a Python package that assists in loading and downloading data from this repository.
For example, you can load training and test sets for webkb fold-2 with:
# pip install relational-datasets from relational_datasets import load train, test = load("webkb", "v0.0.4", fold=2)RelationalDatasets.jl is a Julia package that helps load/download data from this repository:
# ] add RelationalDatasets using RelationalDatasets train, test = load("webkb", "v0.0.4", fold=2)Specific Version: Versions of each data archive may be downloaded by sending requests to a url with the following pattern, where {VERSION} represents a tag and {NAME} is the name for a dataset:
https://github.com/srlearn/datasets/releases/download/{VERSION}/{NAME}_{VERSION}.zip Download version v0.0.4 of toy_cancer:
curl -L https://github.com/srlearn/datasets/releases/download/v0.0.4/toy_cancer_v0.0.4.zip > toy_cancer_v0.0.4.zipDownload version v0.0.4 of webkb:
curl -L https://github.com/srlearn/datasets/releases/download/v0.0.4/webkb_v0.0.4.zip > webkb_v0.0.4.zip