Stage datasets
Once you find a suitable dataset you can stage it for training your models.
CLI
You can stage a dataset with following CLI command:
eotdl datasets get "dataset name"
Your datasets will be staged to a default folder, but you can specify a different folder with the --path
option or the EOTDL_DOWNLOAD_PATH
environment variable. For example, to stage the dataset to the current directory:
eotdl datasets get "dataset name" --path .
In order to overwrite a dataset that you already staged, you can use the --force
option.
eotdl datasets get "dataset name" --force
If you know the specific version of the dataset to stage, use the --version
option.
eotdl datasets get "dataset name" --version 1
By default, only the dataset metadata is staged. If you want to stage the dataset assets as well, use the --assets
option.
eotdl datasets get "dataset name" --assets
Although you might prefer to first explore and filter the metadata in order to stage only the assets that you need. Learn more with our tutorials.
Library
You can stage datasets using the following Python code:
from eotdl.datasets import stage_dataset
stage_dataset("dataset-name")
And use the same options seen before.
stage_dataset("dataset-name", force=True, path="data", version=1, assets=True)