renate.data package#

class renate.data.CSVDataModule(data_path, train_filename='train.csv', test_filename='test.csv', target_name='y', src_bucket=None, src_object_name=None, val_size=0.0, seed=0)[source]#

Bases: RenateDataModule

A data module loading data from CSV files.

Parameters:
  • data_path (Union[Path, str]) – Path to the folder containing the files.

  • train_filename (Union[Path, str]) – Name of the CSV file containing the training data.

  • test_filename (Union[Path, str]) – Name of the CSV file containing the test data.

  • src_bucket (Union[Path, str, None]) – Name of an s3 bucket. If specified, the folder given by src_object_name will be downloaded from S3 to data_path.

  • src_object_name (Union[Path, str, None]) – Folder path in the s3 bucket.

  • target_name (str) – the header of the column containing the target values.

  • val_size (float) – Fraction of the training data to be used for validation.

  • seed (int) – Seed used to fix random number generation.

prepare_data()[source]#

Downloads data folder from S3 if applicable.

Return type:

None

setup()[source]#

Set up train, test and val datasets.

Return type:

None

class renate.data.ImageDataset(data, labels, transform=None, target_transform=None)[source]#

Bases: Dataset

Dataset class for image datasets where the images are loaded as raw images.

Parameters:
  • data (List[str]) – List of data paths to the images.

  • labels (List[int]) – Labels of images.

  • transform (Optional[Callable]) – Transformation or augmentation to perform on the sample.

  • target_transform (Optional[Callable]) – Transformation or augmentation to perform on the target.

Submodules#