renate.benchmark.scenarios module#
- class renate.benchmark.scenarios.Scenario(data_module, num_tasks, chunk_id, seed=0)[source]#
Bases:
ABC
Creates a continual learning scenario from a RenateDataModule.
This class can be extended to modify the returned training/validation/test sets to implement different experimentation settings.
Note that many scenarios implemented here perform randomized operations, e.g., to split a base dataset into chunks. The scenario is only reproducible if the _same_ seed is provided in subsequent instantiations. The seed argument is required for these scenarios.
- Parameters:
data_module¶ (
RenateDataModule
) – The source RenateDataModule for the user data.num_tasks¶ (
int
) – The total number of expected tasks for experimentation.chunk_id¶ (
int
) – The data chunk to load in for the training or validation data.seed¶ (
int
) – Seed used to fix random number generation.
- train_data()[source]#
Returns training dataset with respect to current
chunk_id
.- Return type:
Dataset
- val_data()[source]#
Returns validation dataset with respect to current
chunk_id
.- Return type:
Dataset
- test_data()[source]#
Returns the test data with respect to all tasks in
num_tasks
.- Return type:
List
[Dataset
]
- train_collate_fn()[source]#
Returns collate_fn for train DataLoader.
- Return type:
Optional
[Callable
]
- class renate.benchmark.scenarios.ClassIncrementalScenario(data_module, chunk_id, groupings)[source]#
Bases:
Scenario
A scenario that creates data chunks from data samples with specific classes from a data module.
This class, upon giving a list describing the separation of the dataset separates the dataset with respect to classification labels.
Note that, in order to apply this scenario, the scenario assumes that the data points in the data module are organised into tuples of exactly 2 tensors i.e.
(x, y)
wherex
is the input andy
is the class id.- Parameters:
data_module¶ (
RenateDataModule
) – The source RenateDataModule for the user data.chunk_id¶ (
int
) – The data chunk to load in for the training or validation data.groupings¶ (
Tuple
[Tuple
[int
,...
],...
]) – Tuple of tuples, describing the division of the classes for respective tasks.
- class renate.benchmark.scenarios.TransformScenario(data_module, transforms, chunk_id, seed=0)[source]#
Bases:
Scenario
A scenario that applies a different transformation to each chunk.
The base
data_module
is split intolen(transforms)
random chunks. Thentransforms[i]
is applied to chunki
.- Parameters:
data_module¶ (
RenateDataModule
) – The base data module.transforms¶ (
List
[Callable
]) – A list of transformations.chunk_id¶ (
int
) – The id of the chunk to retrieve.seed¶ (
int
) – Seed used to fix random number generation.
- class renate.benchmark.scenarios.ImageRotationScenario(data_module, degrees, chunk_id, seed)[source]#
Bases:
TransformScenario
A scenario that rotates the images in the dataset by a different angle for each chunk.
- Parameters:
data_module¶ (
RenateDataModule
) – The base data module.degrees¶ (
List
[int
]) – List of degrees corresponding to different tasks.chunk_id¶ (
int
) – The data chunk to load in for the training or validation data.seed¶ (
int
) – Seed used to fix random number generation.
- class renate.benchmark.scenarios.PermutationScenario(data_module, num_tasks, input_dim, chunk_id, seed)[source]#
Bases:
TransformScenario
A scenario that applies a different random permutation of features for each chunk.
- Parameters:
data_module¶ (
RenateDataModule
) – The base data module.num_tasks¶ (
int
) – The total number of expected tasks for experimentation.input_dim¶ (
Union
[List
[int
],Tuple
[int
],int
]) – Dimension of the inputs. Can be a shape tuple or the total number of features.chunk_id¶ (
int
) – The data chunk to load in for the training or validation data.seed¶ (
int
) – A random seed to fix the random number generation for permutations.
- class renate.benchmark.scenarios.IIDScenario(data_module, num_tasks, chunk_id, seed=0)[source]#
Bases:
Scenario
A scenario splitting datasets into random equally-sized chunks.
- class renate.benchmark.scenarios.FeatureSortingScenario(data_module, num_tasks, feature_idx, randomness, chunk_id, seed=0)[source]#
Bases:
_SortingScenario
A scenario that _softly_ sorts a dataset by the value of a feature, then creates chunks.
This scenario sorts the data according to a feature value (see
feature_idx
) and randomly swaps data positions based on the degree of randomness (seerandomness
).This scenario assumes that
dataset[i]
returns a tuple(x, y)
with a tensorx
containing the features.- Parameters:
data_module¶ (
RenateDataModule
) – The source RenateDataModule for the user data.num_tasks¶ (
int
) – The total number of expected tasks for experimentation.feature_idx¶ (
int
) – Index of the feature by which to sort. This index refers to the input featuresx
of a single data point, i.e., no batch dimension. If the tensorx
has more than one dimension, this indexes along the 0-dim while additional dimensions will be averaged out. Hence, for images,feature_idx
refers to a color channel, and we sort by mean color channel value.randomness¶ (
float
) – A value between 0 and 1. For a dataset withN
data points,0.5 * N * randomness
random pairs are swapped.chunk_id¶ (
int
) – The data chunk to load in for the training or validation data.seed¶ (
int
) – Seed used to fix random number generation.
- class renate.benchmark.scenarios.HueShiftScenario(data_module, num_tasks, randomness, chunk_id, seed=0)[source]#
Bases:
_SortingScenario
A scenario that sorts an image dataset by the hue value, then creates chunks.
All images are sorted by hue value and divided into
num_tasks
tasks.randomness
is a value between 0 and 1 and controls the number of random swaps applied to the sorting.This scenario assumes that
dataset[i]
returns a tuple(x, y)
with a tensorx
containing an RGB image.
- class renate.benchmark.scenarios.DataIncrementalScenario(data_module, chunk_id, data_ids=None, groupings=None, seed=0)[source]#
Bases:
Scenario
Creating a scenario which iterates over pre-defined datasets.
The scenario will iterate over a list of datasets that are provided by the given
DataModule
. The data is loaded by assigningdata_ids[chunk_id]
to the attribute of theDataModule
with namedomain
and then calling itssetup()
function.- Parameters:
data_module¶ (
RenateDataModule
) – The sourceRenateDataModule
for the user data.chunk_id¶ (
int
) – The data chunk to load in for the training or validation data.data_ids¶ (
Optional
[Tuple
[Union
[int
,str
],...
]]) – Unique identifier for each pre-defined dataset.groupings¶ (
Optional
[Tuple
[Tuple
[int
,...
],...
]]) – Tuple of tuples that group different datasets associated to adata_id
to one dataset.seed¶ (
int
) – Seed used to fix random number generation.