Supported Algorithms#

Renate provides implementations of various continual learning methods. The following table provides an overview with links to the documentation, and a short description. When initiating model updates using Renate (e.g., using run_training_job(); see How to Run a Training Job), a method may be selected using the shorthand provided below.

Supported Algorithms#






A simple replay-based method, where the model is finetuned using minibatches combining new data and points sampled from a rehearsal memory. The memory is updated after each minibatch. [Paper]



A version of experience replay which augments the loss by a distillation term using logits produced by previous model states. The implementation also includes the DER++ variant of the algorithm. [Paper]



An experimental method combining various of the ER variants listed above.



An offline version of experience replay, where the rehearsal memory is only updated at the end of training.



A distillation-based method inspired by (but not identical to) Deep Model Consolidation. An expert model is trained on the new data and then combined with the previous model state in a distillation phase. [DMC Paper]



A strong baseline that trains the model from scratch on a memory, which is maintained using a greedy class-balancing strategy. [Paper]



This method retrains a randomly initialized model each time from scratch on all data seen so far. Used as “upper bound” in experiments, inefficient for practical use.



A simple method which trains the current model on only the new data without any sort of mitigation for forgetting. Used as “lower bound” baseline in experiments.



A class that implements a Learning to Prompt method for ViTs. The methods trains only the input prompts that are sampled from a prompt pool in an input dependent fashion.



A class that extends the Learning to Prompt method to use a memory replay method like “Offline-ER”



A wrapper which gives access to Experience Replay as implemented in the Avalanche library. This method is the equivalent to our Offline-ER.



A wrapper which gives access to Elastic Weight Consolidation as implemented in the Avalanche library. EWC updates the model in such a way that the parameters after the update remain close to the parameters before the update to avoid catastrophic forgetting. [Paper]



A wrapper which gives access to Learning without Forgetting as implemented in the Avalanche library. LwF does not require to retain old data. It assumes that each new data chunk is its own task. A common backbone is shared across all task and each task has its own prediction head. [Paper]



A wrapper which gives access to iCaRL as implemented in the Avalanche library. This method is limited to class-incremental learning and combines knowledge distillation with nearest neighbors classification. [Paper]