renate.evaluation.metrics.classification module#
- renate.evaluation.metrics.classification.average_accuracy(results, task_id, num_instances)[source]#
Compute the average accuracy of a model.
This measure is defined by:
\[\frac{1}{T} sum_{i=1}^T a_{T,i}\]where \(T\) is the number of tasks, \(a_{T,i}\) is the accuracy of the model on task \(i\), while having learned all tasks up to \(T\).
- renate.evaluation.metrics.classification.micro_average_accuracy(results, task_id, num_instances)[source]#
Compute the micro average accuracy of a model.
This measure is defined by the number of correctly classified data points divided by the total number of data points. If the number of data points is the same in each update step, this is the same as
average_accuracy
.
- renate.evaluation.metrics.classification.forgetting(results, task_id, num_instances)[source]#
Compute the forgetting measure of the model.
This measure is defined by:
\[\frac{1}{T-1} sum_{i=1}^{T-1} f_{T,i}\]where \(f_{j,i}\) is defined as:
\[f_{j,i} = \max_{k\in\{1, \ldots, j=i\}} a_{k,i} - a_{j,i}\]where \(T\) is the final task index, \(a_{n,i}\) is the test classification accuracy on task \(i\) after sequentially learning the nth task and \(f_{j,i}\) is a measure of forgetting on task \(i\) after training up to task \(j\).
- renate.evaluation.metrics.classification.backward_transfer(results, task_id, num_instances)[source]#
Compute the backward transfer measure of the model.
This measure is defined by:
\[\frac{1}{T-1} sum_{i=1}^{T-1} a_{T,i} - a_{i,i}\]where \(T\) is the final task index, \(a_{n,i}\) is the test classification accuracy on task \(i\) after sequentially learning the nth task.
- renate.evaluation.metrics.classification.forward_transfer(results, task_id, num_instances)[source]#
Compute the forward transfer measure of the model.
This measure is defined by:
\[\frac{1}{T-1} sum_{i=2}^{T} a_{i-1,i} - b_{i}\]where \(T\) is the final task index, \(a_{n,i}\) is the test classification accuracy on task \(i\) after sequentially learning the nth task. \(b_{i}\) is the accuracy for all \(T\) tasks recorded at initialisation prior to observing any task.