IIRC Package Tutorial Using PyTorch¶

[1]:

import numpy as np
from PIL import Image

from iirc.lifelong_dataset.torch_dataset import Dataset
from iirc.definitions import IIRC_SETUP, CIL_SETUP, NO_LABEL_PLACEHOLDER

Using CIL setup (Class Incremental Learning)¶

Let’s create a mock dataset of 120 samples, with each belonging to one of the four classes A - B - C - D

[2]:

n = 120
n_per_y = 30
x = np.random.rand(n,32,32,3)
y = ["A"] * n_per_y + ["B"] * n_per_y + ["C"] * n_per_y + ["D"] * n_per_y

Now this dataset should be converted to the format used by the lifelong_datasets, where images should be a pillow images (or strings representing the images path in case of large datasets, such as ImageNet), and the dataset should be arranged as a list of tuples of the form (image, (label,))

[3]:

x = np.uint8(x*255)
mock_dataset = [(Image.fromarray(x[i], 'RGB'), (y[i],)) for i in range(n)]

Now let’s create a tasks schedule, where the first task introduces the labels A and B, and the second task introduces C and D (tasks don’t need to be of equal size)

[4]:

tasks = [["A", "B"], ["C", "D"]]

We also need a transformations function that takes the image and converts it to a tensor, as well as normalize the image, apply augmentations, etc.

There are two such functions that can be provided: essential_transforms_fn and augmentation_transforms_fn

If augmentation_transforms_fn is provided, it will always be applied except if the Dataset is told not to apply augmentations in a specific context (we will see later how)

Otherwise, essential_transforms_fn will be applied

So for example in a test set where augmentations are not needed, augmentation_transforms_fn shouldn’t be provided in the Dataset initialization

Hence, essential_transforms_fn should include any essential transformations that should be applied to the PIL image (such as convert to tensor), while augmentation_transforms_fn should also include the essential transformations, in addition to any augmentations that need to be applied (such as random horizontal flipping, etc)

[5]:

import torchvision.transforms as transforms

essential_transforms_fn = transforms.ToTensor()
augmentation_transforms_fn = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor()
])

Now we are ready to initialize the incremental dataset

[6]:

cil_dataset = Dataset(dataset=mock_dataset, tasks=tasks, setup=CIL_SETUP,
                              essential_transforms_fn=essential_transforms_fn,
                              augmentation_transforms_fn=augmentation_transforms_fn)

Then we can make use of cil_dataset by choosing the task we wish to train on using the choose_task(task_id) function, and creating a dataloader out of it

[7]:

cil_dataset.choose_task(0)

If we print the length of the dataset, we will only get the length of the samples that belong to the current task. The same goes for fetching a sample, as we only have access to the samples of the current task. This means that indexing the cil_dataset is relative to the current task, so the 0th sample for example will be different when we choose another task

[8]:

print(len(cil_dataset))

[9]:

cil_dataset[0]

[9]:

(tensor([[[0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
          [0.8431, 0.6902, 0.9294,  ..., 0.0000, 0.0000, 0.0000],
          [0.5922, 0.0510, 0.6314,  ..., 0.0000, 0.0000, 0.0000],
          ...,
          [0.3804, 0.2863, 0.5451,  ..., 0.0000, 0.0000, 0.0000],
          [0.2235, 0.8431, 0.8118,  ..., 0.0000, 0.0000, 0.0000],
          [0.8275, 0.8667, 0.6471,  ..., 0.0000, 0.0000, 0.0000]],

         [[0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
          [0.9294, 0.3882, 0.2745,  ..., 0.0000, 0.0000, 0.0000],
          [0.4392, 0.7098, 0.7922,  ..., 0.0000, 0.0000, 0.0000],
          ...,
          [0.7176, 0.5725, 0.0118,  ..., 0.0000, 0.0000, 0.0000],
          [0.9137, 0.6471, 0.9882,  ..., 0.0000, 0.0000, 0.0000],
          [0.3804, 0.4667, 0.3216,  ..., 0.0000, 0.0000, 0.0000]],

         [[0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
          [0.2353, 0.2824, 0.5882,  ..., 0.0000, 0.0000, 0.0000],
          [0.5765, 0.5490, 0.5686,  ..., 0.0000, 0.0000, 0.0000],
          ...,
          [0.5020, 0.0941, 0.0745,  ..., 0.0000, 0.0000, 0.0000],
          [0.3412, 0.8392, 0.2745,  ..., 0.0000, 0.0000, 0.0000],
          [0.9922, 0.0627, 0.2314,  ..., 0.0000, 0.0000, 0.0000]]]),
 'A',
 'None')

As we have seen the format returned when we fetch a sample from the dataset is a tuple of length 3 consisting of (image tensor, image label, NO_LABEL_PLACEHOLDER), NO_LABEL_PLACEHOLDER is set to the string “None” and should be ignored in the class incremental setup.

We can also access what classes have we seen thus far (they don’t reset if a previous task is re-chosen, also they are unordered):

[10]:

cil_dataset.reset()
cil_dataset.choose_task(0)
print(f"Task {cil_dataset.cur_task_id}, current task: {cil_dataset.cur_task}, dataset length: {len(cil_dataset)}, classes seen: {cil_dataset.seen_classes}")
cil_dataset.choose_task(1)
print(f"Task {cil_dataset.cur_task_id}, current task: {cil_dataset.cur_task}, dataset length: {len(cil_dataset)}, classes seen: {cil_dataset.seen_classes}")

Task 0, current task: ['A', 'B'], dataset length: 60, classes seen: ['A', 'B']
Task 1, current task: ['C', 'D'], dataset length: 60, classes seen: ['C', 'A', 'B', 'D']

If we need to load the data of all the tasks up to a specific task (including it), this can be done using the load_tasks_up_to(task_id) function

[11]:

cil_dataset.load_tasks_up_to(1)
print(f"current task: {cil_dataset.cur_task}, dataset length: {len(cil_dataset)}")

current task: ['A', 'B', 'C', 'D'], dataset length: 120

If a sample needs to be accessed in it’s original form (PIL form) without any transformations, for example to be added to the replay buffer, the get_item(index) method can be used:

[12]:

cil_dataset.get_item(0)

[12]:

(<PIL.Image.Image image mode=RGB size=32x32 at 0x22D7FF5FF10>, 'A', 'None')

and if we need to know the indices of the samples that belong to a specific class, this can be done using the get_image_indices_by_cla(class name) method, however, this can only be done if that class belongs to the current task. Moreover, these indices are relative to the current task, so whenever we change the task, they would point to totally different samples in the new task

[13]:

cil_dataset.get_image_indices_by_cla("A")

[13]:

array([29,  3, 23, 17, 10, 28,  2, 12, 22,  5, 19,  1, 25,  8,  4, 27,  7,
       16, 11, 21, 24, 15, 18,  9, 13, 26, 20, 14,  6,  0])

If cil_dataset uses data augmentations, but we needed to disable them in a specific part of the code, this context manager can be used:

[14]:

with cil_dataset.disable_augmentations():
    # any samples loaded here will have the essential transformations function applied to them
    pass

Finally, to load this dataset in minibatches for training, the torch dataloader can be used with it, but don’t forget to reinstantiate the dataloader whenever the task changes

[15]:

from torch.utils.data import DataLoader
cil_dataset.choose_task(0)
train_loader = DataLoader(cil_dataset, batch_size=2, shuffle=True)
next(iter(train_loader))

[15]:

[tensor([[[[0.0000, 0.0000, 0.0000,  ..., 0.9059, 0.3020, 0.3294],
           [0.0000, 0.0000, 0.0000,  ..., 0.6588, 0.7255, 0.1373],
           [0.0000, 0.0000, 0.0000,  ..., 0.5725, 0.1137, 0.7373],
           ...,
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000]],

          [[0.0000, 0.0000, 0.0000,  ..., 0.4510, 0.6980, 0.2000],
           [0.0000, 0.0000, 0.0000,  ..., 0.6941, 0.6784, 0.2353],
           [0.0000, 0.0000, 0.0000,  ..., 0.8392, 0.0980, 0.1843],
           ...,
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000]],

          [[0.0000, 0.0000, 0.0000,  ..., 0.9216, 0.9725, 0.4549],
           [0.0000, 0.0000, 0.0000,  ..., 0.1529, 0.8157, 0.2745],
           [0.0000, 0.0000, 0.0000,  ..., 0.8941, 0.1765, 0.1412],
           ...,
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000]]],


         [[[0.0000, 0.0000, 0.0000,  ..., 0.3843, 0.1843, 0.8078],
           [0.0000, 0.0000, 0.0000,  ..., 0.0588, 0.1961, 0.5882],
           [0.0000, 0.0000, 0.0000,  ..., 0.5843, 0.0275, 0.1725],
           ...,
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000]],

          [[0.0000, 0.0000, 0.0000,  ..., 0.5961, 0.4549, 0.3137],
           [0.0000, 0.0000, 0.0000,  ..., 0.7059, 0.2902, 0.2706],
           [0.0000, 0.0000, 0.0000,  ..., 0.5412, 0.3412, 0.7098],
           ...,
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000]],

          [[0.0000, 0.0000, 0.0000,  ..., 0.5098, 0.0902, 0.6667],
           [0.0000, 0.0000, 0.0000,  ..., 0.1216, 0.2706, 0.7922],
           [0.0000, 0.0000, 0.0000,  ..., 0.5294, 0.1098, 0.5294],
           ...,
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000]]]]),
 ('B', 'A'),
 ('None', 'None')]

Using IIRC setup (Incremental Implicitly Refined Classification)¶

Now Imaging that samples belonging to class A have also another sublabel of either Aa or Ab, and the samples belonging to class B have another sublabel of either Ba or Bb

[16]:

y_iirc = []
for i, label in enumerate(y):
    if label == "A" and (i % 2) == 0:
        y_iirc.append(("A", "Aa"))
    elif label == "A":
        y_iirc.append(("A", "Ab"))
    elif label == "B" and (i % 2) == 0:
        y_iirc.append(("B", "Ba"))
    elif label == "B":
        y_iirc.append(("B", "Bb"))
    else:
        y_iirc.append((label,))
print(y_iirc)

[('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('A', 'Aa'), ('A', 'Ab'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('B', 'Ba'), ('B', 'Bb'), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('C',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',), ('D',)]

[17]:

mock_dataset_iirc = [(Image.fromarray(x[i], 'RGB'), y_iirc[i]) for i in range(n)]

[18]:

mock_dataset_iirc[0]

[18]:

(<PIL.Image.Image image mode=RGB size=32x32 at 0x22D7FF79B80>, ('A', 'Aa'))

let’s redefine the tasks by incorporating the new subclasses

[19]:

tasks_iirc = [["A", "B", "C"], ["Aa", "Ba", "D"], ["Ab", "Bb"]]

All the functionality that we mentioned in the CIL setup is still applicable here, but there are some differences though.

The first difference is that in the training set, we don’t necessarily need all the samples that belong to both labels A and Aa to be seen across the two tasks, what makes more sense is that some of them need to appear in the first task and some others need to appear in the second task, probably with some overlap

Hence we have the two arguments superclass_data_pct and subclass_data_pct: * superclass_data_pct controls the percentage of the samples that belong to A that will appear when A is introduced * subclass_data_pct controls the percentage of the samples that belong to Aa that will appear when Aa is introduced (same for other subclasses)

So in this example we have 15 samples that have the labels (A, Aa), and 15 samples that have the labels (A, Ab): * superclass_data_pct = 1.0, subclass_data_pct = 1.0: This means that all 30 samples with label A will appear in the first task, then all 15 samples with label Aa will appear in the second task, etc (100% overlab) * superclass_data_pct = 0.5, subclass_data_pct = 0.5: 15 samples with label A will appear in the first task, then 8 samples with label Aa will appear in the second task, etc (no overlab) * superclass_data_pct = 0.6, subclass_data_pct = 0.8: 18 samples with label A will appear in the first task, then 12 samples with label Aa will appear in the second task, etc (40% overlab)

This procedure will only be done if the argument test_mode is set to False, as in the test set we don’t care about this kind of data repetition but we care more about evaluating on all the available samples

[20]:

iirc_dataset_train = Dataset(dataset=mock_dataset_iirc, tasks=tasks_iirc, setup=IIRC_SETUP, test_mode=False,
                                     essential_transforms_fn=essential_transforms_fn,
                                     augmentation_transforms_fn=augmentation_transforms_fn,
                                     superclass_data_pct=0.6, subclass_data_pct=0.8)

iirc_dataset_test = Dataset(dataset=mock_dataset_iirc, tasks=tasks_iirc, setup=IIRC_SETUP, test_mode=True,
                                     essential_transforms_fn=essential_transforms_fn,
                                     augmentation_transforms_fn=augmentation_transforms_fn)

iirc_dataset_train.choose_task(0)
iirc_dataset_test.choose_task(0)

print(f"Length of task 0 training data: {len(iirc_dataset_train)}, number of samples for class A: {len(iirc_dataset_train.get_image_indices_by_cla('A'))}")
print(f"Length of task 0 training data: {len(iirc_dataset_train)}, number of samples for class A: {len(iirc_dataset_test.get_image_indices_by_cla('A'))}")

Length of task 0 training data: 66, number of samples for class A: 18
Length of task 0 training data: 66, number of samples for class A: 30

The second difference is the concept of incomplete information, so typically in the IIRC setup, the model is not told all the labels of a sample, but only the labels that correspond to the current task, and the model should figure the other labels on its own.

This only applies to the training set though, as for the test set you need to know all the labels to evaluate the model properly

[21]:

labels_ = ["A", "Aa", "Ab"]
iirc_dataset_train.reset()
iirc_dataset_test.reset()
for task_id, label in enumerate(labels_):
    iirc_dataset_train.choose_task(task_id)
    iirc_dataset_test.choose_task(task_id)
    train_sample, train_label1, train_label2 = \
        iirc_dataset_train[iirc_dataset_train.get_image_indices_by_cla(label, num_samples=1)[0]]
    test_sample, test_label1, test_label2 = \
        iirc_dataset_test[iirc_dataset_test.get_image_indices_by_cla(label, num_samples=1)[0]]

    print(f"task {task_id}:\ntraining sample label: {(train_label1, train_label2)}")
    print(f"test sample label: {test_label1, test_label2}\n")

task 0:
training sample label: ('A', 'None')
test sample label: ('A', 'None')

task 1:
training sample label: ('Aa', 'None')
test sample label: ('A', 'Aa')

task 2:
training sample label: ('Ab', 'None')
test sample label: ('A', 'Ab')

To make a dataset use the complete information mode irrespective of whether it is a training set or a test set, the argument complete_information_mode can be provided when initializing the dataset, and the functions enable_complete_information_mode() and enable_incomplete_information_mode() can be used as well after initialization.

Take note that the load_tasks_up_to(task_id) function doesn’t work in the incomplete information