Reproducing experiments "On Tiny Episodic Memories in Continual Learning"

Hi,

I tried to reproduce your results that you described in section 5.5. of your paper "On Tiny Episodic Memories in Continual Learning" because I couldn't find an implementation in your codebase and the experiment seem relatively easy to reproduce. I'm mostly interested in the results for 20-degrees rotation, where fine-tuning on the second task does not harm performance on the first one, so in fact I am only interested to reproduce this figure:

<img width="490" alt="Screenshot 2021-04-08 at 15 10 02" src="https://user-images.githubusercontent.com/19873307/114032337-820a1e80-987c-11eb-9dd4-779151fe1dd6.png">


I've skimmed the paper and listed the following hyperparameters:
- MLP with 2 hidden layers, 256 units each, followed by ReLU
- SGD with lr=0.1
- CrossEntropy loss
- Minibatch size=10
- A single pass through the whole dataset


Unfortunately, after reproducing the experiments I found that after finishing the first task my network has 96% accuracy on test set in contrast to 85% that you reported and finetuining only on the second task indeed leads to catastrophic forgetting (which is not so catastrophic in this case, but leads to the loss of ~5% of accuracy on the test set).

Could you please provide me any details about your experimental setup? Am I missing something?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing experiments "On Tiny Episodic Memories in Continual Learning" #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reproducing experiments "On Tiny Episodic Memories in Continual Learning" #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions