This is the official code repository for this paper.
We use Python 3.9 and the libraries listed in requirements.txt, install with pip -r requirements.txt.
Set experimental parameters and run the SeqGenSeqClass or SeqGenNonSeqClass settings with mc_train.py (training with model collapse).
Similarly, use nomc_train.py for the SeqClass setting (no model collapse).
Regardless, the brunt of the code is carried out in train.py, which holds a class of all required parameters, architectures, etc, to conduct SeqGenSeqClass, our most complicated setting. Of most relevance will likely be the train_generator, train_classifier, and reparation_batch functions. Models for both generators and classifiers are in models.py file (all together), or separately in the models directory. Many other algorithmic choices are found in the utils.py file.
All in prep_data directory.
- Default parameters in
_train.pyfiles correspond to ColoredMNIST, which is created byci_mnist.py. - ColoredSVHN is refered to as SVHN in this repo. Created by
SVHN.py - CelebA created in
prep_celeba.py. Note that conducting experiments for this setting is expensive. - FairFace is created in
fairface.py, again experimenting with this dataset is expensive. Please see the Appendix D for details on the datasets, models, and compute.
There are several plotting scripts in the plotting_script directory.
- The
plotting.pycan generate figures illustrating MIDS for one experiment. comparisons.pyplots the figures used to comapare MIDS and reparations.plotting_ablation.pycreates figures for our abblation studies.overal_model_stats.pycan recall the average and stdev performances for A_L and A_S
The names of the settings in the code differ from the paper:
- MC. Model collapse, SeqGen (usually SeqGenSeqClass).
- ARCLA. Classifier-side AR for SegGenSeqClass. In ColoredMNIST, this is just 'AR'.
- ARGEN. Generator-side AR for SeqGenSeqClass.
- MCNOSEQ. Model collapse with nonsequential classifiers, SegGenNonSeqClass.
- NOMC. No model collapse, SeqClass.
- ARNOMC. AR + NOMC, classifier-side AR for SeqClass
Additionally, references to sensitive groups usually use red and green to denote the minoritized and majoritized groups, as in ColoredMNIST and ColoredSVHN.
An arxiv version of this work may be found here: https://arxiv.org/abs/2403.07857