Thank you so much for this work. Currently, I'm trying to reproduce the results in the paper. I kept all the hyperparameters the same as the source code of this repository.
I trained the 1st stage vq-gan model for 2.2M steps, the same as the pre-train model you provided. Then I did the reconstruction on the whole LSUN churches dataset and calculated the FID, using calc_VQGAN_FID.py. As training went by, the FID score went down at first but increased after 0.8M steps, as shown in Figure 1 below.

Then I trained the 2nd stage diffusion model for 2.0M steps. I sampled 50K images with a temperature of 0.9 and calculated the FID score. The FID score of my local reproducing experiments is 5.53, which has a gap with the FID of 4.07 reported in the paper.

I did all my local experiments on a single NVIDIA V100 Volta GPU.
So I just wonder if the hyperparameter setting in the hparams/defaults/ folder is what you used to get the final results? Or are there any tricks that I might neglect? And do you have any clue about the weird increase when training the 1st-stage vq-gan models?
Thank you so much!
Thank you so much for this work. Currently, I'm trying to reproduce the results in the paper. I kept all the hyperparameters the same as the source code of this repository.
I trained the 1st stage vq-gan model for 2.2M steps, the same as the pre-train model you provided. Then I did the reconstruction on the whole LSUN churches dataset and calculated the FID, using calc_VQGAN_FID.py. As training went by, the FID score went down at first but increased after 0.8M steps, as shown in Figure 1 below.

Then I trained the 2nd stage diffusion model for 2.0M steps. I sampled 50K images with a temperature of 0.9 and calculated the FID score. The FID score of my local reproducing experiments is 5.53, which has a gap with the FID of 4.07 reported in the paper.

I did all my local experiments on a single NVIDIA V100 Volta GPU.
So I just wonder if the hyperparameter setting in the hparams/defaults/ folder is what you used to get the final results? Or are there any tricks that I might neglect? And do you have any clue about the weird increase when training the 1st-stage vq-gan models?
Thank you so much!