What tricks do you guys use to train the model. I tried to rebuild the model based on your guys paper but when I tried to train it it crashes. On the paper you guys only mentioned that you guys used RMSprop optimizer with learning rate 3e-4. I don't feel that's enough to rebuild the model.
And How many epoches are needed for training. I trained the model with about 19000 training images for 20 epochs and the generated image are still kind of like merged faces.
I trained my model with gaussian noise now. But I don't know if you guys did that. And I tried to increase the proportion of LL loss on encoder loss, as the result the kl divergence stalk at about 1..