About your training details - batch_size and training time?

In set_up_hparams.py,  parser.add_argument("--train_steps", type=int, default=100000000) shows default train_steps is 100000000 and there is no input argument for this with your train_vqgan command line.

And seemingly you apply cycle function to train dataloader, which enables train dataloader to iterate permanently regardless of batch size - while train_steps decreases as batch_size increases in general other works, train_steps is fixed to 100000000.

Then I wonder the following two:
1) Batch_size doesn't have thing to do with decreasing training steps and so training time. Is this understand proper?
2) I saw you said it took less than 10 days for training vqgan and absorbing diffusion with single 2080ti. Did you train 100000000 till the end and pick just 1400000th checkpoint?

Thanks for providing such a great work!!

Best,
Junyeong Ahn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About your training details - batch_size and training time? #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

About your training details - batch_size and training time? #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions