You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: TUTORIAL.md
+19-4Lines changed: 19 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,19 +39,34 @@ A publicly-available dataset is available [on huggingface hub](https://huggingfa
39
39
40
40
Approximately 162GB of images are available in the `split_train` directory, although this format is not required by SimpleTuner.
41
41
42
-
### Batch size impacts aspect bucketing
42
+
You can simply create a single folder full of jumbled-up images, or they can be neatly organised into subdirectories.
43
43
44
-
Your maximum batch size is a function of your available VRAM and image resolution.
44
+
**Here are some important guidelines:**
45
+
46
+
### Training batch size
47
+
48
+
Your maximum batch size is a function of your available VRAM and image resolution:
49
+
50
+
```
51
+
vram use = batch size * resolution + base_requirements
52
+
```
53
+
54
+
To reduce VRAM use, you can reduce batch size or resolution, but the base requirements will always bite us in the ass. SDXL is a **huge** model.
55
+
56
+
To summarise:
45
57
46
58
- You want as high of a batch size as you can tolerate.
47
59
- The larger you set `RESOLUTION`, the more VRAM is used, and the lower your batch size can be.
48
60
- A larger batch size requires more training data in each bucket, since each one **must** contain a minimum of that many images.
61
+
- If you can't get a single iteration done with batch size of 1 and resolution of 128x128 on Adafactor or AdamW8Bit, your hardware just won't work.
49
62
50
-
Consequently, this means you should use as much high quality training data as you can acquire.
63
+
Which brings up the next point: **you should use as much high quality training data as you can acquire.**
51
64
52
65
### Selecting images
53
66
54
-
- JPEG artifacts and blurry images are a no-go. If you're trying to extract frames from a movie to train from, you're going to have a bad time as the compression ruins most of it - only the excessively large releases in the 40+ GB range are really going to be useful for improving image clarity.
67
+
- JPEG artifacts and blurry images are a no-go. The model **will** pick these up.
68
+
- Same goes for watermarks and "badges", artist signatures. That will all be picked up effortlessly.
69
+
- If you're trying to extract frames from a movie to train from, you're going to have a bad time. Compression ruins most films - only the large 40+ GB releases are really going to be useful for improving image clarity.
55
70
- Image resolutions optimally should be divisible by 64.
56
71
- This isn't **required**, but is beneficial to follow.
57
72
- Square images are not required, though they will work.
0 commit comments