Add TREAD training for the FLUX model #1672

AmericanPresidentJimmyCarter · 2025-07-27T21:21:43Z

I haven't had time to test it much, but you can make route configs (CLI args) like:

            f'--tread_config={json.dumps({
                'routes': [
                    {
                        'start_layer_idx': 8,
                        'end_layer_idx': -8,
                        'selection_ratio': 0.5,
                    },
                ],
            })}',

It works, it does speed up training, and the speedup is proportion to the selection_ratio (the closer that is to 1.0, the more tokens are dropped).

I haven't messed with the router configurations much to see what works and what doesn't.

It also worked with masked loss, where masking prevents certain tokens from being dropped. This slows down training by using more tokens.

The token dropping was implemented for RoPE in FLUX, as this is required to make TREAD work. I'm unsure the implementation is 100% correct, but it seems to work.

The more tokens you drop, the higher loss seems to be when you start LoRA/LoKr training. It tends to correct fairly rapidly and the images look more normal. I'm unsure if it's the network just adjusting to the smaller amount of tokens being supplied in the intermediary layers.

There was also a bug that seemed to break training with masked loss for flux in the main branch. This was fixed with a self.config.model_flavour == "kontext" guard.

bghira · 2025-07-28T01:20:52Z

replaced by #1675

Add TREAD training for the FLUX model

3b430ce

AmericanPresidentJimmyCarter force-pushed the add-tread branch from 4093d2a to 3b430ce Compare July 27, 2025 21:23

bghira mentioned this pull request Jul 28, 2025

add TREAD training #1675

Merged

bghira closed this Jul 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add TREAD training for the FLUX model #1672

Add TREAD training for the FLUX model #1672

Uh oh!

AmericanPresidentJimmyCarter commented Jul 27, 2025

Uh oh!

bghira commented Jul 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add TREAD training for the FLUX model #1672

Add TREAD training for the FLUX model #1672

Uh oh!

Conversation

AmericanPresidentJimmyCarter commented Jul 27, 2025

Uh oh!

bghira commented Jul 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants