[`FA`] Generalize fa config checks and fix flags #43121

vasqu · 2026-01-05T19:55:07Z

As per title, oftentimes we have somethin along config._attn_implementation == "flash_attention_2". This is unideal and very faulty as other FA flavors will be ignored, e.g. kernels and FA3. This simply changes this to the generalized version everywhere.
Additionally, fixed a few wrong flags that still used _supports_flash_attn_2 (including within tests), relevant commit a231a2b

Note that sam2 video, sam3 tracker video, edgetam video do not have any FA tests but they only use a mask with target guided attention which is not used per default, so we add a fallback similar to sam3 to use FA for other attention blocks.

Note that I changed pixtral slightly more due to it doing unnecessary movements at each attention layer. Not sure why the mask was properly done but not the pos ids.

HuggingFaceDocBuilderDev · 2026-01-05T20:05:02Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

vasqu · 2026-01-06T19:10:35Z

run-slow: pixtral

github-actions · 2026-01-06T19:11:38Z

This comment contains run-slow, running the specified jobs:

models: ["models/pixtral"]
quantizations: []

github-actions · 2026-01-06T19:26:21Z

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

…per internal discussion

Cyrilvallez

Very needed indeed, now that we have many flavors!
Even if it's probably alright to keep a small check like this, wdyt about introducing a small helper is_flash_implementation (probably in import_utils.py along is_tracing etc?) or something? Would probably make it clearer/more maintainable if we keep adding flavors!

Cyrilvallez · 2026-01-12T14:54:24Z

Sounds to me like it's quite important to have a unique entry-point for these cases, as otherwise we have the risk to have very surprising bugs/issues in just a few models because they simply miss some checks

vasqu · 2026-01-12T15:29:38Z

Yes let me change the check to a separate fn, will probably go over more models then (which already do the proper "flash" in ... check)

vasqu · 2026-01-13T16:56:51Z

Moved it to utils.generic (import utils doesn't fit IMO), but added this fn to all models now. Should make this way more maintainable

Cyrilvallez

Very nice, thanks a lot! It will truly make it easier!

Cyrilvallez · 2026-01-15T14:04:44Z

src/transformers/models/edgetam/modeling_edgetam.py

+        if is_flash_attention_requested(self.config) and attention_similarity is not None:
+            # Target guided masks are represented as float masks and are incompatible with Flash Attention
+            # Fallback to SDPA for this call only so the rest of the model can still benefit from FA
+            attention_interface = ALL_ATTENTION_FUNCTIONS["sdpa"]
+            logger.warning_once(
+                "Falling back to SDPA for target-guided attention because "
+                "Flash Attention does not support additive bias masks."
+            )
+


Was not there before, but trusting you on this one!

Discussed this with @yonigozlan internally. This feature seems to be rarely used (it should be None in most cases) + these models often have several different Attention mechanisms; we want to fallback here to support FA partially at least (similarly done in SAM3)

Cyrilvallez · 2026-01-15T14:11:07Z

src/transformers/utils/generic.py

+    Checks whether some flavor of flash attention is requested or not.
+    Priority order first goes for any explicitly passed value `requested_attention_implementation` and
+    then checks the config's saved value `config._attn_implementation`.


Maybe let's just raise if both are provided no? As it does not make sense to give both

Yea makes sense, changed the logic and description to force to pass just one not both

src/transformers/utils/generic.py

github-actions · 2026-01-15T19:13:51Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: afmoe, altclip, auto, autoformer, bamba, bark, bloom, clipseg, codegen, deepseek_v2, deepseek_v3, edgetam, edgetam_video, ernie4_5_vl_moe, falcon, falcon_h1

github-actions · 2026-01-15T19:23:04Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43121&sha=6c5129

fix fa checks

4da5af2

vasqu added 5 commits January 5, 2026 21:05

oops

864fd87

yea makes sense

f402a7a

fix

fe29ce8

did not intend to add this

c5b3947

fixup tests and wrong flags

a231a2b

vasqu changed the title ~~[FA] Generalize fa config checks~~ [FA] Generalize fa config checks and fix flags Jan 6, 2026

vasqu marked this pull request as ready for review January 6, 2026 19:10

github-actions bot requested review from ArthurZucker and Rocketknight1 January 6, 2026 19:11

vasqu requested review from Cyrilvallez and removed request for Rocketknight1 January 6, 2026 19:11

vasqu added 4 commits January 6, 2026 20:55

fix

3570ca1

style

df5b8f1

restrict target guided attn

4e36b73

remove sam and sam hq (dont support fa anyways), make it fallback as …

42078a8

…per internal discussion

huggingface deleted a comment from github-actions bot Jan 7, 2026

vasqu mentioned this pull request Jan 8, 2026

Check kernel FA when creating mask #43172

Closed

vasqu added 2 commits January 8, 2026 17:32

Merge branch 'main' into fix-general-fa-checks

2b3b702

missed this one

b809d62

Cyrilvallez reviewed Jan 12, 2026

View reviewed changes

vasqu added 3 commits January 12, 2026 17:15

Merge branch 'main' into fix-general-fa-checks

a979f95

move check to separate function

df83111

style

4af3806

vasqu and others added 2 commits January 13, 2026 17:34

Merge branch 'main' into fix-general-fa-checks

d05b581

sync changes with main

4931796

vasqu requested a review from Cyrilvallez January 13, 2026 16:55

Cyrilvallez approved these changes Jan 15, 2026

View reviewed changes

vasqu and others added 2 commits January 15, 2026 20:12

force non ambiguity + nit

e034e49

Merge branch 'main' into fix-general-fa-checks

6c51291

vasqu merged commit 9eea1b0 into huggingface:main Jan 15, 2026
25 checks passed

vasqu deleted the fix-general-fa-checks branch January 15, 2026 19:38

[FA] Generalize fa config checks and fix flags #43121

[FA] Generalize fa config checks and fix flags #43121

Conversation

vasqu commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jan 5, 2026

Uh oh!

vasqu commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

CI Results

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez commented Jan 12, 2026

Uh oh!

vasqu commented Jan 12, 2026

Uh oh!

vasqu commented Jan 13, 2026

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

vasqu Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

vasqu Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Jan 15, 2026

Uh oh!

github-actions bot commented Jan 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[`FA`] Generalize fa config checks and fix flags #43121

[`FA`] Generalize fa config checks and fix flags #43121

vasqu commented Jan 5, 2026 •

edited

Loading