Skip to content

Conversation

@Abdennacer-Badaoui
Copy link
Member

What does this PR do?

Fixes Flash Attention 2 generation with left-padding in GPT-2 by ensuring is_causal=True is set even when padding masks are provided.

Flash Attention 2 handles causal masking and padding independently, but the original code incorrectly disabled causal masking when any attention mask was present.

Fixes: test_flash_attn_2_generate_padding_left

@Abdennacer-Badaoui Abdennacer-Badaoui force-pushed the fix/test_flash_attn_2_generate_padding_left branch from 2877ab9 to 4f20a1f Compare October 31, 2025 15:13
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Abdennacer-Badaoui
Copy link
Member Author

@vasqu , ready for review :)

@remi-or remi-or requested review from Cyrilvallez and vasqu and removed request for Cyrilvallez November 3, 2025 14:00
Cyrilvallez
Cyrilvallez previously approved these changes Nov 10, 2025
Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, fa should know that it's indeed causal! cc @vasqu here as well!

However, I think we can even completely remove the shape check, as sdpa checks it internally to make sure of the alignment, and fa should align the mask correctly already if I remember correctly.

@Cyrilvallez Cyrilvallez dismissed their stale review November 10, 2025 11:08

did not mean to approve oupsi

Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with Cyril here, let's simplify the forward, essentially removing the whole is_causal logic during the forward and only change it within in the init based on if we are cross attn or not

@vasqu
Copy link
Contributor

vasqu commented Nov 10, 2025

All the is_causal logic is correctly handled by the integrations; we just need to make sure we define it correctly on init

@Abdennacer-Badaoui
Copy link
Member Author

Makes sense. Thanks @vasqu, @Cyrilvallez. Pushing the changes now.

@Abdennacer-Badaoui Abdennacer-Badaoui force-pushed the fix/test_flash_attn_2_generate_padding_left branch from 015bba0 to 2edb434 Compare November 10, 2025 13:59
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: decision_transformer, gpt2

@vasqu
Copy link
Contributor

vasqu commented Nov 10, 2025

run-slow: decision_transformer, gpt2

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ["models/decision_transformer", "models/gpt2"]
quantizations: []

@vasqu
Copy link
Contributor

vasqu commented Nov 10, 2025

run-slow: decision_transformer, gpt2

@github-actions
Copy link
Contributor

CI Results

Workflow Run ⚙️

⚠️ No test being reported (jobs are skipped or cancelled)!

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ["models/decision_transformer", "models/gpt2"]
quantizations: []

@github-actions
Copy link
Contributor

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx a lot, looking good now!

@vasqu
Copy link
Contributor

vasqu commented Nov 10, 2025

@Abdennacer-Badaoui can you run the fa tests locally and confirm that those work? Seems like our CI doesn't have FA installed anymore so just wanna double check

@Abdennacer-Badaoui
Copy link
Member Author

@vasqu, I tested the FA failing tests for which I opened the PR. they’re all passing.

@vasqu vasqu merged commit 4dd4a8f into huggingface:main Nov 10, 2025
24 checks passed
@vasqu
Copy link
Contributor

vasqu commented Nov 10, 2025

Thx a lot ❤️

@Abdennacer-Badaoui Abdennacer-Badaoui deleted the fix/test_flash_attn_2_generate_padding_left branch November 10, 2025 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants