Skip to content

Conversation

@jiminha
Copy link
Contributor

@jiminha jiminha commented Sep 25, 2024

What does this PR do?

Transformer4.45 has the Dynamic cache updates(huggingface/transformers#31421) removing
self.bias causing the failure. Until we investigate further and update the code based on the new transformer, we are putting bias back to GaudiGPTJAttention.init()

Fixes # (issue)

GAUDI2_CI=1 RUN_SLOW=1 python3.10 -m pytest tests/test_text_generation_example.py -k token0-EleutherAI/gpt-j-6b-1

File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/models/gptj/modeling_gptj.py", line 124, in _attn
causal_mask = self.bias[:, :, key_length - query_length : key_length, :key_length]
AttributeError: 'GaudiGPTJAttention' object has no attribute 'bias'

Transformer4.45 has the Dynamic cache updates removing
self.bias causing the failure. Until we investigate further
and update the code based on the new transformer, we are putting bias back to
GaudiGPTJAttention.init()

huggingface/transformers#31421
@jiminha jiminha requested a review from ZhaiFeiyue as a code owner September 25, 2024 23:02
@jiminha jiminha requested review from libinta and regisss September 25, 2024 23:18
@regisss regisss merged commit 9216159 into transformers_future Sep 26, 2024
1 check passed
@regisss regisss deleted the jha/gptjbias branch September 26, 2024 07:51
@jiminha
Copy link
Contributor Author

jiminha commented Sep 26, 2024

@atakaha Could you check on this PR? Transformer4.45 has the Dynamic cache updates (huggingface/transformers#31421) which removed self.bias, and that caused an error on our side. I tried to update the model file based on their new logic but found that update_sincos_cache() also using self.bias as well. For now I put the self.bias back to our model file so the test goes through. Could you please review the transformer PR31421 and then see if you can update the gptj model file accordingly?

@atakaha
Copy link
Contributor

atakaha commented Sep 26, 2024

looks good to me.
I'll study how to apply the PR to GPTJ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants