Skip to content

add long sequence strategies #8076

Merged
wawltor merged 30 commits into
PaddlePaddle:developfrom
WAI-clear:longSeq_strategy
Mar 26, 2024
Merged

add long sequence strategies #8076
wawltor merged 30 commits into
PaddlePaddle:developfrom
WAI-clear:longSeq_strategy

Conversation

@WAI-clear
Copy link
Copy Markdown
Contributor

PR types

PR changes

Models、APIs

Description

将长序列方案和模型解耦

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Mar 8, 2024

Thanks for your contribution!

@gongel gongel self-requested a review March 8, 2024 04:23
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 15, 2024

Codecov Report

Attention: Patch coverage is 43.16940% with 104 lines in your changes are missing coverage. Please review.

Project coverage is 55.41%. Comparing base (db49062) to head (dc8da0a).
Report is 52 commits behind head on develop.

Files Patch % Lines
...s/long_sequence_strategies/embedding_strategies.py 25.39% 47 Missing ⚠️
...s/long_sequence_strategies/attention_strategies.py 37.50% 15 Missing ⚠️
...ng_sequence_strategies/long_sequence_strategies.py 31.25% 11 Missing ⚠️
paddlenlp/transformers/llama/modeling.py 35.71% 9 Missing ⚠️
paddlenlp/transformers/chatglm/modeling.py 41.66% 7 Missing ⚠️
paddlenlp/transformers/bloom/modeling.py 50.00% 6 Missing ⚠️
paddlenlp/transformers/chatglm_v2/modeling.py 45.45% 6 Missing ⚠️
paddlenlp/transformers/qwen/modeling.py 62.50% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8076      +/-   ##
===========================================
- Coverage    56.56%   55.41%   -1.16%     
===========================================
  Files          589      600      +11     
  Lines        89964    91642    +1678     
===========================================
- Hits         50889    50782     -107     
- Misses       39075    40860    +1785     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment thread llm/finetune_generation.py Outdated
if hasattr(model_config, "use_flash_attention"):
model_config.use_flash_attention = model_args.use_flash_attention

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件不需要被修改吧

Comment thread llm/llama/sft_argument.json Outdated
"intokens": true,
"zero_padding": false,
"use_flash_attention": false
} No newline at end of file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个json也不需要被修改


class AttentionWithLinearBias(nn.Layer):
"""
init_args:bool_attention_mask,num_heads,dtype,tensor_parallel_degree
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ self._get_interleave(2 * closest_power_of_2)[0::2][: n - closest_power_of_2]
)

def forward(self, bool_attention_mask: Tensor, num_heads: int, dtype: paddle.dtype, tensor_parallel_degree=1):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

传入tensor_parallel_degree的用处是?

def _get_interleave(self, n):
def _get_interleave_power_of_2(n):
start = 2 ** (-(2 ** -(math.log2(n) - 3)))
ratio = start
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个ratio和start相等?是否可以复用

"""
try:
import_class = importlib.import_module(f"paddlenlp.transformers.LongSequenceStrategies.{strategy_type}")
except ValueError:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该是ModuleNotFoundError?

strategy_class = getattr(import_class, stratety_name)
strategy_instance = strategy_class(**init_args)
return strategy_instance
except AttributeError:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果是strategy_class,报的错误是AttributeError?

@@ -0,0 +1,49 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文件命名,目录命名,小写

@wawltor wawltor merged commit 6b5099a into PaddlePaddle:develop Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants