-
Notifications
You must be signed in to change notification settings - Fork 876
[Docathon][Add CN Doc No.17] #6369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 7 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
be4b55f
add doc
zade23 60d9b77
Merge branch 'PaddlePaddle:develop' into en_doc_17
zade23 efb8283
fix doc
zade23 545ed35
Merge branch 'PaddlePaddle:develop' into en_doc_17
zade23 5a5dc5d
Update docs/api/paddle/incubate/nn/functional/fused_multi_transformer…
zade23 1086aad
Update docs/api/paddle/incubate/nn/functional/fused_multi_transformer…
zade23 028bc89
Merge branch 'PaddlePaddle:develop' into en_doc_17
zade23 34e0221
Update docs/api/paddle/incubate/nn/functional/fused_multi_transformer…
zade23 e37aa7b
rerun pre-commit
zade23 ba92a08
Merge branch 'develop' of https://github.com/PaddlePaddle/docs into e…
zade23 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
97 changes: 97 additions & 0 deletions
97
docs/api/paddle/incubate/nn/functional/fused_multi_transformer_cn.rst
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,97 @@ | ||
| .. _cn_api_paddle_incubate_nn_functional_fused_multi_transformer: | ||
|
|
||
| fused_multi_transformer | ||
| ------------------------------- | ||
|
|
||
| .. py:function:: paddle.incubate.nn.functional.fused_multi_transformer(x, ln_scales, ln_biases, qkv_weights, qkv_biases, linear_weights, linear_biases, ffn_ln_scales, ffn_ln_biases, ffn1_weights, ffn1_biases, ffn2_weights, ffn2_biases, pre_layer_norm=True, epsilon=1e-05, cache_kvs=None, pre_caches=None, seq_lens=None, rotary_embs=None, time_step=None, attn_mask=None, dropout_rate=0.0, rotary_emb_dims=0, activation='gelu', training=False, mode='upscale_in_train', trans_qkvw=True, ring_id=- 1, name=None) | ||
|
|
||
| 这是一个融合算子,用于计算 Transformer 模型架构中的多个 transformer 层。 | ||
|
|
||
| fused_multi_transformer 算子仅支持在 GPU 上运行。 | ||
|
|
||
| Transformer 层的功能与以下伪代码一致: | ||
|
|
||
| .. code-block:: text | ||
|
|
||
| >>> if pre_layer_norm: | ||
| ... out = layer_norm(x) | ||
| ... out = qkv_linear(out) + qkv_bias | ||
| ... else: | ||
| ... out = qkv_linear(x) + qkv_bias | ||
| >>> out = transpose(out, perm=[2, 0, 3, 1, 4]) | ||
| >>> # extract q, k and v from out. | ||
| >>> q = out[0:1, ::] | ||
| >>> k = out[1:2, ::] | ||
| >>> v = out[2:3, ::] | ||
| >>> out = q * k^t | ||
| >>> out = attn_mask + out | ||
| >>> out = softmax(out) | ||
| >>> out = dropout(out) | ||
| >>> out = out * v | ||
| >>> out = transpose(out, perm=[0, 2, 1, 3]) | ||
| >>> out = linear(out) | ||
| >>> if pre_layer_norm: | ||
| ... out = x + dropout(out + bias) | ||
| ... else: | ||
| ... out = layer_norm(x + dropout(out + bias)) | ||
|
|
||
| >>> residual = out; | ||
| >>> if pre_layer_norm: | ||
| ... out = ffn_layer_norm(out) | ||
| >>> out = ffn1_linear(out) | ||
| >>> out = dropout(activation(out + ffn1_bias)) | ||
| >>> out = ffn2_linear(out) | ||
| >>> out = residual + dropout(out + ffn2_bias) | ||
| >>> if not pre_layer_norm: | ||
| ... out = ffn_layer_norm(out) | ||
|
|
||
| 参数 | ||
| :::::::::::: | ||
| - **x** (Tensor) - 输入张量可以是 3-D 张量,输入数据类型可以是 float16 或 float32,形状为`[batch\_size, sequence\_length, d\_model]`。 | ||
| - **ln_scales** (list(Tensor)|tuple(Tensor)) - 注意力机制中层归一化层的权重张量,形状为`[d\_model]`。 | ||
| - **ln_biases** (list(Tensor)|tuple(Tensor)) - 注意力机制中层归一化层的偏重张量,形状为`[d\_model]`。 | ||
| - **qkv_weights** (list(Tensor)|tuple(Tensor)) - 注意力 qkv 计算的权重张量,形状为`[3, num\_head, dim\_head, d\_model]`。 | ||
| - **qkv_biases** (list(Tensor)|tuple(Tensor)|None) - 注意力 qkv 计算的偏置张量,形状为`[3, num\_head, dim\_head]`。 | ||
| - **linear_weights** (list(Tensor)|tuple(Tensor)) - 注意力机制中线性层的权重张量,形状为`[num\_head * dim\_head, d\_model]`。 | ||
| - **linear_biases** (list(Tensor)|tuple(Tensor)|None) - 注意力机制中线性层的的偏置张量,形状为`[d\_model]`。 | ||
| - **ffn_ln_scales** (list(Tensor)|tuple(Tensor)) - 前馈层中层归一化层的权重张量,形状为`[d\_model]`。 | ||
| - **ffn_ln_biases** (list(Tensor)|tuple(Tensor)) - 前馈层中层归一化层的偏置张量,形状为`[d\_model]`。 | ||
| - **ffn1_weights** (list(Tensor)|tuple(Tensor)) - 前馈层中第一个线性变换层的权重张量,形状为`[d\_model, dim\_feedforward]`。 | ||
| - **ffn1_biases** (list(Tensor)|tuple(Tensor)|None) - 前馈层中第一个线性变换层的偏置张量,形状为`[dim\_feedforward]`。 | ||
| - **ffn2_weights** (list(Tensor)|tuple(Tensor)) - 前馈层中第二线性变换层的权重张量,形状为`[dim\_feedforward, d\_model]`。 | ||
| - **ffn2_biases** (list(Tensor)|tuple(Tensor)|None) - 前馈层中第二线性变换层的偏置张量,形状为`[d_model]`。 | ||
| - **pre_layer_norm** (bool,可选) - 是否是 pre_layer_norm(True)或 post_layer_norm(False)。默认为 True。 | ||
| - **epsilon** (float,可选) - 添加到 layer_norm 的分母中的小浮点值,以避免除以零。默认为 1e-5。 | ||
| - **cache_kvs** (list(Tensor)|tuple(Tensor),可选) - 生成模型的缓存结构张量。形状为`[2, bsz, num\_head, max\_seq\_len, head\_dim]`。默认为 None。 | ||
| - **pre_caches** (list(Tensor)|tuple(Tensor),可选) - 生成模型的前缀缓存。形状为`[2, bsz, num\_head, cache\_len, head\_dim]`。默认为 None。 | ||
| - **seq_lens** (Tensor,可选) - 此批次的序列长度。形状为`[bsz]`。默认为 None。 | ||
| - **rotary_embs** (Tensor,可选) - 用于旋转计算的 RoPE 嵌入。形状为`[2, bsz, 1, seq\_len, head\_dim]`。默认为 None。 | ||
| - **time_step** (Tensor,可选) - 生成模型的时间步张量。用于解码阶段,表示时间步,即 CacheKV 的实际 seq_len。形状为`[1]`,必须位于 CPUPlace。默认为 None。 | ||
| - **attn_mask** (Tensor,可选) - 用于多头注意力层中防止对某些不需要的位置(通常是填充或后续位置)进行注意。其形状为`[batch_size, 1, sequence_length, sequence_length]`。默认为 None。 | ||
| - **dropout_rate** (float,可选) - 将单元设置为零的 dropout 概率。默认为 0.0。 | ||
| - **rotary_emb_dims** (int,可选) - 旋转计算的 rotary_emb_dims,当 rotary_embs 为 None 时为 0,当 rotary_embs 不为 None 且 pos_extra_ids 为 None 时为 1,当 rotary_embs 和 pos_extra_ids 均不为 None 时为 2。默认为 0。 | ||
| - **activation** (str,可选) - 激活函数。默认为"gelu"。 | ||
| - **training** (bool,可选) - 标志是否处于训练阶段。默认为 False。 | ||
| - **mode** (str,可选) - ['upscale_in_train'(默认) | 'downscale_in_infer'] | ||
|
|
||
| 1. upscale_in_train(默认),在训练时放大输出 | ||
| - 训练:out = input * mask / (1.0 - p) | ||
| - 推理:out = input | ||
|
|
||
| 2. downscale_in_infer,推理时减小输出 | ||
| - 训练:out = input * mask | ||
| - 推理:out = input * (1.0 - p) | ||
| - **trans_qkvw** (bool,可选) - 是否对 qkv 的权重进行转置。 | ||
| 如果为 true,则 qkv 的权重形状应为[3, num_head, dim_head, dim_embed]。 | ||
| 否则,qkv 的权重形状应为[dim_embed, 3, num_head, dim_head]。默认为 True。 | ||
| - **ring_id** (int,可选) - 用于张量模型并行中的分布式前向传播,仅支持 NCCL。默认为-1,表示不使用 mp。 | ||
| - **name** (str,可选) - 操作的名称(可选,默认为 None)。更多信息,请参阅 :ref:`api_guide_Name`。 | ||
|
|
||
| 返回 | ||
| :::::::::::: | ||
| - Tensor|tuple:如果 ``cache_kvs`` 为 None,则返回与 ``x`` 形状和数据类型相同的张量,代表 Transformer 的输出。如果 ``cache_kvs`` 不为 None,则返回元组(output, cache_kvs),其中 output 是 Transformer 的输出,cache_kvs 与输入`cache_kvs`原地更新。 | ||
|
|
||
| 代码示例 | ||
| ::::::::: | ||
|
|
||
| COPY-FROM: paddle.incubate.nn.functional.fused_multi_transformer | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.