optimization of index_select op backward#32955

Merged

Xreki merged 30 commits intoPaddlePaddle:developfrom

Zjq9409:Optimizaition_of_index_select_backward_cpu_op

Jul 20, 2021

Contributor

Zjq9409 commented May 18, 2021 •

edited

Loading

PR types

Performance optimization

PR changes

OPs

Describe

Optimization of index_select op backward, the optimized data as follows:

Compared with the origin and pytorch, optimization measures have been improved.


          optimization of index_select op backward

3a1c466

paddle-bot-old bot commented May 18, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Zjq9409 added 11 commits

May 18, 2021 09:28


          optimization of index_select op backward

7ba619f


          add compile parameter

090e046


          optimization of index_select op backward

a2b54bb


          optimization of index_select op backward

cdd297b


          optimization of index_select op backward

3a6faf1


          optimization of index_select op backward

252eefd


          optimization of index_select op backward

aa8057d


          optimization of index_select op backward

92ff317


          optimization of index_select op backward

9beb43b


          optimization of index_select op backward

d2f9aa8


          optimization of index_select op backward

d121f02

paddle-bot-old bot commented Jun 2, 2021

Sorry to inform you that d121f02's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

JamesLim-sy reviewed

View reviewed changes

paddle/fluid/operators/index_select_op.h Outdated Show resolved Hide resolved

paddle/fluid/operators/index_select_op.h

@@ @@ -11,47 +11,50 @@ @@
               // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
               // See the License for the specific language governing permissions and
               // limitations under the License.

Contributor

JamesLim-sy Jun 9, 2021

在Conversation的Comment区域要描述本次PR的目的，PR修改前后性能变化情况等信息。

paddle/fluid/operators/index_select_op.h Show resolved Hide resolved

paddle/fluid/operators/index_select_op.h Outdated Show resolved Hide resolved

Zjq9409 added 7 commits

June 10, 2021 14:01


          optimization of index_select backward

de92b58


          optimization of index_select op backward

0efa130


          optimization of index_select op backward

3b62a64


          optimization of index_select op backward

f3c1fb0


          optimization of index_select op backward

c76a2da


          Merge branch 'develop' into Optimizaition_of_index_select_backward_cp…

72b1c71

…u_op


          optimization of index_select op backward

7c003b1

Xreki reviewed

View reviewed changes

paddle/fluid/operators/index_select_op.h Outdated Show resolved Hide resolved

paddle/fluid/operators/index_select_op.h

               using Tensor = framework::Tensor;
               using LoDTensor = framework::LoDTensor;
               using DDim = framework::DDim;

Contributor

Xreki Jul 5, 2021

必要的空行有助于阅读代码，不要删除。

paddle/fluid/operators/index_select_op.h Outdated Show resolved Hide resolved

paddle/fluid/operators/index_select_op.h Outdated

    
                auto input_dim_size = input_dim.size();

                auto output_dim = x_grad->dims();

                std::vector<T> out_vec(x_grad->numel(), 0);

                std::memset(out_data, 0.0, x_grad->numel() * sizeof(T));

Contributor

Xreki Jul 5, 2021

可以用SetConstant，另外初始化为0的部分，放到L196后的for循环里面，每次初始化一部分，对cache是不是友好些？

paddle/fluid/operators/index_select_op.h Show resolved Hide resolved

JamesLim-sy reviewed

View reviewed changes

paddle/fluid/operators/index_select_op.h Outdated

+              };
+              template <typename T>
+              struct IndexSelectAdd<

Contributor

JamesLim-sy Jun 16, 2021

这一段仿函数的意义感觉不大，感觉除了浮点之外采用的是下述通用形式。

template <typename platform::cpu_isa_t isa, typename T, class Enable = void>
 struct IndexSelectAdd {
   void operator()(int n, const T* src, T* dst) {
     for (int i = 0; i < n; i++) {
       dst[i] += src[i];
     }
   }
 };

Zjq9409 added 6 commits

July 5, 2021 14:46


          optimization of index_select op

8a47e37


          optimization of index_select backward


          Optimizaition of index select backward by blas

1cba37c


          Optimizaition of index select backward by blas

3bdc32c


          modify add operator to blas

c035b19


          Merge branch 'develop' into Optimizaition_of_index_select_backward_cp…

cfb18c7

…u_op

Zjq9409 added 4 commits

July 7, 2021 02:15


          optimization index select backward

b341849


          modify template

f7ee626


          modify template

e6536f0


          optimization of index select

04afbfd

JamesLim-sy reviewed

View reviewed changes

paddle/fluid/operators/index_select_op.h

		@@ -194,8 +220,8 @@ class IndexSelectGradKernel : public framework::OpKernel<T> {
		if (dim < 0) {

Contributor

JamesLim-sy Jul 14, 2021

Line212 - Line219 可以改成：

    auto *x_grad = ctx.Input<framework::LoDTensor>("X");
    auto *index = ctx.Input<framework::LoDTensor>("Index");
    auto *out_grad = ctx.Output<framework::LoDTensor>("Out");

paddle/fluid/operators/index_select_op.h Outdated

               };
-              template <typename T, typename IndexT = int>
+              #if ((!defined __NVCC__) && (!defined __HIPCC__))

Contributor

JamesLim-sy Jul 14, 2021

这里的宏是否还有必要

Zjq9409 requested a review from Xreki

July 14, 2021 11:16

JamesLim-sy reviewed

View reviewed changes

paddle/fluid/operators/index_select_op.h

+                void operator()(const framework::ExecutionContext& ctx, int slice_size,
+                                const T* src_pointer, const T* p_pointer, T* dist_pointer) {
+                  auto blas = math::GetBlas<DeviceContext, T>(ctx);
+                  blas.VADD(slice_size, src_pointer, p_pointer, dist_pointer);

Contributor

JamesLim-sy Jul 14, 2021

使用的blas的时候，可以测一下不同OMP设置情况下的加速比。


          optimization index_select backward

62c570f

Xreki approved these changes

View reviewed changes

Contributor

Xreki left a comment

LGTM

paddle/fluid/operators/index_select_op.h

    
              template <typename DeviceContext, typename T, typename IndexT = int>

              void IndexSelectGradInner(const framework::ExecutionContext& context,

                                        const LoDTensor& out_grad, const LoDTensor& index,

                                        const LoDTensor* out_grad, const LoDTensor* index,

Contributor

Xreki Jul 20, 2021

不要修改参数的类型，不用修改的输入用const Tensor&类型。

Xreki merged commit 6883403 into PaddlePaddle:develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet