Skip to content

Commit cd89e54

Browse files
metax666zhang-chenyiduqimengStareAtYoujxwangmetax
authored
[METAX] Modify CI logic (#2213)
* [fix] fix fail test when backend is mack * [metax]change_cupti_and_fix_softmax (#7) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * [Metax] fix dgc & mklml compile product path problem (#8) * [Metax] fix accuracy kernel & add test_accuracy_op_metax.py unit test (#9) * [Metax] fix dgc & mklml compile product path problem * [Metax] fix accuracy kernel & add test_accuracy_op_metax.py unit test * [Metax] add mixed_vector fix & update change patch * [Metax] update metax_gpu CMakeLists.txt (#10) * [Metax] fix dgc & mklml compile product path problem * [Metax] fix accuracy kernel & add test_accuracy_op_metax.py unit test * [Metax] add mixed_vector fix & update change patch * [Metax] update metax_gpu CMakeLists.txt * [metax] updata_qr_kernel (#11) * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * [Metax] fix illegal address access error in test_momentum_op (#12) * [Metax] fix illegal address access error in test_momentum_op * [Metax] fix cufft and fix some blas kernel apply (#13) * [Metax] fix cufft and fix some blas kernel apply * [metax] add warpctc_warprnn (#14) * [metax] fix bug * [Metax] update metax CI (#15) * [Metax] update metax CI * [Metax] update metax CI CMakeLists (#16) * [Metax] update metax CI * [Metax] update metax CI CMakeLists * [Metax] add github action (#18) * [Metax] add github action --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * [metax] chang build (#19) * [metax]chaneg build --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * change_build (#20) * [metax]chaneg build --------- * change_build (#21) * change_build (#22) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * 【metax】modify cmake for warpctc and warprnnt (#17) * modify cmake for warpctc and warprnnt * modify conv for tf32 and fp32 * modify conv kernel * [metax]modify library to static library (#24) * modify cmake for warpctc and warprnnt * modify conv for tf32 and fp32 * modify conv kernel * modify library to static library * [Metax] organize documents (#25) * [Metax] fix dgc & mklml compile product path problem * [Metax] update metax_gpu CMakeLists.txt * [Metax] organize documents * [metax]fix_code style and index_elementwise_put_kernel (#27) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * change_build_917 (#29) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * chang_build (#30) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build * [metax]change_build --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * [metax]modify kernel (#31) * modify cmake for warpctc and warprnnt * modify conv for tf32 and fp32 * modify conv kernel * modify library to static library * modify kernel * change_metax_work (#32) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build * [metax]change_build * change_metax_work * change_metax_work --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * change_build (#33) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build * [metax]change_build * change_metax_work * change_metax_work * change_metax_work --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * [metax] modify fused_bias_dropout_residual_layer_norm (#34) * modify cmake for warpctc and warprnnt * modify conv for tf32 and fp32 * modify conv kernel * modify library to static library * modify kernel * modify fused_bias_dropout_residual_layer_norm * change_build (#35) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build * [metax]change_build * change_metax_work * change_metax_work * change_metax_work * change_metax_work --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * change_build (#36) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build * [metax]change_build * change_metax_work * change_metax_work * change_metax_work * change_metax_work * change_metax_work --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * change_warpctc.cmake (#38) * change_warpctc.cmake * change_warpctc.cmake (#39) * change warpctc.cmake * test (#40) * test --------- * test_ut (#41) * change_run_ut --------- * tets (#43) * remove_tets --------- * test (#44) * test --------- * [metax] modify compile (#42) * modify cmake for warpctc and warprnnt * modify conv for tf32 and fp32 * modify conv kernel * modify library to static library * modify kernel * modify fused_bias_dropout_residual_layer_norm * modify compile * modify blas * [Metax] add log analysis script (#46) * [Metax] fix dgc & mklml compile product path problem * [Metax] update metax_gpu CMakeLists.txt * [Metax] organize documents * [Metax] add log analysis script * add_generate_pb (#47) * add_generate_pb --------- * modify blas (#51) * modify cmake for warpctc and warprnnt * modify conv for tf32 and fp32 * modify conv kernel * modify library to static library * modify kernel * modify fused_bias_dropout_residual_layer_norm * modify compile * modify blas * modify blas * modify blas * modify blas * [metax] modify tf32 (#52) * modify cmake for warpctc and warprnnt * modify conv for tf32 and fp32 * modify conv kernel * modify library to static library * modify kernel * modify fused_bias_dropout_residual_layer_norm * modify compile * modify blas * modify blas * modify blas * modify blas * modify context * [Metax] update metax backend CI test (#53) * [Metax] fix dgc & mklml compile product path problem * [Metax] update metax_gpu CMakeLists.txt * [Metax] organize documents * [Metax] add log analysis script * [Metax] update metax backend CI test * [Metax] fix log_analysis.py bug (#54) * [Metax] fix dgc & mklml compile product path problem * [Metax] update metax_gpu CMakeLists.txt * [Metax] organize documents * [Metax] add log analysis script * [Metax] update metax backend CI test * [Metax] fix log_analysis.py bug * [Metax] update metax CI CMakeLists & scripts (#56) * [Metax] fix dgc & mklml compile product path problem * [Metax] update metax_gpu CMakeLists.txt * [Metax] organize documents * [Metax] add log analysis script * [Metax] update metax backend CI test * [Metax] fix log_analysis.py bug * [Metax] update metax CI CMakeLists & scripts * [Metax] fix MatmulKernel problem (#57) * [Metax] fix dgc & mklml compile product path problem * [Metax] update metax_gpu CMakeLists.txt * [Metax] organize documents * [Metax] add log analysis script * [Metax] update metax backend CI test * [Metax] fix log_analysis.py bug * [Metax] update metax CI CMakeLists & scripts * [Metax] fix MatmulKernel problem * [Metax] update metax CI program * [metax]fix paddle bug" (#58) * [metax]fix paddle bug * change—ut (#59) * change_ut * change_ut (#60) * change_ut --------- * change_ut (#63) * change_ut * change_ut --------- * [Metax] add keyword filter in CI CMakeLists.txt (#64) * [Metax] add keyword filter in CI CMakeLists.txt * [Metax] add ignore case list * [metax] modify kernels (#67) * modify cmake for warpctc and warprnnt * modify conv for tf32 and fp32 * modify conv kernel * modify library to static library * modify kernel * modify fused_bias_dropout_residual_layer_norm * modify compile * modify blas * modify blas * modify blas * modify blas * modify context * modify kernels * Fix part of the missing kernel issues (#66) Co-authored-by: root <[email protected]> * [Metax] fix index_elementwise_get kernel (#68) * [Metax] add keyword filter in CI CMakeLists.txt * [Metax] add ignore case list * [Metax] fix phi::backends::gpu::DnnVersion() symbol not found * Revert "[Metax] fix phi::backends::gpu::DnnVersion() symbol not found" This reverts commit 087a9c1240f024210d536e543a2fc55db1175529. * [Metax] fix index_elementwise_get kernel * [metax]fix patch and fix missing kernel (#72) * [metax]fix patch and fix missing kernel * [metax] modify kernels (#73) * modify kernels * [metax] modify kernels (#74) * modify kernels * [metax] link mccl and fix missing kernel (#76) * [metax] link mccl and fix missing kernel * [metax] rename yaml file (#77) * [metax]fix patch and fix missing kernel * [metax] link mccl and fix missing kernel * [metax] rename yaml file --------- * [metax] rm file (#78) * [metax]fix patch and fix missing kernel * [metax] link mccl and fix missing kernel * [metax] rename yaml file * [metax] rm file * [metax] rm file --------- * metax_fix_ci (#79) * [metax] add Rules --------- * [metax] add print tensor (#91) * modify cmake for warpctc and warprnnt * modify conv for tf32 and fp32 * modify conv kernel * modify library to static library * modify kernel * modify fused_bias_dropout_residual_layer_norm * modify compile * modify blas * modify blas * modify blas * modify blas * modify context * modify kernels * modify kernels * modify kernels * add print tensor * [Metax] change_patch (#94) * [metax] change_patch --------- * update paddle (#95) * update paddle --------- * [metax] fix dot error (#96) * [metax] fix dot error --------- * Update metax_work.yaml * [metax]rm opt path and fix activation_kernel bug (#98) * [metax]rm opt path and fix activation_kernel bug --------- * updata_paddle (#99) * updata paddle --------- * [Metax] Fix some tests (#102) * fix some tests * [metax] support wint4 in quantize (#103) * updata_metax (#104) * test * test --------- * updata_metax (#105) * chang_meatx_yaml * chang_meatx_yaml * updata_metax * test * test * test * test --------- * add one test to metax (#107) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * fix some tests * add one test --------- Co-authored-by: sw <[email protected]> Co-authored-by: duqimeng <[email protected]> Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> * uodata_metax (#106) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build * [metax]change_build * change_metax_work * change_metax_work * change_metax_work * change_metax_work * change_metax_work * change_warpctc.cmake * change warpctc.cmake * test * change_run_ut * remove_tets * test * add_generate_pb * [metax]fix paddle bug * change_ut * change_ut * change_ut * [metax]fix patch and fix missing kernel * [metax] link mccl and fix missing kernel * [metax] rename yaml file * [metax] rm file * [metax] rm file * [metax] add Rules * [metax] change_patch * update paddle * [metax] fix dot error * [metax]rm opt path and fix activation_kernel bug * updata paddle * chang_meatx_yaml * chang_meatx_yaml * updata_metax * test * test * test * test * test * test * test * test * test * test * test * test --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * updata eigen_and fix_bug (#109) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build * [metax]change_build * change_metax_work * change_metax_work * change_metax_work * change_metax_work * change_metax_work * change_warpctc.cmake * change warpctc.cmake * test * change_run_ut * remove_tets * test * add_generate_pb * [metax]fix paddle bug * change_ut * change_ut * change_ut * [metax]fix patch and fix missing kernel * [metax] link mccl and fix missing kernel * [metax] rename yaml file * [metax] rm file * [metax] rm file * [metax] add Rules * [metax] change_patch * update paddle * [metax] fix dot error * [metax]rm opt path and fix activation_kernel bug * updata paddle * chang_meatx_yaml * chang_meatx_yaml * updata_metax * test * test * test * test * test * test * test * test * test * test * test * test * updata_enigen --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * updata paddle (#110) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build * [metax]change_build * change_metax_work * change_metax_work * change_metax_work * change_metax_work * change_metax_work * change_warpctc.cmake * change warpctc.cmake * test * change_run_ut * remove_tets * test * add_generate_pb * [metax]fix paddle bug * change_ut * change_ut * change_ut * [metax]fix patch and fix missing kernel * [metax] link mccl and fix missing kernel * [metax] rename yaml file * [metax] rm file * [metax] rm file * [metax] add Rules * [metax] change_patch * update paddle * [metax] fix dot error * [metax]rm opt path and fix activation_kernel bug * updata paddle * chang_meatx_yaml * chang_meatx_yaml * updata_metax * test * test * test * test * test * test * test * test * test * test * test * test * updata_enigen * updata_paddle --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> * test * [metax] modify kernels (#117) * modify kernels * modify kernels * fix activation_grad kernel (#118) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * fix some tests * add one test * fix one kernel --------- Co-authored-by: sw <[email protected]> Co-authored-by: duqimeng <[email protected]> Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> * updata flag_and_fix_activation * updata flag_and_fix_activation * updataignore --------- * updata_patch (#120) * updata_patch --------- * Update Paddle submodule to latest develop (#121) Co-authored-by: tianshuo78520a <[email protected]> * [metax] modify kernels (#122) * modify kernels * [Metax] fix weight_quant & weight_only_linear bug (#125) * [Metax] fix weight_quant & weight_only_linear bug * fix and add some kernels (#126) * fix and add some kernels * [Metax] fix 'WeightQuantizeKernel' wint4 branch (#133) * [Metax] fix 'WeightQuantizeKernel' wint4 branch * [Metax] add quanted weight layout transformation using CPU programming (#135) * [Metax] adjust quanted weight layout transformation * [Metax] add quanted weight layout transformation using GPU programming (#136) * [Metax] add quanted weight layout transformation using GPU programming * [Metax] updata_softmax (#138) * updata_softmax * udata patch (#139) * updata_patch --------- * [Metax] optimize wint4 quantization implementation (#140) * [Metax] optimize wint4 quantization implementation * change_flag (#141) * change_flag * [Metax] register fused_fc_elementwise_layernorm kernel (#143) * [Metax] register fused_fc_elementwise_layernorm kernel * updata paddle * [Metax] add private CI (#144) * [Metax] add private CI * [Metax] add Upload (#145) * [Metax] add Upload * test (#154) * ReRun CI (#150) * [metax]fix collect_fpn_proposals (#157) * [Metax_change_ut] * fix sum&collect_fpn_proposals op register * modify profile * [Metax] fix paddle bug replace 'MoeGradDispatchKernel' to 'MoeGateDispatchKernel' * [Metax] register bce_loss_grad & bce_loss & index_add_grad kernels * [Metax] con2d_grad use gpudnn * blas handle support * [Metax] register some kernels & update CMakeLists * [Metax] fix metax unittest fail * [Metax] add group_norm & label_smooth kernel and update matmul kernel * [Metax] fix rmsprop kernel register and add meshgrid & meshgrid_grad kernel register * add test * add test * [test] chang the logic of workspace_host in cholesky_kernel_register alloc(cpuplace,size), test pass alloc(cpuplace, size, stream), crash * [Metax] fix compile fail * Revert "[Metax] fix compile fail" This reverts commit 83bc87f686227962b0262e044225c6ed5507b824. * [Metax] fix compile fail by 'conv_transpose_grad_kernel_impl.h' * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] con2d_grad use gpudnn * [Metax]fix bug and add qr lstsq logsoftmax * [Metax] change_patch * [Metax] update unit test CMakeLists.txt * [Metax] update unit test CMakeLists.txt * [feature] add unique_consecutive kernel * [metax] add some kernel * [metax] add some kernel * [Metax] register baddbmm kernel & update blas api * [Metax] register baddbmm kernel & update blas api * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [feature] add add unique_consecutive kernel.cu * [fix] fix some test case due to missing op register * [fix] fix some fail text * [metax]fix lu eigvalshsqueeze rnn kernel * [metax]fix lu eigvalshsqueeze rnn kernel * add and fix some kernels * [Metax] register deformable_conv kernel & fix 'ModulatedDeformableCol2imCoord' symbol undefined * [Metax] fix conflict * [Metax] adapt to paddle-cpu-20250901 & resolve the issue of 'test_elementwise_mul_op_metax' failure * [Metax] update repeat_interleave kernel & ignore max op test * [metax]fix lu eigvalshsqueeze rnn kernel * [metax] chang patch fix copy * [metax] chang patch fix copy * [Metax] update metax_gpu unit test * [Metax] fix test CMakeList.txt * [metax]change_cupti_and_fix_softmax * [metax]change_patch * [metax]change_patch * [metax] updata_qr_kernel * [metax] updata_qr_kernel * [Metax] fix cufft and fix some blas kernel apply * [metax] fix bug * [Metax] add github action * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]chaneg build * [metax]fix_code style and index_elementwise_put_kernel * [metax]change_build * [metax]change_build * change_metax_work * change_metax_work * change_metax_work * change_metax_work * change_metax_work * change_warpctc.cmake * change warpctc.cmake * test * change_run_ut * remove_tets * test * add_generate_pb * [metax]fix paddle bug * change_ut * change_ut * change_ut * [metax]fix patch and fix missing kernel * [metax] link mccl and fix missing kernel * [metax] rename yaml file * [metax] rm file * [metax] rm file * [metax] add Rules * [metax] change_patch * update paddle * [metax] fix dot error * [metax]rm opt path and fix activation_kernel bug * updata paddle * chang_meatx_yaml * chang_meatx_yaml * updata_metax * test * test * test * test * test * test * test * test * test * test * test * test * updata_enigen * updata_paddle * test * updata ignore * updata_ignore * updata flag_and_fix_activation * updataignore * updata_patch * feat: add gammaln_grad_kernel.cu * updata_softmax * updata_patch * change_flag * [metax] add private CI * [metax] add private CI * [metax] add private CI * [Metax] add private CI * [Metax] add private CI * [Metax] add private CI * [Metax] add private CI * [Metax] add private CI * [Metax] add private CI * [Metax] add Upload * chang yaml * chang ut * updata_paddle * [metax] add schedule * test * [metax]fix collect_fpn_proposals --------- Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: metax666 <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: chezhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> Co-authored-by: root <[email protected]> * [Metax]Update version information (#158) * [Metax] update env (#163) * [metax] Timed trigger (#164) * 【Metax】update (#165) * [Metax] fix version (#166) * [Metax] fix nterpolate_grad_kernel (#167) * [metax]fix version.txt (#169) * test (#170) * update yaml (#171) * [Metax]add parameterized (#172) * [Metax] Assign data stream to CUDA (#174) * [Metax] fix CUDA Kernel No.50 (#175) * [metax] change yaml (#176) * [metax] Add some tests for CI (#173) * Change test script to use 8 jobs instead of 16 * 【Metax】fix patch (#178) * [METAX] Modify CI logic (#179) * [Metax] fix patch (#180) * ignore bilinear_interp_v2_op (#181) * change yaml-yml (#182) * test (#183) * rm metax ci (#184) * updata paddle (#185) * updata_paddle (#186) * tets --------- Co-authored-by: chezhang <[email protected]> Co-authored-by: duqimeng <[email protected]> Co-authored-by: Mingkun.Zhang <[email protected]> Co-authored-by: jiaxinWang-metax <[email protected]> Co-authored-by: MingkunZhang <[email protected]> Co-authored-by: zhang-chenyi <[email protected]> Co-authored-by: ZhouDuan <[email protected]> Co-authored-by: Theendlessofhell <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: ZhouDuan <[email protected]> Co-authored-by: sw <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: tianshuo78520a <[email protected]> Co-authored-by: Yuqiang Ge <[email protected]> Co-authored-by: metax666 <[email protected]>
1 parent 78f6295 commit cd89e54

File tree

5 files changed

+134
-121
lines changed

5 files changed

+134
-121
lines changed

.github/workflows/CI.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,10 @@ jobs:
4747
uses: ./.github/workflows/_IXUCA.yml
4848
needs: [Codestyle-Check]
4949

50+
metax:
51+
name: metax
52+
uses: ./.github/workflows/_Metax-X86.yml
53+
needs: [Codestyle-Check]
5054
#sdaa:
5155
#name: sdaa
5256
#uses: ./.github/workflows/_SDAA.yml

.github/workflows/_Metax-X86.yaml

Lines changed: 0 additions & 96 deletions
This file was deleted.

.github/workflows/_Metax-X86.yml

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
name: PR-CI-METAX
2+
3+
4+
on:
5+
workflow_call:
6+
inputs:
7+
workflow-name:
8+
type: string
9+
required: false
10+
clone_dir:
11+
type: string
12+
required: false
13+
default: 'PaddlecustomDevice'
14+
is_pr:
15+
type: string
16+
required: false
17+
default: 'true'
18+
19+
20+
defaults:
21+
run:
22+
shell: bash
23+
24+
25+
jobs:
26+
metax-gpu-test:
27+
runs-on: paddle-metax-runner-set
28+
env:
29+
PR_ID: ${{ github.event.pull_request.number }}
30+
COMMIT_ID: ${{ github.event.pull_request.head.sha }}
31+
BRANCH: develop
32+
33+
34+
steps:
35+
- name: Checkout repository
36+
run: |
37+
set -x
38+
wget -q --tries=5 --no-proxy https://paddle-github-action.bj.bcebos.com/PaddleCustomDevice/PR/${PR_ID}/${COMMIT_ID}/PaddleCustomDevice.tar.gz --no-check-certificate
39+
echo "Extracting PaddleCustomDevice.tar.gz"
40+
tar -xf PaddleCustomDevice.tar.gz
41+
cd PaddleCustomDevice
42+
git config --global --add safe.directory "*"
43+
git remote add upstream https://github.com/PaddlePaddle/PaddleCustomDevice.git
44+
git merge ${BRANCH} --no-edit
45+
git --no-pager log --pretty=oneline -5
46+
47+
- name: Check bypass
48+
id: check-bypass
49+
uses: ./PaddleCustomDevice/.github/actions/check-bypass
50+
with:
51+
github-token: ${{ secrets.GITHUB_TOKEN }}
52+
workflow-name: metax
53+
54+
55+
- name: RUN METAX-GPU
56+
id: run-metax
57+
if: steps.check-bypass.outputs.can-skip != 'true'
58+
run: |
59+
cd PaddleCustomDevice
60+
# !!!!! SKIP IF NO METAX CHANGE !!!!
61+
echo "=========== Checking PR Changes If METAX FULL CI Needed ==========="
62+
63+
change_numbers=$(git diff --name-only remotes/origin/${BRANCH} | wc -l)
64+
65+
change_backend=$(git diff --name-only remotes/origin/${BRANCH} | grep "backends/"| wc -l)
66+
change_metax_only=$(git diff --name-only remotes/origin/${BRANCH} | grep "backends/metax_gpu"| wc -l)
67+
git --no-pager diff --name-only remotes/origin/${BRANCH}
68+
if [ $change_numbers -ne $change_backend ]; then
69+
echo "Common file changed, continue to run METAX FULL CI test ..."
70+
echo "should_skip=false" >> $GITHUB_OUTPUT
71+
elif [ $change_metax_only -eq 0 ] ; then
72+
echo "NO METAX backend changes found, skip METAX FULL CI ...."
73+
echo "should_skip=true" >> $GITHUB_OUTPUT
74+
exit 0
75+
else
76+
echo "should_skip=false" >> $GITHUB_OUTPUT
77+
fi
78+
79+
- name: compile
80+
run: |
81+
cd PaddleCustomDevice/backends/metax_gpu
82+
bash build.sh
83+
84+
- name: run test
85+
run: |
86+
cd PaddleCustomDevice/backends/metax_gpu/tests
87+
bash run_test.sh -j 8
88+
89+
- name: push whl
90+
env:
91+
PR_ID: ${{ github.event.pull_request.number }}
92+
COMMIT_ID: ${{ github.event.pull_request.head.sha }}
93+
run: |
94+
pip install bce-python-sdk==0.8.74
95+
export AK=paddle
96+
export SK=paddle
97+
if [ ! -f "BosClient.py}" ]; then
98+
wget -q --no-proxy https://xly-devops.bj.bcebos.com/home/bos_retry.tar.gz --no-check-certificate
99+
tar xf bos_retry.tar.gz
100+
fi
101+
cp PaddleCustomDevice/backends/metax_gpu/build/dist/paddle_metax_gpu*.whl .
102+
python BosClient.py paddle_metax_gpu*.whl paddle-github-action/PaddleCustomDevice/metax_gpu/${PR_ID}/${COMMIT_ID}

backends/metax_gpu/patch/paddle.patch

Lines changed: 27 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,10 @@ index cfada544d4..a690e97d74 100644
1919

2020
set(EIGEN_INCLUDE_DIR ${SOURCE_DIR})
2121
diff --git a/paddle/fluid/operators/fused/CMakeLists.txt b/paddle/fluid/operators/fused/CMakeLists.txt
22-
index 99a0116d92..2566e7c41a 100755
22+
index 8d445b39ae..504e7b6293 100755
2323
--- a/paddle/fluid/operators/fused/CMakeLists.txt
2424
+++ b/paddle/fluid/operators/fused/CMakeLists.txt
25-
@@ -43,6 +43,11 @@ if(WITH_GPU OR WITH_ROCM)
25+
@@ -39,6 +39,11 @@ if(WITH_GPU OR WITH_ROCM)
2626
op_library(fused_multi_transformer_int8_op)
2727
endif()
2828

@@ -34,19 +34,6 @@ index 99a0116d92..2566e7c41a 100755
3434
if(CUDA_VERSION GREATER_EQUAL 11.6)
3535
op_library(fused_gemm_epilogue_op)
3636
endif()
37-
diff --git a/paddle/fluid/platform/profiler/cupti_data_process.cc b/paddle/fluid/platform/profiler/cupti_data_process.cc
38-
index bff0f2bf70..9376b5781f 100644
39-
--- a/paddle/fluid/platform/profiler/cupti_data_process.cc
40-
+++ b/paddle/fluid/platform/profiler/cupti_data_process.cc
41-
@@ -16,7 +16,7 @@
42-
43-
#include <cstdio>
44-
45-
-#include "paddle/fluid/platform/enforce.h"
46-
+// #include "paddle/fluid/platform/enforce.h"
47-
#include "paddle/phi/core/os_info.h"
48-
#include "paddle/phi/core/platform/device/gpu/gpu_info.h"
49-
#include "paddle/phi/core/platform/profiler/utils.h"
5037
diff --git a/paddle/phi/backends/dynload/cublas.h b/paddle/phi/backends/dynload/cublas.h
5138
index bda9cbe17e..c73eba9c8a 100644
5239
--- a/paddle/phi/backends/dynload/cublas.h
@@ -98,7 +85,7 @@ index 8b2e08c777..ca926df151 100644
9885
#define CUBLASLT_BLAS_ROUTINE_EACH(__macro) \
9986
__macro(cublasLtCreate); \
10087
diff --git a/paddle/phi/backends/dynload/cudnn.h b/paddle/phi/backends/dynload/cudnn.h
101-
index a943bbed9a..af931490e3 100644
88+
index ad2ada9dfa..9e8389e7dc 100644
10289
--- a/paddle/phi/backends/dynload/cudnn.h
10390
+++ b/paddle/phi/backends/dynload/cudnn.h
10491
@@ -38,7 +38,10 @@ extern void EnforceCUDNNLoaded(const char* fn_name);
@@ -134,7 +121,7 @@ index 1547909d92..ef20838434 100644
134121
} \
135122
}; \
136123
diff --git a/paddle/phi/backends/dynload/cupti.h b/paddle/phi/backends/dynload/cupti.h
137-
index 59e92955c9..d2f8c2da15 100644
124+
index 4241a512e8..94e32b743e 100644
138125
--- a/paddle/phi/backends/dynload/cupti.h
139126
+++ b/paddle/phi/backends/dynload/cupti.h
140127
@@ -24,8 +24,8 @@ limitations under the License. */
@@ -148,7 +135,7 @@ index 59e92955c9..d2f8c2da15 100644
148135

149136
extern std::once_flag cupti_dso_flag;
150137
extern void *cupti_dso_handle;
151-
@@ -71,7 +71,7 @@ extern void *cupti_dso_handle;
138+
@@ -105,7 +105,7 @@ inline bool IsXPUTracingEnabled() {
152139
CUPTI_ROUTINE_EACH(DECLARE_DYNAMIC_LOAD_CUPTI_WRAP);
153140

154141
#undef DECLARE_DYNAMIC_LOAD_CUPTI_WRAP
@@ -191,7 +178,7 @@ index e8cb0ac643..e8e7596d44 100644
191178
} \
192179
}; \
193180
diff --git a/paddle/phi/backends/dynload/dynamic_loader.cc b/paddle/phi/backends/dynload/dynamic_loader.cc
194-
index c74ae9592e..f6dc68917c 100644
181+
index 39f50bd95d..4d627b99b7 100644
195182
--- a/paddle/phi/backends/dynload/dynamic_loader.cc
196183
+++ b/paddle/phi/backends/dynload/dynamic_loader.cc
197184
@@ -18,7 +18,6 @@ limitations under the License. */
@@ -229,7 +216,7 @@ index c5309e7e11..3328571380 100644
229216
} \
230217
}; \
231218
diff --git a/paddle/phi/backends/gpu/cuda/cuda_device_function.h b/paddle/phi/backends/gpu/cuda/cuda_device_function.h
232-
index 092365a961..23d3b65dc6 100644
219+
index 092365a961..8bd3f9fcea 100644
233220
--- a/paddle/phi/backends/gpu/cuda/cuda_device_function.h
234221
+++ b/paddle/phi/backends/gpu/cuda/cuda_device_function.h
235222
@@ -1,3 +1,4 @@
@@ -347,7 +334,22 @@ index 092365a961..23d3b65dc6 100644
347334
CREATE_SHFL_MASK(mask, tid < len);
348335

349336
for (int offset = warpSize / 2; offset > 0; offset /= 2)
350-
337+
diff --git a/paddle/phi/common/float16.h b/paddle/phi/common/float16.h
338+
index d970878dc2..fe0382ccad 100644
339+
--- a/paddle/phi/common/float16.h
340+
+++ b/paddle/phi/common/float16.h
341+
@@ -105,8 +105,9 @@ struct PADDLE_ALIGN(2) float16 {
342+
#endif
343+
344+
HOSTDEVICE inline explicit float16(float val) {
345+
-#if defined(PADDLE_CUDA_FP16) && \
346+
- (defined(__HIPCC__) || (defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 300))
347+
+// #if defined(PADDLE_CUDA_FP16) && \
348+
+// (defined(__HIPCC__) || (defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 300))
349+
+#if 1
350+
half tmp = __float2half(val);
351+
x = *reinterpret_cast<uint16_t*>(&tmp);
352+
351353
diff --git a/paddle/phi/core/enforce.h b/paddle/phi/core/enforce.h
352354
index 024a7de73e..66b373d698 100644
353355
--- a/paddle/phi/core/enforce.h
@@ -651,7 +653,7 @@ index 461e6e2474..48a64ae9ce 100644
651653
dim3 threads(kWarpSize, kBlockDimY);
652654
dim3 grids(static_cast<int>((D + kWarpSize - 1) / kWarpSize));
653655
diff --git a/paddle/phi/kernels/funcs/layer_norm_impl.cu.h b/paddle/phi/kernels/funcs/layer_norm_impl.cu.h
654-
index 4eae698648..5c047723ea 100644
656+
index 470b0d33ee..d58838d53c 100644
655657
--- a/paddle/phi/kernels/funcs/layer_norm_impl.cu.h
656658
+++ b/paddle/phi/kernels/funcs/layer_norm_impl.cu.h
657659
@@ -43,11 +43,11 @@ template <typename T>
@@ -995,7 +997,7 @@ index 9d4bb18d55..80405c2b78 100644
995997
}
996998
}
997999
diff --git a/paddle/phi/kernels/fusion/gpu/masked_multihead_attention_kernel.cu b/paddle/phi/kernels/fusion/gpu/masked_multihead_attention_kernel.cu
998-
index acb3b83bc9..264d2a2b3e 100644
1000+
index 6cf08a5ac7..c09018ba78 100644
9991001
--- a/paddle/phi/kernels/fusion/gpu/masked_multihead_attention_kernel.cu
10001002
+++ b/paddle/phi/kernels/fusion/gpu/masked_multihead_attention_kernel.cu
10011003
@@ -15,7 +15,7 @@
@@ -1008,7 +1010,7 @@ index acb3b83bc9..264d2a2b3e 100644
10081010
namespace phi {
10091011
namespace fusion {
10101012
diff --git a/paddle/phi/kernels/fusion/gpu/qkv_unpack_mha_kernel.cu b/paddle/phi/kernels/fusion/gpu/qkv_unpack_mha_kernel.cu
1011-
index b2d15a59f8..f64582e85a 100644
1013+
index 1e7869afec..26ac439fc7 100644
10121014
--- a/paddle/phi/kernels/fusion/gpu/qkv_unpack_mha_kernel.cu
10131015
+++ b/paddle/phi/kernels/fusion/gpu/qkv_unpack_mha_kernel.cu
10141016
@@ -15,7 +15,7 @@
@@ -1021,7 +1023,7 @@ index b2d15a59f8..f64582e85a 100644
10211023
namespace phi {
10221024
namespace fusion {
10231025
diff --git a/paddle/phi/kernels/gpu/depthwise_conv.h b/paddle/phi/kernels/gpu/depthwise_conv.h
1024-
index 2edac5eba5..4f265e3db7 100644
1026+
index 770a3e1296..b0ec1b949b 100644
10251027
--- a/paddle/phi/kernels/gpu/depthwise_conv.h
10261028
+++ b/paddle/phi/kernels/gpu/depthwise_conv.h
10271029
@@ -29,8 +29,8 @@ namespace cub = hipcub;

backends/metax_gpu/tests/ignore.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,4 @@ test_conv3d_transpose_op
1212
test_conv3d_layer
1313
test_conv3d_transpose_part2_op
1414
test_fused_conv2d_add_act_op
15+
test_bilinear_interp_v2_op

0 commit comments

Comments
 (0)