Skip to content

Conversation

@YKTian-x2b
Copy link
Contributor

@YKTian-x2b YKTian-x2b commented May 27, 2024

PR Category

Inference

PR Types

Others

Description

CMakeLists添加自定义target,让make阶段可以执行fused_conv2d和gemm_epilogue的编译脚本,分别生成对应so。

不在CMakeLists里直接编译的原因:内核用的cutlass可能和paddle子模块的版本不一样,伴随着可能需要C++17。所以,选择执行对应脚本。

P-card-71501

@paddle-bot
Copy link

paddle-bot bot commented May 27, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

cd $build_directory
cmake .. -DPYTHON_EXECUTABLE=$python_exe_path -DCUDA_TOOLKIT_ROOT_DIR=$cuda_root_path -DCOMPUTE_CAPABILITY=$gpu_cc
make -j
make -j10
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的线程数待定,可能是$(nproc)/4 之类的。如果线程少了,会出现paddle都编完了,cutlass算子动态库还在编的尴尬情况。

@zhoutianzi666 zhoutianzi666 changed the title cutlass kernel compile optimization [Paddle Inference]cutlass kernel compile optimization May 29, 2024
gpu_cc="${3:-$default_gpu_cc}"

cd $build_directory
cmake .. -DPYTHON_EXECUTABLE=$python_exe_path -DCUDA_TOOLKIT_ROOT_DIR=$cuda_root_path -DCOMPUTE_CAPABILITY=$gpu_cc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是否应该用paddle里默认的cmake

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get!马上改。

yuanlehome
yuanlehome previously approved these changes May 30, 2024
Copy link
Contributor

@zhoutianzi666 zhoutianzi666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LSTM

@zhoutianzi666 zhoutianzi666 merged commit c2836d0 into PaddlePaddle:develop May 31, 2024
co63oc pushed a commit to co63oc/Paddle that referenced this pull request Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants