Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions paddle/phi/kernels/legacy/gpu/fp8_gemm_blockwise_kernel.cu
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ void cublas_gemm_blockwise_impl(const DenseTensor& A,
PADDLE_CUDABLAS_CHECK(phi::dynload::cublasLtMatmulDescCreate(
&operationDesc, CUBLAS_COMPUTE_32F, CUDA_R_32F));

#if CUBLAS_VERSION >= 120804 && CUDA_VERSION >= 12060
#if CUBLAS_VERSION >= 120901 && CUDA_VERSION >= 12090
Copy link
Contributor

@lshpku lshpku Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢你发现这个问题!该特性是12.8.5引入的,原代码确实有疏忽,但是限制成12.9也不合适,如果你想继续合入这个PR,可以找找cublas官方文档依据,然后修改这里的版本

Copy link
Contributor Author

@co63oc co63oc Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用apt-get 默认安装的12.8.4, 没有更新的版本,那安装12.8.5是怎么安装
image

PR这里先按12.8.5修改

// Setup scaling for A and B
cublasLtMatmulMatrixScale_t A_scale_mode, B_scale_mode;
// Note: in cuBLAS term, tensor name A and B are swapped.
Expand Down Expand Up @@ -187,7 +187,7 @@ void cublas_gemm_blockwise_impl(const DenseTensor& A,
sizeof(B_scale_mode)));
#else
PADDLE_THROW(phi::errors::InvalidArgument(
"Sub-channel FP8 GEMM requires CUDA 12.8 and cuBLAS 12.8.4 or later."));
"Sub-channel FP8 GEMM requires CUDA 12.9 and cuBLAS 12.9.1 or later."));
#endif

// setup transa and transb
Expand Down
Loading