Skip to content

Qwesh157/conv_op_optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Convolution Operator Optimization

Introduction

This project is about convolution operator optimization on GPU

Content

  • Cuda core Implicit GEMM forward
  • Cuda core Implicit GEMM backward
  • CuTe Implicit GEMM

This blog provides a detailed introduction to the optimization steps.

/cuda Implementation on GPU
  /implicitgemm implicit gemm convolution implementation
  /implicitgemmbwd implicit gemm convolution backward implementation
/cudnn cuDNN test on GPU
/cute Using CuTe implement convolution

Build and run

$ cd cuda/implicitgemm
$ bash implgemm.sh

If you want to change the version of program, just change TARGET in Makefile

Verification

There is verification code in main.cu, which was annotated due to slow running.

// printf("===================start verfiy===================\n");
// direct_conv2dcpu(input, weight, output, n, c, h, w, k, r, s, u, v, p, q);
// int error = 0;
// for (int i = 0; i < n * k * outh * outw; i++)
// {
//     if (abs(output_host[i] - output[i]) > getPrecision(output[i]))
//     {
//         printf("error, postion:%d, gpuvalue:%f, cpuvalue:%f\n", i, output_host[i], output[i]);
//         error++;
//         break;
//     }
// }
// printf("================finish,error:%d=========================\n", error);

If you need to verify the result, just unannotate the above code to verify the correctness of the results.

TODO

  • Triton Implicit GEMM
  • Tensor core Implicit GEMM
  • Winograd-based convolution

About

This project is about convolution operator optimization on GPU, include GEMM based (Implicit GEMM) convolution.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages