Skip to content

Conversation

@pvelesko
Copy link
Collaborator

@pvelesko pvelesko commented Jun 18, 2024

Looked into rewriting cucc in C++ but I really don't see any benefit for it. We have a python dependency through unit testing anyways.

Fixes #877
Fixes #874

  • Incorporate vector operation overloading fix from Jenny
  • CUDA compiler drop cmake configuration
  • nvcc symbolink link
  • Implement __shfl_sync variants partially (mask off/off or print error)
  • cuda_bf16.h support
  • cuda_fp16.h support
  • math_constants.h
  • map cudaMallocAsync, cudaFreeAsync to serial versions

Future work:

  • cuda_runtime.h is not C compatible and some HeCBench tests fail to compile for that reason.
  • Implement _sync support for masks other than all 0 or 1

@pvelesko pvelesko marked this pull request as ready for review June 25, 2024 13:30
@pvelesko pvelesko requested review from franz and pjaaskel June 25, 2024 13:47
@Kerilk Kerilk requested a review from jjennychen June 27, 2024 14:49
@pvelesko pvelesko merged commit 4edbcb6 into main Jun 29, 2024
@pvelesko pvelesko deleted the cucc-cpp branch June 29, 2024 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cucc is a fork bomb if a symlink called nvcc pointing to cucc exists Double definitions in vector operator overloading for CUDA source code

3 participants