Skip to content

Conversation

@pgorlani
Copy link
Contributor

@pgorlani pgorlani commented Dec 6, 2021

This solves #4255

The fix expresses the number of threads per block in size_t in order to avoid an infinite loop for parallel_for ranges greater than INT_MAX.

@pgorlani pgorlani requested a review from a team as a code owner December 6, 2021 18:32
steffenlarsen
steffenlarsen previously approved these changes Dec 7, 2021
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@bader bader added the cuda CUDA back-end label Dec 7, 2021
@npmiller
Copy link
Contributor

npmiller commented Dec 7, 2021

I believe this issue is also going to affect the HIP plugin:

So we should probably update that as well.

@romanovvlad romanovvlad merged commit 64720d8 into intel:sycl Dec 9, 2021
@pgorlani pgorlani deleted the int_max_range branch December 9, 2021 15:03
npmiller added a commit to npmiller/llvm that referenced this pull request Dec 9, 2021
This is the equivalent for HIP of the changes in intel#5095.

It also fixes intel#4255 for the HIP plugin.
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Dec 11, 2021
* upstream/sycl: (725 commits)
  [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122)
  [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109)
  [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928)
  [CI] Disable pack and upload steps (intel#5119)
  [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780)
  [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076)
  [SYCL] Fix typo in the name of the host-visible pool (intel#5073)
  [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983)
  [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095)
  [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107)
  [SYCL] Fix a few warnings during build scripts configuration (intel#5082)
  [SYCL] Fix amdgpu openmp test (intel#5103)
  [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066)
  [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099)
  Remove PR disable template (intel#5102)
  [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078)
  [SYCL] Fix the test to not depend on a specific line. (intel#5092)
  [CI] Provide libclc targets to build and test (intel#5091)
  Fix build of `check-llvm-spirv` target after 8f8001a
  Force opt to use new pass manager in pr52289 test after c34d157
  ...
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Dec 12, 2021
* upstream/sycl:
  [CI] Add container users to video group (intel#5101)
  [CI] More typo fixes in Nightly build (intel#5088)
  Revert "[CI] Disable pack and upload steps (intel#5119)" (intel#5125)
  [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122)
  [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109)
  [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928)
  [CI] Disable pack and upload steps (intel#5119)
  [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780)
  [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076)
  [SYCL] Fix typo in the name of the host-visible pool (intel#5073)
  [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983)
  [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095)
  [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107)
  [SYCL] Fix a few warnings during build scripts configuration (intel#5082)
  [SYCL] Fix amdgpu openmp test (intel#5103)
  [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066)
  [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099)
  Remove PR disable template (intel#5102)
  [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078)
bader pushed a commit that referenced this pull request Dec 17, 2021
…#5115)

This is the equivalent for HIP of the changes in #5095.

It also fixes #4255 for the HIP plugin.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda CUDA back-end

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants