Conversation
Codecov Report
@@ Coverage Diff @@
## master #26 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 4 5 +1
Lines 32 65 +33
=========================================
+ Hits 32 65 +33
Continue to review full report at Codecov.
|
yaoyaowd
left a comment
There was a problem hiding this comment.
Could you add some document for the class as well?
Also by eyeball the code, bits_ seems always 64, so it is larger than B and never trigger the code in the if condition.
| } | ||
| } | ||
| int64_t* prefetch_ptr = prefetch_randint_.data_ptr<int64_t>(); | ||
| int64_t res = (prefetch_ptr[size_ - 1] % range) & ((1ULL << B) - 1); |
There was a problem hiding this comment.
why do we want ((1ULL << B) - 1);? to make sure we return positive numbers? I think maybe we can make sure whatever in prefetch_ptr is unsigned.
There was a problem hiding this comment.
Treat random numbers as just unsigned bits by pointer casting
| #include <vector> | ||
|
|
||
| #include "../../../pyg_lib/csrc/random/cpu/randint_engine.h" | ||
|
|
There was a problem hiding this comment.
could you add a simple test to ensure random maybe 1000 times from a seed without duplicates.
There was a problem hiding this comment.
Added one. It sometimes failed for 10000 times so I just have 1000 times. You have made a good guess :)
| reinterpret_cast<uint64_t*>(prefetched_randint_.data_ptr<int64_t>()); | ||
| uint64_t mask = (needed == 64) ? std::numeric_limits<uint64_t>::max() | ||
| : (1ULL << needed) - 1; | ||
| uint64_t res = (prefetch_ptr[size_ - 1] & mask) % range; |
There was a problem hiding this comment.
I feel if needed=16, and bits_=64, the next 4 calls of rand(1<<15) will return the same number because size_ is the same, mask is the same, and range is the same.
There was a problem hiding this comment.
The prefetched element is shifted accordingly after that.
yaoyaowd
left a comment
There was a problem hiding this comment.
A few minor comments. The idea of using 64 bit for buffer and get 4 of 16 bit random number is brilliant. Is this benchmarked?
| namespace pyg { | ||
| namespace random { | ||
|
|
||
| const int RAND_PREFETCH_THRESHOLD = 128; |
There was a problem hiding this comment.
nit: I think this should be called RAND_PREFETCH_SIZE instead of THRESHOLD.
| public: | ||
| PrefetchedRandint() | ||
| : PrefetchedRandint(RAND_PREFETCH_THRESHOLD, RAND_PREFETCH_BITS) {} | ||
| PrefetchedRandint(int size, int bits) : size_(size), bits_(bits) { |
There was a problem hiding this comment.
I think we don't need size_(size), bits_(bits) since they were initialized in prefetch function.
| torch::randint(std::numeric_limits<int64_t>::min(), | ||
| std::numeric_limits<int64_t>::max(), {size}, | ||
| torch::TensorOptions().dtype(torch::kInt64)); | ||
| size_ = size; |
There was a problem hiding this comment.
thoughts, what if we do size_=size - 1 and in the function above you can directly use prefetch_ptr[size] instead of size-1 everywhere.
There was a problem hiding this comment.
Overall, this looks good to me, and we can definitely merge this in. I feel that PrefetchedRandint needs more documentation and comments though.
Any reason why we go with this prefetching approach in the first place? It seems hard to extend if we think about weighted sampling in later stages. What speaks against using CPUGeneratorImpl directly from PyTorch?
CPUGeneratorImpl* generator = getDefaultCPUGenerator();
generator->random();|
Compared with original randint (creating a |
| @@ -0,0 +1,118 @@ | |||
| #pragma once | |||
|
|
|||
| #include <torch/torch.h> | |||
There was a problem hiding this comment.
| #include <torch/torch.h> | |
| #include <ATen/Aten.h> |
and replace all torch:: with at::. Just found out that this heavily improves compilation time.
Added
randintto enable fast random. It can be integrated into random samplers.