Skip to content

fix distribute fpn proposals, test=develop#16152

Merged
jerrywgz merged 8 commits intoPaddlePaddle:developfrom
jerrywgz:fix_distribute_fpn_op
May 6, 2019
Merged

fix distribute fpn proposals, test=develop#16152
jerrywgz merged 8 commits intoPaddlePaddle:developfrom
jerrywgz:fix_distribute_fpn_op

Conversation

@jerrywgz
Copy link
Contributor

@jerrywgz jerrywgz commented Mar 11, 2019

  1. Add GPU kernel for distribute_fpn_proposals op
  2. Fix the distribute formula to avoid zero result
  3. Clean code

Menevagol
Menevagol previously approved these changes Mar 11, 2019
Copy link

@Menevagol Menevagol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the operation collect_fpn_proposals, modify the <= at compare_scores function to <

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change to sizeof(uint64_t) * 8 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will case cudalasterror when kNumCUDAThreads=512

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will cause CudaLastError when kNumCUDAThreads = 512;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not copy code from other file, like RangeInitFunctor, CUDA_1D_KERNEL_LOOP, RoIArea, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, while CUDA_1D_KERNEL_LOOP is widely used in our cuda kernels

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(T) -> static_cast<T>()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add __syncthreads ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jerrywgz jerrywgz force-pushed the fix_distribute_fpn_op branch from 68d75d8 to 2b6c154 Compare April 25, 2019 12:14
@jerrywgz jerrywgz force-pushed the fix_distribute_fpn_op branch from 23b3160 to cf3de9f Compare April 26, 2019 05:57
struct RangeInitFunctor {
int start_;
int delta_;
int* out_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct的成员变量无_结尾,参考google C++ code style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix it in collect_fpn_op

@jerrywgz jerrywgz merged commit cc95a75 into PaddlePaddle:develop May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants