Skip to content

add deformable psroi pooling#17827

Merged
cjt222 merged 47 commits intoPaddlePaddle:developfrom
cjt222:add_deformable_psroi_pooling
Jun 11, 2019
Merged

add deformable psroi pooling#17827
cjt222 merged 47 commits intoPaddlePaddle:developfrom
cjt222:add_deformable_psroi_pooling

Conversation

@cjt222
Copy link
Contributor

@cjt222 cjt222 commented Jun 4, 2019

1、性能测试

测试环境:系统为ubuntu16.04, 显卡为v100
(1) Deformable PSROIPooling 和PSROI Pool对比
输入:
1)input为随机生成的维度为[5, 128, 64, 64]的数据
2)rois为随机生成的16个框,维度为[16, 4],LOD分布为[4,3,2,2,5]
3)trans 为固定维度的[16, 2, 8, 8],值全部为0.5
4)参数设置:
> no_trans=0, spatial_scale=1.0, output_channels=2,
group_size=[1,1], pooled_height=8, pooled_width=8, part_size=[4, 4],
sample_per_part=4, trans_std=0.1

Event Call Total CPU Time (Ratio) GPU Time (Ratio) Min Max Ave
psroi_pool 100 7.36984 6.778908 (0.919818) 0.590927 (0.080182) 0.067989 0.126735 0.0736984
psroi_pool_grad 100 9.12887 7.187817 (0.787372) 1.941052 (0.212628) 0.084644 0.104966 0.0912887
deformable_psroi_pooling 100 10.3871 8.548373 (0.822981) 1.838717 (0.177019) 0.09416 0.131347 0.103871
deformable_psroi_pooling_grad 100 13.5551 8.890796 (0.655902) 4.664282 (0.344098) 0.126387 0.159987 0.135551

(2) Deformable PSROIPooling和PSROI Pool对比
输入:
1)input为随机生成的维度为[5, 128, 64, 64]的数据
2)rois为随机生成的16个框,维度为[16, 4],LOD分布为[4,3,2,2,5]
3)trans 为固定维度的[16, 2, 8, 8],值全部为0.5
4)参数设置:
>no_trans=1, spatial_scale=1.0, output_channels=128,
group_size=[1,1], pooled_height=8, pooled_width=8, part_size=[4, 4],
sample_per_part=4, trans_std=0.1

Event Call Total CPU Time (Ratio) GPU Time (Ratio) Min Max Ave
roi_pool 100 9.0469 7.791581 (0.861243) 1.255322 (0.138757) 0.07903 0.151427 0.090469
roi_pool_grad 100 10.018 7.788079 (0.777406) 2.229958 (0.222594) 0.086172 0.177273 0.10018
deformable_psroi_pooling 100 11.9029 8.883906 (0.746367) 3.018958 (0.25253633) 0.105524 0.148849 0.119029
deformable_psroi_pooling_grad 100 21.2068 9.212682 (0.434421) 11.994113 (0.565579) 0.198663 0.268479 0.212068

2、精度测试
请进入http://agroup.baidu.com/share/md/0f274789bcde46118202deb1fc5b6343 查看
3、API预览文档
image

AddInput("Input",
"(Tensor), "
"the input of Deformable PSROIPooling. "
"The format of input tensor is NCHW. Where N is batch size, "
Copy link
Contributor

@heavengate heavengate Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shape of input tensor is [N, C, H, W]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修正

i += blockDim.x * gridDim.x)

const int CUDA_NUM_THREADS = 1024;
inline int GET_BLOCKS(const int N) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static inline int

int x2 = ceil(x);
int y1 = floor(y);
int y2 = ceil(y);
T dist_x = (T)(x - x1);
Copy link
Contributor

@heavengate heavengate Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static_cast, 下同

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修正

const int group_width, const int part_height, const int part_width,
const int num_classes, const int channels_each_class, T* top_data,
T* top_count, int* roi_batch_id_data) {
CUDA_KERNEL_LOOP(index, count) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please seperate following code with blank lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

每次改完这行,提交时会被pre-commit重新修正回来,貌似是clang-format的原因

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,是说函数里的代码用空行分一下段,现在连在一起代码看上去会不清晰,下面的函数里也是

Copy link
Contributor Author

@cjt222 cjt222 Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已在代码间添加空行区分模块

Tensor* top_count = ctx.Output<Tensor>("TopCount");
top_count->mutable_data<T>(ctx.GetPlace());
PADDLE_ENFORCE_EQ(top_count->dims(), out->dims(),
"number of rois should be same with number of output");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to check this in InferShape

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修正


Examples:

input = fluid.layers.data(name="input",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修正

from op_test import OpTest


class TestDeformablePSROIPoolOp(OpTest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请针对几个attr取不同值多加几个测试用例

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已添加了三组样例

AddAttr<int>("no_trans",
"(int), "
"whether add offset to get new value or not while roi "
"pooling, which value is 0 or 1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果有默认值,最好加一下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

默认值为0,已添加

Copy link
Contributor

@heavengate heavengate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add python api test in python/paddle/fluid/tests/unittests/test_layers.py

@cjt222
Copy link
Contributor Author

cjt222 commented Jun 5, 2019

test_layers.py已添加

update API.spec
heavengate
heavengate previously approved these changes Jun 10, 2019
@cjt222 cjt222 closed this Jun 10, 2019
@cjt222 cjt222 reopened this Jun 10, 2019
xsrobin
xsrobin previously approved these changes Jun 10, 2019
Copy link
Contributor

@xsrobin xsrobin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

trans (Variable): Offset of features on ROIs while pooling.The format is NCHW, where
N is number of ROIs, C is number of channels, which indicate the offset distance
in the x and y directions, H is pooled height, and W is pooled width.
no_trans(integer): Whether add offset to get new value or not while roi pooling, which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why not bool ?
  2. 和前面统一,(前面留一个空格。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已改成bool变量

trans,
no_trans=0,
spatial_scale=1.0,
output_channels=64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the default output_channels is 64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change default value to input channels

output_channels(integer): The number of output channels, which should be less than input channels.
Deformable roi pooling requires output_channels = input_channels, while
deformable psroi pooling requires output_channels = input_channels *
pooled_height * pooled_width. Default: 64.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why default is 64?

Deformable roi pooling requires output_channels = input_channels, while
deformable psroi pooling requires output_channels = input_channels *
pooled_height * pooled_width. Default: 64.
group_size(list): The number of groups which input channels are divided.(eg.number of input channels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list|tuple

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修正

output_channels=3,
group_size=(1, 1),
pooled_height=8,
pooled_width=8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面描述:
output_channels = input_channels * pooled_height * pooled_width.

这里例子的设置不是这个关系。

另外,output_channels, pooled_height, pooled_width这3个要同时设置吗? 可以推理得到output_channels吧。

Copy link
Contributor Author

@cjt222 cjt222 Jun 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个应该需要看用户需求,三个量需要用户同时设置。如果用户想用deformable roi pooling这个功能的话,推理得到就实现不了这个功能了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个例子写的是deformable roi pool的例子,输入和输出的channels一致。

Copy link
Contributor Author

@cjt222 cjt222 Jun 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已将例子改为deformable psroi pooling的形式,另外修正了input_channels = output_channels * pooled_height * pooled_width.的关系。

modify comment
@cjt222 cjt222 dismissed stale reviews from xsrobin and heavengate via bea535b June 10, 2019 07:57
cjt222 added 4 commits June 10, 2019 08:00
modift comment
modift comment
update API.spec
modify comment
@cjt222 cjt222 closed this Jun 10, 2019
@cjt222 cjt222 reopened this Jun 10, 2019
cjt222 added 2 commits June 10, 2019 10:57
add inference in nn.py
update API.spec
Copy link
Contributor

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG for API.

But I see some duplicate code and some irregular code in C++ code.

cjt222 added 2 commits June 11, 2019 08:06
resolve confict
resolve confict
qingqing01
qingqing01 previously approved these changes Jun 11, 2019
Copy link
Contributor

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG for API

update API.spec
Copy link
Contributor

@xsrobin xsrobin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cjt222 cjt222 merged commit 871af28 into PaddlePaddle:develop Jun 11, 2019
@cjt222 cjt222 deleted the add_deformable_psroi_pooling branch June 11, 2019 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants