add iou similarity operator#7566
add iou similarity operator#7566wanghaox merged 9 commits intoPaddlePaddle:developfrom wanghaox:iou_sim
Conversation
| auto x_dims = ctx->GetInputDim("X"); | ||
| auto y_dims = ctx->GetInputDim("Y"); | ||
|
|
||
| PADDLE_ENFORCE_EQ(x_dims.size(), 2UL, "The shape of X is [N, 4]"); |
There was a problem hiding this comment.
The rank of Input(X) must be 2.
| using framework::OperatorWithKernel::OperatorWithKernel; | ||
|
|
||
| protected: | ||
| void InferShape(framework::InferShapeContext *ctx) const override { |
There was a problem hiding this comment.
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of IOUSimilarityOp should not be null.");
PADDLE_ENFORCE(ctx->HasInput("Y"),
"Input(Y) of IOUSimilarityOp should not be null.");|
|
||
| PADDLE_ENFORCE_EQ(x_dims.size(), 2UL, "The shape of X is [N, 4]"); | ||
| PADDLE_ENFORCE_EQ(x_dims[1], 4UL, "The shape of X is [N, 4]"); | ||
| PADDLE_ENFORCE_EQ(y_dims.size(), 2UL, "The shape of Y is [M, 4]"); |
There was a problem hiding this comment.
The rank of Input(Y) must be 2.
| AddInput( | ||
| "X", | ||
| "(Tensor, default Tensor<float>) " | ||
| "BoxList X holding N boxes, each box is " |
| "X", | ||
| "(Tensor, default Tensor<float>) " | ||
| "BoxList X holding N boxes, each box is " | ||
| "represented as [xmin, ymin, xmax, ymax], the shape of X is [N, 4]."); |
There was a problem hiding this comment.
Better to explain the meaning of xmin, ymin, xmax, ymax
|
|
||
| AddComment(R"DOC( | ||
| IOU Similarity Operator. | ||
| Computes pairwise intersection-over-union between box collections. |
There was a problem hiding this comment.
intersection-over-union (IOU) between two box lists.
|
|
||
| platform::ForRange<DeviceContext> for_range( | ||
| static_cast<const DeviceContext&>(ctx.device_context()), x_n); | ||
| for_range(functor); |
There was a problem hiding this comment.
platform::ForRange support GPU, please register GPU kernel.
| self.check_output() | ||
|
|
||
| def test_check_grad(self): | ||
| return |
|
|
||
| def setUp(self): | ||
| self.op_type = "iou_similarity" | ||
| self.set_data() |
There was a problem hiding this comment.
If only one test, the code in set_data() and init_test_data() can be moved here.
| [0.0, 0.0, 20.0, 20.0]]).astype('float32') | ||
| self.output = np.array( | ||
| [[2.0 / 16.0, 0, 6.0 / 400.0], | ||
| [1.0 / 16.0, 0.0, 5.0 / 400.0]]).astype('float32') |
There was a problem hiding this comment.
Better to use random data and calculation the IOU in Python.
There was a problem hiding this comment.
It's better to calculate in Python, but this version uses data to verify the run first.
| See the License for the specific language governing permissions and | ||
| limitations under the License. */ | ||
|
|
||
| #define EIGEN_USE_GPU |
| PADDLE_ENFORCE_EQ(y_dims.size(), 2UL, "The rank of Input(Y) must be 2."); | ||
| PADDLE_ENFORCE_EQ(y_dims[1], 4UL, "The shape of Y is [M, 4]"); | ||
|
|
||
| ctx->SetOutputDim("Out", framework::make_ddim({x_dims[0], y_dims[0]})); |
There was a problem hiding this comment.
Please consider 'X' as a LoDTensor. Here, LoD of 'out' should inherit from 'X'.
There was a problem hiding this comment.
Add ctx->ShareLoD("X", /*->*/ "Out"); in the InferShape like : https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/mul_op.cc#L68
| IOUSimilarityOpMaker(OpProto *proto, OpAttrChecker *op_checker) | ||
| : OpProtoAndCheckerMaker(proto, op_checker) { | ||
| AddInput("X", | ||
| "(Tensor, default Tensor<float>) " |
|
|
||
| AddComment(R"DOC( | ||
| IOU Similarity Operator. | ||
| Computes intersection-over-union (IOU) between two box lists. |
There was a problem hiding this comment.
The document is too simple. Please explain the function more clearly. 'X' should be a LoDTensor and 'Y' is a common Tensor, boxes in 'Y' are shared by all input images.
There was a problem hiding this comment.
done, added the formula
| T inter_xmax = xmax1 > xmax2 ? xmax2 : xmax1; | ||
| T inter_ymax = ymax1 > ymax2 ? ymax2 : ymax1; | ||
| T inter_xmin = xmin1 > xmin2 ? xmin1 : xmin2; | ||
| T inter_ymin = ymin1 > ymin2 ? ymin1 : ymin2; |
There was a problem hiding this comment.
Please use 'min' and 'max' to make the code more readable.
There was a problem hiding this comment.
Std:: min can't run under GPU.
| PADDLE_ENFORCE_EQ(y_dims.size(), 2UL, "The rank of Input(Y) must be 2."); | ||
| PADDLE_ENFORCE_EQ(y_dims[1], 4UL, "The shape of Y is [M, 4]"); | ||
|
|
||
| ctx->SetOutputDim("Out", framework::make_ddim({x_dims[0], y_dims[0]})); |
There was a problem hiding this comment.
Add ctx->ShareLoD("X", /*->*/ "Out"); in the InferShape like : https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/mul_op.cc#L68
| "[xmax, ymax] is the right upper coordinate of the box."); | ||
|
|
||
| AddOutput("Out", | ||
| "(LoDTensor or Tensor, the lod is same as input X) The output of " |
There was a problem hiding this comment.
LoDTensor or Tensor -> LoDTensor
| "(LoDTensor, default LoDTensor<float>) " | ||
| "Box list X is a 2-D LoDTensor with shape [N, 4] holds N boxes, " | ||
| "each box is represented as [xmin, ymin, xmax, ymax], " | ||
| "the shape of X is [N, 4]. [xmin, ymin] is the lower left " |
There was a problem hiding this comment.
[xmin, ymin] is the left top coordinate of the box if the input is image feature map. They are close to the origin of the coordinate system.
Modify other places too.
| IOU Similarity Operator. | ||
| Computes intersection-over-union (IOU) between two box lists. | ||
| Box list 'X' should be a LoDTensor and 'Y' is a common Tensor, | ||
| boxes in 'Y' are shared by all input images. |
There was a problem hiding this comment.
by all instance of the batched inputs of X.
| Computes intersection-over-union (IOU) between two box lists. | ||
| Box list 'X' should be a LoDTensor and 'Y' is a common Tensor, | ||
| boxes in 'Y' are shared by all input images. | ||
| Given two box A and B, the calculation of IOU is as follows: |
There was a problem hiding this comment.
Given two boxes of A and B,
| AddInput("Y", | ||
| "(Tensor, default Tensor<float>) " | ||
| "Box list Y holds M boxes, each box is represented as " | ||
| "[xmin, ymin, xmax, ymax], the shape of X is [N, 4]. " |
There was a problem hiding this comment.
the shape of X is [N, 4] -> the shape of X is [M, 4]
| T y_min1 = x_[row_id * 4 + 1]; | ||
| T x_max1 = x_[row_id * 4 + 2]; | ||
| T y_max1 = x_[row_id * 4 + 3]; | ||
| for (size_t i = 0; i < cols_; ++i) { |
There was a problem hiding this comment.
Here, cols_ is the number of prior_box, in the SSD this number is about 8732 or more, so, this is less efficient on GPU. This will be fixed later.
resolve #7565