Add broadcasting support (e.g. matrix-vector) for cos sim operator.#3918
Add broadcasting support (e.g. matrix-vector) for cos sim operator.#3918xinghai-sun merged 3 commits intoPaddlePaddle:developfrom
Conversation
paddle/operators/cos_sim_op.cc
Outdated
| "Shape of Input(Out) must be [X.Dim(0), 1]."); | ||
| auto out_grad_dims = | ||
| ctx.Input<Tensor>(framework::GradVarName("Out"))->dims(); | ||
| PADDLE_ENFORCE_EQ(out_grad_dims, framework::make_ddim({x_dims[0], 1}), |
There was a problem hiding this comment.
You can using auto x_dims = framework::make_ddim({x_dims[0], 1}) to define x_dims first, and in the following you can just use x_dims instead.
paddle/operators/cos_sim_op.h
Outdated
| // compute | ||
| auto place = context.GetEigenDevice<Place>(); | ||
| auto xy = (x * y).sum(Eigen::array<int, 1>({1})); | ||
| x_norm.device(place) = x.square().sum(Eigen::array<int, 1>({1})).sqrt(); |
There was a problem hiding this comment.
You can define row_along = Eigen::array<int, 1>({1}); first.
| auto xy = (x * y).sum(Eigen::array<int, 1>({1})); | ||
| z.device(place) = xy / x_norm / y_norm; | ||
| } else { | ||
| Eigen::DSizes<int, 2> bcast(rows_x, 1); |
There was a problem hiding this comment.
better change bcast to bcast_dims?
paddle/operators/cos_sim_op.h
Outdated
| int rows_y = in_y->dims()[0]; | ||
| int cols = framework::product(in_x->dims()) / rows_x; | ||
| auto x = EigenMatrix<T>::From(*in_x, framework::make_ddim({rows_x, cols})); | ||
| auto y = EigenMatrix<T>::From(*in_y, framework::make_ddim({rows_y, cols})); |
There was a problem hiding this comment.
you can save framework::make_ddim({rows_x, cols}) and framework::make_ddim({rows_y, cols}) to variables and reuse them in the followings.
paddle/operators/cos_sim_op.h
Outdated
| Eigen::DSizes<int, 2> bcast_row(rows_x, 1); | ||
| auto y_bcast = y.broadcast(bcast_row); | ||
| auto y_snorm_bcast = | ||
| y_norm.square().eval().broadcast(bcast_row).eval().broadcast(bcast); |
There was a problem hiding this comment.
Please merge the two broadcast operations to one like broadcast({rows, cols})
JiayiFeng
left a comment
There was a problem hiding this comment.
Very comprehensive unit tests! Thanks!
paddle/operators/cos_sim_op.cc
Outdated
| // shape check | ||
| auto x_dims = ctx.Input<Tensor>("X")->dims(); | ||
| auto y_dims = ctx.Input<Tensor>("Y")->dims(); | ||
| PADDLE_ENFORCE_EQ(framework::arity(x_dims), framework::arity(y_dims), |
There was a problem hiding this comment.
The DDim has a member function size(), with returns the arity of itself.
paddle/operators/cos_sim_op.cc
Outdated
| PADDLE_ENFORCE_EQ( | ||
| framework::slice_ddim(x_dims, 1, framework::arity(x_dims)), | ||
| framework::slice_ddim(y_dims, 1, framework::arity(y_dims)), | ||
| "All dimensions except 1st of Input(X) and Input(Y) must be equal."); |
paddle/operators/cos_sim_op.cc
Outdated
| auto y_dims = ctx.Input<Tensor>("Y")->dims(); | ||
| PADDLE_ENFORCE_EQ(framework::arity(x_dims), framework::arity(y_dims), | ||
| "Ranks of Input(X) and Input(Y) must be equal."); | ||
| PADDLE_ENFORCE_GE(framework::arity(x_dims), 2, |
There was a problem hiding this comment.
Why can't the rank of x be 1?
There was a problem hiding this comment.
The operator assumes that the first dim of the tensor is for batch axis, so we require that the rank to be larger than 1. Otherwise,it is difficult to tell whether a vector of shape 5 is actually 5 x 1 (five samples in the batch) or 1 x 5 (one sample)?
There was a problem hiding this comment.
Very reasonable! Thanks!
paddle/operators/cos_sim_op.cc
Outdated
| "Rank of Input(X) must not be less than 2."); | ||
| PADDLE_ENFORCE_EQ( | ||
| framework::slice_ddim(x_dims, 1, framework::arity(x_dims)), | ||
| framework::slice_ddim(y_dims, 1, framework::arity(y_dims)), |
There was a problem hiding this comment.
If these two sliced dims are the same, tensors must have the same rank. The above PADDLE_ENFORCE_EQ(framework::arity(x_dims), framework::arity(y_dims), ... is unnecessary.
The infershape of backward operator has the same problem.
There was a problem hiding this comment.
If we do not make sure that X and Y are of the same rank (meaning rank of Y is not less than 2), framework::slice_ddim(y_dims, 1, framework::arity(y_dims)) might throw exceptions because framework::arity(y_dims) might be smaller than 1.
paddle/operators/cos_sim_op.cc
Outdated
| framework::slice_ddim(y_dims, 1, framework::arity(y_dims)), | ||
| "All dimensions except 1st of Input(X) and Input(Y) must be equal."); | ||
| PADDLE_ENFORCE(x_dims[0] == y_dims[0] || y_dims[0] == 1, | ||
| "1st dimension of Input(Y) must be equal to Input(X) or " |
| "just 1 (which will be broadcasted to match Input(X))."); | ||
|
|
||
| // resize tensor | ||
| ctx.Output<Tensor>("Out")->Resize({x_dims[0], 1}); |
There was a problem hiding this comment.
If the second dimension is just 1, why do we need it? We can squeeze it to one dimension.
The infershape of backward operator has the same problem.
There was a problem hiding this comment.
Yes. That's a question we've discussed with @QiJune ---- make the output tensor to be of rank 1 or of rank 2? I prefer rank2, since when we use the output for further processing, it is very clear that it is a batch of samples of scalar instead of one single sample of vector. In a word, it reduces ambiguity. We can discuss this more.
paddle/operators/cos_sim_op.cc
Outdated
| AddInput("X", "The 1st input of cos_sim op."); | ||
| AddInput("Y", "The 2nd input of cos_sim op."); | ||
| AddOutput("Out", "The output of cos_sim op."); | ||
| AddOutput("XNorm", "Row norm of the first input.").AsIntermediate(); |
There was a problem hiding this comment.
What is 'row norm'? Please add more instruction for this output.
paddle/operators/cos_sim_op.h
Outdated
| int rows_x = in_x->dims()[0]; | ||
| int rows_y = in_y->dims()[0]; | ||
| int cols = framework::product(in_x->dims()) / rows_x; | ||
| auto x = EigenMatrix<T>::From(*in_x, framework::make_ddim({rows_x, cols})); |
There was a problem hiding this comment.
EigenMatrix has a member function Reshape, which converts a paddle tensor to EigenMatrix.
static typename EigenMatrix::ConstType Reshape(const Tensor& tensor, int num_col_dims);The EigenMatrix's first dimension(column length) will be the product of tensor's first num_col_dims dimensions. In current case, it is 1.
See https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/eigen.h#L67
JiayiFeng
left a comment
There was a problem hiding this comment.
LGTM. Thank you for the excellent job!
Resolve #3917