[SPARK-29121][ML][MLLIB] Support for dot product operation on Vector(s)#25818
[SPARK-29121][ML][MLLIB] Support for dot product operation on Vector(s)#25818phpisciuneri wants to merge 6 commits intoapache:masterfrom phpisciuneri:SPARK-29121
Conversation
| * If `size` does not match an [[IllegalArgumentException]] is thrown. | ||
| */ | ||
| @Since("3.0.0") | ||
| def dot(v1: Vector, v2: Vector): Double = BLAS.dot(v1, v2) |
There was a problem hiding this comment.
Actually, do we need this method? BLAS.dot() already exists. I can see an instance method taking a single arg for parity with Pyspark, but this doesn't add much.
There was a problem hiding this comment.
It is private to spark, hence the simple wrapping: https://github.com/apache/spark/blob/master/mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala#L26
There was a problem hiding this comment.
Ah right, fair point. Still you can just call a.dot(b) after the first method is added, no?
There was a problem hiding this comment.
Yes indeed I can. Lemme fix. Thanks!
There was a problem hiding this comment.
Oh, I meant, I don't think there is value in adding this method, because a caller can use a.dot(b) directly.
There was a problem hiding this comment.
Ah, I got ya. Yep, I can remove those.
|
@srowen I'll take a look at what is supported in PySpark and see if there are any more gaps. I would enjoy working on this... |
I was just putting this out there as volunteering for future work if it is of interest to the community. I don't have anything else to add to this PR unless there is further review. |
|
Test build #4880 has finished for PR 25818 at commit
|
|
Merged to master |
What changes were proposed in this pull request?
Support for dot product with:
ml.linalg.Vectorml.linalg.Vectorsmllib.linalg.Vectormllib.linalg.VectorsWhy are the changes needed?
Dot product is useful for feature engineering and scoring. BLAS routines are already there, just a wrapper is needed.
Does this PR introduce any user-facing change?
No user facing changes, just some new functionality.
How was this patch tested?
Tests were written and added to the appropriate
VectorSuitesclasses. They can be quickly run with: