-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22883][ML][TEST] Streaming tests for spark.ml.feature, from A to H #20111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22883][ML][TEST] Streaming tests for spark.ml.feature, from A to H #20111
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No existing test to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from object to class b/c this needed testTransformer from the MLTest mix-in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No existing unit test to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scala style
|
Test build #85494 has finished for PR 20111 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rearranged this test so it checks each row independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: rearranged to do validity check per-row
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here do not need to select "keys" column ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have to. The main thing here is to make sure that the transform really does happen. Other tests check validity of the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. so I prefer to use simpler code:
testTransformer[Tuple1[Vector]](dataset.toDF(), brpModel, "values") {
case Row(values: Seq[_]) =>
...
|
LGTM except a tiny issue. :) |
|
Test build #85789 has finished for PR 20111 at commit
|
12b3dcf to
448668d
Compare
|
Updated! Thanks @WeichenXu123 -- I'll merge this once tests pass. |
|
Test build #87857 has finished for PR 20111 at commit
|
|
Merging with master and branch-2.3 |
…to H ## What changes were proposed in this pull request? Adds structured streaming tests using testTransformer for these suites: * BinarizerSuite * BucketedRandomProjectionLSHSuite * BucketizerSuite * ChiSqSelectorSuite * CountVectorizerSuite * DCTSuite.scala * ElementwiseProductSuite * FeatureHasherSuite * HashingTFSuite ## How was this patch tested? It tests itself because it is a bunch of tests! Author: Joseph K. Bradley <[email protected]> Closes #20111 from jkbradley/SPARK-22883-streaming-featureAM. (cherry picked from commit 119f6a0) Signed-off-by: Joseph K. Bradley <[email protected]>
…to H ## What changes were proposed in this pull request? Adds structured streaming tests using testTransformer for these suites: * BinarizerSuite * BucketedRandomProjectionLSHSuite * BucketizerSuite * ChiSqSelectorSuite * CountVectorizerSuite * DCTSuite.scala * ElementwiseProductSuite * FeatureHasherSuite * HashingTFSuite ## How was this patch tested? It tests itself because it is a bunch of tests! Author: Joseph K. Bradley <[email protected]> Closes apache#20111 from jkbradley/SPARK-22883-streaming-featureAM. (cherry picked from commit 119f6a0) Signed-off-by: Joseph K. Bradley <[email protected]>
What changes were proposed in this pull request?
Adds structured streaming tests using testTransformer for these suites:
How was this patch tested?
It tests itself because it is a bunch of tests!