Skip to content

Conversation

@jkbradley
Copy link
Member

What changes were proposed in this pull request?

Adds structured streaming tests using testTransformer for these suites:

  • BinarizerSuite
  • BucketedRandomProjectionLSHSuite
  • BucketizerSuite
  • ChiSqSelectorSuite
  • CountVectorizerSuite
  • DCTSuite.scala
  • ElementwiseProductSuite
  • FeatureHasherSuite
  • HashingTFSuite

How was this patch tested?

It tests itself because it is a bunch of tests!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No existing test to use

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from object to class b/c this needed testTransformer from the MLTest mix-in

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No existing unit test to use

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scala style

@SparkQA
Copy link

SparkQA commented Dec 29, 2017

Test build #85494 has finished for PR 20111 at commit 12b3dcf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class BinarizerSuite extends MLTest with DefaultReadWriteTest
  • class BucketedRandomProjectionLSHSuite extends MLTest with DefaultReadWriteTest
  • class BucketizerSuite extends MLTest with DefaultReadWriteTest
  • class ChiSqSelectorSuite extends MLTest with DefaultReadWriteTest
  • class CountVectorizerSuite extends MLTest with DefaultReadWriteTest
  • class DCTSuite extends MLTest with DefaultReadWriteTest
  • class ElementwiseProductSuite extends MLTest with DefaultReadWriteTest
  • class FeatureHasherSuite extends MLTest with DefaultReadWriteTest
  • class HashingTFSuite extends MLTest with DefaultReadWriteTest

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rearranged this test so it checks each row independently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: rearranged to do validity check per-row

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here do not need to select "keys" column ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have to. The main thing here is to make sure that the transform really does happen. Other tests check validity of the values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. so I prefer to use simpler code:

testTransformer[Tuple1[Vector]](dataset.toDF(), brpModel, "values") {
    case Row(values: Seq[_]) =>
...

@WeichenXu123
Copy link
Contributor

LGTM except a tiny issue. :)

@SparkQA
Copy link

SparkQA commented Jan 8, 2018

Test build #85789 has finished for PR 20111 at commit 12b3dcf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class BinarizerSuite extends MLTest with DefaultReadWriteTest
  • class BucketedRandomProjectionLSHSuite extends MLTest with DefaultReadWriteTest
  • class BucketizerSuite extends MLTest with DefaultReadWriteTest
  • class ChiSqSelectorSuite extends MLTest with DefaultReadWriteTest
  • class CountVectorizerSuite extends MLTest with DefaultReadWriteTest
  • class DCTSuite extends MLTest with DefaultReadWriteTest
  • class ElementwiseProductSuite extends MLTest with DefaultReadWriteTest
  • class FeatureHasherSuite extends MLTest with DefaultReadWriteTest
  • class HashingTFSuite extends MLTest with DefaultReadWriteTest

@jkbradley jkbradley force-pushed the SPARK-22883-streaming-featureAM branch from 12b3dcf to 448668d Compare March 1, 2018 22:03
@jkbradley
Copy link
Member Author

Updated! Thanks @WeichenXu123 -- I'll merge this once tests pass.

@SparkQA
Copy link

SparkQA commented Mar 1, 2018

Test build #87857 has finished for PR 20111 at commit 448668d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member Author

jkbradley commented Mar 2, 2018

Merging with master and branch-2.3

@asfgit asfgit closed this in 119f6a0 Mar 2, 2018
asfgit pushed a commit that referenced this pull request Mar 2, 2018
…to H

## What changes were proposed in this pull request?

Adds structured streaming tests using testTransformer for these suites:
* BinarizerSuite
* BucketedRandomProjectionLSHSuite
* BucketizerSuite
* ChiSqSelectorSuite
* CountVectorizerSuite
* DCTSuite.scala
* ElementwiseProductSuite
* FeatureHasherSuite
* HashingTFSuite

## How was this patch tested?

It tests itself because it is a bunch of tests!

Author: Joseph K. Bradley <[email protected]>

Closes #20111 from jkbradley/SPARK-22883-streaming-featureAM.

(cherry picked from commit 119f6a0)
Signed-off-by: Joseph K. Bradley <[email protected]>
@jkbradley jkbradley deleted the SPARK-22883-streaming-featureAM branch March 2, 2018 06:28
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…to H

## What changes were proposed in this pull request?

Adds structured streaming tests using testTransformer for these suites:
* BinarizerSuite
* BucketedRandomProjectionLSHSuite
* BucketizerSuite
* ChiSqSelectorSuite
* CountVectorizerSuite
* DCTSuite.scala
* ElementwiseProductSuite
* FeatureHasherSuite
* HashingTFSuite

## How was this patch tested?

It tests itself because it is a bunch of tests!

Author: Joseph K. Bradley <[email protected]>

Closes apache#20111 from jkbradley/SPARK-22883-streaming-featureAM.

(cherry picked from commit 119f6a0)
Signed-off-by: Joseph K. Bradley <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants