Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

GBTs inherit from HasStepSize & LInearSVC/Binarizer from HasThreshold

How was this patch tested?

existing tests

@SparkQA
Copy link

SparkQA commented Jul 12, 2017

Test build #79557 has finished for PR 18612 at commit 40f52cc.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class Binarizer(JavaTransformer, HasInputCol, HasOutputCol, HasThreshold,

@SparkQA
Copy link

SparkQA commented Jul 12, 2017

Test build #79560 has finished for PR 18612 at commit 3e92a7e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

final val threshold: DoubleParam = new DoubleParam(this, "threshold", "threshold in binary classification prediction, in range [0, 1]", ParamValidators.inRange(0, 1))
val threshold: DoubleParam = new DoubleParam(this, "threshold", "threshold in binary classification prediction, in range [0, 1]", ParamValidators.inRange(0, 1))

setDefault(threshold, 0.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the error can be eliminated after removing this line, and actually we should set default value in the concrete class if it's inherited.

@SparkQA
Copy link

SparkQA commented Jul 17, 2017

Test build #79652 has finished for PR 18612 at commit 94f6ca5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor

holdenk commented Jul 20, 2017

So improving the doc certainly makes sense on the Python side (thank you for catching that). But as @MLnick asked on the JIRA - I'm not super sure I see the point of the refactor on the Scala side. If we have a parameter with the same name but a different meaning why mix in a base trait only to override its only meaningful provided value?

What do you think @jkbradley?

@zhengruifeng
Copy link
Contributor Author

@holdenk I think the meaning of StepSize in GBT and Threshould in LInearSVC/Binarizer is almost same as that in other algs, so it maybe better to make them inherit from same trait?

@MLnick
Copy link
Contributor

MLnick commented Jul 21, 2017

It makes sense for LinearSVC since the threshold has the same meaning for other binary classifiers (e.g. LogisticRegression). The meaning of threshold for Binarizer is very different as it operates on feature values in a feature vector, not raw predicted values from a model. I would say back out the Binarizer changes here.

As for GBT, well I am sort of indifferent as the step size is roughly analogous between the two definitions (in both cases it is a learning rate).

@zhengruifeng zhengruifeng changed the title [SPARK-21388][ML][PySpark] GBTs inherit from HasStepSize & LInearSVC/Binarizer from HasThreshold [SPARK-21388][ML][PySpark] GBTs inherit from HasStepSize & LInearSVC from HasThreshold Jul 24, 2017
@SparkQA
Copy link

SparkQA commented Jul 24, 2017

Test build #79897 has finished for PR 18612 at commit aaa9187.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


@inherit_doc
class Binarizer(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadable, JavaMLWritable):
class Binarizer(JavaTransformer, HasInputCol, HasOutputCol, HasThreshold,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to be reverted.

kwargs = self._input_kwargs
return self._set(**kwargs)

@since("1.4.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and similarly here, added back

@SparkQA
Copy link

SparkQA commented Jul 27, 2017

Test build #80001 has finished for PR 18612 at commit fb2ce45.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class Binarizer(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadable, JavaMLWritable):

Copy link
Contributor

@yanboliang yanboliang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yanboliang
Copy link
Contributor

Merged into master, thanks for all.

@asfgit asfgit closed this in 253a07e Aug 1, 2017
@zhengruifeng zhengruifeng deleted the override_HasXXX branch August 2, 2017 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants