You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-19460][SPARKR] Update dataset used in R documentation, examples to reduce warning noise and confusions
## What changes were proposed in this pull request?
Replace `iris` dataset with `Titanic` or other dataset in example and document.
## How was this patch tested?
Manual and existing test
Author: wm624@hotmail.com <wm624@hotmail.com>
Closes#17032 from wangmiao1981/example.
Multinomial logistic regression against three classes
582
-
```{r, warning=FALSE}
583
-
df <- createDataFrame(iris)
581
+
```{r}
582
+
t <- as.data.frame(Titanic)
583
+
training <- createDataFrame(t)
584
584
# Note in this case, Spark infers it is multinomial logistic regression, so family = "multinomial" is optional.
585
-
model <- spark.logit(df, Species ~ ., regParam = 0.056)
585
+
model <- spark.logit(training, Class ~ ., regParam = 0.07815179)
586
586
summary(model)
587
587
```
588
588
@@ -609,11 +609,12 @@ MLPC employs backpropagation for learning the model. We use the logistic loss fu
609
609
610
610
`spark.mlp` requires at least two columns in `data`: one named `"label"` and the other one `"features"`. The `"features"` column should be in libSVM-format.
611
611
612
-
We use iris data set to show how to use `spark.mlp` in classification.
613
-
```{r, warning=FALSE}
614
-
df <- createDataFrame(iris)
612
+
We use Titanic data set to show how to use `spark.mlp` in classification.
613
+
```{r}
614
+
t <- as.data.frame(Titanic)
615
+
training <- createDataFrame(t)
615
616
# fit a Multilayer Perceptron Classification Model
`spark.bisectingKmeans` is a kind of [hierarchical clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering) using a divisive (or "top-down") approach: all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy.
771
772
772
-
```{r, warning=FALSE}
773
-
df <- createDataFrame(iris)
774
-
model <- spark.bisectingKmeans(df, Sepal_Length ~ Sepal_Width, k = 4)
773
+
```{r}
774
+
t <- as.data.frame(Titanic)
775
+
training <- createDataFrame(t)
776
+
model <- spark.bisectingKmeans(training, Class ~ Survived, k = 4)
0 commit comments