Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 29 additions & 29 deletions R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ setMethod("explain",

#' isLocal
#'
#' Returns True if the `collect` and `take` methods can be run locally
#' Returns True if the \code{collect} and \code{take} methods can be run locally
#' (without any Spark executors).
#'
#' @param x A SparkDataFrame
Expand Down Expand Up @@ -182,7 +182,7 @@ setMethod("isLocal",
#' @param numRows the number of rows to print. Defaults to 20.
#' @param truncate whether truncate long strings. If \code{TRUE}, strings more than
#' 20 characters will be truncated. However, if set greater than zero,
#' truncates strings longer than `truncate` characters and all cells
#' truncates strings longer than \code{truncate} characters and all cells
#' will be aligned right.
#' @param ... further arguments to be passed to or from other methods.
#' @family SparkDataFrame functions
Expand Down Expand Up @@ -642,10 +642,10 @@ setMethod("unpersist",
#' The following options for repartition are possible:
#' \itemize{
#' \item{1.} {Return a new SparkDataFrame partitioned by
#' the given columns into `numPartitions`.}
#' \item{2.} {Return a new SparkDataFrame that has exactly `numPartitions`.}
#' the given columns into \code{numPartitions}.}
#' \item{2.} {Return a new SparkDataFrame that has exactly \code{numPartitions}.}
#' \item{3.} {Return a new SparkDataFrame partitioned by the given column(s),
#' using `spark.sql.shuffle.partitions` as number of partitions.}
#' using \code{spark.sql.shuffle.partitions} as number of partitions.}
#'}
#' @param x a SparkDataFrame.
#' @param numPartitions the number of partitions to use.
Expand Down Expand Up @@ -1406,11 +1406,11 @@ setMethod("dapplyCollect",
#'
#' @param cols grouping columns.
#' @param func a function to be applied to each group partition specified by grouping
#' column of the SparkDataFrame. The function `func` takes as argument
#' column of the SparkDataFrame. The function \code{func} takes as argument
#' a key - grouping columns and a data frame - a local R data.frame.
#' The output of `func` is a local R data.frame.
#' The output of \code{func} is a local R data.frame.
#' @param schema the schema of the resulting SparkDataFrame after the function is applied.
#' The schema must match to output of `func`. It has to be defined for each
#' The schema must match to output of \code{func}. It has to be defined for each
#' output column with preferred output column name and corresponding data type.
#' @return A SparkDataFrame.
#' @family SparkDataFrame functions
Expand Down Expand Up @@ -1497,9 +1497,9 @@ setMethod("gapply",
#'
#' @param cols grouping columns.
#' @param func a function to be applied to each group partition specified by grouping
#' column of the SparkDataFrame. The function `func` takes as argument
#' column of the SparkDataFrame. The function \code{func} takes as argument
#' a key - grouping columns and a data frame - a local R data.frame.
#' The output of `func` is a local R data.frame.
#' The output of \code{func} is a local R data.frame.
#' @return A data.frame.
#' @family SparkDataFrame functions
#' @aliases gapplyCollect,SparkDataFrame-method
Expand Down Expand Up @@ -1747,7 +1747,7 @@ setMethod("[", signature(x = "SparkDataFrame"),
#' @family subsetting functions
#' @examples
#' \dontrun{
#' # Columns can be selected using `[[` and `[`
#' # Columns can be selected using [[ and [
#' df[[2]] == df[["age"]]
#' df[,2] == df[,"age"]
#' df[,c("name", "age")]
Expand Down Expand Up @@ -1792,7 +1792,7 @@ setMethod("subset", signature(x = "SparkDataFrame"),
#' select(df, df$name, df$age + 1)
#' select(df, c("col1", "col2"))
#' select(df, list(df$name, df$age + 1))
#' # Similar to R data frames columns can also be selected using `$`
#' # Similar to R data frames columns can also be selected using $
#' df[,df$age]
#' }
#' @note select(SparkDataFrame, character) since 1.4.0
Expand Down Expand Up @@ -2443,7 +2443,7 @@ generateAliasesForIntersectedCols <- function (x, intersectedColNames, suffix) {
#' Return a new SparkDataFrame containing the union of rows
#'
#' Return a new SparkDataFrame containing the union of rows in this SparkDataFrame
#' and another SparkDataFrame. This is equivalent to `UNION ALL` in SQL.
#' and another SparkDataFrame. This is equivalent to \code{UNION ALL} in SQL.
#' Note that this does not remove duplicate rows across the two SparkDataFrames.
#'
#' @param x A SparkDataFrame
Expand Down Expand Up @@ -2486,7 +2486,7 @@ setMethod("unionAll",

#' Union two or more SparkDataFrames
#'
#' Union two or more SparkDataFrames. This is equivalent to `UNION ALL` in SQL.
#' Union two or more SparkDataFrames. This is equivalent to \code{UNION ALL} in SQL.
#' Note that this does not remove duplicate rows across the two SparkDataFrames.
#'
#' @param x a SparkDataFrame.
Expand Down Expand Up @@ -2519,7 +2519,7 @@ setMethod("rbind",
#' Intersect
#'
#' Return a new SparkDataFrame containing rows only in both this SparkDataFrame
#' and another SparkDataFrame. This is equivalent to `INTERSECT` in SQL.
#' and another SparkDataFrame. This is equivalent to \code{INTERSECT} in SQL.
#'
#' @param x A SparkDataFrame
#' @param y A SparkDataFrame
Expand Down Expand Up @@ -2547,7 +2547,7 @@ setMethod("intersect",
#' except
#'
#' Return a new SparkDataFrame containing rows in this SparkDataFrame
#' but not in another SparkDataFrame. This is equivalent to `EXCEPT` in SQL.
#' but not in another SparkDataFrame. This is equivalent to \code{EXCEPT} in SQL.
#'
#' @param x a SparkDataFrame.
#' @param y a SparkDataFrame.
Expand Down Expand Up @@ -2576,8 +2576,8 @@ setMethod("except",

#' Save the contents of SparkDataFrame to a data source.
#'
#' The data source is specified by the `source` and a set of options (...).
#' If `source` is not specified, the default data source configured by
#' The data source is specified by the \code{source} and a set of options (...).
#' If \code{source} is not specified, the default data source configured by
#' spark.sql.sources.default will be used.
#'
#' Additionally, mode is used to specify the behavior of the save operation when data already
Expand Down Expand Up @@ -2613,7 +2613,7 @@ setMethod("except",
#' @note write.df since 1.4.0
setMethod("write.df",
signature(df = "SparkDataFrame", path = "character"),
function(df, path, source = NULL, mode = "error", ...){
function(df, path, source = NULL, mode = "error", ...) {
if (is.null(source)) {
source <- getDefaultSqlSource()
}
Expand All @@ -2635,14 +2635,14 @@ setMethod("write.df",
#' @note saveDF since 1.4.0
setMethod("saveDF",
signature(df = "SparkDataFrame", path = "character"),
function(df, path, source = NULL, mode = "error", ...){
function(df, path, source = NULL, mode = "error", ...) {
write.df(df, path, source, mode, ...)
})

#' Save the contents of the SparkDataFrame to a data source as a table
#'
#' The data source is specified by the `source` and a set of options (...).
#' If `source` is not specified, the default data source configured by
#' The data source is specified by the \code{source} and a set of options (...).
#' If \code{source} is not specified, the default data source configured by
#' spark.sql.sources.default will be used.
#'
#' Additionally, mode is used to specify the behavior of the save operation when
Expand Down Expand Up @@ -2675,7 +2675,7 @@ setMethod("saveDF",
#' @note saveAsTable since 1.4.0
setMethod("saveAsTable",
signature(df = "SparkDataFrame", tableName = "character"),
function(df, tableName, source = NULL, mode="error", ...){
function(df, tableName, source = NULL, mode="error", ...) {
if (is.null(source)) {
source <- getDefaultSqlSource()
}
Expand Down Expand Up @@ -2756,7 +2756,7 @@ setMethod("summary",
#' @param minNonNulls if specified, drop rows that have less than
#' minNonNulls non-null values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\code{minNonNulls}?

#' This overwrites the how parameter.
#' @param cols optional list of column names to consider. In `fillna`,
#' @param cols optional list of column names to consider. In \code{fillna},
#' columns specified in cols that do not have matching data
#' type are ignored. For example, if value is a character, and
#' subset contains a non-character column, then the non-character
Expand Down Expand Up @@ -2880,7 +2880,7 @@ setMethod("fillna",
#'
#' @param x a SparkDataFrame.
#' @param row.names NULL or a character vector giving the row names for the data frame.
Copy link
Contributor

@junyangq junyangq Aug 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we want to use \code{NULL} or not, considering we use \code{TRUE} for logical values. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could - I don't feel very strongly with \code{TRUE} to begin with..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated a few places we are referencing NULL literally.
there are more "null" in DataFrame or column function documentation but they are in somewhat gray area - JVM null are mapped to R NA (and not to NULL) - and we should look into the best way to name functions or document them.

#' @param optional If `TRUE`, converting column names is optional.
#' @param optional If \code{TRUE}, converting column names is optional.
#' @param ... additional arguments to pass to base::as.data.frame.
#' @return A data.frame.
#' @family SparkDataFrame functions
Expand Down Expand Up @@ -3058,7 +3058,7 @@ setMethod("str",
#' @note drop since 2.0.0
setMethod("drop",
signature(x = "SparkDataFrame"),
function(x, col, ...) {
function(x, col) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to clarify removing ... is intentional ? Just wondering as we have the @param documentation above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually follows from the discussion in #14705. A summary may be seen at #14735 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - that sounds good

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, in fact, this one was added in #14705 - which we missed and shouldn't be added.

stopifnot(class(col) == "character" || class(col) == "Column")

if (class(col) == "Column") {
Expand Down Expand Up @@ -3218,8 +3218,8 @@ setMethod("histogram",
#' and to not change the existing data.
#' }
#'
#' @param x s SparkDataFrame.
#' @param url JDBC database url of the form `jdbc:subprotocol:subname`.
#' @param x a SparkDataFrame.
#' @param url JDBC database url of the form \code{jdbc:subprotocol:subname}.
#' @param tableName yhe name of the table in the external database.
#' @param mode one of 'append', 'overwrite', 'error', 'ignore' save mode (it is 'error' by default).
#' @param ... additional JDBC database connection properties.
Expand All @@ -3237,7 +3237,7 @@ setMethod("histogram",
#' @note write.jdbc since 2.0.0
setMethod("write.jdbc",
signature(x = "SparkDataFrame", url = "character", tableName = "character"),
function(x, url, tableName, mode = "error", ...){
function(x, url, tableName, mode = "error", ...) {
jmode <- convertToJSaveMode(mode)
jprops <- varargsToJProperties(...)
write <- callJMethod(x@sdf, "write")
Expand Down
30 changes: 15 additions & 15 deletions R/pkg/R/SQLContext.R
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ infer_type <- function(x) {
#' Get Runtime Config from the current active SparkSession
#'
#' Get Runtime Config from the current active SparkSession.
#' To change SparkSession Runtime Config, please see `sparkR.session()`.
#' To change SparkSession Runtime Config, please see \code{sparkR.session()}.
#'
#' @param key (optional) The key of the config to get, if omitted, all config is returned
#' @param defaultValue (optional) The default value of the config to return if they config is not
Expand Down Expand Up @@ -720,11 +720,11 @@ dropTempView <- function(viewName) {
#'
#' Returns the dataset in a data source as a SparkDataFrame
#'
#' The data source is specified by the `source` and a set of options(...).
#' If `source` is not specified, the default data source configured by
#' The data source is specified by the \code{source} and a set of options(...).
#' If \code{source} is not specified, the default data source configured by
#' "spark.sql.sources.default" will be used. \cr
#' Similar to R read.csv, when `source` is "csv", by default, a value of "NA" will be interpreted
#' as NA.
#' Similar to R read.csv, when \code{source} is "csv", by default, a value of "NA" will be
#' interpreted as NA.
#'
#' @param path The path of files to load
#' @param source The name of external data source
Expand Down Expand Up @@ -791,8 +791,8 @@ loadDF <- function(x, ...) {
#' Creates an external table based on the dataset in a data source,
#' Returns a SparkDataFrame associated with the external table.
#'
#' The data source is specified by the `source` and a set of options(...).
#' If `source` is not specified, the default data source configured by
#' The data source is specified by the \code{source} and a set of options(...).
#' If \code{source} is not specified, the default data source configured by
#' "spark.sql.sources.default" will be used.
#'
#' @param tableName a name of the table.
Expand Down Expand Up @@ -830,22 +830,22 @@ createExternalTable <- function(x, ...) {
#' Additional JDBC database connection properties can be set (...)
#'
#' Only one of partitionColumn or predicates should be set. Partitions of the table will be
#' retrieved in parallel based on the `numPartitions` or by the predicates.
#' retrieved in parallel based on the \code{numPartitions} or by the predicates.
#'
#' Don't create too many partitions in parallel on a large cluster; otherwise Spark might crash
#' your external database systems.
#'
#' @param url JDBC database url of the form `jdbc:subprotocol:subname`
#' @param url JDBC database url of the form \code{jdbc:subprotocol:subname}
#' @param tableName the name of the table in the external database
#' @param partitionColumn the name of a column of integral type that will be used for partitioning
#' @param lowerBound the minimum value of `partitionColumn` used to decide partition stride
#' @param upperBound the maximum value of `partitionColumn` used to decide partition stride
#' @param numPartitions the number of partitions, This, along with `lowerBound` (inclusive),
#' `upperBound` (exclusive), form partition strides for generated WHERE
#' clause expressions used to split the column `partitionColumn` evenly.
#' @param lowerBound the minimum value of \code{partitionColumn} used to decide partition stride
#' @param upperBound the maximum value of \code{partitionColumn} used to decide partition stride
#' @param numPartitions the number of partitions, This, along with \code{lowerBound} (inclusive),
#' \code{upperBound} (exclusive), form partition strides for generated WHERE
#' clause expressions used to split the column \code{partitionColumn} evenly.
#' This defaults to SparkContext.defaultParallelism when unset.
#' @param predicates a list of conditions in the where clause; each one defines one partition
#' @param ... additional JDBC database connection named propertie(s).
#' @param ... additional JDBC database connection named properties.
#' @return SparkDataFrame
#' @rdname read.jdbc
#' @name read.jdbc
Expand Down
22 changes: 11 additions & 11 deletions R/pkg/R/WindowSpec.R
Original file line number Diff line number Diff line change
Expand Up @@ -125,11 +125,11 @@ setMethod("orderBy",

#' rowsBetween
#'
#' Defines the frame boundaries, from `start` (inclusive) to `end` (inclusive).
#' Defines the frame boundaries, from \code{start} (inclusive) to \code{end} (inclusive).
#'
#' Both `start` and `end` are relative positions from the current row. For example, "0" means
#' "current row", while "-1" means the row before the current row, and "5" means the fifth row
#' after the current row.
#' Both \code{start} and \code{end} are relative positions from the current row. For example,
#' "0" means "current row", while "-1" means the row before the current row, and "5" means the
#' fifth row after the current row.
#'
#' @param x a WindowSpec
#' @param start boundary start, inclusive.
Expand Down Expand Up @@ -157,12 +157,12 @@ setMethod("rowsBetween",

#' rangeBetween
#'
#' Defines the frame boundaries, from `start` (inclusive) to `end` (inclusive).
#' Defines the frame boundaries, from \code{start} (inclusive) to \code{end} (inclusive).
#'
#' Both \code{start} and \code{end} are relative from the current row. For example, "0" means
#' "current row", while "-1" means one off before the current row, and "5" means the five off
#' after the current row.
#'
#' Both `start` and `end` are relative from the current row. For example, "0" means "current row",
#' while "-1" means one off before the current row, and "5" means the five off after the
#' current row.

#' @param x a WindowSpec
#' @param start boundary start, inclusive.
#' The frame is unbounded if this is the minimum long value.
Expand Down Expand Up @@ -195,8 +195,8 @@ setMethod("rangeBetween",
#' Define a windowing column.
#'
#' @param x a Column, usually one returned by window function(s).
#' @param window a WindowSpec object. Can be created by `windowPartitionBy` or
#' `windowOrderBy` and configured by other WindowSpec methods.
#' @param window a WindowSpec object. Can be created by \code{windowPartitionBy} or
#' \code{windowOrderBy} and configured by other WindowSpec methods.
#' @rdname over
#' @name over
#' @aliases over,Column,WindowSpec-method
Expand Down
2 changes: 1 addition & 1 deletion R/pkg/R/column.R
Original file line number Diff line number Diff line change
Expand Up @@ -284,7 +284,7 @@ setMethod("%in%",
#' otherwise
#'
#' If values in the specified column are null, returns the value.
#' Can be used in conjunction with `when` to specify a default value for expressions.
#' Can be used in conjunction with \code{when} to specify a default value for expressions.
#'
#' @param x a Column.
#' @param value value to replace when the corresponding entry in \code{x} is NA.
Expand Down
Loading