Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,18 @@ import org.apache.spark.sql.types._
* a single partition, and we use a single reducer to do the aggregation.).
*/
@ExpressionDescription(
usage = "_FUNC_(expr,isIgnoreNull) - Returns the last value of `child` for a group of rows.")
usage = """_FUNC_(expr,isIgnoreNull) - Returns the last value of `child` for a group of rows.
_FUNC_(expr,isIgnoreNull=false) - Returns the last value of `child` for a group of rows.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why repeat this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are added could you also add a line about this being a non-deterministic function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just happened to take a look at this PR. I manually built and tested. It seems printing the message as below:

spark-sql> DESCRIBE FUNCTION last;
Function: last
Class: org.apache.spark.sql.catalyst.expressions.aggregate.Last
Usage: last(expr,isIgnoreNull) - Returns the last value of `child` for a group of rows.
    last(expr,isIgnoreNull=false) - Returns the last value of `child` for a group of rows.
      If isIgnoreNull is true, returns only non-null values.

Maybe, it'd be nicer if the indentation is pretty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvanhovell @HyukjinKwon Thanks for the review. I'm simply following the usage string of other functions, e.g:

spark-sql> describe function first;
Function: first
Class: org.apache.spark.sql.catalyst.expressions.aggregate.First
Usage: first(expr) - Returns the first value of `child` for a group of rows.
    first(expr,isIgnoreNull=false) - Returns the first value of `child` for a group of rows.
      If isIgnoreNull is true, returns only non-null values.

spark-sql> describe function approx_count_distinct;
Function: approx_count_distinct
Class: org.apache.spark.sql.catalyst.expressions.aggregate.HyperLogLogPlusPlus
Usage: approx_count_distinct(expr) - Returns the estimated cardinality by HyperLogLog++.
    approx_count_distinct(expr, relativeSD=0.05) - Returns the estimated cardinality by HyperLogLog++
      with relativeSD, the maximum estimation error allowed.

So it seems the current convention is that: the first line is a short one-line description, followed by a detail description. Do we have any explicit "usage string style" to follow?

@hvanhovell I'll add the note about nondeterministic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I thought it should be as below if it dose not affect anything but only this:

spark-sql> DESCRIBE FUNCTION last;
Function: last
Class: org.apache.spark.sql.catalyst.expressions.aggregate.Last
Usage: last(expr,isIgnoreNull) - Returns the last value of `child` for a group of rows.
       last(expr,isIgnoreNull=false) - Returns the last value of `child` for a group of rows.
         If isIgnoreNull is true, returns only non-null values.

This was just my personal opinion.

If isIgnoreNull is true, returns only non-null values.
""")
case class Last(child: Expression, ignoreNullsExpr: Expression) extends DeclarativeAggregate {

def this(child: Expression) = this(child, Literal.create(false, BooleanType))

private val ignoreNulls: Boolean = ignoreNullsExpr match {
case Literal(b: Boolean, BooleanType) => b
case _ =>
throw new AnalysisException("The second argument of First should be a boolean literal.")
throw new AnalysisException("The second argument of Last should be a boolean literal.")
}

override def children: Seq[Expression] = child :: ignoreNullsExpr :: Nil
Expand Down