Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -60,14 +60,44 @@ if ! [ -z ${HADOOP_CONF_DIR+x} ]; then
SPARK_CLASSPATH="$HADOOP_CONF_DIR:$SPARK_CLASSPATH";
fi

IGNORE_DEFAULT_DRIVER_JVM_OPTIONS=${IGNORE_DEFAULT_DRIVER_JVM_OPTIONS:-false}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have DEFAULT_DRIVER_JVM_OPTIONS instead of this flag?

  1. By default, DEFAULT_DRIVER_JVM_OPTIONS=-XX:OnOutOfMemoryError="kill -9 %p" and it will be appended before spark.driver.extraJavaOptions.
  2. If users unset DEFAULT_DRIVER_JVM_OPTIONS, then only spark.driver.extraJavaOptions works.

At (1), if spark.driver.extraJavaOptions has -XX:OnOutOfMemoryError=, it will supersede DEFAULT_DRIVER_JVM_OPTIONS because the last one is used.

Copy link
Contributor Author

@skonto skonto Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun How can they unset DEFAULT_DRIVER_JVM_OPTIONS at runtime? Probably missing something here. Do you mean by setting the container env var eg. spark.kubernetes.driverEnv.DEFAULT_DRIVER_JVM_OPTIONS=""?

Copy link
Contributor Author

@skonto skonto Jul 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun the only option I see is setting DEFAULT_DRIVER_JVM_OPTIONS to empty string to signal the script to use the default, which is similar to setting that flag to true. I will the rename the flag but not sure if we are aligned.

If users unset DEFAULT_DRIVER_JVM_OPTIONS

How should the user do that? Could you elaborate a bit more?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun gentle ping :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Sorry for being late, @skonto . I'll take a look this PR tomorrow again~

Copy link
Member

@dongjoon-hyun dongjoon-hyun Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, what I thought was something like the following

DEFAULT_DRIVER_JVM_OPTIONS=${DEFAULT_DRIVER_JVM_OPTIONS:--XX:OnOutOfMemoryError="kill -9 %p"}

And users do export DEFAULT_DRIVER_JVM_OPTIONS=' ' for unset. Ya. My comment was unclear at that time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skonto . What do you think about the above? For me, the above looks more direct.

Copy link
Contributor Author

@skonto skonto Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun Ok so that is what I also concluded :) I can do that, it looks fine to me, the only issue is that this way we open the door to adding arbitrary options there. Initially that was what I didnt want to support, thus I used that the flag to make defaults unchangeable (defaults usually have fixed values).
But we could be flexible here for simplicity reasons, so I will change to what you suggest.

DRIVER_VERBOSE=${DRIVER_VERBOSE:-false}

function get_verbose_flag()
{
if [[ $DRIVER_VERBOSE == "true" ]]; then
echo "--verbose"
else
echo ""
fi
}

function get_args_with_defaults()
{
if [[ $IGNORE_DEFAULT_DRIVER_JVM_OPTIONS == "true" ]]; then
echo "$@"
else
if grep -q "spark.driver.extraJavaOptions" "/opt/spark/conf/spark.properties"; then
sed 's/spark.driver.extraJavaOptions=/&-XX:OnOutOfMemoryError="kill -9 %p" /g' /opt/spark/conf/spark.properties > /tmp/spark.properties
Copy link
Contributor Author

@skonto skonto Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/opt/spark/conf/spark.properties cannot be written in place as its a mounted config map.

else
cp /opt/spark/conf/spark.properties /tmp/spark.properties
echo 'spark.driver.extraJavaOptions=-XX:OnOutOfMemoryError="kill -9 %p"' >> /tmp/spark.properties
fi
echo "$@" | sed 's|/opt/spark/conf/spark.properties |/tmp/spark.properties |g'
Copy link
Contributor Author

@skonto skonto Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot avoid sed if we want to make sure the default is always added. Otherwise if we just add spark.driver.extraJavaOptions to CMD, user provided spark.driver.extraJavaOptions will override it, as Spark conf has precedence rules in case the same property is used more than once (the last value is the one chosen).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you follow https://github.com/apache/spark/pull/25229/files#r310866032, the suggested behavior is the following. So, it seems that we don't need this complexity.

At (1), if spark.driver.extraJavaOptions has -XX:OnOutOfMemoryError=, it will supersede DEFAULT_DRIVER_JVM_OPTIONS because the last one is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so we say that spark.driver.extraJavaOptions overrides whatever the default is.

fi
}

case "$1" in
driver)
shift 1
DRIVER_ARGS=$(get_args_with_defaults "$@")
VERBOSE_FLAG=$(get_verbose_flag)
Copy link
Contributor Author

@skonto skonto Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use this flag so in the tests we can detect what properties we get. Also it it useful in general users want to debug Spark.

CMD=(
"$SPARK_HOME/bin/spark-submit"
$VERBOSE_FLAG
--conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
--deploy-mode client
"$@"
$DRIVER_ARGS
)
;;
executor)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,28 @@ private[spark] trait BasicTestsSuite { k8sSuite: KubernetesSuite =>
expectedJVMValue = Seq("(spark.test.foo,spark.test.bar)"))
}

test("Run SparkPi without the default exit on OOM error flag", k8sTestTag) {
sparkAppConf
.set("spark.driver.extraJavaOptions", "-Dspark.test.foo=spark.test.bar")
.set("spark.kubernetes.driverEnv.DRIVER_VERBOSE", "true")
.set("spark.kubernetes.driverEnv.IGNORE_DEFAULT_DRIVER_JVM_OPTIONS", "true")

val output = Seq("Pi is roughly 3",
"(spark.driver.extraJavaOptions,-Dspark.test.foo=spark.test.bar)")

runSparkPiAndVerifyCompletion(expectedLogOnCompletion = output)
}

test("Run SparkPi with the default exit on OOM error flag set", k8sTestTag) {
sparkAppConf
.set("spark.driver.extraJavaOptions", "-Dspark.test.foo=spark.test.bar")
.set("spark.kubernetes.driverEnv.DRIVER_VERBOSE", "true")
val output = Seq("Pi is roughly 3",
"(spark.driver.extraJavaOptions,-XX:OnOutOfMemoryError=\"kill -9 %p\" " +
"-Dspark.test.foo=spark.test.bar)")
runSparkPiAndVerifyCompletion(expectedLogOnCompletion = output)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding these test cases.


test("Run SparkRemoteFileTest using a remote data file", k8sTestTag) {
sparkAppConf
.set("spark.files", REMOTE_PAGE_RANK_DATA_FILE)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -161,13 +161,14 @@ class KubernetesSuite extends SparkFunSuite
appResource: String = containerLocalSparkDistroExamplesJar,
driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
expectedLogOnCompletion: Seq[String] = Seq("Pi is roughly 3"),
appArgs: Array[String] = Array.empty[String],
appLocator: String = appLocator,
isJVM: Boolean = true ): Unit = {
runSparkApplicationAndVerifyCompletion(
appResource,
SPARK_PI_MAIN_CLASS,
Seq("Pi is roughly 3"),
expectedLogOnCompletion,
appArgs,
driverPodChecker,
executorPodChecker,
Expand Down