Skip to content

Commit 57ed4df

Browse files
suqilongMridul Muralidharan
authored andcommitted
[SPARK-33669] Wrong error message from YARN application state monitor when sc.stop in yarn client mode
### What changes were proposed in this pull request? This change make InterruptedIOException to be treated as InterruptedException when closing YarnClientSchedulerBackend, which doesn't log error with "YARN application has exited unexpectedly xxx" ### Why are the changes needed? For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries to interrupt Yarn application monitor thread. In MonitorThread.run() it catches InterruptedException to gracefully response to stopping request. But client.monitorApplication method also throws InterruptedIOException when the hadoop rpc call is calling. In this case, MonitorThread will not know it is interrupted, a Yarn App failed is returned with "Failed to contact YARN for application xxxxx; YARN application has exited unexpectedly with state xxxxx" is logged with error level. which confuse user a lot. ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? very simple patch, seems no need? Closes #30617 from sqlwindspeaker/yarn-client-interrupt-monitor. Authored-by: suqilong <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> (cherry picked from commit 48f93af) Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
1 parent fa50fa1 commit 57ed4df

2 files changed

Lines changed: 5 additions & 2 deletions

File tree

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1069,7 +1069,7 @@ private[spark] class Client(
10691069
logError(s"Application $appId not found.")
10701070
cleanupStagingDir()
10711071
return YarnAppReport(YarnApplicationState.KILLED, FinalApplicationStatus.KILLED, None)
1072-
case NonFatal(e) =>
1072+
case NonFatal(e) if !e.isInstanceOf[InterruptedIOException] =>
10731073
val msg = s"Failed to contact YARN for application $appId."
10741074
logError(msg, e)
10751075
// Don't necessarily clean up staging dir because status is unknown

resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@
1717

1818
package org.apache.spark.scheduler.cluster
1919

20+
import java.io.InterruptedIOException
21+
2022
import scala.collection.mutable.ArrayBuffer
2123

2224
import org.apache.hadoop.yarn.api.records.YarnApplicationState
@@ -121,7 +123,8 @@ private[spark] class YarnClientSchedulerBackend(
121123
allowInterrupt = false
122124
sc.stop()
123125
} catch {
124-
case e: InterruptedException => logInfo("Interrupting monitor thread")
126+
case _: InterruptedException | _: InterruptedIOException =>
127+
logInfo("Interrupting monitor thread")
125128
}
126129
}
127130

0 commit comments

Comments
 (0)