Skip to content

Conversation

@WeichenXu123
Copy link
Contributor

@WeichenXu123 WeichenXu123 commented Jul 31, 2019

What changes were proposed in this pull request?

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

How was this patch tested?

Run test

python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"

Before
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
After
Test passed

@SparkQA
Copy link

SparkQA commented Jul 31, 2019

Test build #108472 has finished for PR 25315 at commit 545736f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-28582][PYSPARK] Fix pyspark daemon exit failed when receive SIGTERM on py3.7 [SPARK-28582][PYSPARK] Fix pyspark daemon exit failed when receive SIGTERM on Python 3.7 Jul 31, 2019
@HyukjinKwon
Copy link
Member

cc @JoshRosen as well.

@HyukjinKwon
Copy link
Member

cc @shivaram and @felixcheung too who I talked with a similar symptom but by a different reason at R worker before.

@WeichenXu123
Copy link
Contributor Author

WeichenXu123 commented Aug 2, 2019

@HyukjinKwon The issue is specific to py3.7 and can only triggered in the python test case. I add a sleep in test which can avoid this issue happen and avoid change daemon code.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix the PR description and title. LGTM if the test passes.

@WeichenXu123 WeichenXu123 changed the title [SPARK-28582][PYSPARK] Fix pyspark daemon exit failed when receive SIGTERM on Python 3.7 [SPARK-28582][PYSPARK] Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7 Aug 2, 2019
@HyukjinKwon
Copy link
Member

Tested in Python 3.7. LGTM

@SparkQA
Copy link

SparkQA commented Aug 2, 2019

Test build #108566 has finished for PR 25315 at commit 688b660.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 2, 2019

Test build #108565 has finished for PR 25315 at commit a334251.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Aug 2, 2019

Merged to master, branch-2.4 and branch-2.3.

Thanks for investigating this @WeichenXu123.

HyukjinKwon pushed a commit that referenced this pull request Aug 2, 2019
… which fail on Python 3.7

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes #25315 from WeichenXu123/fix_py37_daemon.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit fbeee0c)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Aug 2, 2019
… which fail on Python 3.7

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes #25315 from WeichenXu123/fix_py37_daemon.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit fbeee0c)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Aug 2, 2019
… which fail on Python 3.7

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes #25315 from WeichenXu123/fix_py37_daemon.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit fbeee0c)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
# request shutdown
terminator(daemon)
time.sleep(1)
daemon.wait(5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Guys.

The timeout argument is Python 3 only feature.

Since this will break Python 2.7 test in all branches. I'll revert this commit from all the branches.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Aug 2, 2019

@HyukjinKwon
Copy link
Member

Oops, sorry it's my bad. Let me reopen the PR.

@dongjoon-hyun
Copy link
Member

Thanks~ BTW, the Jenkins will be shutdown from tonight.

HyukjinKwon pushed a commit to HyukjinKwon/spark that referenced this pull request Aug 3, 2019
… which fail on Python 3.7

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes apache#25315 from WeichenXu123/fix_py37_daemon.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Aug 3, 2019
…which fail on Python 3.7

This PR picks up #25315 back after removing `Popen.wait` usage which exists in Python 3 only. I saw the last test results wrongly and thought it was passed.

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes #25343 from HyukjinKwon/SPARK-28582.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit b3394db)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Aug 3, 2019
…which fail on Python 3.7

This PR picks up #25315 back after removing `Popen.wait` usage which exists in Python 3 only. I saw the last test results wrongly and thought it was passed.

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes #25343 from HyukjinKwon/SPARK-28582.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit b3394db)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@WeichenXu123 WeichenXu123 deleted the fix_py37_daemon branch August 3, 2019 03:00
rluta pushed a commit to rluta/spark that referenced this pull request Sep 17, 2019
… which fail on Python 3.7

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes apache#25315 from WeichenXu123/fix_py37_daemon.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit fbeee0c)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
rluta pushed a commit to rluta/spark that referenced this pull request Sep 17, 2019
… which fail on Python 3.7

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes apache#25315 from WeichenXu123/fix_py37_daemon.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit fbeee0c)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
rluta pushed a commit to rluta/spark that referenced this pull request Sep 17, 2019
…which fail on Python 3.7

This PR picks up apache#25315 back after removing `Popen.wait` usage which exists in Python 3 only. I saw the last test results wrongly and thought it was passed.

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes apache#25343 from HyukjinKwon/SPARK-28582.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit b3394db)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Sep 26, 2019
… which fail on Python 3.7

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes apache#25315 from WeichenXu123/fix_py37_daemon.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit fbeee0c)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Sep 26, 2019
… which fail on Python 3.7

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes apache#25315 from WeichenXu123/fix_py37_daemon.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit fbeee0c)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Sep 26, 2019
…which fail on Python 3.7

This PR picks up apache#25315 back after removing `Popen.wait` usage which exists in Python 3 only. I saw the last test results wrongly and thought it was passed.

Fix flaky test DaemonTests.do_termination_test which fail on Python 3.7. I add a sleep after the test connection to daemon.

Run test
```
python/run-tests --python-executables=python3.7 --testname "pyspark.tests.test_daemon DaemonTests"
```
**Before**
Fail on test "test_termination_sigterm". And we can see daemon process do not exit.
**After**
Test passed

Closes apache#25343 from HyukjinKwon/SPARK-28582.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit b3394db)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants