Prevent JVM Segfault #294

jdye64 · 2021-11-03T16:42:29Z

Under certain circumstances importing dask_cuda after any dask_sql imports will cause an underlying segfault to be triggered in the JVM. In tests thus far adjusting that import order in context.py prevents that from happening. Subsequent PRs may be required after further testing.

codecov-commenter · 2021-11-03T17:22:48Z

Codecov Report

Merging #294 (1ba9a26) into main (e48d9c1) will decrease coverage by 0.10%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #294      +/-   ##
==========================================
- Coverage   95.99%   95.89%   -0.11%     
==========================================
  Files          64       65       +1     
  Lines        2797     2800       +3     
  Branches      421      418       -3     
==========================================
  Hits         2685     2685              
- Misses         71       73       +2     
- Partials       41       42       +1

Impacted Files	Coverage Δ
dask_sql/context.py	`99.09% <ø> (+<0.01%)`	⬆️
dask_sql/physical/utils/groupby.py	`100.00% <100.00%> (ø)`
dask_sql/physical/utils/sort.py	`83.33% <0.00%> (-7.06%)`	⬇️
dask_sql/physical/rel/custom/__init__.py	`100.00% <0.00%> (ø)`
dask_sql/physical/rel/custom/distributeby.py	`86.36% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 77f1d87...1ba9a26. Read the comment docs.

jdye64 · 2021-11-03T18:19:56Z

I plan to add a pytest with some of the common occurrences where we were seeing the segfault occur so we can guard against those in the future. I'm just compiling that list offline but will post a pytest in the coming days.

jdye64 · 2021-11-04T18:51:01Z

rerun tests

charlesbluca · 2021-11-04T18:58:51Z

Cool, looks like this triggers the segfault! Is there any way we can wrap this up in a GPU-only test?

jdye64 · 2021-11-04T19:00:34Z

Cool, looks like this triggers the segfault! Is there any way we can wrap this up in a GPU-only test?

I think it is actually failing to find dask_cuda still

charlesbluca · 2021-11-04T19:05:15Z

Checking the gpuCI run, it looks like dask-cuda is there in the conda list step, so I'm assuming it being imported after dask_sql.Context is what's causing the segfault here. However it is failing to find the package in the host tests since we don't install any GPU requirements there.

Would we need to add dask-cuda to the host test requirements, or is there a way to contain the conditions for this segfault (i.e. the dask-cuda import) in a test that can be disabled on host?

jdye64 · 2021-11-04T20:26:39Z

Ok good news. So I commented out the fix in commit and CI failed. This is actually good. Then I un-commented the fix out again in commit and it works. Therefore I am satisfied this at least solves the segfault issue we are interested in.

dask_sql/context.py

tests/integration/fixtures.py

dask_sql/context.py

jdye64 added 2 commits November 3, 2021 11:56

testing import order of java

f703601

import dask_cuda before any dask_sql import

7d57639

jdye64 marked this pull request as draft November 3, 2021 16:42

jdye64 added 4 commits November 3, 2021 16:50

try import for dask_cuda

8324c56

try import and noqa for unused import

369d261

fix simple syntax mistake

74a763a

formatting issues

67d4609

charlesbluca mentioned this pull request Nov 4, 2021

[FEA] 0.4 Release ? #257

Closed

testing for segfaults

ee5162f

more testing

e9783b1

randerzander assigned jdye64 Nov 4, 2021

jdye64 added 2 commits November 4, 2021 19:58

replace fix

5f93659

noqa F401

1ba9a26

jdye64 marked this pull request as ready for review November 8, 2021 17:14

ayushdg requested changes Nov 8, 2021

View reviewed changes

dask_sql/context.py Outdated Show resolved Hide resolved

charlesbluca reviewed Nov 8, 2021

View reviewed changes

tests/integration/fixtures.py Show resolved Hide resolved

address review comments

0dc2aa3

ayushdg requested changes Nov 8, 2021

View reviewed changes

dask_sql/context.py Outdated Show resolved Hide resolved

jdye64 added 2 commits November 8, 2021 15:26

change ordering

b104dff

Merge remote-tracking branch 'upstream/main' into segfault_fix

0793b27

charlesbluca merged commit b5766b7 into dask-contrib:main Nov 9, 2021

jdye64 deleted the segfault_fix branch November 10, 2021 15:14

charlesbluca mentioned this pull request Nov 10, 2021

[BUG] Occasional segfaults during query parsing/execution with ucx environments #297

Closed

pentschev mentioned this pull request Mar 7, 2022

[BUG] Segfaults on "select count(*) from test" with tables on top of cuDF DataFrames #415

Closed

ksonj mentioned this pull request Jun 17, 2022

[BUG] Segfault when running with pytest #586

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prevent JVM Segfault #294

Prevent JVM Segfault #294

Uh oh!

jdye64 commented Nov 3, 2021

Uh oh!

codecov-commenter commented Nov 3, 2021 •

edited

Loading

Uh oh!

jdye64 commented Nov 3, 2021

Uh oh!

jdye64 commented Nov 4, 2021

Uh oh!

charlesbluca commented Nov 4, 2021

Uh oh!

jdye64 commented Nov 4, 2021

Uh oh!

charlesbluca commented Nov 4, 2021

Uh oh!

jdye64 commented Nov 4, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Prevent JVM Segfault #294

Prevent JVM Segfault #294

Uh oh!

Conversation

jdye64 commented Nov 3, 2021

Uh oh!

codecov-commenter commented Nov 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jdye64 commented Nov 3, 2021

Uh oh!

jdye64 commented Nov 4, 2021

Uh oh!

charlesbluca commented Nov 4, 2021

Uh oh!

jdye64 commented Nov 4, 2021

Uh oh!

charlesbluca commented Nov 4, 2021

Uh oh!

jdye64 commented Nov 4, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Nov 3, 2021 •

edited

Loading