-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-19571][R] Fix SparkR test break on Windows via AppVeyor #16927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @felixcheung and @shivaram |
|
(I think I should cc @steveloughran too just FYI) |
|
Thanks @HyukjinKwon for investigating ! LGTM |
|
Thanks! Lets cc Sean on this too.
The error was very hard to read - as a separate fix, is there something in core or sql we could add to show better error when winutil is missing? Or maybe I have just missed the message?
|
|
Yea, I agree that the error message was hard to read. Maybe let me try to raise this issue in a JIRA. |
|
cc @srowen too. |
|
Test build #72876 has finished for PR 16927 at commit
|
|
merged to master, thanks for the fix! |
|
Thanks! |
## What changes were proposed in this pull request? It seems wintuils for Hadoop 2.6.5 not exiting for now in https://github.com/steveloughran/winutils This breaks the tests in SparkR on Windows so this PR proposes to use winutils built by Hadoop 2.6.4 for now. ## How was this patch tested? Manually via AppVeyor **Before** https://ci.appveyor.com/project/spark-test/spark/build/627-r-test-break **After** https://ci.appveyor.com/project/spark-test/spark/build/629-r-test-break Author: hyukjinkwon <[email protected]> Closes apache#16927 from HyukjinKwon/spark-r-windows-break.
|
Hm.. @felixcheung, the origin seems from Hadoop's issue. Probably it is https://issues.apache.org/jira/browse/HADOOP-10775. I will file an issue there after testing this with Hadoop 2.8+ when I am able to test. |
|
I could add the 2.6.5 binaries if you want, though the 2.6.4 ones should be compatible. I think I just lifted the 2.6.x artifacts out of an HDP build; its only the 2.7.x ones where I built things locally, and I don't think things were compiling the last time I tried to do a build on that VM. Regarding your comment about the new IOE, yes, that is coming from the HADOOP-10775 changes. Before: meaningless stack trace on startup irrespective of actual use, different meaningless stack trace on failures. Doesn't look like the message you've seen meets that criteria of being informative. Following the stack trace, it seems to get triggered in an This implies that something has gone wrong in the invocation, but why I don't know, and the error message isn't doing anything to help. At the very least it should indicate which arg is invalid, and join the string with some form of quotations so that you can see what's up in more detail. Now, what was the problem? Just that winutil.exe was missing? Because that's something we should be detecting and reporting with something including a link to a wiki page Improvements to the hadoop logging/diagnostics welcome here. I'd also personally like the ability to tell have the file:// client to try to set file permissions, but there you go., |
|
I'm worrying about this now: have my attempts to fix the messages gone horribly wrong. Admittedly, it was sitting in a Budapest airport with a post-ApacheCon hangover, but @afs was giving real time code reviews and I was trying to eliminate that problem "bad reporting if env is a mess". Please open a Hadoop issue, assign to me, with as much stack trace and env diagnostics as you have. We're a couple of days away from Hadoop 2.8.0-alpha; its not too late to change the reporting |
…n info in AppVeyor tests ## What changes were proposed in this pull request? This PR proposes three things as below: **Install packages per documentation** - this does not affect the tests itself (but CRAN which we are not doing via AppVeyor) up to my knowledge. This adds `knitr` and `rmarkdown` per https://github.com/apache/spark/blob/45824fb608930eb461e7df53bb678c9534c183a9/R/WINDOWS.md#unit-tests (please see 45824fb) **Improve logs/shorten logs** - actually, long logs can be a problem on AppVeyor (e.g., see #17873) `R -e ...` repeats printing R information for each invocation as below: ``` R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: i386-w64-mingw32/i386 (32-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. ``` It looks reducing the call might be slightly better and print out the versions together looks more readable. Before: ``` # R information ... > packageVersion('testthat') [1] '1.0.2' > > # R information ... > packageVersion('e1071') [1] '1.6.8' > > ... 3 more times ``` After: ``` # R information ... > packageVersion('knitr'); packageVersion('rmarkdown'); packageVersion('testthat'); packageVersion('e1071'); packageVersion('survival') [1] ‘1.16’ [1] ‘1.6’ [1] ‘1.0.2’ [1] ‘1.6.8’ [1] ‘2.41.3’ ``` **Add`appveyor.yml`/`dev/appveyor-install-dependencies.ps1` for triggering the test** Changing this file might break the test, e.g., #16927 ## How was this patch tested? Before (please see https://ci.appveyor.com/project/HyukjinKwon/spark/build/169-master) After (please see the AppVeyor build in this PR): Author: hyukjinkwon <[email protected]> Closes #18336 from HyukjinKwon/minor-add-knitr-and-rmarkdown. (cherry picked from commit 75a6d05) Signed-off-by: Sean Owen <[email protected]>
…n info in AppVeyor tests ## What changes were proposed in this pull request? This PR proposes three things as below: **Install packages per documentation** - this does not affect the tests itself (but CRAN which we are not doing via AppVeyor) up to my knowledge. This adds `knitr` and `rmarkdown` per https://github.com/apache/spark/blob/45824fb608930eb461e7df53bb678c9534c183a9/R/WINDOWS.md#unit-tests (please see 45824fb) **Improve logs/shorten logs** - actually, long logs can be a problem on AppVeyor (e.g., see #17873) `R -e ...` repeats printing R information for each invocation as below: ``` R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: i386-w64-mingw32/i386 (32-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. ``` It looks reducing the call might be slightly better and print out the versions together looks more readable. Before: ``` # R information ... > packageVersion('testthat') [1] '1.0.2' > > # R information ... > packageVersion('e1071') [1] '1.6.8' > > ... 3 more times ``` After: ``` # R information ... > packageVersion('knitr'); packageVersion('rmarkdown'); packageVersion('testthat'); packageVersion('e1071'); packageVersion('survival') [1] ‘1.16’ [1] ‘1.6’ [1] ‘1.0.2’ [1] ‘1.6.8’ [1] ‘2.41.3’ ``` **Add`appveyor.yml`/`dev/appveyor-install-dependencies.ps1` for triggering the test** Changing this file might break the test, e.g., #16927 ## How was this patch tested? Before (please see https://ci.appveyor.com/project/HyukjinKwon/spark/build/169-master) After (please see the AppVeyor build in this PR): Author: hyukjinkwon <[email protected]> Closes #18336 from HyukjinKwon/minor-add-knitr-and-rmarkdown.
What changes were proposed in this pull request?
It seems wintuils for Hadoop 2.6.5 not exiting for now in https://github.com/steveloughran/winutils
This breaks the tests in SparkR on Windows so this PR proposes to use winutils built by Hadoop 2.6.4 for now.
How was this patch tested?
Manually via AppVeyor
Before
https://ci.appveyor.com/project/spark-test/spark/build/627-r-test-break
After
https://ci.appveyor.com/project/spark-test/spark/build/629-r-test-break