Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

It seems wintuils for Hadoop 2.6.5 not exiting for now in https://github.com/steveloughran/winutils

This breaks the tests in SparkR on Windows so this PR proposes to use winutils built by Hadoop 2.6.4 for now.

How was this patch tested?

Manually via AppVeyor

Before

https://ci.appveyor.com/project/spark-test/spark/build/627-r-test-break

After

https://ci.appveyor.com/project/spark-test/spark/build/629-r-test-break

@HyukjinKwon HyukjinKwon changed the title [WIP][SPARK-19571][R] Fix SparkR test break on Windows via AppVeyor [SPARK-19571][R] Fix SparkR test break on Windows via AppVeyor Feb 14, 2017
@HyukjinKwon
Copy link
Member Author

cc @felixcheung and @shivaram

@HyukjinKwon
Copy link
Member Author

(I think I should cc @steveloughran too just FYI)

@shivaram
Copy link
Contributor

Thanks @HyukjinKwon for investigating ! LGTM

@felixcheung
Copy link
Member

felixcheung commented Feb 14, 2017 via email

@HyukjinKwon
Copy link
Member Author

Yea, I agree that the error message was hard to read. Maybe let me try to raise this issue in a JIRA.

@HyukjinKwon
Copy link
Member Author

cc @srowen too.

@SparkQA
Copy link

SparkQA commented Feb 14, 2017

Test build #72876 has finished for PR 16927 at commit 58af8fc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

merged to master, thanks for the fix!

@asfgit asfgit closed this in f776e3b Feb 14, 2017
@HyukjinKwon
Copy link
Member Author

Thanks!

cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 15, 2017
## What changes were proposed in this pull request?

It seems wintuils for Hadoop 2.6.5 not exiting for now in https://github.com/steveloughran/winutils

This breaks the tests in SparkR on Windows so this PR proposes to use winutils built by Hadoop 2.6.4 for now.

## How was this patch tested?

Manually via AppVeyor

**Before**

https://ci.appveyor.com/project/spark-test/spark/build/627-r-test-break

**After**

https://ci.appveyor.com/project/spark-test/spark/build/629-r-test-break

Author: hyukjinkwon <[email protected]>

Closes apache#16927 from HyukjinKwon/spark-r-windows-break.
@HyukjinKwon
Copy link
Member Author

Hm.. @felixcheung, the origin seems from Hadoop's issue.

java.io.IOException: (null) entry in command string: null chmod 0644 C:\Users\appveyor\AppData\Local\Temp\1\RtmpCuLueF\spark-mlpb486948329c.tmp\rMetadata\_temporary\0\_temporary\attempt_20170214140636_0069_m_000000_70\part-00000
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:859)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:842)
        at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)
        at org.apache.hadoop.fs.ChecksumFileSystem$1.apply(ChecksumFileSystem.java:501)
        at org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:482)
        at org.apache.hadoop.fs.ChecksumFileSystem.setPermission(ChecksumFileSystem.java:498)
        at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:467)
        at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:433)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:801)

Probably it is https://issues.apache.org/jira/browse/HADOOP-10775. I will file an issue there after testing this with Hadoop 2.8+ when I am able to test.

@steveloughran
Copy link
Contributor

steveloughran commented Feb 24, 2017

I could add the 2.6.5 binaries if you want, though the 2.6.4 ones should be compatible.

I think I just lifted the 2.6.x artifacts out of an HDP build; its only the 2.7.x ones where I built things locally, and I don't think things were compiling the last time I tried to do a build on that VM.

Regarding your comment about the new IOE, yes, that is coming from the HADOOP-10775 changes.

Before: meaningless stack trace on startup irrespective of actual use, different meaningless stack trace on failures.
After: startup failures tucked away until later, actual uses intended to provide enough information that somebody competent can track down why things are broken.

Doesn't look like the message you've seen meets that criteria of being informative.

Following the stack trace, it seems to get triggered in an exec(String[] args) if one of the arguments is null Shell.java#L1192.

This implies that something has gone wrong in the invocation, but why I don't know, and the error message isn't doing anything to help. At the very least it should indicate which arg is invalid, and join the string with some form of quotations so that you can see what's up in more detail.

Now, what was the problem? Just that winutil.exe was missing? Because that's something we should be detecting and reporting with something including a link to a wiki page

Improvements to the hadoop logging/diagnostics welcome here. I'd also personally like the ability to tell have the file:// client to try to set file permissions, but there you go.,

@steveloughran
Copy link
Contributor

I'm worrying about this now: have my attempts to fix the messages gone horribly wrong. Admittedly, it was sitting in a Budapest airport with a post-ApacheCon hangover, but @afs was giving real time code reviews and I was trying to eliminate that problem "bad reporting if env is a mess".

Please open a Hadoop issue, assign to me, with as much stack trace and env diagnostics as you have. We're a couple of days away from Hadoop 2.8.0-alpha; its not too late to change the reporting

asfgit pushed a commit that referenced this pull request Jun 18, 2017
…n info in AppVeyor tests

## What changes were proposed in this pull request?

This PR proposes three things as below:

**Install packages per documentation** - this does not affect the tests itself (but CRAN which we are not doing via AppVeyor) up to my knowledge.

This adds `knitr` and `rmarkdown` per https://github.com/apache/spark/blob/45824fb608930eb461e7df53bb678c9534c183a9/R/WINDOWS.md#unit-tests (please see 45824fb)

**Improve logs/shorten logs** - actually, long logs can be a problem on AppVeyor (e.g., see #17873)

`R -e ...` repeats printing R information for each invocation as below:

```
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: i386-w64-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
```

It looks reducing the call might be slightly better and print out the versions together looks more readable.

Before:

```
# R information ...
> packageVersion('testthat')
[1] '1.0.2'
>
>

# R information ...
> packageVersion('e1071')
[1] '1.6.8'
>
>
... 3 more times
```

After:

```
# R information ...
> packageVersion('knitr'); packageVersion('rmarkdown'); packageVersion('testthat'); packageVersion('e1071'); packageVersion('survival')
[1] ‘1.16’
[1] ‘1.6’
[1] ‘1.0.2’
[1] ‘1.6.8’
[1] ‘2.41.3’
```

**Add`appveyor.yml`/`dev/appveyor-install-dependencies.ps1` for triggering the test**

Changing this file might break the test, e.g., #16927

## How was this patch tested?

Before (please see https://ci.appveyor.com/project/HyukjinKwon/spark/build/169-master)
After (please see the AppVeyor build in this PR):

Author: hyukjinkwon <[email protected]>

Closes #18336 from HyukjinKwon/minor-add-knitr-and-rmarkdown.

(cherry picked from commit 75a6d05)
Signed-off-by: Sean Owen <[email protected]>
asfgit pushed a commit that referenced this pull request Jun 18, 2017
…n info in AppVeyor tests

## What changes were proposed in this pull request?

This PR proposes three things as below:

**Install packages per documentation** - this does not affect the tests itself (but CRAN which we are not doing via AppVeyor) up to my knowledge.

This adds `knitr` and `rmarkdown` per https://github.com/apache/spark/blob/45824fb608930eb461e7df53bb678c9534c183a9/R/WINDOWS.md#unit-tests (please see 45824fb)

**Improve logs/shorten logs** - actually, long logs can be a problem on AppVeyor (e.g., see #17873)

`R -e ...` repeats printing R information for each invocation as below:

```
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: i386-w64-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
```

It looks reducing the call might be slightly better and print out the versions together looks more readable.

Before:

```
# R information ...
> packageVersion('testthat')
[1] '1.0.2'
>
>

# R information ...
> packageVersion('e1071')
[1] '1.6.8'
>
>
... 3 more times
```

After:

```
# R information ...
> packageVersion('knitr'); packageVersion('rmarkdown'); packageVersion('testthat'); packageVersion('e1071'); packageVersion('survival')
[1] ‘1.16’
[1] ‘1.6’
[1] ‘1.0.2’
[1] ‘1.6.8’
[1] ‘2.41.3’
```

**Add`appveyor.yml`/`dev/appveyor-install-dependencies.ps1` for triggering the test**

Changing this file might break the test, e.g., #16927

## How was this patch tested?

Before (please see https://ci.appveyor.com/project/HyukjinKwon/spark/build/169-master)
After (please see the AppVeyor build in this PR):

Author: hyukjinkwon <[email protected]>

Closes #18336 from HyukjinKwon/minor-add-knitr-and-rmarkdown.
@HyukjinKwon HyukjinKwon deleted the spark-r-windows-break branch January 2, 2018 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants