Skip to content

Conversation

@warrenzhu25
Copy link
Contributor

@warrenzhu25 warrenzhu25 commented Jul 8, 2020

What changes were proposed in this pull request?

Fix regression bug in load-spark-env.cmd with Spark 3.0.0

Why are the changes needed?

cmd doesn't support set env twice. So set SPARK_ENV_CMD=%SPARK_CONF_DIR%\%SPARK_ENV_CMD% doesn't take effect, which caused regression.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manually tested.

  1. Create a spark-env.cmd under conf folder. Inside this, echo spark-env.cmd
  2. Run old load-spark-env.cmd, nothing printed in the output
  3. Run fixed load-spark-env.cmd, spark-env.cmd showed in the output.

@dongjoon-hyun
Copy link
Member

cc @srowen

@dongjoon-hyun
Copy link
Member

@warrenzhu25 . Could you elaborate about what is the regression you are mentioning here?

Fix regression bug in load-spark-env.cmd with Spark 3.0.0

@HyukjinKwon
Copy link
Member

ok to test

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Jul 9, 2020

@warrenzhu25 it should be best to show the input like your command you tried and output from the command explicitly before/after the fix in PR description especially given that not so many dev people here arguably don't have Windows env.

@SparkQA
Copy link

SparkQA commented Jul 9, 2020

Test build #125441 has finished for PR 29044 at commit bb018bb.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

@warrenzhu25 As you see, we have a Windows OS test job in AppVeyor.
This patch consistently fails there. Could you take a look at that?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, master branch is healthy on Windows.

To be considered as a valid patch, this PR should pass AppVeyor test job on Windows.

@warrenzhu25
Copy link
Contributor Author

@warrenzhu25 As you see, we have a Windows OS test job in AppVeyor.
This patch consistently fails there. Could you take a look at that?

As I see, the failure seems unrelated with this change. It seems it couldn't find correct version of hadoop.dll.

@dongjoon-hyun
Copy link
Member

It's directly relevant to this PR because your patch is changing environment variable.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32227] Fix regression bug in load-spark-env.cmd with Spark 3.0.0 [WIP][SPARK-32227] Fix regression bug in load-spark-env.cmd with Spark 3.0.0 Jul 10, 2020
@dongjoon-hyun
Copy link
Member

Please remove [WIP] from the title when AppVeyor passes on Windows. Thanks.

@warrenzhu25
Copy link
Contributor Author

warrenzhu25 commented Jul 10, 2020

It's directly relevant to this PR because your patch is changing environment variable.

winutils only impacted by PATH and HADOOP_HOME, and I don't touch both. Also, my change is just reverting into the version as 2.4.4. Could you help rerun the tests?

@warrenzhu25
Copy link
Contributor Author

@dongjoon-hyun Could you help retest this as failing tests might be unrelated?

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Jul 27, 2020

@warrenzhu25 you can push empty commit to retrigger the tests, or rebase to sync with the master branch.

@dongjoon-hyun
Copy link
Member

+1 for @HyukjinKwon 's comment.

@HyukjinKwon
Copy link
Member

Okay, I am debugging a related issue and happened to take a look again for this fix. this fix looks good. AppVeyor currently fails for another issue which I'll probably be able to fix soon.

@SparkQA
Copy link

SparkQA commented Jul 27, 2020

Test build #126617 has finished for PR 29044 at commit d6219d8.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 27, 2020

Test build #126659 has finished for PR 29044 at commit d169de3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@warrenzhu25 warrenzhu25 changed the title [WIP][SPARK-32227] Fix regression bug in load-spark-env.cmd with Spark 3.0.0 [SPARK-32227] Fix regression bug in load-spark-env.cmd with Spark 3.0.0 Jul 27, 2020
@SparkQA
Copy link

SparkQA commented Jul 28, 2020

Test build #126670 has finished for PR 29044 at commit 24af13c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 28, 2020

Test build #126680 has finished for PR 29044 at commit 4589e1b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@warrenzhu25
Copy link
Contributor Author

@dongjoon-hyun Could you help merge this?

@HyukjinKwon
Copy link
Member

Merged to master and branch-3.0.

@HyukjinKwon
Copy link
Member

Thanks for working on this @warrenzhu25.

HyukjinKwon pushed a commit that referenced this pull request Jul 30, 2020
Fix regression bug in load-spark-env.cmd with Spark 3.0.0

cmd doesn't support set env twice. So set `SPARK_ENV_CMD=%SPARK_CONF_DIR%\%SPARK_ENV_CMD%` doesn't take effect, which caused regression.

No

Manually tested.
1. Create a spark-env.cmd under conf folder. Inside this, `echo spark-env.cmd`
2. Run old load-spark-env.cmd, nothing printed in the output
2. Run fixed load-spark-env.cmd, `spark-env.cmd` showed in the output.

Closes #29044 from warrenzhu25/32227.

Lead-authored-by: Warren Zhu <[email protected]>
Co-authored-by: Warren Zhu <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 7437720)
Signed-off-by: HyukjinKwon <[email protected]>
@dongjoon-hyun
Copy link
Member

Thank you all. I see that this passed the GitHub Action at 4589e1b .

@warrenzhu25 warrenzhu25 deleted the 32227 branch June 11, 2022 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants