Skip to content

Added Java v8 reference and install code for MacOS#1209

Merged
miguelgfierro merged 3 commits intorecommenders-team:stagingfrom
yegorkryukov:yegorkryukov/update_setup_guide
Nov 3, 2020
Merged

Added Java v8 reference and install code for MacOS#1209
miguelgfierro merged 3 commits intorecommenders-team:stagingfrom
yegorkryukov:yegorkryukov/update_setup_guide

Conversation

@yegorkryukov
Copy link
Contributor

Description

Added info about Java v8 needed for PySpark to work. Added MacOS install instructions.

Related Issues

#1207

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.
  • This PR is being made to staging and not master.

SETUP.md Outdated
<details>
<summary><strong><em>Java version for PySpark</em></strong></summary>

**Note.** By default we use pyspark v2.4.3. It doesn't work on Java versions >8.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yegorkryukov thanks for the contribution. In linux we are using 2.4.5, see this. Does 2.4.5 work with Java 8 on Mac?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miguelgfierro I'm not sure. My reco_pyspark environment uses Spark 2.4.3 and it works with Java 8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just installed 2.4.5 and now pyspark won't work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, not very stable :-)

would 2.4.5 work if you install java 9?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Didn't work with Java 9 either.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yegorkryukov I just tested on my mac (Catalina 10.15.7) with Spark 2.4.5 + Java 8 and I could run als_movielens notebook w/o any issues.
Few things we can improve the installation steps in this PR are:

  • JAVA8 installation on MacOS is relevant to readers only when they install pyspark. We can move the section under PySpark environment installation section. See Set PySpark environment variables on Linux or MacOS section e.g.. We can move JAVA8 installation section right before Set PySpark environment variables on Linux or MacOS.
  • At the end of JAVA8 installation instruction, let readers verify installation via java -version. Also, one may still use bash. We can simply add a comment: # in bash, run: . ~/.asdf/plugins/java/set-java-home.bash at the end of the instruction script.
  • @miguelgfierro I think we should remove SPARK_HOME (maybe we backup the path at env-activation and recover when deactivate env. I recall we had that codes before). If we set SPARK_HOME to spark installation path, pyspark uses that installed version instead of the version we installed (2.4.5).

Comment on lines +71 to +75
brew install asdf
asdf plugin add Java
asdf install java adoptopenjdk-8.0.265+1
asdf global java adoptopenjdk-8.0.265+1
. ~/.asdf/plugins/java/set-java-home.zsh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used a mac in a bit, but curious if asdf is needed here vs more direct use of adoptopenjdk?

brew tap AdoptOpenJDK/openjdk
brew cask install adoptopenjdk8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried this one more time rn and when I call java -version I get this:

openjdk version "14.0.1" 2020-04-14
OpenJDK Runtime Environment (build 14.0.1+14)
OpenJDK 64-Bit Server VM (build 14.0.1+14, mixed mode, sharing)

Which stops
So I chose asdf the first time because it set environmental variables correctly (or so I think).

I'm happy to use this solution though. What do you think should I do to fix this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think typically you would run something like this (and add it in .bashrc or .bash_profile)
export JAVA_HOME=$(/usr/libexec/java_home -v1.8);

but if asdf handles installation without extra setup it's fine. Does this require use of z-shell?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@loomlike can you test this out on mac and the 2.4.3 / 2.4.5 versioning issues?

Copy link
Contributor Author

@yegorkryukov yegorkryukov Oct 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this require use of z-shell?

zsh is the default shell on Catalina now I believe

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wow, i'm really falling behind the times in os/x world =)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gramhagen I tested and everything looks good. Please see my comment in the previous thread.

SETUP.md Outdated

You can specify the environment name as well with the flag `-n`.

<details>
Copy link
Collaborator

@gramhagen gramhagen Sep 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are your thoughts about putting this in the section for PySpark setup? do you think that hides it too much? If so perhaps keeping the note here, but referring to Pyspark Environment setup for details on installing Java dependencies would be more organized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about this one for some time. Initially I've placed this under PySpark environment setup. But then I thought that I would need to duplicate this piece and place it under Full (PySpark & Python GPU) environment. Hence I've placed it above those two. If you think having this piece twice under those two section makes more sense I'll move it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to put the note that using PySpark 2.4.x requires Java 8 underneath the note about xlearn so it's visible at the top level. We can add a reference there to PySpark Environment section for more details.

we should add a section in PySpark environment about Installing Java for Mac.

The Full environment section already references PySpark, so no need to repeat it there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work?

Copy link
Collaborator

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@loomlike loomlike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution and we are really sorry for delayed response! The PR looks great. I left few comments. Please check them, try to address them if possible, and make sure you merge the latest staging branch first before complete this PR.

Copy link
Collaborator

@gramhagen gramhagen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, we can tweak it later if we decide to move anything around. Thanks a lot for testing this out!

@miguelgfierro miguelgfierro merged commit d15f17f into recommenders-team:staging Nov 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants