Skip to content

Conversation

@Micket
Copy link
Contributor

@Micket Micket commented Mar 15, 2023

(created using eb --new-pr)

Fixes: #17489

@Micket Micket added the new label Mar 15, 2023
@Micket
Copy link
Contributor Author

Micket commented Mar 15, 2023

From todays zoom meeting, i whipped out what this might look like.

I didn't check if these versions are really compatible with 1.12 or not.

@boegel boegel added this to the 4.x milestone Mar 15, 2023
'sources': [{'download_filename': 'v0.14.1.tar.gz', 'filename': '%(name)s-%(version)s.tar.gz'}],
'checksums': ['ced67e1cf1f97e168cdf271851a4d0b6d382ab7936e7bcbb39aaa87239c324b6'],
}),
('torchtext', '0.14.1', {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work without additional effort I think, at least not if we want to be sure that the provided SentencePiece is used.

@PetrKralCZ has been fighting with this, and is about to open a PR for a torchtext easyconfig...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, i forgot about that sed thing while i was fighting the download filenames

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the old sed command that was used in earlier versions, completely untested. @PetrKralCZ if you have any feedback on this part i would appreciate it.

@lexming
Copy link
Contributor

lexming commented Mar 16, 2023

This approach looks fine to me 👍

(linked the issue to this PR)

Copy link
Contributor

@VRehnberg VRehnberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to look over the compatibility of the versions and the only clash I found was for torchdata.

boegel
boegel previously requested changes Mar 28, 2023
@@ -0,0 +1,59 @@
easyblock = 'PythonBundle'

name = 'PyTorch-bundle'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't PyTorch-extensions be a better name?

I think PyTorch-bundle was chosen in analogy to SciPy-bundle, but that's a collection of Python packages from different contexts, while here we're only bundling "official" PyTorch extensions (since they're all obtained via https://github.com/pytorch)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lexming @Micket Can we make up our minds here, so we can get this merged?

Copy link
Contributor

@VRehnberg VRehnberg Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Micket didn't have a strong opinion (see below) I'd say that you can probably decide @boegel. To add some nuance I'd like to note that it is a PythonBundle of PyTorch domain libraries under https://github.com/pytorch, though excluding PyTorch it self, and as it is a PythonBundle these libraries are added in an extension list. So, I think PyTorch-extensions and PyTorch-bundle can both make sense, or even PyTorch-extensions-bundle if you'd like to be even more specific.

I looked through the names of other bundles and these where the only other relevant ones I found (i.e. they didn't just have names corresponding to one of the dependencies or extensions):

  • ESL-Bundle
  • R-bundle-Bioconductor
  • SciPy-bundle

As this is the only thing remaining I'd appreciate if it could be resolved soon.

Copy link
Contributor

@lexming lexming Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote goes for PyTorch-bundle

@boegel boegel modified the milestones: 4.x, next release (4.7.2?) Mar 28, 2023
@VRehnberg
Copy link
Contributor

I wouldn't mind having PyTorch TensorBoard Profiler https://github.com/pytorch/kineto/tree/main/tb_plugin in this bundle as well, but it depends on TensorBoard so I don't know if that would make it too problematic.

@boegel
Copy link
Member

boegel commented Apr 13, 2023

I wouldn't mind having PyTorch TensorBoard Profiler https://github.com/pytorch/kineto/tree/main/tb_plugin in this bundle as well, but it depends on TensorBoard so I don't know if that would make it too problematic.

The scope for this is PyTorch extensions that are developed under the PyTorch flag (i.e. in https://github.com/pytorch).
I'm not sure if pulling in 3rd-party stuff like tensorboard is wise...

@VRehnberg
Copy link
Contributor

VRehnberg commented Apr 14, 2023

I wouldn't mind having PyTorch TensorBoard Profiler https://github.com/pytorch/kineto/tree/main/tb_plugin in this bundle as well, but it depends on TensorBoard so I don't know if that would make it too problematic.

The scope for this is PyTorch extensions that are developed under the PyTorch flag (i.e. in https://github.com/pytorch). I'm not sure if pulling in 3rd-party stuff like tensorboard is wise...

Well PyTorch TensorBoard Profiler is in https://github.com/pytorch as you can see from the link I included, so that wasn't the issue here. It's rather a question of whether it would be problematic to pull in too much stuff into this bundle. It wouldn't be the only extension with dependencies either not only the only extension with dependencies in the extension list as portalocker will have to be added for torchdata.

I'm also unsure what failure modes to be aware of when there are several packages with the same python package as extension. Would this mean that PyTorch-bundle and TensorFlow couldn't be loaded at the same time for example if both have TensorBoard as an extension.

Edit: Might be that TensorBoard isn't technically a dependency just needed to visualize the results so might be a non-issue. I'll try. Edit 2: That seems to have been false at least.

@VRehnberg
Copy link
Contributor

VRehnberg commented Apr 14, 2023

I also ran into some incompatibilities with versions. I'm not sure whether we want to follow the compatibility matrices or not as that doesn't seem to have been the case historically (see e.g. recent torchtext vs compatibility matrix).

Other matrices:
https://github.com/pytorch/data/#version-compatibility
https://github.com/pytorch/text#installation
https://github.com/pytorch/vision#installation
https://pytorch.org/audio/main/installation.html
ignite doesn't have one.

Copy link
Contributor

@VRehnberg VRehnberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had problems with torchtext and torchaudio because of their submodules but otherwise these are the changes for which I've gotten the rest to work.

@Micket
Copy link
Contributor Author

Micket commented Apr 17, 2023

@VRehnberg and I decided that it was to much work to get the torchaudio thing working, it fetches and git-submodules a bunch of thirdparty sources during build step, and our SoX easyconfig uses an conflicting FFmpeg dep, so since it's not a super popular package we just dropped it for now.

So all that's left to decide if we should rename this PyTorch-extensions instead? I have no strong opinions.

@VRehnberg
Copy link
Contributor

Test report by @VRehnberg
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
alvis5-09 - Linux Rocky Linux 8.6, x86_64, Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz, 4 x NVIDIA NVIDIA A40, 520.61.05, Python 3.6.8
See https://gist.github.com/VRehnberg/8ffc9e0e55c3e9829a57c307e18783c5 for a full test report.

@VRehnberg
Copy link
Contributor

Test report by @VRehnberg
SUCCESS
Build succeeded for 2 out of 2 (1 easyconfigs in total)
alvis1-13 - Linux Rocky Linux 8.6, x86_64, Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz, 2 x NVIDIA Tesla V100-SXM2-32GB, 520.61.05, Python 3.6.8
See https://gist.github.com/VRehnberg/db9d6944068cf975aa82835f56326c4c for a full test report.

Copy link
Contributor

@lexming lexming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lexming
Copy link
Contributor

lexming commented Apr 18, 2023

@boegelbot: please test @ generoso

@boegelbot
Copy link
Collaborator

@lexming: Request for testing this PR well received on login1

PR test command 'EB_PR=17540 EB_ARGS= EB_CONTAINER= /opt/software/slurm/bin/sbatch --job-name test_PR_17540 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 10670

Test results coming soon (I hope)...

Details

- notification for comment with ID 1512556697 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 2 (1 easyconfigs in total)
cns1 - Linux Rocky Linux 8.5, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/6cac8644dc63e4c9f13f3e96d232d63d for a full test report.

@Micket
Copy link
Contributor Author

Micket commented Apr 18, 2023

no GPUs on generoso, won't be able to build any CUDA stuff that has tests

@lexming
Copy link
Contributor

lexming commented Apr 18, 2023

oopsie...

@lexming
Copy link
Contributor

lexming commented Apr 18, 2023

@boegelbot please test @ jsc-zen2

@boegelbot
Copy link
Collaborator

@lexming: Request for testing this PR well received on jsczen2l1.int.jsc-zen2.easybuild-test.cluster

PR test command 'EB_PR=17540 EB_ARGS= /opt/software/slurm/bin/sbatch --mem-per-cpu=4000M --job-name test_PR_17540 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen2.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 2533

Test results coming soon (I hope)...

Details

- notification for comment with ID 1513031826 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (1 easyconfigs in total)
jsczen2c1.int.jsc-zen2.easybuild-test.cluster - Linux Rocky Linux 8.5, x86_64, AMD EPYC 7742 64-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/boegelbot/e02b06d33610719bfbecdf04e3829a1c for a full test report.

@lexming lexming dismissed boegel’s stale review April 20, 2023 16:32

use PyTorch-bundle for consistency with the name of other bundle easyconfigs

@lexming
Copy link
Contributor

lexming commented Apr 20, 2023

Going in, thanks @Micket and @VRehnberg !

@lexming lexming merged commit 8efb9fa into easybuilders:develop Apr 20, 2023
@Micket Micket deleted the 20230315175515_new_pr_PyTorch-bundle1121 branch April 21, 2023 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PyTorch packages collected into single easyconfig, e.g. torchvision as extension?

5 participants