Skip to content

Conversation

@boegel
Copy link
Contributor

@boegel boegel commented Aug 21, 2025

fixes #70

@boegel

This comment was marked as outdated.

@eessi-bot-deucalion

This comment was marked as outdated.

@boegel boegel force-pushed the fix_parallel_limit branch from 7f8299a to 439749a Compare August 22, 2025 07:47
@trz42
Copy link
Contributor

trz42 commented Aug 22, 2025

Testing new bot release v0.9.0 on Deucalion
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-deucalion arch:aarch64/a64fx

edit: no response, but entry in bot log about missing for:, so 👍 w.r.t. test for bot v0.9.0

@trz42
Copy link
Contributor

trz42 commented Aug 22, 2025

Trying new bot build command syntax...
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-deucalion for:CPU=aarch64/a64fx

@trz42

This comment was marked as outdated.

@eessi-bot-deucalion

This comment was marked as outdated.

@boegel boegel force-pushed the fix_parallel_limit branch from 439749a to 123e17b Compare August 22, 2025 09:52
@boegel

This comment was marked as outdated.

@eessi-bot-deucalion

This comment was marked as outdated.

@boegel
Copy link
Contributor Author

boegel commented Aug 22, 2025

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Aug 22, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2025.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.08/pr_71/521065

date job status comment
Aug 22 16:25:57 UTC 2025 submitted job id 521065 awaits release by job manager
Aug 22 16:26:20 UTC 2025 released job awaits launch by Slurm scheduler
Aug 22 16:27:33 UTC 2025 running job 521065 is running
Aug 22 21:05:19 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-521065.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2025.06-software-linux-aarch64-a64fx-17558955850.tar.gzsize: 1437 MiB (1507592270 bytes)
entries: 2685
modules under 2025.06/software/linux/aarch64/a64fx/modules/all
GCC/14.3.0.lua
GCCcore/14.3.0.lua
software under 2025.06/software/linux/aarch64/a64fx/software
GCC/14.3.0
GCCcore/14.3.0
reprod directories under 2025.06/software/linux/aarch64/a64fx/reprod
GCC/14.3.0/20250822_204557UTC
GCCcore/14.3.0/
GCCcore/14.3.0/20250822_204549UTC
other under 2025.06/software/linux/aarch64/a64fx
2025.06/init/easybuild/eb_hooks.py
Aug 22 21:05:19 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-521065.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Aug 25 16:34:07 UTC 2025 uploaded transfer of eessi-2025.06-software-linux-aarch64-a64fx-17558955850.tar.gz to S3 bucket succeeded
Aug 26 10:58:53 UTC 2025 uploaded transfer of eessi-2025.06-software-linux-aarch64-a64fx-17558955850.tar.gz to S3 bucket succeeded

@boegel boegel force-pushed the fix_parallel_limit branch from caff884 to 8132905 Compare August 25, 2025 07:47
@boegel
Copy link
Contributor Author

boegel commented Aug 25, 2025

4.5 hours is a lot more reasonable than 10 hours, so the test build of GCC 14.3.0 shows that the fix is working as intended.

I'll get rid of the test easystack file and trigger another build so we can deploy the updated hooks...

@boegel
Copy link
Contributor Author

boegel commented Aug 25, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2

@boegel boegel marked this pull request as ready for review August 25, 2025 07:48
@eessi-bot-aws
Copy link

eessi-bot-aws bot commented Aug 25, 2025

New job on instance eessi-bot-mc-aws for repository eessi.io-2023.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2025.08/pr_71/85364

date job status comment
Aug 25 07:48:08 UTC 2025 submitted job id 85364 awaits release by job manager
Aug 25 07:48:38 UTC 2025 released job awaits launch by Slurm scheduler
Aug 25 07:53:41 UTC 2025 running job 85364 is running
Aug 25 07:57:46 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-85364.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-17561084470.tar.gzsize: 0 MiB (19834 bytes)
entries: 1
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
reprod directories under 2023.06/software/linux/x86_64/amd/zen2/reprod
no reprod directories in tarball
other under 2023.06/software/linux/x86_64/amd/zen2
2023.06/init/easybuild/eb_hooks.py
Aug 25 07:57:46 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:x86_64_amd_zen2+default
P: perf: 440.795 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:x86_64_amd_zen2+default
P: perf: 432.192 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /775175bf @BotBuildTests:x86_64_amd_zen2+default
P: latency: 1.77 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /52707c40 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 1.94 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /b1aacda9 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 3.93 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /c6bad193 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 4.18 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 0.56 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:x86_64_amd_zen2+default
P: latency: 0.62 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:x86_64_amd_zen2+default
P: bandwidth: 7381.4 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:x86_64_amd_zen2+default
P: bandwidth: 7409.59 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-85364.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Aug 25 16:33:44 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen2-17561084470.tar.gz to S3 bucket succeeded

@eessi-bot-aws
Copy link

eessi-bot-aws bot commented Aug 25, 2025

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2025.08/pr_71/85365

date job status comment
Aug 25 07:48:13 UTC 2025 submitted job id 85365 awaits release by job manager
Aug 25 07:48:36 UTC 2025 released job awaits launch by Slurm scheduler
Aug 25 07:53:41 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-85365.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2025.06-software-linux-x86_64-amd-zen2-17561083920.tar.gzsize: 0 MiB (19838 bytes)
entries: 1
modules under 2025.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2025.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
reprod directories under 2025.06/software/linux/x86_64/amd/zen2/reprod
no reprod directories in tarball
other under 2025.06/software/linux/x86_64/amd/zen2
2025.06/init/easybuild/eb_hooks.py
Aug 25 07:53:41 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-85365.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Aug 25 16:33:52 UTC 2025 uploaded transfer of eessi-2025.06-software-linux-x86_64-amd-zen2-17561083920.tar.gz to S3 bucket succeeded

Copy link
Contributor

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@trz42
Copy link
Contributor

trz42 commented Aug 25, 2025

bot:status

@trz42 trz42 added 2025.06-software.eessi.io 2025.06 version of software.eessi.io 2023.06-software.eessi.io 2023.06 version of software.eessi.io bot:deploy labels Aug 25, 2025
@boegel
Copy link
Contributor Author

boegel commented Aug 26, 2025

staging tarballs also included test build of GCC/14.3.0 on A64FX, which was not the intention, so I closed the staging PR

triggering another build to try and get only updated eb_hooks.py deployed...

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2

@eessi-bot-aws
Copy link

eessi-bot-aws bot commented Aug 26, 2025

New job on instance eessi-bot-mc-aws for repository eessi.io-2023.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2025.08/pr_71/85576

date job status comment
Aug 26 07:55:36 UTC 2025 submitted job id 85576 awaits release by job manager
Aug 26 07:55:51 UTC 2025 released job awaits launch by Slurm scheduler
Aug 26 08:00:55 UTC 2025 running job 85576 is running
Aug 26 08:04:00 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-85576.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-17561952790.tar.gzsize: 0 MiB (19831 bytes)
entries: 1
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
reprod directories under 2023.06/software/linux/x86_64/amd/zen2/reprod
no reprod directories in tarball
other under 2023.06/software/linux/x86_64/amd/zen2
2023.06/init/easybuild/eb_hooks.py
Aug 26 08:04:00 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:x86_64_amd_zen2+default
P: perf: 441.664 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:x86_64_amd_zen2+default
P: perf: 443.675 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /775175bf @BotBuildTests:x86_64_amd_zen2+default
P: latency: 1.8 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /52707c40 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 1.72 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /b1aacda9 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 4.01 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /c6bad193 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 4.23 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 0.58 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:x86_64_amd_zen2+default
P: latency: 0.55 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:x86_64_amd_zen2+default
P: bandwidth: 7335.06 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:x86_64_amd_zen2+default
P: bandwidth: 7403.57 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-85576.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Aug 26 10:58:32 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen2-17561952790.tar.gz to S3 bucket succeeded

@eessi-bot-aws
Copy link

eessi-bot-aws bot commented Aug 26, 2025

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2025.08/pr_71/85577

date job status comment
Aug 26 07:55:40 UTC 2025 submitted job id 85577 awaits release by job manager
Aug 26 07:55:49 UTC 2025 released job awaits launch by Slurm scheduler
Aug 26 08:00:53 UTC 2025 running job 85577 is running
Aug 26 08:01:57 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-85577.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2025.06-software-linux-x86_64-amd-zen2-17561952520.tar.gzsize: 0 MiB (19836 bytes)
entries: 1
modules under 2025.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2025.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
reprod directories under 2025.06/software/linux/x86_64/amd/zen2/reprod
no reprod directories in tarball
other under 2025.06/software/linux/x86_64/amd/zen2
2025.06/init/easybuild/eb_hooks.py
Aug 26 08:01:57 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-85577.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Aug 26 10:58:40 UTC 2025 uploaded transfer of eessi-2025.06-software-linux-x86_64-amd-zen2-17561952520.tar.gz to S3 bucket succeeded

Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ocaisa ocaisa merged commit 3bb84bc into EESSI:main Aug 26, 2025
71 of 76 checks passed
@boegel boegel deleted the fix_parallel_limit branch August 26, 2025 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2023.06-software.eessi.io 2023.06 version of software.eessi.io 2025.06-software.eessi.io 2025.06 version of software.eessi.io bot:deploy

Projects

None yet

Development

Successfully merging this pull request may close these issues.

incorrect parallelism on A64FX with EasyBuild 5.x

3 participants