Hpc ugent modular #15

SanderBorgmans · 2019-08-28T12:02:19Z

@jan-janssen, thank you for the modular implementation. I had to fiddle around with the queues.yaml file before I got everything working. But if I am correct, an extra cluster property is added to the queues dict and an extra list object with all the clusters should be added. I fixed some minor errors and everything should work now. The commands are now parsed as strings instead of lists, since the swap commands requires the shell flag, which meant that the format flag of the queue command had to be adapted.

jan-janssen · 2019-08-28T12:41:40Z

Hi @SanderBorgmans I am sorry for the bug, I now added at least a very basic test to see it is working, but testing the functionality with the queuing system is always a bit difficult. I just have one more question, is there any reason why the SLURM adapter does not work for you? (I modified the tests to be using slurm in the modular adapter, maybe you can take a look at the current master - it still needs your changes to use subprocess with shell=True, but it should not be necessary to have another adapter gent.py and there should be no need to modify basic.py.

SanderBorgmans · 2019-08-28T12:47:30Z

@jan-janssen I tought an extra wrapper was required because of the config["queue_type"] in ["GENT"] line in the queueadapter.py. Aside from that, the gent wrapper handles the convert_queue_status and get_job_id_from_output slightly different, as our HPC environment returns a slightly different output for squeue, and adds the cluster column for the queue table. Moreover, it checks whether the queue_status_output is empty, since it is possible for some users to not have a single job on some clusters.

jan-janssen · 2019-08-28T13:13:30Z

@SanderBorgmans I would like to minimise the duplicated code, that is why I am a bit reluctant to add a second parser to communicate with slurm. In the current master branch I added the line https://github.com/pyiron/pysqa/blob/master/pysqa/modular.py#L10 to use slurm in the modular adapter and I added the option shell=True to convert lists to strings.

SanderBorgmans · 2019-08-28T13:17:05Z

@jan-janssen This is fine for me, I will then just overwrite the slurm wrapper by my own wrapper and everything should be ok.

jan-janssen · 2019-08-28T13:20:56Z

@SanderBorgmans I still have trouble to understand why we need two different adapters for slurm. But if it is necessary, that is fine for me, maybe you could at least derive it from the SlurmCommands class, so it is obvious which parts changed.

SanderBorgmans · 2019-08-28T13:40:43Z

@jan-janssen That's a good idea. The different adapter is necessary because of https://github.com/pyiron/pysqa/blob/master/pysqa/wrapper/slurm.py#L39 and https://github.com/pyiron/pysqa/blob/master/pysqa/wrapper/slurm.py#L43 . When submitting a job with sbatch, the returned output looks like jobid;cluster, so that an extra split is necessary. Furthermore, the squeue command prints an extra line "CLUSTER: clustername", which is not dropped when specifying --no-header, and thus again an extra operation is required to drop this line. Aside from this both adapters are identical with the inclusion of the cluster column that I introduced before.

I also got errors from a possibly empty queue_status_output when splitlines is called. This occurred for cluster which had no jobs in the queue. Does this happen for you?

jan-janssen · 2019-08-28T13:45:03Z

@jan-janssen That's a good idea. The different adapter is necessary because of https://github.com/pyiron/pysqa/blob/master/pysqa/wrapper/slurm.py#L39 and https://github.com/pyiron/pysqa/blob/master/pysqa/wrapper/slurm.py#L43 . When submitting a job with sbatch, the returned output looks like jobid;cluster, so that an extra split is necessary. Furthermore, the squeue command prints an extra line "CLUSTER: clustername", which is not dropped when specifying --no-header, and thus again an extra operation is required to drop this line. Aside from this both adapters are identical with the inclusion of the cluster column that I introduced before.

Ok, then it makes sense to have a different wrapper.

I also got errors from a possibly empty queue_status_output when splitlines is called. This occurred for cluster which had no jobs in the queue. Does this happen for you?

I have to admit our cluster still runs SGE so @dnoeger implemented the slurm adapter for their cluster. But it definitely makes sense to check the length of the output.

Can you modify the pull request accordingly, then we can merge it?

jan-janssen · 2019-08-28T13:54:53Z

Does this work for you? #19

SanderBorgmans · 2019-08-28T13:57:12Z

@jan-janssen Perfect, I was just about to push, but I had some git issues.

jan-janssen · 2019-08-28T14:09:44Z

Nice, then I guess we can close this pull request and I am going to push a new release to pip and conda-forge .

Sander Borgmans added 2 commits August 28, 2019 12:33

Adapted for hpc ugent, fixed some minor errors

8444f61

Issue with squeue format when parsed as list

0e1e31d

SanderBorgmans mentioned this pull request Aug 28, 2019

Hpc ugent #7

Closed

Merge branch 'master' into hpc_ugent_modular

4e70bd4

jan-janssen closed this Aug 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hpc ugent modular #15

Hpc ugent modular #15

Uh oh!

SanderBorgmans commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

SanderBorgmans commented Aug 28, 2019 •

edited

Loading

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

SanderBorgmans commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

SanderBorgmans commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

SanderBorgmans commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Hpc ugent modular #15

Hpc ugent modular #15

Uh oh!

Conversation

SanderBorgmans commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

SanderBorgmans commented Aug 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

SanderBorgmans commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

SanderBorgmans commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

SanderBorgmans commented Aug 28, 2019

Uh oh!

jan-janssen commented Aug 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SanderBorgmans commented Aug 28, 2019 •

edited

Loading