Skip to content

Conversation

@youkaichao
Copy link
Member

#3686 makes usage of local_rank , but the API change is breaking. This PR modifies the API to be backward compatible, and add test for the local_rank argument.

This is kind of smoke test, as we only test the multi-node code in single node. But that should be fine, because single-node program is also a special case of multi-node program.

#TODO when we find better solution to test multi-node, we can actually test it in multi-node environment.

cc @esmeetu @cadedaniel @simon-mo

Copy link
Collaborator

@cadedaniel cadedaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround

Comment on lines +17 to +19
env['LOCAL_RANK'] = str(i)
env['WORLD_SIZE'] = str(number_of_processes)
env['LOCAL_WORLD_SIZE'] = str(number_of_processes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also test the case where LOCAL_* is not present in env?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests using ray already tested this, i.e. the default value of local_rank=-1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks!

@youkaichao youkaichao merged commit 756b30a into vllm-project:main Mar 29, 2024
@youkaichao youkaichao deleted the test_local_rank branch March 29, 2024 04:19
@rkooo567
Copy link
Collaborator

I can try bringing docker container multi node test next week!

xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 31, 2024
…project#3711)

[Core][Test] move local_rank to the last arg with default value to keep api compatible (vllm-project#3711)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants