Skip to content

supports multiple NCCL communicators preserved in NCCLCommContext#19407

Merged
gavin1332 merged 2 commits intoPaddlePaddle:developfrom
gavin1332:nccl
Aug 27, 2019
Merged

supports multiple NCCL communicators preserved in NCCLCommContext#19407
gavin1332 merged 2 commits intoPaddlePaddle:developfrom
gavin1332:nccl

Conversation

@gavin1332
Copy link
Collaborator

@gavin1332 gavin1332 commented Aug 25, 2019

supports multiple NCCL communicators preserved in NCCLCommContext.
This pr adapt paddle pipeline training for single-process-and-multiple-threads training
test=develop

@gavin1332 gavin1332 force-pushed the nccl branch 3 times, most recently from 3b4298b to a802018 Compare August 27, 2019 01:30
Copy link
Contributor

@hutuxian hutuxian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ring_id);
return comm_map_.at(ring_id).get();
PADDLE_ENFORCE_GT(comm_map_.count(ring_id), 0,
"comunicator in ring id %d has not been initialized",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comunicator -> communicator

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave it to TODO list

}

// retrieve a communicator by the ring id and place
NCCLComm* Get(int ring_id, Place place) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const Place& place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the place is a POD, not necessary.

@gavin1332 gavin1332 requested a review from luotao1 August 27, 2019 13:37
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for paddle_enforce

@gavin1332 gavin1332 merged commit efb05ba into PaddlePaddle:develop Aug 27, 2019
@gavin1332 gavin1332 deleted the nccl branch August 27, 2019 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants