implement pserver RPC part, and simple parameter partition.#2243
implement pserver RPC part, and simple parameter partition.#2243helinwang merged 5 commits intoPaddlePaddle:developfrom
Conversation
paddle/go/cclient/cclient.go
Outdated
There was a problem hiding this comment.
type switch for clearly?
There was a problem hiding this comment.
I thought this is clear because the underlying type for selector is bool, but I could be wrong.
I don't fully understand how to do type switch here, could you paste the code here?
paddle/go/cclient/cclient.go
Outdated
There was a problem hiding this comment.
maybe we should implement this interface bases on ectd? for example, list pserver names from etcd value for fault tolerant pserver?
There was a problem hiding this comment.
I will write a module who implements this interface that talks to etcd.
However, this PR only includes a module with static pserver addresses. To be used for MPI cluster.
I will send a separate PR for the etcd module.
paddle/go/pserver/client.go
Outdated
There was a problem hiding this comment.
maybe name with "param" better here?
paddle/go/pserver/client.go
Outdated
There was a problem hiding this comment.
do we need to check the pserverNum == len(list) here to ensure that alive pserver instance equal the desired number?
There was a problem hiding this comment.
It's ok if number of pservers still did not reach the desired number. We will only connect to pservers that are alive. For pservers not alive yet, we will not connect to them, and all calls to them will be blocked until connected. (See: https://github.com/PaddlePaddle/Paddle/pull/2243/files#diff-b1a8f38187aa603e0efc16791f282d87R61)
There was a problem hiding this comment.
thanks for the explain. I got it. : )
Easier to view using github split diff :)
Please use the following command to build, because there is some problem with the go cmake (the cmake checks out a new Paddle from github for Go dependency, so the changes in local filesystem are not reflected. Fix is in another PR: #2294).
open new teminal and run:
./test/main # run the example program we just build