Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
119 commits
Select commit Hold shift + click to select a range
92a6c7a
init async ssa executor
jacquesqiao Jan 16, 2019
afda840
init communicator
jacquesqiao Jan 16, 2019
ea66979
can run
jacquesqiao Jan 17, 2019
88d71fa
support num_iteration_per_run
jacquesqiao Jan 17, 2019
69484f7
remote communicator
jacquesqiao Jan 18, 2019
7021979
init communicator
jacquesqiao Jan 18, 2019
9958775
add NewTmpScope to scope
jacquesqiao Jan 18, 2019
b5aefc8
fix compile problem
jacquesqiao Jan 18, 2019
f3210b6
fix copy_memory and share_memory
jacquesqiao Jan 18, 2019
be72940
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Jan 24, 2019
ca5d96b
complete send lod tensor
jacquesqiao Jan 24, 2019
1866d2d
parameter send support selected_rows
jacquesqiao Jan 24, 2019
74040cb
code clean
jacquesqiao Jan 24, 2019
1edc042
update send_op
jacquesqiao Jan 24, 2019
ada43e8
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Jan 25, 2019
fab8457
code optimize
jacquesqiao Jan 26, 2019
a66115b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Jan 26, 2019
62549e0
add GenParentScopeTreeDebugInfo
jacquesqiao Jan 27, 2019
be738a6
add some debug infor
jacquesqiao Jan 27, 2019
9da96ab
clean code of test_async_ssa_graph_executor_mnist
jacquesqiao Jan 27, 2019
7e145b7
optimize test_async_ssa_graph_executor_mnist
jacquesqiao Jan 28, 2019
02dab46
add some debug info
jacquesqiao Jan 28, 2019
4a17261
complete test_async_ssa_graph_executor_mnist test=develop
jacquesqiao Jan 28, 2019
c7e3868
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Jan 28, 2019
657a4f9
code can compile
jacquesqiao Jan 28, 2019
249f48e
update test test=develop
jacquesqiao Jan 28, 2019
d6c0dca
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Jan 29, 2019
381f383
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Feb 3, 2019
16af1db
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Feb 3, 2019
b1fe8d4
add a check for async_ssa_graph_exe test=develop
jacquesqiao Feb 4, 2019
741b7cf
fix compile test=develop
jacquesqiao Feb 4, 2019
4356f18
complete parameter_send
jacquesqiao Feb 6, 2019
5c36eb8
fix build
jacquesqiao Feb 6, 2019
5cf0092
add more log and fix test_dist_base in multi_batch_merge_pass
jacquesqiao Feb 7, 2019
a0585d0
init parameter recv
jacquesqiao Feb 7, 2019
a804a2a
complete parameter recv
jacquesqiao Feb 8, 2019
a715261
Merge branch 'fix-cpu-broadcast' of ssh://github.com/jacquesqiao/Padd…
jacquesqiao Feb 8, 2019
fbd186b
complete recv op
jacquesqiao Feb 8, 2019
8bda4ab
parameter recv can run
jacquesqiao Feb 8, 2019
e72637d
ThreadedSSAGraphExecutor support num_iteration_per_run test=develop
jacquesqiao Feb 9, 2019
84367cf
support async mode in dist mode parallel executor
jacquesqiao Feb 10, 2019
c4ded17
async mode support dist train
jacquesqiao Feb 11, 2019
2171aa7
async ssa exe only support local mode
jacquesqiao Feb 11, 2019
cc71e89
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Feb 21, 2019
31a05d3
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Feb 21, 2019
9465c3d
fix compile problem
jacquesqiao Feb 21, 2019
7f3be09
fix multi graph test=develop
jacquesqiao Feb 21, 2019
12f6b8c
change the include of ThreadPool.h test=develop
jacquesqiao Feb 21, 2019
f4f4816
fix gpu error test=develop
jacquesqiao Feb 22, 2019
ecedd53
fix code bug test=develop
jacquesqiao Feb 22, 2019
b5b8e6c
revert the change of scope test=develop
jacquesqiao Feb 23, 2019
10393dd
add some check test=develop
jacquesqiao Feb 25, 2019
b8491bf
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Feb 25, 2019
43c8237
use one graph
jacquesqiao Feb 25, 2019
cf0511f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Feb 25, 2019
dab7f36
optimize code test=develop
jacquesqiao Feb 25, 2019
ff01d70
fix style
jacquesqiao Feb 25, 2019
f768fbf
support multi graph
jacquesqiao Feb 26, 2019
49f2f4f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Feb 26, 2019
02425b2
fix compile
jacquesqiao Feb 27, 2019
847e4f4
pure async mode train
jacquesqiao Mar 1, 2019
3691a46
improve communicator
jacquesqiao Mar 4, 2019
9573d61
use rpc common in parameter send and recv
jacquesqiao Mar 4, 2019
3c6b733
remove exe context
jacquesqiao Mar 4, 2019
c2cce6b
simplify parameter send and recv
jacquesqiao Mar 4, 2019
5060150
improve communicator
jacquesqiao Mar 4, 2019
13e8b5b
clear gradient before merge
jacquesqiao Mar 4, 2019
e70b172
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Mar 4, 2019
8744f9a
fix parallel executor async mode
jacquesqiao Mar 4, 2019
fab1b54
Merge branch 'add-communicator' of ssh://github.com/jacquesqiao/Paddl…
jacquesqiao Mar 4, 2019
8c38aca
tmp commit
jacquesqiao Mar 5, 2019
b2c082c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Mar 5, 2019
e92ad8a
optimize test_async_ssa_graph_executor_mnist test=develop
jacquesqiao Mar 5, 2019
f28c258
code clean test=develop
jacquesqiao Mar 5, 2019
c09477b
revert change
jacquesqiao Mar 5, 2019
4e218da
code format test=develop
jacquesqiao Mar 5, 2019
5e8de51
code format test=develop
jacquesqiao Mar 6, 2019
255b36d
can run
jacquesqiao Mar 6, 2019
7d5dc4e
fix cmake list
jacquesqiao Mar 6, 2019
a0bb18b
Merge branch 'add-async-ssa-graph-executor' of ssh://github.com/jacqu…
jacquesqiao Mar 6, 2019
a23f1ee
optimize code
jacquesqiao Mar 7, 2019
446fdf9
fix compile problem
jacquesqiao Mar 7, 2019
fe6a840
fix delete recv ops
jacquesqiao Mar 7, 2019
3225e19
fix remove recv op
jacquesqiao Mar 7, 2019
ff8054c
can run
jacquesqiao Mar 8, 2019
c0e5941
add commnet for recv do_not_run
jacquesqiao Mar 8, 2019
63cd70a
fix blocking problem
jacquesqiao Mar 8, 2019
0a828fe
add some flags for communicator
jacquesqiao Mar 10, 2019
eb6af30
change embedding interface addnremote_prefetch
jacquesqiao Mar 10, 2019
ad5a2b3
add some debug flags for communicator
jacquesqiao Mar 11, 2019
43378ad
add flags to init
jacquesqiao Mar 11, 2019
d3a1437
add fake rpc to send
jacquesqiao Mar 11, 2019
23d3929
optimize merge vars
jacquesqiao Mar 12, 2019
9b74707
fix compile problem
jacquesqiao Mar 12, 2019
0fcdae8
add communicator_test
jacquesqiao Mar 12, 2019
c567deb
optimize log
jacquesqiao Mar 13, 2019
347178b
fix pserver memory leak
jacquesqiao Mar 14, 2019
065b68b
clean code
jacquesqiao Mar 14, 2019
ea0df4e
add some check
jacquesqiao Mar 16, 2019
039d783
change communicator_recv_wait_ms to communicator_max_send_grad_num_be…
jacquesqiao Mar 18, 2019
3061840
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Mar 27, 2019
37f6b9a
fix build test=develop
jacquesqiao Mar 27, 2019
d640c6c
fix pylint
jacquesqiao Mar 27, 2019
392e97a
fix cpplint test=develop
jacquesqiao Mar 27, 2019
b542639
code clean test=develop
jacquesqiao Mar 27, 2019
33be014
fix distribute compile problem test=develop
jacquesqiao Mar 27, 2019
b68f840
fix test_split_selected_rows_op test=develop
jacquesqiao Mar 27, 2019
34890fd
fix gpu build for lookup_table_op test=develop
jacquesqiao Mar 28, 2019
d8974e6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Mar 29, 2019
61912e8
test_dist_base set runtime_split_send_recv to false test=develop
jacquesqiao Mar 29, 2019
a1821a0
remote remote_prefetch in embedding layer test=develop
jacquesqiao Mar 30, 2019
df45c8c
update nce and hierarchical_sigmoid remote_prefetch
jacquesqiao Mar 30, 2019
8342f12
fix set remote_prefetch test=develop
jacquesqiao Mar 31, 2019
9db1a9e
change log level test=develop
jacquesqiao Mar 31, 2019
baf0232
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Mar 31, 2019
adf272b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
jacquesqiao Mar 31, 2019
fb6cc3a
follow commnet, optimize code and add comment test=develop
jacquesqiao Apr 1, 2019
9861a92
change the return type of NewTempScope to unique ptr test=develop
jacquesqiao Apr 1, 2019
4031c1a
fix ci build test=develop
jacquesqiao Apr 1, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion paddle/fluid/framework/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ endif()
target_link_libraries(executor while_op_helper executor_gc_helper)

cc_library(parallel_executor SRCS parallel_executor.cc DEPS
threaded_ssa_graph_executor scope_buffered_ssa_graph_executor parallel_ssa_graph_executor
threaded_ssa_graph_executor scope_buffered_ssa_graph_executor parallel_ssa_graph_executor async_ssa_graph_executor
graph build_strategy
fast_threaded_ssa_graph_executor variable_helper)

Expand Down
6 changes: 6 additions & 0 deletions paddle/fluid/framework/details/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,12 @@ cc_library(threaded_ssa_graph_executor SRCS threaded_ssa_graph_executor.cc DEPS

cc_library(parallel_ssa_graph_executor SRCS parallel_ssa_graph_executor.cc DEPS threaded_ssa_graph_executor)

set(ASYNC_SSA_GRAPH_EXECUTOR_DEPS threaded_ssa_graph_executor)
if(WITH_DISTRIBUTE)
list(APPEND ASYNC_SSA_GRAPH_EXECUTOR_DEPS communicator)
endif()
cc_library(async_ssa_graph_executor SRCS async_ssa_graph_executor.cc DEPS ${ASYNC_SSA_GRAPH_EXECUTOR_DEPS})

cc_test(broadcast_op_test SRCS broadcast_op_handle_test.cc DEPS var_handle op_handle_base scope ddim memory
device_context broadcast_op_handle)
cc_test(gather_op_test SRCS gather_op_handle_test.cc DEPS var_handle op_handle_base scope ddim memory
Expand Down
203 changes: 203 additions & 0 deletions paddle/fluid/framework/details/async_ssa_graph_executor.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,203 @@
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018 -> 2019

//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#include "paddle/fluid/framework/details/async_ssa_graph_executor.h"

#include "paddle/fluid/framework/variable_helper.h"

#ifdef PADDLE_WITH_DISTRIBUTE
#include "paddle/fluid/operators/distributed/communicator.h"
#endif

namespace paddle {
namespace framework {
namespace details {

inline void NewTempScopeAndInitVars(const std::vector<VarInfo> &var_infos,
Scope *scope) {
VLOG(3) << "NewTempScopeAndInitVars";
Scope &local_scope = scope->NewScope();
*scope->Var(details::kLocalExecScopeName)->GetMutable<Scope *>() =
&local_scope;

for (auto &info : var_infos) {
if (scope->FindVar(info.name_) != nullptr) {
continue;
}

if (info.persistable_) { // Persistable
InitializeVariable(scope->Var(info.name_), info.type_);
} else {
InitializeVariable(local_scope.Var(info.name_), info.type_);
}
}
}

// get RpcContext and remote send and recv op
void ProcessGraph(std::vector<ir::Graph *> graphs, Scope *scope) {
#ifdef PADDLE_WITH_DISTRIBUTE
using RpcCtxMap = operators::distributed::RpcCtxMap;
VLOG(3) << "ProcessGraph";
RpcCtxMap send_varname_to_ctx;
RpcCtxMap recv_varname_to_ctx;
for (auto i = 0; i < graphs.size(); ++i) {
std::vector<ir::Node *> nodes_to_delete;
for (auto &node : graphs[i]->Nodes()) {
VLOG(3) << "node name " << node->Name();
if (node && node->IsOp()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the node maybe nullptr here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have met this problem before, so I add more check to ensure it works right.

if (node->Name() == "send") {
auto send_var_name = node->Op()->Input("X")[0];
auto send_varnames = boost::get<std::vector<std::string>>(
node->Op()->GetNullableAttr("send_varnames"));
auto epmap = boost::get<std::vector<std::string>>(
node->Op()->GetNullableAttr("epmap"));
auto height_section = boost::get<std::vector<int64_t>>(
node->Op()->GetNullableAttr("sections"));
send_varname_to_ctx[send_var_name] =
operators::distributed::RpcContext(send_var_name, send_varnames,
epmap, height_section);
VLOG(3) << "find and init an send op: "
<< send_varname_to_ctx[send_var_name];
} else if (node->Name() == "recv") {
auto recv_var_name = node->Op()->Output("Out")[0];
auto recv_varnames = boost::get<std::vector<std::string>>(
node->Op()->GetNullableAttr("recv_varnames"));
auto epmap = boost::get<std::vector<std::string>>(
node->Op()->GetNullableAttr("epmap"));
recv_varname_to_ctx[recv_var_name] =
operators::distributed::RpcContext(recv_var_name, recv_varnames,
epmap, {});
nodes_to_delete.push_back(node);
VLOG(3) << "find and remove an recv op: "
<< recv_varname_to_ctx[recv_var_name];
}
}
}
}
// init communicator here
if (send_varname_to_ctx.size() > 0) {
VLOG(3) << "this is distribute mode, will use communicator";
operators::distributed::Communicator::Init(send_varname_to_ctx,
recv_varname_to_ctx, scope);
operators::distributed::Communicator::GetInstance()->Start();
}
#endif
}

AsyncSSAGraphExecutor::AsyncSSAGraphExecutor(
const ExecutionStrategy &strategy, const std::vector<Scope *> &local_scopes,
const std::vector<platform::Place> &places, std::vector<ir::Graph *> graphs)
: strategy_(std::move(strategy)),
local_scopes_(std::move(local_scopes)),
pool_(places.size() >= 2 ? new ::ThreadPool(places.size()) : nullptr),
places_(std::move(places)),
graphs_(std::move(graphs)) {
VLOG(3) << "build AsyncSSAGraphExecutor";
PADDLE_ENFORCE_EQ(places_.size(), local_scopes_.size());

// set the correct size of thread pool to each device.
strategy_.num_threads_ = strategy_.num_threads_ < places_.size()
? 1UL
: strategy_.num_threads_ / places_.size();
VLOG(1) << "set num_threads: " << strategy_.num_threads_
<< " to run the operators of the graph on each device.";
for (size_t i = 0; i < places.size(); ++i) {
executors_.emplace_back(new details::ThreadedSSAGraphExecutor(
strategy_, {local_scopes_[i]}, {places_[i]}, graphs_[i]));
}

for (auto &node : graphs_[0]->Nodes()) {
if (node->IsVar() && !node->IsCtrlVar() && node->Var()) {
var_infos_.emplace_back();
var_infos_.back().name_ = node->Var()->Name();
var_infos_.back().type_ = node->Var()->GetType();
var_infos_.back().persistable_ = node->Var()->Persistable();
}
}
for (auto *scope : local_scopes_) {
NewTempScopeAndInitVars(var_infos_, scope);
}
ProcessGraph(graphs_, local_scopes_[0]);
}

void AsyncSSAGraphExecutor::StartOffPythonTrainLoop() {
VLOG(3) << "StartOffPythonTrainLoop size = " << places_.size();
for (size_t i = 1; i < places_.size(); ++i) {
auto call = [this, i]() -> void {
VLOG(3) << "start off python thread " << i;
try {
while (true) {
executors_[i]->Run({});
}
} catch (...) {
exception_holder_.Catch(std::current_exception());
VLOG(3) << "get exception type = " << exception_holder_.Type();
}
VLOG(3) << "thread " << i << " exited!";
};
run_futures_.emplace_back(pool_->enqueue(std::move(call)));
}
}

void AsyncSSAGraphExecutor::HandleException() {
if (exception_holder_.IsCaught()) {
for (auto &f : run_futures_) {
VLOG(3) << "wait future";
f.wait();
}
VLOG(3) << "caught exception " << exception_holder_.Type()
<< ", rethrow it";
run_futures_.clear();
exception_holder_.ReThrow();
}
}

FeedFetchList AsyncSSAGraphExecutor::Run(
const std::vector<std::string> &fetch_tensors) {
// init once
if (run_futures_.size() == 0 && places_.size() > 1) {
exception_holder_.Clear();
StartOffPythonTrainLoop();
}

if (places_.size() == 1) {
exception_holder_.Clear();
} else {
HandleException();
}

FeedFetchList fetch_data;
fetch_data.reserve(fetch_tensors.size());

try {
fetch_data = executors_[0]->Run(fetch_tensors);
} catch (...) {
exception_holder_.Catch(std::current_exception());
}

HandleException();

FeedFetchList ret;
for (size_t fetch_idx = 0; fetch_idx < fetch_tensors.size(); ++fetch_idx) {
std::vector<const LoDTensor *> lodtensor_ptrs;
lodtensor_ptrs.push_back(&fetch_data.at(fetch_idx));
ret.emplace_back();
ret.back().MergeLoDTensor(lodtensor_ptrs, platform::CPUPlace());
}
return ret;
}

} // namespace details
} // namespace framework
} // namespace paddle
65 changes: 65 additions & 0 deletions paddle/fluid/framework/details/async_ssa_graph_executor.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#pragma once

#include <memory>
#include <string>
#include <utility>
#include <vector>

#include "ThreadPool.h"
#include "paddle/fluid/framework/details/threaded_ssa_graph_executor.h"

namespace paddle {
namespace framework {
namespace details {

struct VarInfo {
std::string name_;
proto::VarType::Type type_;
bool persistable_;
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate with here.


class AsyncSSAGraphExecutor : public SSAGraphExecutor {
public:
AsyncSSAGraphExecutor(const ExecutionStrategy &strategy,
const std::vector<Scope *> &local_scopes,
const std::vector<platform::Place> &places,
std::vector<ir::Graph *> graphs);
~AsyncSSAGraphExecutor() final = default;
const ir::Graph &Graph() const override { return *graphs_[0]; }

FeedFetchList Run(const std::vector<std::string> &fetch_tensors) override;

private:
void StartOffPythonTrainLoop();
void HandleException();

private:
ExecutionStrategy strategy_;
std::vector<Scope *> local_scopes_;
std::unique_ptr<::ThreadPool> pool_{nullptr};
std::vector<platform::Place> places_;
std::vector<ir::Graph *> graphs_;

std::vector<std::unique_ptr<details::ThreadedSSAGraphExecutor>> executors_;
ExceptionHolder exception_holder_;
std::vector<std::future<void>> run_futures_;
std::vector<VarInfo> var_infos_;
};

} // namespace details
} // namespace framework
} // namespace paddle
11 changes: 9 additions & 2 deletions paddle/fluid/framework/details/build_strategy.cc
Original file line number Diff line number Diff line change
Expand Up @@ -184,8 +184,12 @@ class ParallelExecutorPassBuilder : public ir::PassBuilder {
// Convert graph to run on multi-devices.
void AppendMultiDevPass(const BuildStrategy &strategy) {
ir::Pass *multi_devices_pass = nullptr;
if (strategy.is_distribution_) {
VLOG(10) << "Add dist_multi_devices_pass";

if (strategy_.async_mode_) {
multi_devices_pass = AppendPass("async_multi_devices_pass").get();
} else if (strategy_.is_distribution_) {
VLOG(10)
<< "Add dist_multi_devices_pass, multi device parameter server mode";
multi_devices_pass = AppendPass("dist_multi_devices_pass").get();
} else {
if (strategy.reduce_ == BuildStrategy::ReduceStrategy::kAllReduce) {
Expand Down Expand Up @@ -234,10 +238,12 @@ ir::Graph *BuildStrategy::Apply(ir::Graph *graph,
#else
const bool use_cuda) const {
#endif
VLOG(3) << "apply all passes";
// Create a default one if not finalized by user.
CreatePassesFromStrategy(false);

for (std::shared_ptr<ir::Pass> &pass : pass_builder_->AllPasses()) {
VLOG(3) << "apply " << pass->Type();
if (IsMultiDevPass(pass->Type())) {
pass->Erase(kPlaces);
pass->SetNotOwned<const std::vector<platform::Place>>(kPlaces, &places);
Expand Down Expand Up @@ -293,6 +299,7 @@ ir::Graph *BuildStrategy::Apply(ir::Graph *graph,
graph = pass->Apply(graph);
VLOG(3) << "Finish Apply Pass " << pass->Type();
}
VLOG(3) << "All Passes Applied";
return graph;
}

Expand Down
1 change: 1 addition & 0 deletions paddle/fluid/framework/details/build_strategy.h
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ struct BuildStrategy {
// num_trainers is 1, so the current fields of build_strategy doesn't tell if
// it's distributed model.
bool is_distribution_{false};
bool async_mode_{false};
int num_trainers_{1};
int trainer_id_{0};
std::vector<std::string> trainers_endpoints_;
Expand Down
18 changes: 18 additions & 0 deletions paddle/fluid/framework/details/exception_holder.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@

#pragma once

#include <memory>
#include <string>

#include "glog/logging.h"
#include "paddle/fluid/platform/enforce.h"

Expand Down Expand Up @@ -64,6 +67,21 @@ class ExceptionHolder {
ClearImpl();
}

std::string Type() {
std::lock_guard<std::mutex> lock(mu_);
switch (type_) {
case kNone:
return "None";
case kEnforceNotMet: {
return "EnforceNotMet";
}
case kEOF: {
return "EOF";
}
}
return "unknown";
}

private:
void ClearImpl() {
exception_.reset();
Expand Down
2 changes: 2 additions & 0 deletions paddle/fluid/framework/details/execution_strategy.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ struct ExecutionStrategy {
size_t num_iteration_per_drop_scope_{1};
ExecutorType type_{kDefault};
bool dry_run_{false};
size_t num_iteration_per_run_{1}; // only use with async_ssa_graph_executor
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this mean?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/PaddlePaddle/Paddle/pull/16172/files#diff-bcb7058cf667aba60603c4448e6180c8R131
used here, will run multi steps when call exe.run to improve performance.

// and pyreader with data queue
};

} // namespace details
Expand Down
Loading