-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Disable thread pool creation when enabled OpenMP #2485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
8f53449
f362218
43f5176
3d9d892
cba7e2b
1686db3
c713bf3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -102,12 +102,14 @@ InferenceSession::InferenceSession(const SessionOptions& session_options, | |
| : session_options_(session_options), | ||
| graph_transformation_mgr_(session_options.max_num_graph_transformation_steps), | ||
| logging_manager_(logging_manager), | ||
| #ifdef USE_OPENMP | ||
| thread_pool_(concurrency::CreateThreadPool("intra_op_thread_pool", | ||
| session_options.intra_op_num_threads)), | ||
| inter_op_thread_pool_(session_options.execution_mode == ExecutionMode::ORT_PARALLEL | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Seems this would disable parallel executor when building OpenMP? Say, if running on a 8-HT machine, with OMP_NUM_THREADS=4 and -y 2 in onnxruntime_perf_test, is that a valid setting? #Resolved
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a good point. Do you think it's better to keep the previous behavior for inter threadpool (-y) ? #Resolved
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Parallel executor is off by default so it should not cause any perf difference in your case. By keeping this logic, we can try to enable it together with OpenMP in case some models might benefit from the combination. In reply to: 350977021 [](ancestors = 350977021)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If USE_OPENMP is not defined, please initialize the var to nullptr here. Because the next few lines are referencing this variable. |
||
| ? concurrency::CreateThreadPool("inter_op_thread_pool", | ||
| session_options.inter_op_num_threads) | ||
| : nullptr), | ||
| #endif | ||
| session_state_(execution_providers_, | ||
| session_options.enable_mem_pattern && session_options.execution_mode == ExecutionMode::ORT_SEQUENTIAL, | ||
| thread_pool_.get(), | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -41,8 +41,8 @@ namespace perftest { | |
| "\t-p [profile_file]: Specifies the profile name to enable profiling and dump the profile data to the file.\n" | ||
| "\t-s: Show statistics result, like P75, P90.\n" | ||
| "\t-v: Show verbose information.\n" | ||
| "\t-x [intra_op_num_threads]: Sets the number of threads used to parallelize the execution within nodes, A value of 0 means ORT will pick a default. Must >=0.\n" | ||
| "\t-y [inter_op_num_threads]: Sets the number of threads used to parallelize the execution of the graph (across nodes), A value of 0 means ORT will pick a default. Must >=0.\n" | ||
| "\t-x [intra_op_num_threads]: Sets the number of threads used to parallelize the execution within nodes, A value of 0 means ORT will pick a default. Must >=0. If OpenMP is enabled, this configuration will be ignored.\n" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can we change -x to call omp_set_num_threads() when building with OpenMP?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. currently this is controlled by env var
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe set_omp_num_threads would override the env then, which is what user would expect if OpenMP is mainly used for intra node or data parallelism. Otherwise, we need to ask user to use different ways to specify data parallelism #threads. In reply to: 350977758 [](ancestors = 350977758)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is one tricky part of the current behavior: if OpenMP is not enabled, if you set this cause the problem: if we want to extend
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unless we unify the definition of "numThreads" ( number of worker threads? or number of threads in total? ), I would prefer to keep the current status
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even if -x means different things when using OpenMP vs. not using OpenMP, I'd prefer to have -x to set OMP thread count, because it's easier to control than setting env and prevents accidentally picking up previously set env, especially in Windows. In this case, we are not changing any existing behavior, and only made it easy to control omp threads. In reply to: 351000042 [](ancestors = 351000042)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You cannot expect that everyone fully understand this tricky behavior, and it's not necessarily have to understand. It can't be easier to make mistakes -- create threadpool with the wrong number of threads without being aware of it. I would even prefer to add a new flag rather than
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good to me, so we just disable -x in OpenMP build. In reply to: 351019314 [](ancestors = 351019314) |
||
| "\t-y [inter_op_num_threads]: Sets the number of threads used to parallelize the execution of the graph (across nodes), A value of 0 means ORT will pick a default. Must >=0. If OpenMP is enabled, this configuration will be ignored.\n" | ||
| "\t-P: Use parallel executor instead of sequential executor.\n" | ||
| "\t-o [optimization level]: Default is 1. Valid values are 0 (disable), 1 (basic), 2 (extended), 99 (all).\n" | ||
| "\t\tPlease see onnxruntime_c_api.h (enum GraphOptimizationLevel) for the full list of all optimization levels. \n" | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.