add parallel build script to ci …#16901
Merged
wopeizl merged 9 commits intoPaddlePaddle:developfrom Apr 22, 2019
Merged
Conversation
2. run test case according to the run type test=develop
luotao1
reviewed
Apr 22, 2019
| wait; # wait for all subshells to finish | ||
| } | ||
|
|
||
| function aggresive_test() { |
| test_cases=$(ctest -N -V) | ||
| exclusive_tests='' | ||
| single_card_tests='' | ||
| multiple_card_tests='' |
Contributor
There was a problem hiding this comment.
请解释下每个变量的含义
exclusive_tests='' 这个没理解是什么测试?独占的?
single_card_tests='' 单卡测试
multiple_card_tests='' 多卡测试
| if [[ "$matchstr" == "" ]]; then | ||
| # Any test case with LABELS property would be parse here | ||
| # RUN_TYPE=EXCLUSIVE mean the case would run exclusively | ||
| # RUN_TYPE=DIST mean the case would take two graph cards during runtime |
Contributor
There was a problem hiding this comment.
706行的解释是说,分布式的单测都必须要2张卡,4张卡都不行?
| # RUN_TYPE=EXCLUSIVE mean the case would run exclusively | ||
| # RUN_TYPE=DIST mean the case would take two graph cards during runtime | ||
| read is_exclusive <<< $(echo "$line"|grep -oEi "RUN_TYPE=EXCLUSIVE") | ||
| read is_multicard <<< $(echo "$line"|grep -oEi "RUN_TYPE=DIST") |
Contributor
There was a problem hiding this comment.
RUN_TYPE=EXCLUSIVE 和SERIAL重复了。
Contributor
Author
There was a problem hiding this comment.
现在含义不太一样了,SERIAL在现在处理方式下应该不起作用了
| if [ ${WITH_TESTING:-ON} == "ON" ] ; then | ||
| cat <<EOF | ||
| ======================================== | ||
| Running unit tests ... |
Contributor
There was a problem hiding this comment.
Running unit tests ... 没体现并行
| wait | ||
| } | ||
|
|
||
| function parallel_test() { |
Contributor
There was a problem hiding this comment.
这个parallel_test没有看到用的地方。可以删掉?
| if [[ $cardnumber == $CUDA_DEVICE_COUNT ]]; then | ||
| ctest -I $i,,$NUM_PROC -R "($testcases)" --output-on-failure & | ||
| else | ||
| # echo "env CUDA_VISIBLE_DEVICES=$cuda_list ctest -I $i,,$NUM_PROC -R \"($testcases)\" --output-on-failure &" |
| cuda_list="$cuda_list,$[i*cardnumber+j]" | ||
| fi | ||
| done | ||
| # echo $cuda_list |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
… test=develop
4.1日CI单测时间(617个单测): 4卡机器 56 分钟,8卡机器 53 分钟
4.19日CI单测时间(617个单测):4卡机器 33 分钟,8卡机器 27 分钟
部分独占运行case列表如下,
1/24 Test #146: test_inference_label_semantic_roles .......... Passed 11.74 sec
2/24 Test #142: test_api_impl ................................ Passed 15.35 sec
3/24 Test #144: test_inference_image_classification_vgg ...... Passed 17.29 sec
4/24 Test #145: test_inference_image_classification_resnet ... Passed 17.39 sec
5/24 Test #147: test_inference_recognize_digits_mlp .......... Passed 8.66 sec
6/24 Test #151: test_inference_nlp ........................... Passed 2.11 sec
7/24 Test #149: test_inference_recommender_system ............ Passed 8.33 sec
8/24 Test #174: test_train_recognize_digits_mlp .............. Passed 2.88 sec
9/24 Test #150: test_inference_word2vec ...................... Passed 8.58 sec
10/24 Test #148: test_inference_recognize_digits_conv ......... Passed 14.64 sec
11/24 Test #307: test_hsigmoid_remote_table_op ................ Passed 3.21 sec
12/24 Test #252: test_parallel_executor_test_while_train ...... Passed 7.83 sec
13/24 Test #365: test_nce_remote_table_op ..................... Passed 3.63 sec
14/24 Test #175: test_train_recognize_digits_conv ............. Passed 16.14 sec
15/24 Test #384: test_listen_and_serv_op ...................... Passed 4.85 sec
16/24 Test #438: test_alloc_continuous_space_op ............... Passed 5.71 sec
17/24 Test #337: test_parallel_executor_mnist ................. Passed 21.12 sec
18/24 Test #445: test_conv_shift_op ........................... Passed 5.49 sec
19/24 Test #541: test_adam_op_multi_thread .................... Passed 5.99 sec
20/24 Test #205: test_weight_decay ............................ Passed 62.28 sec
21/24 Test #427: test_recordio_reader ......................... Passed 56.72 sec
22/24 Test #452: test_parallel_executor_seresnext ............. Passed 428.13 sec
23/24 Test #544: test_nearest_interp_op ....................... Passed 58.07 sec
24/24 Test #552: test_parallel_executor_crf ................... Passed 81.98 sec
部分双卡case,
Test #255: test_dist_base ................... Passed 2.23 sec
Test #192: test_distribute_fpn_proposals_op ... Passed 6.51 sec
Test #267: test_dist_save_load .............. Passed 24.87 sec
Test #410: test_dist_allreduce_op ............. Passed 25.29 sec
Test #232: test_dist_mnist_pg ............... Passed 25.48 sec
Test #278: test_dist_mnist_batch_merge ...... Passed 19.90 sec
Test #614: test_distillation_strategy ....... Passed 62.71 sec
Test #275: test_dist_simnet_bow ............... Passed 47.75 sec
Test #319: test_dist_ctr .................... Passed 26.57 sec
Test #492: test_dist_word2vec ............... Passed 66.63 sec
Test #531: test_dist_text_classification .... Passed 31.28 sec
Test #358: test_dist_mnist .................. Passed 125.71 sec
Test #548: test_dist_train .................. Passed 2.59 sec
Test #551: test_dist_se_resnext_nccl ........ Passed 60.43 sec
Test #550: test_dist_se_resnext ............... Passed 233.77 sec