Skip to content

Fix test_weight_decay#17109

Merged
chengduoZH merged 1 commit intoPaddlePaddle:developfrom
chengduoZH:fix_test_weight_decay
Apr 26, 2019
Merged

Fix test_weight_decay#17109
chengduoZH merged 1 commit intoPaddlePaddle:developfrom
chengduoZH:fix_test_weight_decay

Conversation

@chengduoZH
Copy link
Contributor

@chengduoZH chengduoZH commented Apr 26, 2019

http://ci.paddlepaddle.org/viewLog.html?tab=buildLog&buildTypeId=Paddle_PrCi&buildId=89562

[22:51:27]	test_weight_decay failed
[22:51:27]	 F
[22:51:27]	======================================================================
[22:51:27]	FAIL: test_weight_decay (test_weight_decay.TestWeightDecay)
[22:51:27]	----------------------------------------------------------------------
[22:51:27]	Traceback (most recent call last):
[22:51:27]	  File "/paddle/build/python/paddle/fluid/tests/unittests/test_weight_decay.py", line 186, in test_weight_decay
[22:51:27]	    + " in class " + self.__class__.__name__)
[22:51:27]	AssertionError: Expect 0.00045114197
[22:51:27]	But Got0.00045114197 in class TestWeightDecay
[22:51:27]	
[22:51:27]	----------------------------------------------------------------------
[22:51:27]	Ran 1 test in 51.645s
[22:51:27]	
[22:51:27]	FAILED (failures=1)

Disable temporary, will fix it later.

test=develop
@chengduoZH chengduoZH requested a review from luotao1 April 26, 2019 02:05
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chengduoZH chengduoZH merged commit 9ccce57 into PaddlePaddle:develop Apr 26, 2019
sneaxiy pushed a commit to sneaxiy/Paddle that referenced this pull request Apr 28, 2019
# The first commit's message is:
remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (PaddlePaddle#17066)

# This is the 2nd commit message:

Fleet unify distributed training (PaddlePaddle#16791)

* implement distributed transpiler with fleet
# This is the 3rd commit message:

ParallelDyGraph with GPU collective mode (PaddlePaddle#16827)

implement dygraph.parallel.DataParallel to hook reduce op.

# This is the 4th commit message:

Init mixed precision training interface (PaddlePaddle#16856)

* Init mixed precision training interface

* Add fp16 test script

test=develop

* All initializers support float16

test=develop

* Code cleanup & add more code annotations

test=develop

* Update API spec

test=develop

* Add usage example in doc

test=develop

# This is the 5th commit message:

fix reference_count_pass,test=develop (PaddlePaddle#17060)

test=develop
# This is the 6th commit message:

Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (PaddlePaddle#17090)

* Cache the information of linear interpolation in forward and use it in backward.
test=develop

* Fix cuda kernel.
test=develop

# This is the 7th commit message:

remove unnecessary prepare_data (PaddlePaddle#17080)

test=develop
# This is the 8th commit message:

fix interpolate cu. test=develop (PaddlePaddle#17101)

# This is the 9th commit message:

test=develop, double backward leaky_relu (PaddlePaddle#17067)

backward of backward: leaky_relu
# This is the 10th commit message:

fix fuse optimizer ops (PaddlePaddle#17102)

test=develop
# This is the 11th commit message:

truncated_gaussian_random supported in distributed training, test=develop (PaddlePaddle#17091)

# This is the 12th commit message:

 Detailed coordinate description for yolov3 loss (PaddlePaddle#17007)

* Detailed coordinate description for yolov3 loss

test=develop

* modified api.spec

test=develop

* modified loss name

* fix api.spec

test=develop

* polish description

test=develop

* modified api.spec

test=develop

# This is the 13th commit message:

fix test_weight_decay (PaddlePaddle#17109)

test=develop
# This is the 14th commit message:

Path flag (PaddlePaddle#17105)

* fix python/paddle/fluid/__init__.py detecting problems
sneaxiy added a commit that referenced this pull request Apr 28, 2019
* refine_dropout_mem,test=develop

* # This is a combination of 14 commits.
# The first commit's message is:
remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066)

# This is the 2nd commit message:

Fleet unify distributed training (#16791)

* implement distributed transpiler with fleet
# This is the 3rd commit message:

ParallelDyGraph with GPU collective mode (#16827)

implement dygraph.parallel.DataParallel to hook reduce op.

# This is the 4th commit message:

Init mixed precision training interface (#16856)

* Init mixed precision training interface

* Add fp16 test script

test=develop

* All initializers support float16

test=develop

* Code cleanup & add more code annotations

test=develop

* Update API spec

test=develop

* Add usage example in doc

test=develop

# This is the 5th commit message:

fix reference_count_pass,test=develop (#17060)

test=develop
# This is the 6th commit message:

Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090)

* Cache the information of linear interpolation in forward and use it in backward.
test=develop

* Fix cuda kernel.
test=develop

# This is the 7th commit message:

remove unnecessary prepare_data (#17080)

test=develop
# This is the 8th commit message:

fix interpolate cu. test=develop (#17101)

# This is the 9th commit message:

test=develop, double backward leaky_relu (#17067)

backward of backward: leaky_relu
# This is the 10th commit message:

fix fuse optimizer ops (#17102)

test=develop
# This is the 11th commit message:

truncated_gaussian_random supported in distributed training, test=develop (#17091)

# This is the 12th commit message:

 Detailed coordinate description for yolov3 loss (#17007)

* Detailed coordinate description for yolov3 loss

test=develop

* modified api.spec

test=develop

* modified loss name

* fix api.spec

test=develop

* polish description

test=develop

* modified api.spec

test=develop

# This is the 13th commit message:

fix test_weight_decay (#17109)

test=develop
# This is the 14th commit message:

Path flag (#17105)

* fix python/paddle/fluid/__init__.py detecting problems
np.isclose(
a=loss[i], b=loss3[i], rtol=5e-5),
"Expect " + str(loss[i]) + "\n" + "But Got" + str(loss2[i])
+ " in class " + self.__class__.__name__)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After analysis of this unit test, the reduce model doesn't support this usage:

            param_list = [(var, var * self.learning_rate)
                          for var in main_prog.block(0).all_parameters()]

            optimizer = fluid.optimizer.Adagrad(
                learning_rate=self.learning_rate)
            optimizer.minimize(avg_cost)

            for params in param_list:
                updated_p = fluid.layers.elementwise_sub(
                    x=params[0], y=params[1])
                fluid.layers.assign(input=updated_p, output=params[0])

Because, in Reduce mode, all the optimizer ops are scattered to different cards equally according to the op's role(which is opt_role) to update different parameter in a specified card, and after that, the updated parameters are broadcast to other cards. But the op_role of the above ops(below optimizer.minimize(avg_cost)) are forward, so the results are not right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants