Optimizer Design by jacquesqiao · Pull Request #4656 · PaddlePaddle/Paddle

jacquesqiao · 2017-10-10T00:47:56Z

dzhwinter · 2017-10-10T18:00:31Z

doc/design/optimizer.md

+		op related.
+		"""
+		...
+		return update_op


When a user wants to update twice, the update_op need to trace the first update_op and all update, backward op related. Maybe we need to write some guides to point out it.

ok, after discussing with @dzhwinter , this can be done in the current design, but not the most important thing to consider now.

have already make the backward interface public, user and call it directly multiple times to create more gradient operators in the graph.

dzhwinter · 2017-10-10T18:01:24Z

doc/design/optimizer.md

@@ -0,0 +1,85 @@
+## Optimizer Design
+In deeplearning system, `Optimizer` is used to optimize(minimize) loss thow updating a list of parameters. 


thow is a typo?

reyoung · 2017-10-10T18:05:11Z

doc/design/optimizer.md

+
+### A typical training process:
+
+1. run forward to calculate activation using data and parameter.


I do not think this typical training process fits our current design.

Currently, we put every operator into one ProgramDesc. There are not three running stages explicitly.

This is a general abstract training process, no matter how complex the training process is, they are all composed of these stages.

In Our design, we also have functions like backward and optimize to put related operators into ProgramDesc. Here we just put the interface into Optimizer as high level API.

reyoung · 2017-10-10T18:34:35Z

doc/design/optimizer.md

+
+```python
+class Optimizer(object):
+	def _backward(loss):


backward and update should be public.

reyoung · 2017-10-10T18:39:57Z

doc/design/optimizer.md

+3. User use the optimizer to `minimize` a certain `cost` thow updating parameters in parameter_list.
+
+```python
+opt = optimizer.minimize(cost, parameter_list=[w1, ...])


opt should as a list.

wangkuiyi · 2017-10-10T18:41:37Z

doc/design/optimizer.md

@@ -0,0 +1,85 @@
+## Optimizer Design
+In deeplearning system, `Optimizer` is used to optimize(minimize) loss thow updating a list of parameters. 


This design doc doesn't explain the challenge.

It looks to me that the challenge is

The Problem

A PaddlePaddle program, or a block, is a sequence of operators operating variables. A training program needs to do three kinds of works:

the forward pass, which computes intermediate results and the cost(s),

the backward pass, which derives gradients from intermediate and costs, and

the optimization pass, which update model parameters.

These works rely on three kinds of operators:

forward operators,

gradient operators, and

optimization operators.

It's true that users should be able to create all these operators manually by calling some low-level API, but it would be much more convenient if they could only describe the forward pass and let PaddlePaddle create the backward and optimization operators automatically.

In this design, we propose a high-level API that automatically derives the optimisation pass and operators from the forward pass.

wangkuiyi · 2017-10-10T18:42:08Z

doc/design/optimizer.md

+## Optimizer Design
+In deeplearning system, `Optimizer` is used to optimize(minimize) loss thow updating a list of parameters. 
+
+### A typical training process:


If the above proposed section ## The Problem is accepted, this paragraph of three bullets can be removed.

wangkuiyi · 2017-10-10T18:42:51Z

doc/design/optimizer.md

+
+1. User write code to describe the network:
+
+```python


This Python program needs to be properly indented -- to the right of 1. in the above line.

wangkuiyi · 2017-10-10T18:43:07Z

doc/design/optimizer.md

+cost = layer.mse(hidden, labels)
+```
+
+the code above will generate forward operators in [block](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/block.md).


the => The
the code above => The above code snippet
will generate => creates

wangkuiyi · 2017-10-10T18:43:59Z

doc/design/optimizer.md

+the code above will generate forward operators in [block](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/block.md).
+
+
+2. User create a Optimizer and set parameter list that it need to update.


Either The user creates or Users create

wangkuiyi · 2017-10-10T18:44:24Z

doc/design/optimizer.md

+
+2. User create a Optimizer and set parameter list that it need to update.
+
+```python


Correct code snippet indentation in the Markdown doc.

wangkuiyi · 2017-10-10T18:45:08Z

doc/design/optimizer.md

+
+### What does optimizer do:
+
+In PaddlePaddle, we use block of operators to describe computation. From the Python Interface we described above, we can see that `Optimizer` should add some operators to the computation block:


block of operators => blocks of operators
we use => PaddlePaddle uses

done, removed

wangkuiyi · 2017-10-10T18:46:42Z

doc/design/optimizer.md

+
+```python
+class Optimizer(object):
+	def _backward(loss):


_backward => create_backward_pass

wangkuiyi · 2017-10-10T18:46:52Z

doc/design/optimizer.md

+		...
+		return variables
+
+	def _update(var_list):


_update => create_optimization_pass

Superjomn · 2017-10-10T22:10:35Z

doc/design/optimizer.md

+
+1. User write code to describe the network:
+
+	```python


the pseudo code here not format well.

tonyyang-svail · 2017-10-10T22:19:06Z

doc/design/optimizer.md

+
+```python
+class Optimizer(object):
+	def create_backward_pass(loss, parameter_list=None):


Parameter and variable look like interchangeable in Python API. Not sure they are referred to the same concept.

tonyyang-svail · 2017-10-10T22:21:15Z

doc/design/optimizer.md

+		This method simply combines calls `create_backward_pass()` and
+		`create_optimization_pass()`.
+		"""
+		vars_grads = create_backward_pass(loss)


typo create_backward_pass(loss) => create_backward_pass(loss, parameter_list)

jacquesqiao · 2017-10-18T02:22:40Z

The reason of use a uniform interface for Optimizer.

when use parameter_share, and with different optimizer.
one network, different optimizer.

jacquesqiao added 2 commits October 9, 2017 17:46

init optimizer design

1077256

fix index

94f559b

jacquesqiao changed the title ~~Optimizer on block~~ Optimizer Design Oct 10, 2017

jacquesqiao requested review from Superjomn, abhinavarora, reyoung, tonyyang-svail and wangkuiyi October 10, 2017 17:06

dzhwinter reviewed Oct 10, 2017

View reviewed changes

jacquesqiao mentioned this pull request Oct 10, 2017

Optimizer Design and Implementation #4679

Closed

23 tasks

reyoung reviewed Oct 10, 2017

View reviewed changes

wangkuiyi reviewed Oct 10, 2017

View reviewed changes

jacquesqiao added 2 commits October 10, 2017 14:00

optimize the interface

a30e4b4

add a link to python_api.md

9ba35a4

Superjomn reviewed Oct 10, 2017

View reviewed changes

doc/design/optimizer.md

1. User write code to describe the network:

```python

Copy link

Contributor

Superjomn Oct 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pseudo code here not format well.

tonyyang-svail reviewed Oct 10, 2017

View reviewed changes

optimize the code of Optimizer

49318e8

wangkuiyi approved these changes Oct 11, 2017

View reviewed changes

wangkuiyi merged commit 696874a into PaddlePaddle:develop Oct 11, 2017

jacquesqiao mentioned this pull request Oct 21, 2017

optimizer design #3711

Closed

		@@ -0,0 +1,85 @@
		## Optimizer Design
		In deeplearning system, `Optimizer` is used to optimize(minimize) loss thow updating a list of parameters.


		### A typical training process:

		1. run forward to calculate activation using data and parameter.

		the code above will generate forward operators in [block](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/block.md).


		2. User create a Optimizer and set parameter list that it need to update.


		2. User create a Optimizer and set parameter list that it need to update.

		```python


		### What does optimizer do:

		In PaddlePaddle, we use block of operators to describe computation. From the Python Interface we described above, we can see that `Optimizer` should add some operators to the computation block:

Comments

Conversation

jacquesqiao commented Oct 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

The Problem

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jacquesqiao commented Oct 18, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

jacquesqiao commented Oct 10, 2017 •

edited

Loading