- PAPER: Trust Region Policy Optimization (TRPO) Pt 2
- Theoretical lower bound proof
- Estimating State Values
- Sampling action values using Monte Carlo Search
- Single Path Sampling (trajectory based)
- Examples of Real Robot Transition Functions
- Mapping out Actuator space -> Joint space
- Mapping out Joint space -> Cartesian space

