Skip to content

Latest commit

 

History

History
18 lines (15 loc) · 614 Bytes

File metadata and controls

18 lines (15 loc) · 614 Bytes

Day 10

  • PAPER: Trust Region Policy Optimization (TRPO) Pt 2
    • Theoretical lower bound proof
    • Estimating State Values
    • Sampling action values using Monte Carlo Search
    • Single Path Sampling (trajectory based)
  • Examples of Real Robot Transition Functions
    • Mapping out Actuator space -> Joint space
    • Mapping out Joint space -> Cartesian space

Notes

Paper notes 1 Paper notes 2