Skip to content
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
109 commits
Select commit Hold shift + click to select a range
f0e03de
Added working her version, Online sampling is missing.
megan-klaiber Jul 20, 2020
f2b0645
Updated test_her.
megan-klaiber Jul 21, 2020
f7d5f88
Added first version of online her sampling. Still problems with tenso…
megan-klaiber Jul 23, 2020
88771b8
Reformat
araffin Jul 23, 2020
2e436a2
Fixed tests
araffin Jul 23, 2020
c0a82fc
Added some comments.
megan-klaiber Jul 23, 2020
e6263b2
Updated changelog.
megan-klaiber Jul 23, 2020
257b8fc
Add missing init file
araffin Jul 23, 2020
90f6e2c
Fixed some small bugs.
megan-klaiber Jul 23, 2020
b864f34
Merge branch 'master' into her
araffin Jul 24, 2020
fb4351c
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin Jul 24, 2020
7b22e68
Reduced arguments for HER, small changes.
megan-klaiber Jul 29, 2020
501b1c4
Added getattr. Fixed bug for online sampling.
megan-klaiber Aug 3, 2020
15d5511
Merge branch 'master' into her
araffin Aug 4, 2020
7411668
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin Aug 4, 2020
5d09619
Updated save/load funtions. Small changes.
megan-klaiber Aug 6, 2020
cb6a650
Merge branch 'her' of https://github.com/DLR-RM/stable-baselines3 int…
megan-klaiber Aug 6, 2020
cb9026f
Added her to init.
megan-klaiber Aug 6, 2020
e30f730
Updated save method.
megan-klaiber Aug 7, 2020
7d1eb24
Updated her ratio.
megan-klaiber Aug 7, 2020
afca86a
Merge branch 'master' into her
araffin Aug 7, 2020
21bd1a4
Move obs_wrapper
megan-klaiber Aug 11, 2020
12bd35c
Merge branch 'her' of https://github.com/DLR-RM/stable-baselines3 int…
megan-klaiber Aug 11, 2020
e647d36
Added DQN test.
megan-klaiber Aug 11, 2020
fc2b181
Fix potential bug
araffin Aug 11, 2020
3f3bd49
Offline and online her share same sample_goal function.
megan-klaiber Aug 19, 2020
cce063f
Changed lists into arrays.
megan-klaiber Aug 24, 2020
6393816
Merge branch 'master' into her
araffin Aug 24, 2020
0c0d742
Updated her test.
megan-klaiber Aug 24, 2020
474e434
Merge branch 'her' of https://github.com/DLR-RM/stable-baselines3 int…
megan-klaiber Aug 24, 2020
bbf9d6d
Fix online sampling
araffin Aug 24, 2020
eefea13
Fixed action bug. Updated time limit for episodes.
megan-klaiber Aug 24, 2020
b5b00db
Updated convert_dict method to take keys as arguments.
megan-klaiber Aug 24, 2020
fb229b7
Renamed obs dict wrapper.
megan-klaiber Aug 25, 2020
8a93ac9
Seed bit flipping env
araffin Aug 25, 2020
66ab30c
Remove get_episode_dict
araffin Aug 25, 2020
a4b7cf9
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin Aug 25, 2020
d6a5524
Add fast online sampling version
araffin Aug 25, 2020
a3c08de
Added documentation.
megan-klaiber Aug 25, 2020
bbf5a93
Vectorized reward computation
araffin Aug 25, 2020
41ab0a9
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin Aug 25, 2020
2525eb0
Vectorized goal sampling
araffin Aug 25, 2020
c57c6ef
Update time limit for episodes in online her sampling.
megan-klaiber Aug 25, 2020
902267c
Fix max episode length inference
araffin Aug 26, 2020
fc7f647
Bug fix for Fetch envs
araffin Aug 26, 2020
0757a73
Fix for HER + gSDE
araffin Aug 27, 2020
eb89099
Reformat (new black version)
araffin Aug 27, 2020
d1adff6
Added info dict to compute new reward. Check her_replay_buffer again.
megan-klaiber Aug 27, 2020
01162df
Fix info buffer
araffin Aug 27, 2020
68b91b0
Merge branch 'master' into her
araffin Aug 27, 2020
656a1a6
Updated done flag.
megan-klaiber Aug 28, 2020
59bbe80
Fixes for gSDE
araffin Aug 28, 2020
bf5eb69
Merge branch 'master' into her
araffin Aug 29, 2020
02d5c6b
Merge branch 'master' into her
araffin Sep 1, 2020
90dafc4
Offline her version uses now HerReplayBuffer as episode storage.
megan-klaiber Sep 16, 2020
45ac099
Merge branch 'master' into her
araffin Sep 16, 2020
655e4c3
Fix num_timesteps computation
araffin Sep 17, 2020
d1df715
Merge branch 'master' into her
araffin Sep 24, 2020
046088b
Fix get torch params
araffin Sep 24, 2020
22b9d5a
Merge branch 'master' into her
araffin Sep 26, 2020
ba34a29
Merge branch 'master' into her
araffin Sep 30, 2020
281bd94
Merge branch 'master' into her
araffin Oct 4, 2020
a68cc32
Vectorized version for offline sampling.
megan-klaiber Oct 5, 2020
8a25457
Modified offline her sampling to use sample method of her_replay_buffer
megan-klaiber Oct 6, 2020
c125d08
Updated HER tests.
megan-klaiber Oct 6, 2020
a70b47b
Updated documentation
megan-klaiber Oct 6, 2020
74e5d14
Merge branch 'master' into her
araffin Oct 7, 2020
aaa80c8
Cleanup docstrings
araffin Oct 7, 2020
362ea5c
Updated to review comments
megan-klaiber Oct 8, 2020
aafe326
Merge branch 'master' into her
araffin Oct 12, 2020
7f8b636
Fix pytype
araffin Oct 12, 2020
39a63b8
Update according to review comments.
megan-klaiber Oct 13, 2020
417559e
Merge branch 'master' into her
megan-klaiber Oct 14, 2020
258deff
Removed random goal strategy. Updated sample transitions.
megan-klaiber Oct 14, 2020
381d927
Updated migration. Removed time signal removal.
megan-klaiber Oct 14, 2020
c10b26a
Update doc
araffin Oct 14, 2020
9d5c83e
Fix potential load issue
araffin Oct 14, 2020
86a25bf
Merge branch 'master' into her
araffin Oct 16, 2020
46c6d29
Add VecNormalize support for dict obs
araffin Oct 16, 2020
1cfc790
Updated saving/loading replay buffer for HER.
megan-klaiber Oct 16, 2020
f738f32
Fix test memory usage
araffin Oct 16, 2020
d0a3f46
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin Oct 16, 2020
fe42e1f
Merge branch 'master' into her
araffin Oct 18, 2020
d7a787f
Fixed save/load replay buffer.
megan-klaiber Oct 19, 2020
f36589f
Merge branch 'her' of https://github.com/DLR-RM/stable-baselines3 int…
megan-klaiber Oct 19, 2020
c8ebaa9
Fixed save/load replay buffer
megan-klaiber Oct 19, 2020
11f0fa2
Fixed transition index after loading replay buffer in online sampling
megan-klaiber Oct 19, 2020
dd07f7c
Merge branch 'master' into her
araffin Oct 20, 2020
ee39e38
Better error handling
araffin Oct 20, 2020
3821e4d
Add tests for get_time_limit
araffin Oct 20, 2020
dca9582
More tests for VecNormalize with dict obs
araffin Oct 20, 2020
631cc9c
Update doc
araffin Oct 20, 2020
ba0a7e4
Improve HER description
araffin Oct 20, 2020
907bcff
Add test for sde support
araffin Oct 20, 2020
f650934
Add comments
araffin Oct 20, 2020
03c4104
Add comments
araffin Oct 20, 2020
6c18e4c
Remove check that was always valid
araffin Oct 20, 2020
28b281d
Fix for terminal observation
araffin Oct 21, 2020
d196aa2
Updated buffer size in offline version and reset of HER buffer
megan-klaiber Oct 21, 2020
1f7ab9f
Reformat
araffin Oct 21, 2020
7da274f
Update doc
araffin Oct 21, 2020
8bb5c7c
Remove np.empty + add doc
araffin Oct 21, 2020
d884f9c
Fix loading
araffin Oct 21, 2020
0ba1272
Updated loading replay buffer
megan-klaiber Oct 21, 2020
4034217
Separate online and offline sampling + bug fixes
araffin Oct 22, 2020
aacd936
Update tensorboard log name
araffin Oct 22, 2020
940ee2c
Version bump
araffin Oct 22, 2020
3bb19a7
Bug fix for special case
araffin Oct 22, 2020
fb92b22
Merge branch 'master' into her
araffin Oct 22, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/misc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ New Features:
- Refactored opening paths for saving and loading to use strings, pathlib or io.BufferedIOBase (@PartiallyTyped)
- Added ``DDPG`` algorithm as a special case of ``TD3``.
- Introduced ``BaseModel`` abstract parent for ``BasePolicy``, which critics inherit from.
- Added Hindsight Experience Replay ``HER``. (@megan-klaiber)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will need also to update the documentation: add HER to the module and to the examples (you can mostly copy-paste what was done in SB2 documentation ;))


Bug Fixes:
^^^^^^^^^^
Expand Down Expand Up @@ -356,4 +357,4 @@ And all the contributors:
@Miffyli @dwiel @miguelrass @qxcv @jaberkow @eavelardev @ruifeng96150 @pedrohbtp @srivatsankrishnan @evilsocket
@MarvineGothic @jdossgollin @SyllogismRXS @rusu24edward @jbulow @Antymon @seheevic @justinkterry @edbeeching
@flodorner @KuKuXia @NeoExtended @PartiallyTyped @mmcenta @richardwu @kinalmehta @rolandgvc @tkelestemur @mloo3
@tirafesi @blurLake @koulakis @joeljosephjin @shwang
@tirafesi @blurLake @koulakis @joeljosephjin @shwang @megan-klaiber
Empty file.
31 changes: 31 additions & 0 deletions stable_baselines3/her/goal_selection_strategy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
from enum import Enum


class GoalSelectionStrategy(Enum):
"""
The strategies for selecting new goals when
creating artificial transitions.
"""

# Select a goal that was achieved
# after the current step, in the same episode
FUTURE = 0
# Select the goal that was achieved
# at the end of the episode
FINAL = 1
# Select a goal that was achieved in the episode
EPISODE = 2
# Select a goal that was achieved
# at some point in the training procedure
# (and that is present in the replay buffer)
RANDOM = 3


# For convenience
# that way, we can use string to select a strategy
KEY_TO_GOAL_STRATEGY = {
"future": GoalSelectionStrategy.FUTURE,
"final": GoalSelectionStrategy.FINAL,
"episode": GoalSelectionStrategy.EPISODE,
"random": GoalSelectionStrategy.RANDOM,
}
Loading