-
Notifications
You must be signed in to change notification settings - Fork 2k
Implement HER #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Implement HER #120
Changes from 11 commits
Commits
Show all changes
109 commits
Select commit
Hold shift + click to select a range
f0e03de
Added working her version, Online sampling is missing.
megan-klaiber f2b0645
Updated test_her.
megan-klaiber f7d5f88
Added first version of online her sampling. Still problems with tenso…
megan-klaiber 88771b8
Reformat
araffin 2e436a2
Fixed tests
araffin c0a82fc
Added some comments.
megan-klaiber e6263b2
Updated changelog.
megan-klaiber 257b8fc
Add missing init file
araffin 90f6e2c
Fixed some small bugs.
megan-klaiber b864f34
Merge branch 'master' into her
araffin fb4351c
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin 7b22e68
Reduced arguments for HER, small changes.
megan-klaiber 501b1c4
Added getattr. Fixed bug for online sampling.
megan-klaiber 15d5511
Merge branch 'master' into her
araffin 7411668
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin 5d09619
Updated save/load funtions. Small changes.
megan-klaiber cb6a650
Merge branch 'her' of https://github.com/DLR-RM/stable-baselines3 int…
megan-klaiber cb9026f
Added her to init.
megan-klaiber e30f730
Updated save method.
megan-klaiber 7d1eb24
Updated her ratio.
megan-klaiber afca86a
Merge branch 'master' into her
araffin 21bd1a4
Move obs_wrapper
megan-klaiber 12bd35c
Merge branch 'her' of https://github.com/DLR-RM/stable-baselines3 int…
megan-klaiber e647d36
Added DQN test.
megan-klaiber fc2b181
Fix potential bug
araffin 3f3bd49
Offline and online her share same sample_goal function.
megan-klaiber cce063f
Changed lists into arrays.
megan-klaiber 6393816
Merge branch 'master' into her
araffin 0c0d742
Updated her test.
megan-klaiber 474e434
Merge branch 'her' of https://github.com/DLR-RM/stable-baselines3 int…
megan-klaiber bbf9d6d
Fix online sampling
araffin eefea13
Fixed action bug. Updated time limit for episodes.
megan-klaiber b5b00db
Updated convert_dict method to take keys as arguments.
megan-klaiber fb229b7
Renamed obs dict wrapper.
megan-klaiber 8a93ac9
Seed bit flipping env
araffin 66ab30c
Remove get_episode_dict
araffin a4b7cf9
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin d6a5524
Add fast online sampling version
araffin a3c08de
Added documentation.
megan-klaiber bbf5a93
Vectorized reward computation
araffin 41ab0a9
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin 2525eb0
Vectorized goal sampling
araffin c57c6ef
Update time limit for episodes in online her sampling.
megan-klaiber 902267c
Fix max episode length inference
araffin fc7f647
Bug fix for Fetch envs
araffin 0757a73
Fix for HER + gSDE
araffin eb89099
Reformat (new black version)
araffin d1adff6
Added info dict to compute new reward. Check her_replay_buffer again.
megan-klaiber 01162df
Fix info buffer
araffin 68b91b0
Merge branch 'master' into her
araffin 656a1a6
Updated done flag.
megan-klaiber 59bbe80
Fixes for gSDE
araffin bf5eb69
Merge branch 'master' into her
araffin 02d5c6b
Merge branch 'master' into her
araffin 90dafc4
Offline her version uses now HerReplayBuffer as episode storage.
megan-klaiber 45ac099
Merge branch 'master' into her
araffin 655e4c3
Fix num_timesteps computation
araffin d1df715
Merge branch 'master' into her
araffin 046088b
Fix get torch params
araffin 22b9d5a
Merge branch 'master' into her
araffin ba34a29
Merge branch 'master' into her
araffin 281bd94
Merge branch 'master' into her
araffin a68cc32
Vectorized version for offline sampling.
megan-klaiber 8a25457
Modified offline her sampling to use sample method of her_replay_buffer
megan-klaiber c125d08
Updated HER tests.
megan-klaiber a70b47b
Updated documentation
megan-klaiber 74e5d14
Merge branch 'master' into her
araffin aaa80c8
Cleanup docstrings
araffin 362ea5c
Updated to review comments
megan-klaiber aafe326
Merge branch 'master' into her
araffin 7f8b636
Fix pytype
araffin 39a63b8
Update according to review comments.
megan-klaiber 417559e
Merge branch 'master' into her
megan-klaiber 258deff
Removed random goal strategy. Updated sample transitions.
megan-klaiber 381d927
Updated migration. Removed time signal removal.
megan-klaiber c10b26a
Update doc
araffin 9d5c83e
Fix potential load issue
araffin 86a25bf
Merge branch 'master' into her
araffin 46c6d29
Add VecNormalize support for dict obs
araffin 1cfc790
Updated saving/loading replay buffer for HER.
megan-klaiber f738f32
Fix test memory usage
araffin d0a3f46
Merge branch 'her' of github.com:DLR-RM/stable-baselines3 into her
araffin fe42e1f
Merge branch 'master' into her
araffin d7a787f
Fixed save/load replay buffer.
megan-klaiber f36589f
Merge branch 'her' of https://github.com/DLR-RM/stable-baselines3 int…
megan-klaiber c8ebaa9
Fixed save/load replay buffer
megan-klaiber 11f0fa2
Fixed transition index after loading replay buffer in online sampling
megan-klaiber dd07f7c
Merge branch 'master' into her
araffin ee39e38
Better error handling
araffin 3821e4d
Add tests for get_time_limit
araffin dca9582
More tests for VecNormalize with dict obs
araffin 631cc9c
Update doc
araffin ba0a7e4
Improve HER description
araffin 907bcff
Add test for sde support
araffin f650934
Add comments
araffin 03c4104
Add comments
araffin 6c18e4c
Remove check that was always valid
araffin 28b281d
Fix for terminal observation
araffin d196aa2
Updated buffer size in offline version and reset of HER buffer
megan-klaiber 1f7ab9f
Reformat
araffin 7da274f
Update doc
araffin 8bb5c7c
Remove np.empty + add doc
araffin d884f9c
Fix loading
araffin 0ba1272
Updated loading replay buffer
megan-klaiber 4034217
Separate online and offline sampling + bug fixes
araffin aacd936
Update tensorboard log name
araffin 940ee2c
Version bump
araffin 3bb19a7
Bug fix for special case
araffin fb92b22
Merge branch 'master' into her
araffin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| from enum import Enum | ||
|
|
||
|
|
||
| class GoalSelectionStrategy(Enum): | ||
| """ | ||
| The strategies for selecting new goals when | ||
| creating artificial transitions. | ||
| """ | ||
|
|
||
| # Select a goal that was achieved | ||
| # after the current step, in the same episode | ||
| FUTURE = 0 | ||
| # Select the goal that was achieved | ||
| # at the end of the episode | ||
| FINAL = 1 | ||
| # Select a goal that was achieved in the episode | ||
| EPISODE = 2 | ||
| # Select a goal that was achieved | ||
| # at some point in the training procedure | ||
| # (and that is present in the replay buffer) | ||
| RANDOM = 3 | ||
|
|
||
|
|
||
| # For convenience | ||
| # that way, we can use string to select a strategy | ||
| KEY_TO_GOAL_STRATEGY = { | ||
| "future": GoalSelectionStrategy.FUTURE, | ||
| "final": GoalSelectionStrategy.FINAL, | ||
| "episode": GoalSelectionStrategy.EPISODE, | ||
| "random": GoalSelectionStrategy.RANDOM, | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you will need also to update the documentation: add HER to the module and to the examples (you can mostly copy-paste what was done in SB2 documentation ;))