-
Notifications
You must be signed in to change notification settings - Fork 2k
Refactor observation stacking #1238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ble-baselines3 into new-stack-observation
|
will try to review this one this week ;) |
|
Funnily, this is the 2nd time this piece of code is refactored ^^ first time was #243 (currently doing the review ;)) |
| return channels_first, stack_dimension, tuple(stacked_shape), repeat_axis | ||
|
|
||
| def stack_observation_space(self, observation_space: spaces.Box) -> spaces.Box: | ||
| def stack_observation_space(self, observation_space: Union[spaces.Box, spaces.Dict]) -> Union[spaces.Box, spaces.Dict]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about removing it but after a quick check, it is a bit used: https://github.com/search?o=desc&q=stable_baselines3+stack_observation_space&s=indexed&type=Code
so, let's remove it only in 2.x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the contrary, it seems that most of the matches on this search match def stack_observation_space that come from the copies of sb3. I can't find any code where this function is actually used.
| Given an observation space, returns a new observation space with stacked observations | ||
| This function is deprecated. | ||
| As an alternative, use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.stacked_observation_space is not a better alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is supposed to work by passing an observation space as input. This makes it a bit strange, as it is a bit of a cross between a class method and an instance method, in that it depends on self.n_stack and self.repeat_axis, but not on self.observation_space. Anyway, I think it should be removed, and if there is someone who actually uses it, then the correspodance that works in all cases (potentially with a different observation_space than the one of the instance) is the code that is proposed in the docstring.
| observation_space = self.stackedobs.stack_observation_space(wrapped_obs_space) | ||
| VecEnvWrapper.__init__(self, venv, observation_space=observation_space) | ||
| @property | ||
| def n_stack(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was it used somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in stack_observation_space; it's not used elsewhere in sb3, nor in sb3-contrib or rl-zoo3. It's hard to know if this arg is used by codes that depend on sb3.
araffin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the stacked dict obs and simplify some code that was a bit duplicated.
I'm also not sure why you added all the # pytype: disable=attribute-error, I tried removing them and it does work locally.
And pytype doesn't use the type stubs if I recall, in comparison to mypy (so if mypy passes, should be fine).
Apart from that comment, the rest look good to me =)
Usually, we wonder why a bug appears when we have not changed anything. Here, I wonder why a bug disappears when we have not changed anything 🤔. So be it! LGTM too! |
|
i think i know what happened, pytype allow multiline skip, and that's what your comment did. |
Description
Summary of changes:
StackedObservationsnow supports Dict as an observation space. It handles this by nesting instances of StackedObservation.DictStackedObservationhas therefore been deprecated.The changes I need feedback on:
stack_observation_spacedeprecationI deprecated the
stack_observation_spacemethod for the following reasons:self.repeat_axis, but takesobservation_spaceas parameter).compute_stackingnew params and returnsstatic method
compute_stackingno longer takesnum_envsand return thestacked_shapeinstead of a zero-filled stacked obs of shape(num_env, *stacked_shape)Motivation and Context
Types of changes
Checklist
make format(required)make check-codestyleandmake lint(required)make pytestandmake typeboth pass. (required)make doc(required)Note: You can run most of the checks using
make commit-checks.Note: we are using a maximum length of 127 characters per line