-
Notifications
You must be signed in to change notification settings - Fork 2k
Closed
Closed
Copy link
Labels
questionFurther information is requestedFurther information is requested
Description
❓ Question
Hello!
I have a question about n_step and the relationship between episodes and advantage in episodic tasks. I have an episodic task that ends with the same step every time. And I use PPO.
If n_step is greater than episode length, I believe that the advantage function will take into account and compute for the next episode as well. What do you think is actually the case?
Then I would prefer n_step equal to episode length without including other episodes.
The following piece of code.
https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/buffers.py#L402
On the other hand, other questions seemed to suggest that including other episodes would be a good idea.
#560
Checklist
- I have checked that there is no similar issue in the repo
- I have read the documentation
- If code there is, it is minimal and working
- If code there is, it is formatted using the markdown code blocks for both code and stack traces.
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested