Skip to content

[Question] Relationship between n_step, episode, and advantage in episodic tasks #1938

@d505

Description

@d505

❓ Question

Hello!

I have a question about n_step and the relationship between episodes and advantage in episodic tasks. I have an episodic task that ends with the same step every time. And I use PPO.
If n_step is greater than episode length, I believe that the advantage function will take into account and compute for the next episode as well. What do you think is actually the case?
Then I would prefer n_step equal to episode length without including other episodes.

The following piece of code.
https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/buffers.py#L402

On the other hand, other questions seemed to suggest that including other episodes would be a good idea.
#560

Checklist

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions