Added E3B and validated - SuperMarioBros environment - Fixed Pretraining Mode by roger-creus · Pull Request #41 · RLE-Foundation/rllte

roger-creus · 2023-11-28T17:08:13Z

Description

I have implemented the E3B intrinsic reward proposed here. I have added the SuperMarioBros environment, which I have used to validate the E3B implementation. I have also fixed the pretraining mode for on-policy agents:

Before: the intrinsic rewards are only added to the extrinsic returns and advantages.
Now: if on pretraining mode, compute the intrinsic returns and intrinsic advantages. If using intrinsic + extrinsic rewards, do as before.

This has significantly increased the performance of intrinsic reward algorithms in pre-training mode.

This is the performance of PPO+E3B during pretraining mode in the SuperMarioBros-1-1-v3 environment (i.e. without access to task rewards!)

Motivation and Context

E3B is a recent algorithm that achieves SOTA results in complex environments, so it's a valuable contribution.
During the pretraining phase, the intrinsic rewards were not being optimized properly
Added the SuperMarioBros environment because it is cool and helps evaluating the performance of exploration algorithms since in Mario, good exploratory agents achieve high task rewards.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist

roger-creus added 2 commits November 28, 2023 09:59

Added E3B algorithm

d3f9afc

Added Mario env. Fixed pretraining mode and validated E3B on Mario

87f53ad

roger-creus requested a review from myismyname November 28, 2023 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added E3B and validated - SuperMarioBros environment - Fixed Pretraining Mode#41

Added E3B and validated - SuperMarioBros environment - Fixed Pretraining Mode#41
roger-creus wants to merge 2 commits intomainfrom
e3b

roger-creus commented Nov 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

roger-creus commented Nov 28, 2023

Description

Motivation and Context

Types of changes

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant