Skip to content

thu-rllab/SCAS

Repository files navigation

Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression

Code for NeurIPS 2024 accepted paper: Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression.

Environment

Paper results were collected with MuJoCo 210 (and mujoco-py 2.1.2.14) in OpenAI gym 0.23.1 with the D4RL datasets. Networks are trained using PyTorch 1.11.0 and Python 3.7.

Usage

Pretrained Models

We have uploaded pretrained dynamics models in SCAS_dynamics/ to facilitate experiment reproduction.

You can also pretrain dynamics models by running:

./run_pretrain.sh

Offline RL

The SCAS algorithm can be trained by running:

./run_experiments.sh

Logging

This codebase uses tensorboard. You can view saved runs with:

tensorboard --logdir <run_dir>

Citation

If you find this work useful, please consider citing:

@article{mao2024offline,
  title={Offline reinforcement learning with ood state correction and ood action suppression},
  author={Mao, Yixiu and Wang, Qi and Chen, Chen and Qu, Yun and Ji, Xiangyang},
  journal={Advances in Neural Information Processing Systems},
  volume={37},
  pages={93568--93601},
  year={2024}
}

About

NeurIPS 2024

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published