Code for NeurIPS 2024 accepted paper: Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression.
Paper results were collected with MuJoCo 210 (and mujoco-py 2.1.2.14) in OpenAI gym 0.23.1 with the D4RL datasets. Networks are trained using PyTorch 1.11.0 and Python 3.7.
We have uploaded pretrained dynamics models in SCAS_dynamics/ to facilitate experiment reproduction.
You can also pretrain dynamics models by running:
./run_pretrain.sh
The SCAS algorithm can be trained by running:
./run_experiments.sh
This codebase uses tensorboard. You can view saved runs with:
tensorboard --logdir <run_dir>
If you find this work useful, please consider citing:
@article{mao2024offline,
title={Offline reinforcement learning with ood state correction and ood action suppression},
author={Mao, Yixiu and Wang, Qi and Chen, Chen and Qu, Yun and Ji, Xiangyang},
journal={Advances in Neural Information Processing Systems},
volume={37},
pages={93568--93601},
year={2024}
}