This repository is the official Tensorflow implementation of "Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation".
Jiaxuan You*, Bowen Liu*, Rex Ying, Vijay Pande, Jure Leskovec, Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation
- Install rdkit, please refer to the offical website for further details, using anaconda is recommended:
conda create -c rdkit -n my-rdkit-env rdkit- Install mpi4py, networkx:
conda install mpi4py
pip install networkx=1.11- Install OpenAI baseline dependencies:
cd rl-baselines
pip install -e .- Install customized molecule gym environment:
cd gym-molecule
pip install -e.There are 4 important files:
run_molecule.pyis the main code for running the program. You may tune all kinds of hyper-parameters there.- The molecule environment code is in
gym-molecule/gym_molecule/envs/molecule.py. - RL related code is in
rl-baselines/baselines/ppo1folder:gcn_policy.pyis the GCN policy network;pposgd_simple_gcn.pyis the PPO algorithm specifically tuned for GCN policy.
- single process run
python run_molecule.py- mutiple processes run
mpirun -np 8 python run_molecule.py 2>/dev/null2>/dev/null will hide the warning info provided by rdkit package.
We highly recommend using tensorboard to monitor the training process. To do this, you may run
tensorboard --logdir runsAll the generated molecules along the training process are stored in the molecule_gen folder, each run configuration is stored in a different csv file. Molecules are stored using SMILES strings, along with the desired properties scores.