In this project, the problem of return time prediction and event type prediction is considered. To solve this problem, we suggest a continuous time convolution model (COTIC).
All the experiments were conducted using one Nvidia A40 GPU.
The number of parameters was up to 6M on the default setup. The training time was from 1 second for small dataset (MIMIC with 285 9-length sequences) up to 900 seconds for big long-sequence datasets (Transactions with 30k 900-length sequences) per epoch.
- COTIC
- Retweet
- Amazon
- SO
- Transactions
- Mimic
- MemeTrack
The datasets are taken from cloud drive.
Install dependencies
# clone project
git clone git@github.com:VladislavZh/COTIC.git
cd COTIC
python -m venv .venv
source .venv/bin/activate
# install pytorch according to instructions
# https://pytorch.org/get-started/
# install requirements
pip install -r requirements.txtTrain model with default configuration
# train on CPU
python train.py name=[name] dataset=[dataset] num_types=[num_types]
# train on GPU
python train.py trainer=gpu name=[name] dataset=[dataset] num_types=[num_types]Data path should as follows: data/[dataset]/[train/val/test] with csv sequence files.
You can override any parameter from command line like this
python train.py name=[name] dataset=[dataset] num_types=[num_types] trainer.max_epochs=20 datamodule.batch_size=64The directory structure of new project looks like this:
├── configs <- Hydra configuration files
│ ├── callbacks <- Callbacks configs
│ ├── datamodule <- Datamodule configs
│ ├── debug <- Debugging configs
│ ├── experiment <- Experiment configs
│ ├── hparams_search <- Hyperparameter search configs
│ ├── local <- Local configs
│ ├── log_dir <- Logging directory configs
│ ├── logger <- Logger configs
│ ├── model <- Model configs
│ ├── trainer <- Trainer configs
│ │
│ ├── test.yaml <- Main config for testing
│ ├── train.yaml <- Main config for training
│ ├── ...
├── data <- Project data
│
├── logs <- Logs generated by Hydra and PyTorch Lightning loggers
│
├── src <- Source code
│ ├── datamodules <- Lightning datamodules
│ ├── models <- Lightning models
│ ├── utils <- Utility scripts
│ │
│ ├── testing_pipeline.py
│ └── training_pipeline.py
│
├── tests <- Tests of any kind
│ ├── helpers <- A couple of testing utilities
│ ├── shell <- Shell/command based tests
│ └── unit <- Unit tests
│
├── test.py <- Run testing
├── train.py <- Run training
│
├── .env.example <- Template of the file for storing private environment variables
├── .gitignore <- List of files/folders ignored by git
├── .pre-commit-config.yaml <- Configuration of pre-commit hooks for code formatting
├── requirements.txt <- File for installing python dependencies
├── setup.cfg <- Configuration of linters and pytest
└── README.md
