Implentation of 'Variational Attention for Sequence to Sequence Models' in tensorflow.
This package consists of 3 models, each of which have been organized into separate folders:
- Deterministic encoder-decoder with deterministic attention (
ded_detAttn) - Variational encoder-decoder with deterministic attention (
ved_detAttn) - Variational encoder-decoder with variational attention (
ved_varAttn)
The proposed model and baselines have been evaluated on two experiments:
- Neural Question Generation with the SQuAD dataset
- Conversation Systems with the Cornell Movie Dialogue dataset
The data has been preprocessed and the train-val-test split is provided in the data/ directory.
- tensorflow-gpu==1.3.0
- Keras==2.0. 8
- numpy==1.12.1
- pandas==0.22.0
- gensim==3.1.2
- nltk==3.2.3
- tqdm==4.19.1
- Generate word2vec, required for initializing word embeddings, specifying the dataset:
python w2v_generator.py --dataset qgen
- Train the desired model, set configurations in the
model_config.pyfile. For example,
cd ved_varAttn
vim model_config.py # Make necessary edits
python train.py
- The model checkpoints are stored in
models/directory, the summaries for Tensorboard are stored insummary_logs/directory. As training progresses, the metrics on the validation set are dumped intolog.txtandbleu/directory.
- Evaluate performance of the trained model. Refer to
predict.ipynbto load desired checkpoint, calculate performance metrics (BLEU and diversity score) on the test set, and generate sample outputs.