dsl-char-cnn

Character-level CNN model for the DSL 2016 shared task.

Requirements

Keras
TensorFlow. Not tested with Theano, but should work fine except for some imports in the beginning of the code.
Cuda GPU card will make the code much faster, especially with cudnn.

Usage

To train a model, go to the src dir and run:

python cnn_multifilter.py

This will train a model, save it to disk, and report some scores.

to test a model on raw texts, go to the src dir and run:

python predict_test_data_with_trained_model.py

This will create a file with predictions under data. It will also create a file with the posterior probabilitites. See example files under data.

Note: file names are currently hard-coded in several places (e.g. model files in the train and test scripts, and data files in data.py.

Citing

If you use this code in your work, please consider citing our paper: "A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects", Yonatan Belinkov and James Glass, VarDial 2016.

@InProceedings{belinkov-glass:2016:VarDial,
  author    = {Belinkov, Yonatan  and  Glass, James},
  title     = {A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects},
  booktitle = {Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)},
  month     = {December},
  year      = {2016},
  address   = {Osaka, Japan}
}

TODO

Clean code

Acknowledgements

This implementation uses code from François Chollet's IMBC CNN example.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dsl2016_to_csv.py		dsl2016_to_csv.py
get_alphabet.py		get_alphabet.py
split_train_dev.py		split_train_dev.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

dsl-char-cnn

Requirements

Usage

Citing

TODO

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

boknilev/dsl-char-cnn

Folders and files

Latest commit

History

Repository files navigation

dsl-char-cnn

Requirements

Usage

Citing

TODO

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages