While PyData Amsterdam 2024 was not that interesting, I wrote kind-of word2vec (cbow) as I understood it. I didn't care about anything but to kill some time and train a small NN on a M2 Pro.
- Run
data.pyto preprcoesshp.txtinto vocabs and word-to-index and index-to-word - Run
train.pyto start far-from-optimal train loop - Run
run.pylikepython run.py 'harry+ron-hermione'to get top-5 words that are close in the learned space