-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Closed
Labels
usageGeneral spaCy usageGeneral spaCy usage
Description
Thanks for an amazing library!
I have a small pain point that I have run into. My workflow is the following:
- Parse a sentence
- Remove stop words
- Compute distance between two sentences
In this workflow I have run into a couple of problems:
Lemmadoes not have atextproperty likeTokendoes, it does have alower_property which is equivalent, so maybe it could be just aliased- There is no simple way to obtain the vector of a lemma given a
Token. I usednlp.vocab[token.lemma].vectoras a proxy, but I think it could be cleaner. I think it makes a lot of sense for people to use the lemma's vector rather than token's
I hope you find this useful in improving the future API. I would be willing to contribute if there is interest.
## Info about spaCy
* **spaCy version:** 1.7.3
* **Platform:** Darwin-16.5.0-x86_64-i386-64bit
* **Python version:** 3.6.1
* **Installed models:** en, en_core_web_md
## Info about model en_core_web_md
* **lang:** en
* **name:** core_web_md
* **license:** CC BY-SA 3.0
* **author:** Explosion AI
* **url:** https://explosion.ai
* **version:** 1.2.1
* **spacy_version:** >=1.7.0,<2.0.0
* **email:** [email protected]
* **description:** General-purpose English model, with tagging, parsing, entities and word vectors
* **source:** /Users/miroslav.zoricak/.local/share/virtualenvs/intent-classification-tNT6Vqf2/lib/python3.6/site-packages/en_core_web_md/en_core_web_md-1.2.1
Metadata
Metadata
Assignees
Labels
usageGeneral spaCy usageGeneral spaCy usage