v1.3.0: Improve API consistency
✨ Major features and improvements
- Add
Span.sentimentattribute. - #658: Add
Span.noun_chunksiterator (thanks @pokey). - #642: Let
--data-pathbe specified when running download.py scripts (thanks @ExplodingCabbage). - #638: Add German stopwords (thanks @souravsingh).
- #614: Fix
PhraseMatcherto work with newMatcher(thanks @sadovnychyi).
🔴 Bug fixes
- Fix issue #605:
acceptargument toMatchernow rejects matches as expected. - Fix issue #617:
Vocab.load()now works with string paths, as well asPathobjects. - Fix issue #639: Stop words in
Languageclass now used as expected. - Fix issues #656, #624:
Tokenizerspecial-case rules now support arbitrary token attributes.
📖 Documentation and examples
- Add "Customizing the tokenizer" workflow.
- Add "Training the tagger, parser and entity recognizer" workflow.
- Add "Entity recognition" workflow.
- Fix various typos and inconsistencies.
👥 Contributors
Thanks to @pokey, @ExplodingCabbage, @souravsingh, @sadovnychyi, @manojsakhwar, @TiagoMRodrigues, @savkov, @pspiegelhalter, @chenb67, @kylepjohnson, @YanhaoYang, @tjrileywisc, @dechov, @wjt, @jsmootiv and @blarghmatey for the pull requests!