-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
enable argument for spacy.load()
#10784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable argument for spacy.load()
#10784
Conversation
svlandeg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to hear some thoughts/reviews by other colleagues as well, but in principle I think this will be really useful. It makes the code & docs slightly more complex, but it will have a huge impact on many use-cases where you really just want one or 2 components and disable everything else. It feels like such a common use-case that we might as well cater for it...
…del with added pipes for tests.
…ecification of component activity. Test refactoring. Added to default config.
e6acdd2 to
9ae4189
Compare
|
The |
|
The Python 3.8 mac build includes some additional CLI tests that aren't run in the other builds (just to avoid excessive downloads/time), so these failures do matter, and aren't particular to python 3.8 / macos. I didn't realize that the plan was to add this setting to the config? I thought this was just adding new keyword arguments to Edited to add: I don't think it makes sense to have two conflicting features in the config like |
Adding it to the config isn't necessary. It was my assumption we'd want to handle |
|
Agreed with Adriane's points! |
|
Removed |
…ation_status() to allow non-standard pipes.
…t_activation_status() to allow non-standard pipes. Extended tests.
|
Removed |
enable argument for spacy.load()
Co-authored-by: Adriane Boyd <[email protected]>
Co-authored-by: Adriane Boyd <[email protected]>
Co-authored-by: Adriane Boyd <[email protected]>
…tatus() at 80 chars.
|
I'm kind of late here, but I agree this would be a great feature to have - for short bits of code this can be clearer and more succint than disable. |
svlandeg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Just had a few more minor comments but should otherwise be good to merge :-)
Co-authored-by: Sofie Van Landeghem <[email protected]>
…spaCy into feat/inclusive-spacy-load-flags
|
Nice work, Raphael! |
* Add "Aim-spaCy" to spaCy Universe (#10943) * Add Aim-spaCy to spaCy universe * Update Aim thumbnail * Fix author links Co-authored-by: Paul O'Leary McCann <[email protected]> * Auto-format code with black (#10945) Co-authored-by: explosion-bot <[email protected]> * precomputable_biaffine: avoid concatenation (#10911) The `forward` of `precomputable_biaffine` performs matrix multiplication and then `vstack`s the result with padding. This creates a temporary array used for the output of matrix concatenation. This change avoids the temporary by pre-allocating an array that is large enough for the output of matrix multiplication plus padding and fills the array in-place. This gave me a small speedup (a bit over 100 WPS) on de_core_news_lg on M1 Max (after changing thinc-apple-ops to support in-place gemm as BLIS does). * Add failing test: `test_matcher_extension_in_set_predicate` (#10948) * vectors: remove use of float as row number (#10955) The float -1 was returned rather than the integer -1 as the row for unknown keys. This doesn't introduce a realy bug, since such floats cast (without issues) to int in the conversion to NumPy arrays. Still, it's nice to to do the correct thing :). * Update for CBlas changes in Thinc 8.1.0.dev2 (#10970) * Workaround for Typer optional default values with Python calls (#10788) * Workaround for Typer optional default values with Python calls: added test and workaround. * @rmitsch Workaround for Typer optional default values with Python calls: reverting some black formatting changes. Co-authored-by: Sofie Van Landeghem <[email protected]> * @rmitsch Workaround for Typer optional default values with Python calls: removing return type hint. Co-authored-by: Sofie Van Landeghem <[email protected]> * Workaround for Typer optional default values with Python calls: fixed imports, added GitHub issue marker. * Workaround for Typer optional default values with Python calls: removed forcing of default values for optional arguments in init_config_cli(). Added default values for init_config(). Synchronized default values for init_config_cli() and init_config(). * Workaround for Typer optional default values with Python calls: removed unused import. * Workaround for Typer optional default values with Python calls: fixed usage of optimize in init_config_cli(). * Workaround for Typer optional default values with Pythhon calls: remove output_file from InitDefaultValues. * Workaround for Typer optional default values with Python calls: rename class for default init values. * Workaround for Typer optional default values with Python calls: remove newline. * remove introduced newlines * Remove test_init_config_from_python_without_optional_args(). * remove leftover import * reformat import * remove duplicate Co-authored-by: Sofie Van Landeghem <[email protected]> * Made _initialize_X() methods private. (#10978) * Auto-format code with black (#10977) Co-authored-by: explosion-bot <[email protected]> * account for NER labels with a hyphen in the name (#10960) * account for NER labels with a hyphen in the name * cleanup * fix docstring * add return type to helper method * shorter method and few more occurrences * user helper method across repo * fix circular import * partial revert to avoid circular import * `enable` argument for spacy.load() (#10784) * Enable flag on spacy.load: foundation for include, enable arguments. * Enable flag on spacy.load: fixed tests. * Enable flag on spacy.load: switched from pretrained model to empty model with added pipes for tests. * Enable flag on spacy.load: switched to more consistent error on misspecification of component activity. Test refactoring. Added to default config. * Enable flag on spacy.load: added support for fields not in pipeline. * Enable flag on spacy.load: removed serialization fields from supported fields. * Enable flag on spacy.load: removed 'enable' from config again. * Enable flag on spacy.load: relaxed checks in _resolve_component_activation_status() to allow non-standard pipes. * Enable flag on spacy.load: fixed relaxed checks for _resolve_component_activation_status() to allow non-standard pipes. Extended tests. * Enable flag on spacy.load: comments w.r.t. resolution workarounds. * Enable flag on spacy.load: remove include fields. Update website docs. * Enable flag on spacy.load: updates w.r.t. changes in master. * Implement Doc.from_json(): update docstrings. Co-authored-by: Adriane Boyd <[email protected]> * Implement Doc.from_json(): remove newline. Co-authored-by: Adriane Boyd <[email protected]> * Implement Doc.from_json(): change error message for E1038. Co-authored-by: Adriane Boyd <[email protected]> * Enable flag on spacy.load: wrapped docstring for _resolve_component_status() at 80 chars. * Enable flag on spacy.load: changed exmples for enable flag. * Remove newline. Co-authored-by: Sofie Van Landeghem <[email protected]> * Fix docstring for Language._resolve_component_status(). * Rename E1038 to E1042. Co-authored-by: Adriane Boyd <[email protected]> Co-authored-by: Sofie Van Landeghem <[email protected]> * add counts to verbose list of NER labels (#10957) * Update linguistic-features.md (#10993) Change link for downloading fasttext word vectors * Use thinc-apple-ops>=0.1.0.dev0 with `apple` extras (#10904) * Use thinc-apple-ops>=0.1.0.dev0 with `apple` extras Also test with thinc-apple-ops that is at least 0.1.0.dev0. * Check thinc-apple-ops on macOS with Python 3.10 Co-authored-by: Adriane Boyd <[email protected]> * Use `pip install --pre` for installing thinc-apple-ops in CI Co-authored-by: Adriane Boyd <[email protected]> Co-authored-by: Gor Arakelyan <[email protected]> Co-authored-by: Paul O'Leary McCann <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: explosion-bot <[email protected]> Co-authored-by: Madeesh Kannan <[email protected]> Co-authored-by: Raphael Mitsch <[email protected]> Co-authored-by: Sofie Van Landeghem <[email protected]> Co-authored-by: Adriane Boyd <[email protected]> Co-authored-by: Victoria <[email protected]>
Goals
Introduce
enableargument tospacy.load(), behaving analogously to anddisable.Description
enabletakes precedence over their negative counterparts. I.e.: ifenableis set,disablehas to be consistent. Otherwise an error is raised. Internally,enableis resolved todisableby disabling those components in the pipeline that are not part ofinclude.Example:
This disables all components other than the tagger and the parser.
Types of change
New feature, new tests.
Checklist