Commit 3c76d52
* Update `TextCatBOW` to use the fixed `SparseLinear` layer
A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754
This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.
While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.
* Replace TexCatBOW `length_exponent` parameter by `length`
We now round up the length to the next power of two if it isn't
a power of two.
* Remove some tests for TextCatBOW.v2
* Fix missing import
1 parent 8b35824 commit 3c76d52
File tree
3 files changed
+44
-7
lines changed- spacy
- tests/pipeline
- website/docs/api
3 files changed
+44
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
976 | 976 | | |
977 | 977 | | |
978 | 978 | | |
979 | | - | |
980 | | - | |
981 | | - | |
982 | 979 | | |
983 | 980 | | |
984 | 981 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
502 | 502 | | |
503 | 503 | | |
504 | 504 | | |
505 | | - | |
506 | | - | |
507 | | - | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
508 | 508 | | |
509 | 509 | | |
510 | 510 | | |
| |||
752 | 752 | | |
753 | 753 | | |
754 | 754 | | |
755 | | - | |
| 755 | + | |
756 | 756 | | |
757 | 757 | | |
758 | 758 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1020 | 1020 | | |
1021 | 1021 | | |
1022 | 1022 | | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
1023 | 1063 | | |
1024 | 1064 | | |
1025 | 1065 | | |
| |||
0 commit comments