Skip to content

Dependencies not deprojectivized in spaCy 1.7 #898

@adam-ra

Description

@adam-ra

I've noticed suspiciously large amount of evident parser errors after migrating from Spacy 1.6.0 and generic ‘en’ model to 1.7.2 + ‘en_depent_web_md’.

Environment: Python 3.4.3 / 3.5.1 on 64-bit Linux (Ubuntu).

Some example below (please note the abundance of proper noun tags).

1.6.0: pain in lower back

 0) ROOT    pain                    NOUN  NN    ROOT
 1) prep   ---- in                  ADP   IN    prep
 3) pobj   -------- back            NOUN  NN    pobj
 2) amod   ------------ low         ADJ   JJR   amod

1.7.2: pain in lower back

 3) ROOT    back                    PROPN NNP   ROOT
 0) nsubj  ---- pain                NOUN  NN    nsubj
 1) nmod   ---- in                  X     XX    nmod
 2) compound ---- lower             PROPN NNP   compound

1.6.0: I feel pain in lower back

 1) ROOT    feel                    VERB  VBP   ROOT
 0) nsubj  ---- i                   PRON  PRP   nsubj
 2) dobj   ---- pain                NOUN  NN    dobj
 3) prep   -------- in              ADP   IN    prep
 5) pobj   ------------ back        NOUN  NN    pobj
 4) amod   ---------------- low     ADJ   JJR   amod

1.7.2: I feel pain in lower back

 1) ROOT    feel                    VERB  VBP   ROOT
 0) nsubj  ---- i                   NOUN  NN    nsubj
 2) dobj   ---- pain                NOUN  NN    dobj
 3) prep   -------- in              ADP   IN    prep
 5) pobj   ------------ back        PROPN NNP   pobj
 4) compound ---------------- lower PROPN NNP   compound

1.6.0: sores on my dick

 0) ROOT    sore                    NOUN  NNS   ROOT
 1) prep   ---- on                  ADP   IN    prep
 3) pobj   -------- dick            NOUN  NN    pobj
 2) poss   ------------ my          ADJ   PRP$  poss

1.7.2: sores on my dick

 0) ROOT    sore                    NOUN  NN    ROOT
 1) prep   ---- on                  ADP   IN    prep
 3) pobj   -------- dick            PROPN NNP   pobj
 2) compound ------------ my        PROPN NNP   compound

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBugs and behaviour differing from documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions