[2.0] KeyError when attempting to train the dependency parser

I have encountered inconsistencies when parsing certain questions, and wanted to try updating the model to fix them. I referred to the NER-training sample and the 2.0 docs, and put together the following:

```
from spacy.gold import GoldParse

from functools import partial
import random

def reformat_train_data(tokenizer, examples):
    """Reformat data to match JSON format"""
    output = []
    for i, (text, deps, heads) in enumerate(examples):
        doc = tokenizer(text)
        ner_tags = [''] * len(doc)
        words = [w.text for w in doc]
        tags = ['-'] * len(doc)
        sentence = (range(len(doc)), words, tags, heads, deps, ner_tags)
        output.append((text, [(sentence, [])]))
    return output

training_data = [('How long does it run?', 
                  ['advmod', 'advmod', 'aux', 'det', 'nsubj', 'ROOT'],
                  [1, 4, 4, 4, 4, 4]),
                ('How high does it reach?',
                 ['advmod', 'advmod', 'aux', 'det', 'nsubj', 'ROOT'],
                 [1, 4, 4, 4, 4, 4])]

get_training_data = partial(reformat_train_data, nlp.tokenizer, training_data)

optimizer = nlp.begin_training(get_training_data)
for iteration in range(100):
    random.shuffle(training_data)
    for raw_text, deps, heads in training_data:
        doc = nlp.make_doc(raw_text)
        gold = GoldParse(deps=deps, heads=heads)
        nlp.update([doc], [gold], sgd=optimizer)
```

However, when trying to run this, it crashes with `KeyError: 13656873538139661788`. It seems like the error is similar to https://github.com/explosion/spaCy/issues/1052, but the particular bug from that issue appears to have been fixed. 

Is there something I'm doing wrong? Or does Spacy 2.0 currently not support training the dependency parser?

### Update

## Info about spaCy

* **spaCy version:** 2.0.0a9
* **Platform:** Linux-4.4.0-43-Microsoft-x86_64-with-debian-stretch-sid
* **Python version:** 3.6.1
* **Models:** en, en_default

## Full stack trace

```
KeyError                                  Traceback (most recent call last)
<ipython-input-2-d6926b2fc85f> in <module>()
     30 
     31 
---> 32 optimizer = nlp.begin_training(get_training_data)
     33 for iteration in range(100):
     34     random.shuffle(training_data)

/mnt/c/Users/notnami/projects/.../.spacy-nightly/lib/python3.6/site-packages/spacy/language.py in begin_training(self, get_gold_tuples, **cfg)
    367             if hasattr(proc, 'begin_training'):
    368                 context = proc.begin_training(get_gold_tuples(),
--> 369                                               pipeline=self.pipeline)
    370                 contexts.append(context)
    371         learn_rate = util.env_opt('learn_rate', 0.001)


/mnt/c/Users/notnami/projects/..../.spacy-nightly/lib/python3.6/site-packages/spacy/pipeline.pyx in spacy.pipeline.NeuralTagger.begin_training (spacy/pipeline.cpp:16293)()

/mnt/c/Users/notnami/projects/.../.spacy-nightly/lib/python3.6/site-packages/spacy/morphology.pyx in spacy.morphology.Morphology.__init__ (spacy/morphology.cpp:4655)()

/mnt/c/Users/notnami/projects/.../.spacy-nightly/lib/python3.6/site-packages/spacy/morphology.pyx in spacy.morphology.Morphology.add_special_case (spacy/morphology.cpp:5625)()

KeyError: 13656873538139661788
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[2.0] KeyError when attempting to train the dependency parser #1268

Update

Info about spaCy

Full stack trace

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[2.0] KeyError when attempting to train the dependency parser #1268

Description

Update

Info about spaCy

Full stack trace

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions