-
Notifications
You must be signed in to change notification settings - Fork 928
Description
Hello! I have trained a NER model for the Armenian language using the ArmTDP dataset and the xlm-roberta-base model.
After that, I attempted to test the model using stanza.Pipeline:
import stanza
config = {
'processors': 'tokenize, ner',
'lang': 'hy',
'ner_model_path': '/Lab/Projects/ner/models/hy_armtdp_nertagger_bert_18.pt',
}
nlp = stanza.Pipeline(**config)
nlp("some text in Arminian")
While working with the same data, I observed that the outputs after loading the model were different each time.
Although there was no such problem when testing the code using internal commands. Whenever I run the following code, I get the same output:
python3 -m stanza.utils.training.run_ner hy_armtdp --score_test
What could be the cause of this problem?
Additionally, I have added data conversion and BERT code for Armenian in this pull request (trained model can be downloaded from this drive).
If the problem is feasible, it would be great to integrate a NER model for Armenian in the main package
Thanks!