-
Notifications
You must be signed in to change notification settings - Fork 31.4k
Closed
Description
The bug is similar to #2202.
I am trying to evaluate MLM perplexity (without training/finetuning) using Roberta with run_language_modeling.py (from the official example). However, some weights seems to be reinitialized instead of getting loading from the pretrained Roberta checkpoint.
To Reproduce (with master branch):
import logging
logging.basicConfig(level=logging.INFO)
from transformers import RobertaForMaskedLM
_ = RobertaForMaskedLM.from_pretrained('roberta-base')
It gives the following warning message:
WARNING:transformers.modeling_utils:Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at roberta-base and are newly initialized: ['roberta.embeddings.position_ids', 'lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The perplexities I get on direct evaluation on Wikitext-2/103 datasets are also much higher than the official Roberta implementation from fairseq. I suspect this could be the reason.
Metadata
Metadata
Assignees
Labels
No labels