Some weights not initialized in pre-trained RobertaForMaskedLM

The bug is similar to #2202.

I am trying to evaluate MLM perplexity (without training/finetuning) using Roberta with `run_language_modeling.py` (from the [official example](https://github.com/huggingface/transformers/tree/master/examples/language-modeling)). However, some weights seems to be reinitialized instead of getting loading from the pretrained Roberta checkpoint.

## To Reproduce (~~with master branch~~):

```
import logging
logging.basicConfig(level=logging.INFO)
from transformers import RobertaForMaskedLM
_ = RobertaForMaskedLM.from_pretrained('roberta-base')
```

It gives the following warning message:
```
WARNING:transformers.modeling_utils:Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at roberta-base and are newly initialized: ['roberta.embeddings.position_ids', 'lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
```

The perplexities I get on direct evaluation on Wikitext-2/103 datasets are also much higher than the official Roberta implementation from fairseq. I suspect this could be the reason.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some weights not initialized in pre-trained RobertaForMaskedLM #6193

To Reproduce (with master branch):

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Some weights not initialized in pre-trained RobertaForMaskedLM #6193

Description

To Reproduce (with master branch):

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions