Is it expected the same sentence gives different features?

I'm a bit puzzled by something I encountered trying to encode sentences as embeddings. When I ran the sentences through the model one at a time, I got slightly different results from when I ran batches of sentences. 

I've reduced an example down to:
```
from transformers import pipeline
import numpy as np

p = pipeline('feature-extraction', model='allenai/scibert_scivocab_uncased')
s = 'the scurvy dog walked home alone'.split()

for l in range(1,len(s)+1):
    txt = ' '.join(s[:l])
    
    res1 = p(txt)
    res2 = p(txt)
    res1_2 = p([txt, txt])
    print(l, txt, len(res1[0]))
    print(all( np.allclose(i, j) for i, j in zip(res1[0], res2[0])),
          all( np.allclose(i, j) for i, j in zip(res2[0], res1_2[0])),
          all( np.allclose(i, j) for i, j in zip(res1_2[0], res1_2[1])))
```
The output I get is:
```
1 the 3
True False True
2 the scurvy 6
True True True
3 the scurvy dog 7
True False False
4 the scurvy dog walked 9
True False True
5 the scurvy dog walked home 10
True True False
6 the scurvy dog walked home alone 11
True True True
```

So running a single sentence through the model seems to give the same output each time, but if I run a batch with the same sentence twice, it's sometimes different (between the two outputs, and compared to the single-sentence case)

Is this expected/explainable?

Further context, I'm running it on CPU (laptop), python 3.8.9, freshly installed venv.
The difference is usually just in a few indices of the embeddings, and can be up to 1e-3. The difference is negligible when comparing the embeddings with cosine distance. But I'd like to understand where it comes from before dismissing it.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is it expected the same sentence gives different features? #115

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is it expected the same sentence gives different features? #115

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions