Skip to content

Commit 0a80959

Browse files
author
Amine Abdaoui
authored
Add cards for all Geotrend models (#8617)
* docs(bert-base-15lang-cased): add model card * add cards for all Geotrend models * [model cards] fix language tag for all Geotrend models
1 parent dcc9c64 commit 0a80959

30 files changed

Lines changed: 1209 additions & 0 deletions

File tree

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
---
2+
language: multilingual
3+
4+
datasets: wikipedia
5+
6+
license: apache-2.0
7+
---
8+
9+
# bert-base-15lang-cased
10+
11+
We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
12+
13+
Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
14+
15+
The measurements below have been computed on a [Google Cloud n1-standard-1 machine (1 vCPU, 3.75 GB)](https://cloud.google.com/compute/docs/machine-types\#n1_machine_type):
16+
| Model | Num parameters | Size | Memory | Loading time |
17+
| ------------------------------- | -------------- | -------- | -------- | ------------ |
18+
| bert-base-multilingual-cased | 178 million | 714 MB | 1400 MB | 4.2 sec |
19+
| Geotrend/bert-base-15lang-cased | 141 million | 564 MB | 1098 MB | 3.1 sec |
20+
21+
Handled languages: en, fr, es, de, zh, ar, ru, vi, el, bg, th, tr, hi, ur and sw.
22+
23+
For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
24+
25+
## How to use
26+
27+
```python
28+
from transformers import AutoTokenizer, AutoModel
29+
30+
tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-15lang-cased")
31+
model = AutoModel.from_pretrained("Geotrend/bert-base-15lang-cased")
32+
33+
```
34+
35+
### How to cite
36+
37+
```bibtex
38+
@inproceedings{smallermbert,
39+
title={Load What You Need: Smaller Versions of Mutlilingual BERT},
40+
author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
41+
booktitle={SustaiNLP / EMNLP},
42+
year={2020}
43+
}
44+
```
45+
46+
## Contact
47+
48+
Please contact amine@geotrend.fr for any question, feedback or request.
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
---
2+
language: ar
3+
4+
datasets: wikipedia
5+
6+
license: apache-2.0
7+
---
8+
9+
# bert-base-ar-cased
10+
11+
We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
12+
13+
Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
14+
15+
16+
For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
17+
18+
## How to use
19+
20+
```python
21+
from transformers import AutoTokenizer, AutoModel
22+
23+
tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-ar-cased")
24+
model = AutoModel.from_pretrained("Geotrend/bert-base-ar-cased")
25+
26+
```
27+
28+
### How to cite
29+
30+
```bibtex
31+
@inproceedings{smallermbert,
32+
title={Load What You Need: Smaller Versions of Mutlilingual BERT},
33+
author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
34+
booktitle={SustaiNLP / EMNLP},
35+
year={2020}
36+
}
37+
```
38+
39+
## Contact
40+
41+
Please contact amine@geotrend.fr for any question, feedback or request.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
language: bg
3+
4+
datasets: wikipedia
5+
6+
license: apache-2.0
7+
---
8+
9+
# bert-base-bg-cased
10+
11+
We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
12+
13+
Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
14+
15+
For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
16+
17+
## How to use
18+
19+
```python
20+
from transformers import AutoTokenizer, AutoModel
21+
22+
tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-bg-cased")
23+
model = AutoModel.from_pretrained("Geotrend/bert-base-bg-cased")
24+
25+
```
26+
27+
### How to cite
28+
29+
```bibtex
30+
@inproceedings{smallermbert,
31+
title={Load What You Need: Smaller Versions of Mutlilingual BERT},
32+
author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
33+
booktitle={SustaiNLP / EMNLP},
34+
year={2020}
35+
}
36+
```
37+
38+
## Contact
39+
40+
Please contact amine@geotrend.fr for any question, feedback or request.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
language: de
3+
4+
datasets: wikipedia
5+
6+
license: apache-2.0
7+
---
8+
9+
# bert-base-de-cased
10+
11+
We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
12+
13+
Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
14+
15+
For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
16+
17+
## How to use
18+
19+
```python
20+
from transformers import AutoTokenizer, AutoModel
21+
22+
tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-de-cased")
23+
model = AutoModel.from_pretrained("Geotrend/bert-base-de-cased")
24+
25+
```
26+
27+
### How to cite
28+
29+
```bibtex
30+
@inproceedings{smallermbert,
31+
title={Load What You Need: Smaller Versions of Mutlilingual BERT},
32+
author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
33+
booktitle={SustaiNLP / EMNLP},
34+
year={2020}
35+
}
36+
```
37+
38+
## Contact
39+
40+
Please contact amine@geotrend.fr for any question, feedback or request.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
language: el
3+
4+
datasets: wikipedia
5+
6+
license: apache-2.0
7+
---
8+
9+
# bert-base-el-cased
10+
11+
We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
12+
13+
Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
14+
15+
For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
16+
17+
## How to use
18+
19+
```python
20+
from transformers import AutoTokenizer, AutoModel
21+
22+
tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-el-cased")
23+
model = AutoModel.from_pretrained("Geotrend/bert-base-el-cased")
24+
25+
```
26+
27+
### How to cite
28+
29+
```bibtex
30+
@inproceedings{smallermbert,
31+
title={Load What You Need: Smaller Versions of Mutlilingual BERT},
32+
author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
33+
booktitle={SustaiNLP / EMNLP},
34+
year={2020}
35+
}
36+
```
37+
38+
## Contact
39+
40+
Please contact amine@geotrend.fr for any question, feedback or request.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
language: multilingual
3+
4+
datasets: wikipedia
5+
6+
license: apache-2.0
7+
---
8+
9+
# bert-base-en-ar-cased
10+
11+
We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
12+
13+
Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
14+
15+
For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
16+
17+
## How to use
18+
19+
```python
20+
from transformers import AutoTokenizer, AutoModel
21+
22+
tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-en-ar-cased")
23+
model = AutoModel.from_pretrained("Geotrend/bert-base-en-ar-cased")
24+
25+
```
26+
27+
### How to cite
28+
29+
```bibtex
30+
@inproceedings{smallermbert,
31+
title={Load What You Need: Smaller Versions of Mutlilingual BERT},
32+
author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
33+
booktitle={SustaiNLP / EMNLP},
34+
year={2020}
35+
}
36+
```
37+
38+
## Contact
39+
40+
Please contact amine@geotrend.fr for any question, feedback or request.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
language: multilingual
3+
4+
datasets: wikipedia
5+
6+
license: apache-2.0
7+
---
8+
9+
# bert-base-en-bg-cased
10+
11+
We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
12+
13+
Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
14+
15+
For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
16+
17+
## How to use
18+
19+
```python
20+
from transformers import AutoTokenizer, AutoModel
21+
22+
tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-en-bg-cased")
23+
model = AutoModel.from_pretrained("Geotrend/bert-base-en-bg-cased")
24+
25+
```
26+
27+
### How to cite
28+
29+
```bibtex
30+
@inproceedings{smallermbert,
31+
title={Load What You Need: Smaller Versions of Mutlilingual BERT},
32+
author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
33+
booktitle={SustaiNLP / EMNLP},
34+
year={2020}
35+
}
36+
```
37+
38+
## Contact
39+
40+
Please contact amine@geotrend.fr for any question, feedback or request.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
language: en
3+
4+
datasets: wikipedia
5+
6+
license: apache-2.0
7+
---
8+
9+
# bert-base-en-cased
10+
11+
We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
12+
13+
Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
14+
15+
For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
16+
17+
## How to use
18+
19+
```python
20+
from transformers import AutoTokenizer, AutoModel
21+
22+
tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-en-cased")
23+
model = AutoModel.from_pretrained("Geotrend/bert-base-en-cased")
24+
25+
```
26+
27+
### How to cite
28+
29+
```bibtex
30+
@inproceedings{smallermbert,
31+
title={Load What You Need: Smaller Versions of Mutlilingual BERT},
32+
author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
33+
booktitle={SustaiNLP / EMNLP},
34+
year={2020}
35+
}
36+
```
37+
38+
## Contact
39+
40+
Please contact amine@geotrend.fr for any question, feedback or request.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
language: multilingual
3+
4+
datasets: wikipedia
5+
6+
license: apache-2.0
7+
---
8+
9+
# bert-base-en-de-cased
10+
11+
We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
12+
13+
Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
14+
15+
For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
16+
17+
## How to use
18+
19+
```python
20+
from transformers import AutoTokenizer, AutoModel
21+
22+
tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-en-de-cased")
23+
model = AutoModel.from_pretrained("Geotrend/bert-base-en-de-cased")
24+
25+
```
26+
27+
### How to cite
28+
29+
```bibtex
30+
@inproceedings{smallermbert,
31+
title={Load What You Need: Smaller Versions of Mutlilingual BERT},
32+
author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
33+
booktitle={SustaiNLP / EMNLP},
34+
year={2020}
35+
}
36+
```
37+
38+
## Contact
39+
40+
Please contact amine@geotrend.fr for any question, feedback or request.

0 commit comments

Comments
 (0)