Skip to content

Commit 5d7c433

Browse files
authored
Merge pull request #1 from huggingface/master
resolve conflicts
2 parents 5cd7086 + 68d9251 commit 5d7c433

File tree

89 files changed

+4792
-361
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

89 files changed

+4792
-361
lines changed

.circleci/config.yml

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ jobs:
9999
path: ~/transformers/tests_output.txt
100100
- store_artifacts:
101101
path: ~/transformers/reports
102-
102+
103103
run_tests_torch_and_tf_all:
104104
working_directory: ~/transformers
105105
docker:
@@ -169,7 +169,7 @@ jobs:
169169
path: ~/transformers/tests_output.txt
170170
- store_artifacts:
171171
path: ~/transformers/reports
172-
172+
173173
run_tests_torch_and_flax_all:
174174
working_directory: ~/transformers
175175
docker:
@@ -237,7 +237,7 @@ jobs:
237237
path: ~/transformers/tests_output.txt
238238
- store_artifacts:
239239
path: ~/transformers/reports
240-
240+
241241
run_tests_torch_all:
242242
working_directory: ~/transformers
243243
docker:
@@ -304,7 +304,7 @@ jobs:
304304
path: ~/transformers/tests_output.txt
305305
- store_artifacts:
306306
path: ~/transformers/reports
307-
307+
308308
run_tests_tf_all:
309309
working_directory: ~/transformers
310310
docker:
@@ -370,7 +370,7 @@ jobs:
370370
path: ~/transformers/tests_output.txt
371371
- store_artifacts:
372372
path: ~/transformers/reports
373-
373+
374374
run_tests_flax_all:
375375
working_directory: ~/transformers
376376
docker:
@@ -437,7 +437,7 @@ jobs:
437437
path: ~/transformers/tests_output.txt
438438
- store_artifacts:
439439
path: ~/transformers/reports
440-
440+
441441
run_tests_pipelines_torch_all:
442442
working_directory: ~/transformers
443443
docker:
@@ -549,15 +549,15 @@ jobs:
549549
- v0.4-custom_tokenizers-{{ checksum "setup.py" }}
550550
- v0.4-{{ checksum "setup.py" }}
551551
- run: pip install --upgrade pip
552-
- run: pip install .[ja,testing,sentencepiece,jieba]
552+
- run: pip install .[ja,testing,sentencepiece,jieba,spacy,ftfy]
553553
- run: python -m unidic download
554554
- save_cache:
555555
key: v0.4-custom_tokenizers-{{ checksum "setup.py" }}
556556
paths:
557557
- '~/.cache/pip'
558558
- run: |
559559
if [ -f test_list.txt ]; then
560-
python -m pytest -s --make-reports=tests_custom_tokenizers ./tests/test_tokenization_bert_japanese.py | tee tests_output.txt
560+
python -m pytest -s --make-reports=tests_custom_tokenizers ./tests/test_tokenization_bert_japanese.py ./tests/test_tokenization_openai.py | tee tests_output.txt
561561
fi
562562
- store_artifacts:
563563
path: ~/transformers/tests_output.txt
@@ -662,7 +662,7 @@ jobs:
662662
path: ~/transformers/flax_examples_output.txt
663663
- store_artifacts:
664664
path: ~/transformers/reports
665-
665+
666666
run_examples_flax_all:
667667
working_directory: ~/transformers
668668
docker:
@@ -729,7 +729,7 @@ jobs:
729729
path: ~/transformers/tests_output.txt
730730
- store_artifacts:
731731
path: ~/transformers/reports
732-
732+
733733
run_tests_hub_all:
734734
working_directory: ~/transformers
735735
docker:
@@ -795,7 +795,7 @@ jobs:
795795
path: ~/transformers/tests_output.txt
796796
- store_artifacts:
797797
path: ~/transformers/reports
798-
798+
799799
run_tests_onnxruntime_all:
800800
working_directory: ~/transformers
801801
docker:

.github/ISSUE_TEMPLATE/bug-report.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Library:
4949
- Deepspeed: @stas00
5050
- Ray/raytune: @richardliaw, @amogkam
5151
- Text generation: @patrickvonplaten @narsil
52-
- Tokenizers: @LysandreJik
52+
- Tokenizers: @SaulLu
5353
- Trainer: @sgugger
5454
- Pipelines: @Narsil
5555
- Speech: @patrickvonplaten, @anton-l

.github/workflows/self-scheduled.yml

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,10 @@ jobs:
5151
if: ${{ always() }}
5252
run: cat reports/tests_torch_gpu_failures_short.txt
5353

54+
- name: Test durations
55+
if: ${{ always() }}
56+
run: cat reports/tests_torch_gpu_durations.txt
57+
5458
- name: Run examples tests on GPU
5559
if: ${{ always() }}
5660
env:
@@ -67,6 +71,10 @@ jobs:
6771
if: ${{ always() }}
6872
run: cat reports/examples_torch_gpu_failures_short.txt
6973

74+
- name: Test durations
75+
if: ${{ always() }}
76+
run: cat reports/examples_torch_gpu_durations.txt
77+
7078
- name: Run all pipeline tests on GPU
7179
if: ${{ always() }}
7280
env:
@@ -78,6 +86,10 @@ jobs:
7886
if: ${{ always() }}
7987
run: cat reports/tests_torch_pipeline_gpu_failures_short.txt
8088

89+
- name: Test durations
90+
if: ${{ always() }}
91+
run: cat reports/tests_torch_pipeline_gpu_durations.txt
92+
8193
- name: Test suite reports artifacts
8294
if: ${{ always() }}
8395
uses: actions/upload-artifact@v2
@@ -119,6 +131,10 @@ jobs:
119131
if: ${{ always() }}
120132
run: cat reports/tests_flax_gpu_failures_short.txt
121133

134+
- name: Test durations
135+
if: ${{ always() }}
136+
run: cat reports/tests_flax_gpu_durations.txt
137+
122138
- name: Test suite reports artifacts
123139
if: ${{ always() }}
124140
uses: actions/upload-artifact@v2
@@ -163,6 +179,10 @@ jobs:
163179
if: ${{ always() }}
164180
run: cat reports/tests_tf_gpu_failures_short.txt
165181

182+
- name: Test durations
183+
if: ${{ always() }}
184+
run: cat reports/tests_tf_gpu_durations.txt
185+
166186
- name: Run all pipeline tests on GPU
167187
if: ${{ always() }}
168188
env:
@@ -176,6 +196,10 @@ jobs:
176196
if: ${{ always() }}
177197
run: cat reports/tests_tf_pipeline_gpu_failures_short.txt
178198

199+
- name: Test durations
200+
if: ${{ always() }}
201+
run: cat reports/tests_tf_pipeline_gpu_durations.txt
202+
179203
- name: Test suite reports artifacts
180204
if: ${{ always() }}
181205
uses: actions/upload-artifact@v2
@@ -215,6 +239,10 @@ jobs:
215239
if: ${{ always() }}
216240
run: cat reports/tests_torch_xla_tpu_failures_short.txt
217241

242+
- name: Tests durations
243+
if: ${{ always() }}
244+
run: cat reports/tests_torch_xla_tpu_durations.txt
245+
218246
- name: Test suite reports artifacts
219247
if: ${{ always() }}
220248
uses: actions/upload-artifact@v2
@@ -258,6 +286,10 @@ jobs:
258286
if: ${{ always() }}
259287
run: cat reports/tests_torch_multi_gpu_failures_short.txt
260288

289+
- name: Test durations
290+
if: ${{ always() }}
291+
run: cat reports/tests_torch_multi_gpu_durations.txt
292+
261293
- name: Run all pipeline tests on GPU
262294
if: ${{ always() }}
263295
env:
@@ -269,6 +301,10 @@ jobs:
269301
if: ${{ always() }}
270302
run: cat reports/tests_torch_pipeline_multi_gpu_failures_short.txt
271303

304+
- name: Test durations
305+
if: ${{ always() }}
306+
run: cat reports/tests_torch_pipeline_multi_gpu_durations.txt
307+
272308
- name: Test suite reports artifacts
273309
if: ${{ always() }}
274310
uses: actions/upload-artifact@v2
@@ -313,6 +349,10 @@ jobs:
313349
if: ${{ always() }}
314350
run: cat reports/tests_tf_multi_gpu_failures_short.txt
315351

352+
- name: Test durations
353+
if: ${{ always() }}
354+
run: cat reports/tests_tf_multi_gpu_durations.txt
355+
316356
- name: Run all pipeline tests on GPU
317357
if: ${{ always() }}
318358
env:
@@ -326,6 +366,10 @@ jobs:
326366
if: ${{ always() }}
327367
run: cat reports/tests_tf_pipeline_multi_gpu_failures_short.txt
328368

369+
- name: Test durations
370+
if: ${{ always() }}
371+
run: cat reports/tests_tf_pipeline_multi_gpu_durations.txt
372+
329373
- name: Test suite reports artifacts
330374
if: ${{ always() }}
331375
uses: actions/upload-artifact@v2
@@ -403,6 +447,10 @@ jobs:
403447
if: ${{ always() }}
404448
run: cat reports/tests_torch_cuda_extensions_gpu_failures_short.txt
405449

450+
- name: Test durations
451+
if: ${{ always() }}
452+
run: cat reports/tests_torch_cuda_extensions_gpu_durations.txt
453+
406454
- name: Test suite reports artifacts
407455
if: ${{ always() }}
408456
uses: actions/upload-artifact@v2
@@ -443,6 +491,10 @@ jobs:
443491
if: ${{ always() }}
444492
run: cat reports/tests_torch_cuda_extensions_multi_gpu_failures_short.txt
445493

494+
- name: Test durations
495+
if: ${{ always() }}
496+
run: cat reports/tests_torch_cuda_extensions_multi_gpu_durations.txt
497+
446498
- name: Test suite reports artifacts
447499
if: ${{ always() }}
448500
uses: actions/upload-artifact@v2

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -285,6 +285,7 @@ Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.
285285
1. **[Megatron-GPT2](https://huggingface.co/docs/transformers/model_doc/megatron_gpt2)** (from NVIDIA) released with the paper [Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism](https://arxiv.org/abs/1909.08053) by Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper and Bryan Catanzaro.
286286
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
287287
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
288+
1. **[Nyströmformer](https://huggingface.co/docs/transformers/master/model_doc/nystromformer)** (from the University of Wisconsin - Madison) released with the paper [Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention](https://arxiv.org/abs/2102.03902) by Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh.
288289
1. **[Pegasus](https://huggingface.co/docs/transformers/model_doc/pegasus)** (from Google) released with the paper [PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/abs/1912.08777) by Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu.
289290
1. **[Perceiver IO](https://huggingface.co/docs/transformers/model_doc/perceiver)** (from Deepmind) released with the paper [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
290291
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.

README_ko.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -264,6 +264,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
264264
1. **[mLUKE](https://huggingface.co/docs/transformers/model_doc/mluke)** (from Studio Ousia) released with the paper [mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models](https://arxiv.org/abs/2110.08151) by Ryokan Ri, Ikuya Yamada, and Yoshimasa Tsuruoka.
265265
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
266266
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
267+
1. **[Nyströmformer](https://huggingface.co/docs/transformers/master/model_doc/nystromformer)** (from the University of Wisconsin - Madison) released with the paper [Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention](https://arxiv.org/abs/2102.03902) by Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh.
267268
1. **[Pegasus](https://huggingface.co/docs/transformers/model_doc/pegasus)** (from Google) released with the paper [PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/abs/1912.08777) by Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu.
268269
1. **[Perceiver IO](https://huggingface.co/docs/transformers/model_doc/perceiver)** (from Deepmind) released with the paper [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
269270
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.

README_zh-hans.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,7 @@ conda install -c huggingface transformers
288288
1. **[mLUKE](https://huggingface.co/docs/transformers/model_doc/mluke)** (来自 Studio Ousia) 伴随论文 [mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models](https://arxiv.org/abs/2110.08151) 由 Ryokan Ri, Ikuya Yamada, and Yoshimasa Tsuruoka 发布。
289289
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (来自 Microsoft Research) 伴随论文 [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) 由 Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu 发布。
290290
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (来自 Google AI) 伴随论文 [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) 由 Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel 发布。
291+
1. **[Nyströmformer](https://huggingface.co/docs/transformers/master/model_doc/nystromformer)** (来自 the University of Wisconsin - Madison) 伴随论文 [Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention](https://arxiv.org/abs/2102.03902) 由 Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh 发布。
291292
1. **[Pegasus](https://huggingface.co/docs/transformers/model_doc/pegasus)** (来自 Google) 伴随论文 [PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/abs/1912.08777) 由 Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu 发布。
292293
1. **[Perceiver IO](https://huggingface.co/docs/transformers/model_doc/perceiver)** (来自 Deepmind) 伴随论文 [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) 由 Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira 发布。
293294
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (来自 VinAI Research) 伴随论文 [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) 由 Dat Quoc Nguyen and Anh Tuan Nguyen 发布。

README_zh-hant.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -300,6 +300,7 @@ conda install -c huggingface transformers
300300
1. **[mLUKE](https://huggingface.co/docs/transformers/model_doc/mluke)** (from Studio Ousia) released with the paper [mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models](https://arxiv.org/abs/2110.08151) by Ryokan Ri, Ikuya Yamada, and Yoshimasa Tsuruoka.
301301
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
302302
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
303+
1. **[Nyströmformer](https://huggingface.co/docs/transformers/master/model_doc/nystromformer)** (from the University of Wisconsin - Madison) released with the paper [Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention](https://arxiv.org/abs/2102.03902) by Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh.
303304
1. **[Pegasus](https://huggingface.co/docs/transformers/model_doc/pegasus)** (from Google) released with the paper [PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/abs/1912.08777) by Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu.
304305
1. **[Perceiver IO](https://huggingface.co/docs/transformers/model_doc/perceiver)** (from Deepmind) released with the paper [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
305306
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.

docs/source/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -214,6 +214,8 @@
214214
title: MPNet
215215
- local: model_doc/mt5
216216
title: MT5
217+
- local: model_doc/nystromformer
218+
title: Nyströmformer
217219
- local: model_doc/openai-gpt
218220
title: OpenAI GPT
219221
- local: model_doc/gpt2

docs/source/benchmarks.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,13 @@ specific language governing permissions and limitations under the License.
1414

1515
[[open-in-colab]]
1616

17-
Let's take a look at how 🤗 Transformer models can be benchmarked, best practices, and already available benchmarks.
17+
Let's take a look at how 🤗 Transformers models can be benchmarked, best practices, and already available benchmarks.
1818

19-
A notebook explaining in more detail how to benchmark 🤗 Transformer models can be found [here](https://github.com/huggingface/notebooks/tree/master/examples/benchmark.ipynb).
19+
A notebook explaining in more detail how to benchmark 🤗 Transformers models can be found [here](https://github.com/huggingface/notebooks/tree/master/examples/benchmark.ipynb).
2020

21-
## How to benchmark 🤗 Transformer models
21+
## How to benchmark 🤗 Transformers models
2222

23-
The classes [`PyTorchBenchmark`] and [`TensorFlowBenchmark`] allow to flexibly benchmark 🤗 Transformer models. The benchmark classes allow us to measure the _peak memory usage_ and _required time_ for both _inference_ and _training_.
23+
The classes [`PyTorchBenchmark`] and [`TensorFlowBenchmark`] allow to flexibly benchmark 🤗 Transformers models. The benchmark classes allow us to measure the _peak memory usage_ and _required time_ for both _inference_ and _training_.
2424

2525
<Tip>
2626

0 commit comments

Comments
 (0)