Skip to content

Commit 204ebc2

Browse files
authored
Update installation page and add contributing to the doc (#5084)
* Update installation page and add contributing to the doc * Remove mention of symlinks
1 parent 043f9f5 commit 204ebc2

File tree

6 files changed

+106
-56
lines changed

6 files changed

+106
-56
lines changed

CONTRIBUTING.md

Lines changed: 20 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,8 @@ Awesome! Please provide the following information:
6565
If you are willing to contribute the model yourself, let us know so we can best
6666
guide you.
6767

68-
We have added a **detailed guide and templates** to guide you in the process of adding a new model. You can find them in the [`templates`](./templates) folder.
68+
We have added a **detailed guide and templates** to guide you in the process of adding a new model. You can find them
69+
in the [`templates`](https://github.com/huggingface/transformers/templates) folder.
6970

7071
### Do you want a new feature (that is not a model)?
7172

@@ -86,7 +87,9 @@ A world-class feature request addresses the following points:
8687
If your issue is well written we're already 80% of the way there by the time you
8788
post it.
8889

89-
We have added **templates** to guide you in the process of adding a new example script for training or testing the models in the library. You can find them in the [`templates`](./templates) folder.
90+
We have added **templates** to guide you in the process of adding a new example script for training or testing the
91+
models in the library. You can find them in the [`templates`](https://github.com/huggingface/transformers/templates)
92+
folder.
9093

9194
## Start contributing! (Pull Requests)
9295

@@ -206,15 +209,21 @@ Follow these steps to start contributing:
206209
to be merged;
207210
4. Make sure existing tests pass;
208211
5. Add high-coverage tests. No quality testing = no merge.
209-
- If you are adding a new model, make sure that you use `ModelTester.all_model_classes = (MyModel, MyModelWithLMHead,...)`, which triggers the common tests.
210-
- If you are adding new `@slow` tests, make sure they pass using `RUN_SLOW=1 python -m pytest tests/test_my_new_model.py`.
211-
- If you are adding a new tokenizer, write tests, and make sure `RUN_SLOW=1 python -m pytest tests/test_tokenization_{your_model_name}.py` passes.
212-
CircleCI does not run them.
213-
6. All public methods must have informative docstrings that work nicely with sphinx. See `modeling_ctrl.py` for an example.
212+
- If you are adding a new model, make sure that you use
213+
`ModelTester.all_model_classes = (MyModel, MyModelWithLMHead,...)`, which triggers the common tests.
214+
- If you are adding new `@slow` tests, make sure they pass using
215+
`RUN_SLOW=1 python -m pytest tests/test_my_new_model.py`.
216+
- If you are adding a new tokenizer, write tests, and make sure
217+
`RUN_SLOW=1 python -m pytest tests/test_tokenization_{your_model_name}.py` passes.
218+
CircleCI does not run the slow tests.
219+
6. All public methods must have informative docstrings that work nicely with sphinx. See `modeling_ctrl.py` for an
220+
example.
214221

215222
### Tests
216223

217-
You can run 🤗 Transformers tests with `unittest` or `pytest`.
224+
An extensive test suite is included to test the library behavior and several examples. Library tests can be found in
225+
the [tests folder](https://github.com/huggingface/transformers/tree/master/tests) and examples tests in the
226+
[examples folder](https://github.com/huggingface/transformers/tree/master/examples).
218227

219228
We like `pytest` and `pytest-xdist` because it's faster. From the root of the
220229
repository, here's how to run tests with `pytest` for the library:
@@ -261,7 +270,8 @@ $ python -m unittest discover -s examples -t examples -v
261270

262271
### Style guide
263272

264-
For documentation strings, `transformers` follows the [google
265-
style](https://google.github.io/styleguide/pyguide.html).
273+
For documentation strings, `transformers` follows the [google style](https://google.github.io/styleguide/pyguide.html).
274+
Check our [documentation writing guide](https://github.com/huggingface/transformers/tree/master/docs#writing-documentation---specification)
275+
for more information.
266276

267277
#### This guide was heavily inspired by the awesome [scikit-learn guide to contributing](https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md)

docs/README.md

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -42,20 +42,14 @@ pip install recommonmark
4242

4343
## Building the documentation
4444

45-
Make sure that there is a symlink from the `example` file (in /examples) inside the source folder. Run the following
46-
command to generate it:
47-
48-
```bash
49-
ln -s ../../examples/README.md examples.md
50-
```
51-
5245
Once you have setup `sphinx`, you can build the documentation by running the following command in the `/docs` folder:
5346

5447
```bash
5548
make html
5649
```
5750

58-
A folder called ``_build/html`` should have been created. You can now open the file ``_build/html/index.html`` in your browser.
51+
A folder called ``_build/html`` should have been created. You can now open the file ``_build/html/index.html`` in your
52+
browser.
5953

6054
---
6155
**NOTE**
@@ -132,8 +126,8 @@ XXXConfig
132126
:members:
133127
```
134128

135-
This will include every public method of the configuration. If for some reason you wish for a method not to be displayed
136-
in the documentation, you can do so by specifying which methods should be in the docs:
129+
This will include every public method of the configuration. If for some reason you wish for a method not to be
130+
displayed in the documentation, you can do so by specifying which methods should be in the docs:
137131

138132
```
139133
XXXTokenizer
@@ -147,8 +141,8 @@ XXXTokenizer
147141

148142
### Writing source documentation
149143

150-
Values that should be put in `code` should either be surrounded by double backticks: \`\`like so\`\` or be written as an object
151-
using the :obj: syntax: :obj:\`like so\`.
144+
Values that should be put in `code` should either be surrounded by double backticks: \`\`like so\`\` or be written as
145+
an object using the :obj: syntax: :obj:\`like so\`.
152146

153147
When mentionning a class, it is recommended to use the :class: syntax as the mentioned class will be automatically
154148
linked by Sphinx: :class:\`transformers.XXXClass\`

docs/source/contributing.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../CONTRIBUTING.md

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,7 @@ conversion utilities for the following models:
142142
converting_tensorflow_models
143143
migration
144144
torchscript
145+
contributing
145146

146147
.. toctree::
147148
:maxdepth: 2

docs/source/installation.md

Lines changed: 67 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,69 +1,102 @@
11
# Installation
22

3-
Transformers is tested on Python 3.6+ and PyTorch 1.1.0
3+
🤗 Transformers is tested on Python 3.6+, and PyTorch 1.1.0+ or TensorFlow 2.0+.
44

5-
## With pip
5+
You should install 🤗 Transformers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're
6+
unfamiliar with Python virtual environments, check out the [user guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/). Create a virtual environment with the version of Python you're going
7+
to use and activate it.
68

7-
PyTorch Transformers can be installed using pip as follows:
9+
Now, if you want to use 🤗 Transformers, you can install it with pip. If you'd like to play with the examples, you
10+
must install it from source.
811

9-
``` bash
12+
## Installation with pip
13+
14+
First you need to install one of, or both, TensorFlow 2.0 and PyTorch.
15+
Please refer to [TensorFlow installation page](https://www.tensorflow.org/install/pip#tensorflow-2.0-rc-is-available)
16+
and/or [PyTorch installation page](https://pytorch.org/get-started/locally/#start-locally) regarding the specific
17+
install command for your platform.
18+
19+
When TensorFlow 2.0 and/or PyTorch has been installed, 🤗 Transformers can be installed using pip as follows:
20+
21+
```bash
1022
pip install transformers
1123
```
1224

13-
## From source
25+
Alternatively, for CPU-support only, you can install 🤗 Transformers and PyTorch in one line with
26+
27+
```bash
28+
pip install transformers[torch]
29+
```
30+
31+
or 🤗 Transformers and TensorFlow 2.0 in one line with
32+
33+
```bash
34+
pip install transformers[tf-cpu]
35+
```
36+
37+
To check 🤗 Transformers is properly installed, run the following command:
38+
39+
```bash
40+
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you'))"
41+
```
42+
43+
It should download a pretrained model then print something like
44+
45+
```bash
46+
[{'label': 'NEGATIVE', 'score': 0.9991129040718079}]
47+
```
48+
49+
(Note that TensorFlow will print additional stuff before that last statement.)
50+
51+
## Installing from source
1452

15-
To install from source, clone the repository and install with:
53+
To install from source, clone the repository and install with the following commands:
1654

1755
``` bash
1856
git clone https://github.com/huggingface/transformers.git
1957
cd transformers
20-
pip install .
58+
pip install -e .
59+
```
60+
61+
Again, you can run
62+
63+
```bash
64+
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you'))"
2165
```
2266

67+
to check 🤗 Transformers is properly installed.
68+
2369
## Caching models
2470

2571
This library provides pretrained models that will be downloaded and cached locally. Unless you specify a location with
26-
`cache_dir=...` when you use the `from_pretrained` method, these models will automatically be downloaded in the
27-
folder given by the shell environment variable ``TRANSFORMERS_CACHE``. The default value for it will be the PyTorch
72+
`cache_dir=...` when you use methods like `from_pretrained`, these models will automatically be downloaded in the
73+
folder given by the shell environment variable ``TRANSFORMERS_CACHE``. The default value for it will be the PyTorch
2874
cache home followed by ``/transformers/`` (even if you don't have PyTorch installed). This is (by order of priority):
2975

3076
* shell environment variable ``ENV_TORCH_HOME``
3177
* shell environment variable ``ENV_XDG_CACHE_HOME`` + ``/torch/``
3278
* default: ``~/.cache/torch/``
3379

34-
So if you don't have any specific environment variable set, the cache directory will be at
80+
So if you don't have any specific environment variable set, the cache directory will be at
3581
``~/.cache/torch/transformers/``.
3682

37-
**Note:** If you have set a shell enviromnent variable for one of the predecessors of this library
38-
(``PYTORCH_TRANSFORMERS_CACHE`` or ``PYTORCH_PRETRAINED_BERT_CACHE``), those will be used if there is no shell
83+
**Note:** If you have set a shell enviromnent variable for one of the predecessors of this library
84+
(``PYTORCH_TRANSFORMERS_CACHE`` or ``PYTORCH_PRETRAINED_BERT_CACHE``), those will be used if there is no shell
3985
enviromnent variable for ``TRANSFORMERS_CACHE``.
4086

41-
## Tests
42-
43-
An extensive test suite is included to test the library behavior and several examples. Library tests can be found in the [tests folder](https://github.com/huggingface/transformers/tree/master/tests) and examples tests in the [examples folder](https://github.com/huggingface/transformers/tree/master/examples).
44-
45-
Refer to the [contributing guide](https://github.com/huggingface/transformers/blob/master/CONTRIBUTING.md#tests) for details about running tests.
46-
47-
## OpenAI GPT original tokenization workflow
48-
49-
If you want to reproduce the original tokenization process of the `OpenAI GPT` paper, you will need to install `ftfy` and `SpaCy`:
50-
51-
``` bash
52-
pip install spacy ftfy==4.4.3
53-
python -m spacy download en
54-
```
55-
56-
If you don't install `ftfy` and `SpaCy`, the `OpenAI GPT` tokenizer will default to tokenize using BERT's `BasicTokenizer` followed by Byte-Pair Encoding (which should be fine for most usage, don't worry).
57-
58-
## Note on model downloads (Continuous Integration or large-scale deployments)
87+
### Note on model downloads (Continuous Integration or large-scale deployments)
5988

60-
If you expect to be downloading large volumes of models (more than 1,000) from our hosted bucket (for instance through your CI setup, or a large-scale production deployment), please cache the model files on your end. It will be way faster, and cheaper. Feel free to contact us privately if you need any help.
89+
If you expect to be downloading large volumes of models (more than 1,000) from our hosted bucket (for instance through
90+
your CI setup, or a large-scale production deployment), please cache the model files on your end. It will be way
91+
faster, and cheaper. Feel free to contact us privately if you need any help.
6192

6293
## Do you want to run a Transformer model on a mobile device?
6394

6495
You should check out our [swift-coreml-transformers](https://github.com/huggingface/swift-coreml-transformers) repo.
6596

66-
It contains a set of tools to convert PyTorch or TensorFlow 2.0 trained Transformer models (currently contains `GPT-2`, `DistilGPT-2`, `BERT`, and `DistilBERT`) to CoreML models that run on iOS devices.
97+
It contains a set of tools to convert PyTorch or TensorFlow 2.0 trained Transformer models (currently contains `GPT-2`,
98+
`DistilGPT-2`, `BERT`, and `DistilBERT`) to CoreML models that run on iOS devices.
6799

68-
At some point in the future, you'll be able to seamlessly move from pre-training or fine-tuning models in PyTorch to productizing them in CoreML,
69-
or prototype a model or an app in CoreML then research its hyperparameters or architecture from PyTorch. Super exciting!
100+
At some point in the future, you'll be able to seamlessly move from pre-training or fine-tuning models in PyTorch or
101+
TensorFlow 2.0 to productizing them in CoreML, or prototype a model or an app in CoreML then research its
102+
hyperparameters or architecture from PyTorch or TensorFlow 2.0. Super exciting!

docs/source/model_doc/gpt.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,17 @@ Hugging Face showcasing the generative capabilities of several models. GPT is on
3838

3939
The original code can be found `here <https://github.com/openai/finetune-transformer-lm>`_.
4040

41+
Note:
42+
43+
If you want to reproduce the original tokenization process of the `OpenAI GPT` paper, you will need to install
44+
``ftfy`` and ``SpaCy``::
45+
46+
pip install spacy ftfy==4.4.3
47+
python -m spacy download en
48+
49+
If you don't install ``ftfy`` and ``SpaCy``, the :class:`transformers.OpenAIGPTTokenizer` will default to tokenize using
50+
BERT's :obj:`BasicTokenizer` followed by Byte-Pair Encoding (which should be fine for most usage, don't
51+
worry).
4152

4253
OpenAIGPTConfig
4354
~~~~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)