Update installation page and add contributing to the doc (#5084)

sgugger · web-flow · commit 204ebc25e652 · 2020-06-17T14:01:10.000-04:00
* Update installation page and add contributing to the doc

* Remove mention of symlinks
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -65,7 +65,8 @@ Awesome! Please provide the following information:
 If you are willing to contribute the model yourself, let us know so we can best
 guide you.
 
-We have added a **detailed guide and templates** to guide you in the process of adding a new model. You can find them in the [`templates`](./templates) folder.
+We have added a **detailed guide and templates** to guide you in the process of adding a new model. You can find them 
+in the [`templates`](https://github.com/huggingface/transformers/templates) folder.
 
 ### Do you want a new feature (that is not a model)?
 
@@ -86,7 +87,9 @@ A world-class feature request addresses the following points:
 If your issue is well written we're already 80% of the way there by the time you
 post it.
 
-We have added **templates** to guide you in the process of adding a new example script for training or testing the models in the library. You can find them in the [`templates`](./templates) folder.
+We have added **templates** to guide you in the process of adding a new example script for training or testing the 
+models in the library. You can find them in the [`templates`](https://github.com/huggingface/transformers/templates) 
+folder.
 
 ## Start contributing! (Pull Requests)
 
@@ -206,15 +209,21 @@ Follow these steps to start contributing:
    to be merged;
 4. Make sure existing tests pass;
 5. Add high-coverage tests. No quality testing = no merge. 
- - If you are adding a new model, make sure that you use `ModelTester.all_model_classes = (MyModel, MyModelWithLMHead,...)`, which triggers the common tests.
- - If you are adding new `@slow` tests, make sure they pass using `RUN_SLOW=1 python -m pytest tests/test_my_new_model.py`. 
- - If you are adding a new tokenizer, write tests, and make sure `RUN_SLOW=1 python -m pytest tests/test_tokenization_{your_model_name}.py` passes.
-CircleCI does not run them. 
-6. All public methods must have informative docstrings that work nicely with sphinx. See `modeling_ctrl.py` for an example.
+   - If you are adding a new model, make sure that you use 
+     `ModelTester.all_model_classes = (MyModel, MyModelWithLMHead,...)`, which triggers the common tests.
+   - If you are adding new `@slow` tests, make sure they pass using 
+     `RUN_SLOW=1 python -m pytest tests/test_my_new_model.py`. 
+   - If you are adding a new tokenizer, write tests, and make sure 
+     `RUN_SLOW=1 python -m pytest tests/test_tokenization_{your_model_name}.py` passes.
+   CircleCI does not run the slow tests. 
+6. All public methods must have informative docstrings that work nicely with sphinx. See `modeling_ctrl.py` for an 
+   example.
 
 ### Tests
 
-You can run 🤗 Transformers tests with `unittest` or `pytest`.
+An extensive test suite is included to test the library behavior and several examples. Library tests can be found in 
+the [tests folder](https://github.com/huggingface/transformers/tree/master/tests) and examples tests in the 
+[examples folder](https://github.com/huggingface/transformers/tree/master/examples).
 
 We like `pytest` and `pytest-xdist` because it's faster. From the root of the
 repository, here's how to run tests with `pytest` for the library:
@@ -261,7 +270,8 @@ $ python -m unittest discover -s examples -t examples -v
 
 ### Style guide
 
-For documentation strings, `transformers` follows the [google
-style](https://google.github.io/styleguide/pyguide.html).
+For documentation strings, `transformers` follows the [google style](https://google.github.io/styleguide/pyguide.html).
+Check our [documentation writing guide](https://github.com/huggingface/transformers/tree/master/docs#writing-documentation---specification)
+for more information.
 
 #### This guide was heavily inspired by the awesome [scikit-learn guide to contributing](https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md)
diff --git a/docs/README.md b/docs/README.md
@@ -42,20 +42,14 @@ pip install recommonmark
 
 ## Building the documentation
 
-Make sure that there is a symlink from the `example` file (in /examples) inside the source folder. Run the following
-command to generate it:
-
-```bash
-ln -s ../../examples/README.md examples.md
-```
-
 Once you have setup `sphinx`, you can build the documentation by running the following command in the `/docs` folder:
 
 ```bash
 make html
 ```
 
-A folder called ``_build/html`` should have been created. You can now open the file ``_build/html/index.html`` in your browser. 
+A folder called ``_build/html`` should have been created. You can now open the file ``_build/html/index.html`` in your
+browser. 
 
 ---
 **NOTE**
@@ -132,8 +126,8 @@ XXXConfig
     :members:
 ```
 
-This will include every public method of the configuration. If for some reason you wish for a method not to be displayed
-in the documentation, you can do so by specifying which methods should be in the docs:
+This will include every public method of the configuration. If for some reason you wish for a method not to be
+displayed in the documentation, you can do so by specifying which methods should be in the docs:
 
 ```
 XXXTokenizer
@@ -147,8 +141,8 @@ XXXTokenizer
 
 ### Writing source documentation
 
-Values that should be put in `code` should either be surrounded by double backticks: \`\`like so\`\` or be written as an object
-using the :obj: syntax: :obj:\`like so\`.
+Values that should be put in `code` should either be surrounded by double backticks: \`\`like so\`\` or be written as
+an object using the :obj: syntax: :obj:\`like so\`.
 
 When mentionning a class, it is recommended to use the :class: syntax as the mentioned class will be automatically
 linked by Sphinx: :class:\`transformers.XXXClass\`
diff --git a/docs/source/contributing.md b/docs/source/contributing.md
@@ -0,0 +1 @@
+../../CONTRIBUTING.md
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -142,6 +142,7 @@ conversion utilities for the following models:
     converting_tensorflow_models
     migration
     torchscript
+    contributing
 
 .. toctree::
     :maxdepth: 2
diff --git a/docs/source/installation.md b/docs/source/installation.md
@@ -1,69 +1,102 @@
 # Installation
 
-Transformers is tested on Python 3.6+ and PyTorch 1.1.0
+🤗 Transformers is tested on Python 3.6+, and PyTorch 1.1.0+ or TensorFlow 2.0+.
 
-## With pip
+You should install 🤗 Transformers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're
+unfamiliar with Python virtual environments, check out the [user guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/). Create a virtual environment with the version of Python you're going 
+to use and activate it.
 
-PyTorch Transformers can be installed using pip as follows:
+Now, if you want to use 🤗 Transformers, you can install it with pip. If you'd like to play with the examples, you
+must install it from source.
 
-``` bash
+## Installation with pip
+
+First you need to install one of, or both, TensorFlow 2.0 and PyTorch.
+Please refer to [TensorFlow installation page](https://www.tensorflow.org/install/pip#tensorflow-2.0-rc-is-available) 
+and/or [PyTorch installation page](https://pytorch.org/get-started/locally/#start-locally) regarding the specific 
+install command for your platform.
+
+When TensorFlow 2.0 and/or PyTorch has been installed, 🤗 Transformers can be installed using pip as follows:
+
+```bash
 pip install transformers
 ```
 
-## From source
+Alternatively, for CPU-support only, you can install 🤗 Transformers and PyTorch in one line with
+
+```bash
+pip install transformers[torch]
+```
+
+or 🤗 Transformers and TensorFlow 2.0 in one line with
+
+```bash
+pip install transformers[tf-cpu]
+```
+
+To check 🤗 Transformers is properly installed, run the following command:
+
+```bash
+python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you'))"
+```
+
+It should download a pretrained model then print something like
+
+```bash
+[{'label': 'NEGATIVE', 'score': 0.9991129040718079}]
+```
+
+(Note that TensorFlow will print additional stuff before that last statement.)
+
+## Installing from source
 
-To install from source, clone the repository and install with:
+To install from source, clone the repository and install with the following commands:
 
 ``` bash
 git clone https://github.com/huggingface/transformers.git
 cd transformers
-pip install .
+pip install -e .
+```
+
+Again, you can run 
+
+```bash
+python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you'))"
 ```
 
+to check 🤗 Transformers is properly installed.
+
 ## Caching models
 
 This library provides pretrained models that will be downloaded and cached locally. Unless you specify a location with
-`cache_dir=...` when you use the `from_pretrained` method, these models will automatically be downloaded in the 
-folder given by the shell environment variable ``TRANSFORMERS_CACHE``. The default value for it will be the PyTorch 
+`cache_dir=...` when you use methods like `from_pretrained`, these models will automatically be downloaded in the
+folder given by the shell environment variable ``TRANSFORMERS_CACHE``. The default value for it will be the PyTorch
 cache home followed by ``/transformers/`` (even if you don't have PyTorch installed). This is (by order of priority):
 
   * shell environment variable ``ENV_TORCH_HOME``
   * shell environment variable ``ENV_XDG_CACHE_HOME`` + ``/torch/``
   * default: ``~/.cache/torch/``
 
-So if you don't have any specific environment variable set, the cache directory will be at 
+So if you don't have any specific environment variable set, the cache directory will be at
 ``~/.cache/torch/transformers/``.
 
-**Note:** If you have set a shell enviromnent variable for one of the predecessors of this library 
-(``PYTORCH_TRANSFORMERS_CACHE`` or ``PYTORCH_PRETRAINED_BERT_CACHE``), those will be used if there is no shell 
+**Note:** If you have set a shell enviromnent variable for one of the predecessors of this library
+(``PYTORCH_TRANSFORMERS_CACHE`` or ``PYTORCH_PRETRAINED_BERT_CACHE``), those will be used if there is no shell
 enviromnent variable for ``TRANSFORMERS_CACHE``.
 
-## Tests
-
-An extensive test suite is included to test the library behavior and several examples. Library tests can be found in the [tests folder](https://github.com/huggingface/transformers/tree/master/tests) and examples tests in the [examples folder](https://github.com/huggingface/transformers/tree/master/examples).
-
-Refer to the [contributing guide](https://github.com/huggingface/transformers/blob/master/CONTRIBUTING.md#tests) for details about running tests.
-
-## OpenAI GPT original tokenization workflow
-
-If you want to reproduce the original tokenization process of the `OpenAI GPT` paper, you will need to install `ftfy` and `SpaCy`:
-
-``` bash
-pip install spacy ftfy==4.4.3
-python -m spacy download en
-```
-
-If you don't install `ftfy` and `SpaCy`, the `OpenAI GPT` tokenizer will default to tokenize using BERT's `BasicTokenizer` followed by Byte-Pair Encoding (which should be fine for most usage, don't worry).
-
-## Note on model downloads (Continuous Integration or large-scale deployments)
+### Note on model downloads (Continuous Integration or large-scale deployments)
 
-If you expect to be downloading large volumes of models (more than 1,000) from our hosted bucket (for instance through your CI setup, or a large-scale production deployment), please cache the model files on your end. It will be way faster, and cheaper. Feel free to contact us privately if you need any help.
+If you expect to be downloading large volumes of models (more than 1,000) from our hosted bucket (for instance through
+your CI setup, or a large-scale production deployment), please cache the model files on your end. It will be way
+faster, and cheaper. Feel free to contact us privately if you need any help.
 
 ## Do you want to run a Transformer model on a mobile device?
 
 You should check out our [swift-coreml-transformers](https://github.com/huggingface/swift-coreml-transformers) repo.
 
-It contains a set of tools to convert PyTorch or TensorFlow 2.0 trained Transformer models (currently contains `GPT-2`, `DistilGPT-2`, `BERT`, and `DistilBERT`) to CoreML models that run on iOS devices.
+It contains a set of tools to convert PyTorch or TensorFlow 2.0 trained Transformer models (currently contains `GPT-2`, 
+`DistilGPT-2`, `BERT`, and `DistilBERT`) to CoreML models that run on iOS devices.
 
-At some point in the future, you'll be able to seamlessly move from pre-training or fine-tuning models in PyTorch to productizing them in CoreML,
-or prototype a model or an app in CoreML then research its hyperparameters or architecture from PyTorch. Super exciting!
+At some point in the future, you'll be able to seamlessly move from pre-training or fine-tuning models in PyTorch or
+TensorFlow 2.0 to productizing them in CoreML, or prototype a model or an app in CoreML then research its
+hyperparameters or architecture from PyTorch or TensorFlow 2.0. Super exciting!
diff --git a/docs/source/model_doc/gpt.rst b/docs/source/model_doc/gpt.rst
@@ -38,6 +38,17 @@ Hugging Face showcasing the generative capabilities of several models. GPT is on
 
 The original code can be found `here <https://github.com/openai/finetune-transformer-lm>`_.
 
+Note:
+
+If you want to reproduce the original tokenization process of the `OpenAI GPT` paper, you will need to install 
+``ftfy`` and ``SpaCy``::
+
+    pip install spacy ftfy==4.4.3
+    python -m spacy download en
+
+If you don't install ``ftfy`` and ``SpaCy``, the :class:`transformers.OpenAIGPTTokenizer` will default to tokenize using 
+BERT's :obj:`BasicTokenizer` followed by Byte-Pair Encoding (which should be fine for most usage, don't 
+worry).
 
 OpenAIGPTConfig
 ~~~~~~~~~~~~~~~~~~~~~