Skip to content
This repository was archived by the owner on Jan 15, 2024. It is now read-only.

Commit 210dd0c

Browse files
authored
[Numpy] [Fix] Update README.md (#1306)
* Update README.md Update README.md Update ubuntu18.04-devel-gpu.Dockerfile Update README.md update Update README.md Update README.md Update README.md use python3 -m Update benchmark_utils.py Update benchmark_utils.py Update ubuntu18.04-devel-gpu.Dockerfile Update ubuntu18.04-devel-gpu.Dockerfile * Update ubuntu18.04-devel-gpu.Dockerfile * Update ubuntu18.04-devel-gpu.Dockerfile * Update ubuntu18.04-devel-gpu.Dockerfile * Update ubuntu18.04-devel-gpu.Dockerfile * update * Update README.md * Update README.md * Update ubuntu18.04-devel-gpu.Dockerfile * Update README.md
1 parent d93356f commit 210dd0c

8 files changed

Lines changed: 82 additions & 17 deletions

File tree

README.md

Lines changed: 35 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,29 @@
1-
# GluonNLP + Numpy
1+
<h3 align="center">
2+
GluonNLP: Your Choice of Deep Learning for NLP
3+
</h3>
24

3-
Implementing NLP algorithms using the new numpy-like interface of MXNet. It's also a testbed for the next-generation release of GluonNLP.
4-
5-
This is a work-in-progress.
5+
<p align="center">
6+
<a href="https://github.com/dmlc/gluon-nlp/actions"><img src="https://github.com/dmlc/gluon-nlp/workflows/continuous%20build/badge.svg"></a>
7+
<a href="https://codecov.io/gh/dmlc/gluon-nlp"><img src="https://codecov.io/gh/dmlc/gluon-nlp/branch/master/graph/badge.svg"></a>
8+
<a href="https://github.com/dmlc/gluonnlp/actions"><img src="https://img.shields.io/badge/python-3.6%2C3.8-blue.svg"></a>
9+
<a href="https://pypi.org/project/gluonnlp/#history"><img src="https://img.shields.io/pypi/v/gluonnlp.svg"></a>
10+
</p>
611

12+
GluonNLP is a toolkit that enables easy text preprocessing, datasets
13+
loading and neural models building to help you speed up your Natural
14+
Language Processing (NLP) research.
715

816
# Features
917

10-
- Data Pipeline for NLP
11-
- AutoML support (TODO)
18+
For NLP Practitioners
19+
- Easy-to-use Data Pipeline
20+
- Automatically Train Models via AutoNLP (TODO)
21+
22+
For Researchers
1223
- Pretrained Model Zoo
24+
- Programming with numpy-like API
25+
26+
For Engineers
1327
- Fast Deployment
1428
- [TVM](https://tvm.apache.org/) (TODO)
1529
- AWS Integration
@@ -70,6 +84,18 @@ python3 -m gluonnlp.cli.preprocess help
7084

7185
```
7286

87+
### Frequently Asked Questions
88+
- **Question**: I cannot you access the command line toolkits. By running `nlp_data`, it reports `nlp_data: command not found`.
89+
90+
This is sometimes because that you have installed glunonnlp to the user folder and
91+
the executables are installed to `~/.local/bin`. You can try to change the `PATH` variable to
92+
also include '~/.local/bin'.
93+
94+
```
95+
export PATH=${PATH}:~/.local/bin
96+
```
97+
98+
7399
# Run Unittests
74100
You may go to [tests](tests) to see all how to run the unittests.
75101

@@ -78,8 +104,8 @@ You may go to [tests](tests) to see all how to run the unittests.
78104
You can use Docker to launch a JupyterLab development environment with GluonNLP installed.
79105

80106
```
81-
docker pull gluonai/gluon-nlp:v1.0.0
82-
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 gluonai/gluon-nlp:v1.0.0
107+
docker pull gluonai/gluon-nlp:gpu-latest
108+
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 --shm-size=4g gluonai/gluon-nlp:gpu-latest
83109
```
84110

85-
For more details, you can refer to the guidance in [tools/docker].
111+
For more details, you can refer to the guidance in [tools/docker](tools/docker).

scripts/benchmarks/benchmark_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ def is_mxnet_available():
9191

9292

9393
logger = logging.getLogger(__name__) # pylint: disable=invalid-name
94-
logging_config(logger=logger)
94+
logging_config(folder='gluonnlp_benchmark', name='benchmark', logger=logger)
9595

9696

9797
_is_memory_tracing_enabled = False

scripts/machine_translation/train_transformer.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -526,7 +526,6 @@ def train(args):
526526

527527
if __name__ == '__main__':
528528
os.environ['MXNET_GPU_MEM_POOL_TYPE'] = 'Round'
529-
os.environ['MXNET_USE_FUSION'] = '0' # Manually disable pointwise fusion
530529
args = parse_args()
531530
np.random.seed(args.seed)
532531
mx.random.seed(args.seed)

setup.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ def find_version(*file_paths):
3939
'protobuf',
4040
'pandas',
4141
'tokenizers>=0.7.0',
42+
'click>=7.0', # Dependency of youtokentome
4243
'youtokentome>=1.0.6',
4344
'fasttext>=0.9.2'
4445
]

src/gluonnlp/data/tokenizers.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,6 @@
3030
from typing import List, Tuple, Union, NewType, Optional
3131
from collections import OrderedDict
3232

33-
import jieba
3433
import sacremoses
3534

3635
from .vocab import Vocab

tests/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,13 @@
33
To run the unittests, use the following command
44

55
```bash
6-
pytest .
6+
python3 -m pytest .
77
```
88

99
To test for certain file, e.g., the `test_models_transformer.py`, use the following command
1010

1111
```bash
12-
pytest test_models_transformer
12+
python3 -m pytest test_models_transformer
1313
```
1414

1515
Refer to the [official guide of pytest](https://docs.pytest.org/en/latest/) for more details.

tools/docker/README.md

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,35 @@ You can run the docker with the following command.
99

1010
```
1111
docker pull gluonai/gluon-nlp:gpu-latest
12-
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 --shm-size=4g gluonai/gluon-nlp:gpu-latest
12+
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 --shm-size=2g gluonai/gluon-nlp:gpu-latest
1313
```
1414

1515
Here, we open the ports 8888, 8787, 8786, which are used for connecting to JupyterLab.
16-
Also, we set `--shm-size` to `4g`. This sets the shared memory storage to 4GB. Since NCCL will
16+
Also, we set `--shm-size` to `2g`. This sets the shared memory storage to 2GB. Since NCCL will
1717
create shared memory segments, this argument is essential for the JupyterNotebook to work with NCCL.
1818
(See also https://github.com/NVIDIA/nccl/issues/290).
1919

20+
The folder structure of the docker image will be
21+
```
22+
/workspace/
23+
├── gluonnlp
24+
├── horovod
25+
├── mxnet
26+
├── notebooks
27+
├── data
28+
```
29+
30+
If you have a multi-GPU instance, e.g., [g4dn.12xlarge](https://aws.amazon.com/ec2/instance-types/g4/),
31+
[p2.8xlarge](https://aws.amazon.com/ec2/instance-types/p2/),
32+
[p3.8xlarge](https://aws.amazon.com/ec2/instance-types/p3/), you can try to run the following
33+
command to verify the installation of horovod + MXNet
34+
35+
```
36+
docker run --gpus all --rm -it --shm-size=4g gluonai/gluon-nlp:gpu-latest \
37+
horovodrun -np 2 python3 -m pytest /workspace/horovod/horovod/test/test_mxnet.py
38+
```
39+
40+
2041
## Build your own Docker Image
2142
To build a docker image fom the dockerfile, you may use the following command:
2243

tools/docker/ubuntu18.04-devel-gpu.Dockerfile

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ RUN echo "hwloc_base_binding_policy = none" >> /usr/local/etc/openmpi-mca-params
7474
ENV LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH
7575
ENV PATH=/usr/local/openmpi/bin/:/usr/local/bin:/root/.local/bin:$PATH
7676

77-
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python
77+
RUN ln -s $(which python3) /usr/local/bin/python
7878

7979
RUN mkdir -p ${WORKDIR}
8080

@@ -144,6 +144,25 @@ WORKDIR ${WORKDIR}
144144
# Debug horovod by default
145145
RUN echo NCCL_DEBUG=INFO >> /etc/nccl.conf
146146

147+
# Install NodeJS + Tensorboard + TensorboardX
148+
RUN curl -sL https://deb.nodesource.com/setup_14.x | bash - \
149+
&& apt-get install -y nodejs
150+
151+
RUN apt-get update \
152+
&& apt-get install -y --no-install-recommends \
153+
libsndfile1-dev
154+
155+
RUN pip3 install --no-cache --upgrade \
156+
soundfile==0.10.2 \
157+
ipywidgets==7.5.1 \
158+
jupyter_tensorboard==0.2.0 \
159+
widgetsnbextension==3.5.1 \
160+
tensorboard==2.1.1 \
161+
tensorboardX==2.1
162+
RUN jupyter labextension install jupyterlab_tensorboard \
163+
&& jupyter nbextension enable --py widgetsnbextension \
164+
&& jupyter labextension install @jupyter-widgets/jupyterlab-manager
165+
147166
# Revise default shell to /bin/bash
148167
RUN jupyter notebook --generate-config \
149168
&& echo "c.NotebookApp.terminado_settings = { 'shell_command': ['/bin/bash'] }" >> /root/.jupyter/jupyter_notebook_config.py

0 commit comments

Comments
 (0)