Skip to content

Commit b1e2b23

Browse files
authored
Merge pull request #91 from xinghai-sun/ds2_refactor_data
Refactor the data preprocessor part for DeepSpeech2.
2 parents 8d55151 + 22498ad commit b1e2b23

24 files changed

+1280
-512
lines changed

deep_speech_2/README.md

Lines changed: 24 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -16,34 +16,48 @@ For some machines, we also need to install libsndfile1. Details to be added.
1616
### Preparing Data
1717

1818
```
19-
cd data
20-
python librispeech.py
21-
cat manifest.libri.train-* > manifest.libri.train-all
19+
cd datasets
20+
sh run_all.sh
2221
cd ..
2322
```
2423

25-
After running librispeech.py, we have several "manifest" json files named with a prefix `manifest.libri.`. A manifest file summarizes a speech data set, with each line containing the meta data (i.e. audio filepath, transcription text, audio duration) of each audio file within the data set, in json format.
24+
`sh run_all.sh` prepares all ASR datasets (currently, only LibriSpeech available). After running, we have several summarization manifest files in json-format.
2625

27-
By `cat manifest.libri.train-* > manifest.libri.train-all`, we simply merge the three seperate sample sets of LibriSpeech (train-clean-100, train-clean-360, train-other-500) into one training set. This is a simple way for merging different data sets.
26+
A manifest file summarizes a speech data set, with each line containing the meta data (i.e. audio filepath, transcript text, audio duration) of each audio file within the data set, in json format. Manifest file serves as an interface informing our system of where and what to read the speech samples.
27+
28+
29+
More help for arguments:
30+
31+
```
32+
python datasets/librispeech/librispeech.py --help
33+
```
34+
35+
### Preparing for Training
36+
37+
```
38+
python compute_mean_std.py
39+
```
40+
41+
`python compute_mean_std.py` computes mean and stdandard deviation for audio features, and save them to a file with a default name `./mean_std.npz`. This file will be used in both training and inferencing.
2842

2943
More help for arguments:
3044

3145
```
32-
python librispeech.py --help
46+
python compute_mean_std.py --help
3347
```
3448

35-
### Traininig
49+
### Training
3650

3751
For GPU Training:
3852

3953
```
40-
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --trainer_count 4 --train_manifest_path ./data/manifest.libri.train-all
54+
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --trainer_count 4
4155
```
4256

4357
For CPU Training:
4458

4559
```
46-
python train.py --trainer_count 8 --use_gpu False -- train_manifest_path ./data/manifest.libri.train-all
60+
python train.py --trainer_count 8 --use_gpu False
4761
```
4862

4963
More help for arguments:
@@ -55,7 +69,7 @@ python train.py --help
5569
### Inferencing
5670

5771
```
58-
python infer.py
72+
CUDA_VISIBLE_DEVICES=0 python infer.py
5973
```
6074

6175
More help for arguments:

0 commit comments

Comments
 (0)