Refactor the whole data preprocessor part for DeepSpeech2. #91

xinghai-sun · 2017-06-13T15:52:35Z

resolve #90

Refactor the data preprocessor with newly added classes, e.g. AudioSegment, SpeechSegment, TextFeaturizer, AudioFeaturizer, SpeechFeaturizer etc.
Add data augmentation interfaces and classes e.g. AugmentorBase, AugmentationPipeline, VolumePerturbAugmentor etc., to make it easier to add more data augmentation models.
Separate normalizer's mean-std computing from DataGenerator. Add FeatureNormalizer. -
Add an independent tool compute_mean_std.py for users to create mean_std file before training.
Re-organize data directory into datasets and data_utils.
Add module, class, function docs, and update README.md.

…ize dir, add augmentaion interfaces etc.). 1. Refactor data preprocessor with new added class AudioSegment, SpeechSegment, TextFeaturizer, AudioFeaturizer, SpeechFeaturizer. 2. Add data augmentation interfaces and class AugmentorBase, AugmentationPipeline, VolumnPerturbAugmentor etc.. 3. Seperate normalizer's mean and std computing from training, by adding FeatureNormalizer and a seperate tool compute_mean_std.py. 4. Re-organize directory.

qingqing01

后续觉得可以加数据处理的doc，这个过程还是挺复杂的~

qingqing01 · 2017-06-14T05:24:46Z

deep_speech_2/train.py

    "Otherwise, the training will resume from "
    "the existing model of this path. (default: %(default)s)")
+parser.add_argument(
+    "--augmentation_config",


真实运行的时候需要提供augmentation_config配置吗？只看到code里注释的json格式，没看到json文件，如果运行的时候需要，可否提供一个json文件，用户用时配置就可以

这个建议很好，当前augmentation_config为str格式（由于目前augmentation仅留置了接口，所以默认augmentation_config='{}'，即augmentation不生效)，配置json string确实不方便。
因为模型参数较多，后续可以统一提供一个config file。

qingqing01

LGTM.

chrisxu2016

LGTM

chrisxu2016 · 2017-06-14T05:21:24Z

deep_speech_2/data_utils/audio.py

+        :rtype: AudioSegment
+        """
+        samples, sample_rate = soundfile.read(file, dtype='float32')
+        return cls(samples, sample_rate)


默认只读取.wav文件吗？

chrisxu2016 · 2017-06-14T07:30:56Z

deep_speech_2/data_utils/audio.py

+        :param gain: Gain in decibels to apply to samples. 
+        :type gain: float
+        """
+        self._samples *= 10.**(gain / 20.)


建议这里返回一个新建一个audio对象，方便后面添加add_noise时，复用这个方法
return type(self)(10.**(gain / 20.) * self._samples, self._sample_rate)

chrisxu2016 · 2017-06-14T10:06:29Z

deep_speech_2/data_utils/audio.py

+        :return: Number of samples.
+        :rtype: int
+        """
+        return self._samples.shape(0)


应该是 self._samples.shape[0], ()改为[]

xinghai-sun added 3 commits June 12, 2017 23:19

Add function, class and module docs for data parts in DS2.

b4a9e41

Update README.md for DS2.

22498ad

xinghai-sun requested review from chrisxu2016, kuke, lcy-seso, pkuyym and qingqing01 June 13, 2017 15:52

qingqing01 reviewed Jun 14, 2017

View reviewed changes

qingqing01 approved these changes Jun 14, 2017

View reviewed changes

xinghai-sun merged commit b1e2b23 into PaddlePaddle:develop Jun 14, 2017

xinghai-sun deleted the ds2_refactor_data branch June 14, 2017 06:58

chrisxu2016 reviewed Jun 14, 2017

View reviewed changes

chrisxu2016 reviewed Jun 15, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor the whole data preprocessor part for DeepSpeech2. #91

Refactor the whole data preprocessor part for DeepSpeech2. #91

Uh oh!

xinghai-sun commented Jun 13, 2017

Uh oh!

qingqing01 left a comment

Uh oh!

qingqing01 Jun 14, 2017

Uh oh!

xinghai-sun Jun 14, 2017

Uh oh!

qingqing01 left a comment

Uh oh!

chrisxu2016 left a comment

Uh oh!

chrisxu2016 Jun 14, 2017

Uh oh!

chrisxu2016 Jun 14, 2017

Uh oh!

chrisxu2016 Jun 14, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Refactor the whole data preprocessor part for DeepSpeech2. #91

Refactor the whole data preprocessor part for DeepSpeech2. #91

Uh oh!

Conversation

xinghai-sun commented Jun 13, 2017

Uh oh!

qingqing01 left a comment

Choose a reason for hiding this comment

Uh oh!

qingqing01 Jun 14, 2017

Choose a reason for hiding this comment

Uh oh!

xinghai-sun Jun 14, 2017

Choose a reason for hiding this comment

Uh oh!

qingqing01 left a comment

Choose a reason for hiding this comment

Uh oh!

chrisxu2016 left a comment

Choose a reason for hiding this comment

Uh oh!

chrisxu2016 Jun 14, 2017

Choose a reason for hiding this comment

Uh oh!

chrisxu2016 Jun 14, 2017

Choose a reason for hiding this comment

Uh oh!

chrisxu2016 Jun 14, 2017

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants