Error when trying to finetune framevad model

**Describe the bug**

An error occurs when training (fine-tuning) is performed in the manifest format specified in the training readme.

**Steps/Code to reproduce bug**

After creating the train, val dataset manifest in the format
-> {"audio_filepath": "/path/to/audio_file2", "offset": 0, "duration": 10000, "label": "0 0 0 1 1 1 1 0 0"}

https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/vad_multilingual_frame_marblenet
When loading the model and performing fine tuning, an error occurs during data loading.

**Expected behavior**

The error message is as follows:

~/nemo/collections/asr/data/audio_to_label.py", line 347, in __getitem__ 
t = torch.tensor(self.label2id[sample.label]).long()
KeyError: '0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0'

When comparing the previous framevad_add branch and the latest branch code, we can see that EncDecClassificationModel in classification_models.py has changed from inheriting the existing _EncDecBaseModel class to inheriting the EncDecSpeakerLabelModel class.

In this case, instead of using audio_to_label_dataset.get_audio_multi_label_dataset to load the manifest, current branch use audio_to_label_dataset.get_speech_label_dataset,
so, audio_to_label.AudioToSpeechLabelDataset is being used instead of audio_to_label.AudioToMultiLabelDataset.

The class inherits _AudioLabelDataset again and uses __getitem__ of the class, which causes an issue where the vad ground truth recorded per frame cannot be read.



If you copy the EncDecClassificationModel class from the previous branch, training step will be performed, but learning will not proceed due to a variable name mismatch, such as val_acc, in the eval step. Please review this error.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error when trying to finetune framevad model #13972

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error when trying to finetune framevad model #13972

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions