Data Augmentation to include music, custom, etc

This is sort of a two-part question.

We have a keyword recognition 'net that now seems to perform quite well. However, its Achille’s heel is background music. In the presence of any sort of background music the success rate falls to zero. It would seem to make sense to add background music during training but I don’t know if this is considered best-practice or a realistic approach.

If it is, would it be possible at some point to add background music or possibly just “custom” noise sources to the data augmentation option?

Hi @jefffhaynes,

Yes this is a good approach in particular if you have some some specific background noise.
We have an example transformation block here:
This feature is reserved for enterprise subscription but you can check the shell script and apply a similar transformation locally using sox.


Hello @jefffhaynes,

You can check this answer too to add custom noise using a python script: Poor performance of Light on/off implementation


Wow, I didn’t know you could define custom blocks. Thanks!

Also, not sure how to upgrade my subscription. I emailed you guys about it and was told I didn’t need it :rofl:

1 Like