Recognition accuracy decays as sound source-device getting distant
I have trained a sound event detector with clean wav file, and recorded some wav file in real-world enviroment to test the detector.
The real-world files have totally the same content (since it was recorded from a wav playback from a PC speaker repeatedly) except for the speaker-device distance of each recording.
I found that the accuracy decays as sound source-device getting distant. I am wondering the reason and how to get same performance ignore the distance.
thank you all!
Speaker vs real world will be different, you will be best to upload the audio directly or capture from the event. You can read some more general tips here on improving model accuracy:
The problem is that as your source-to-microphone distance increases, your files contain more reverberation: the source signal will reflect off multiple paths before arriving at the microphone. The human ear will actually detect this, which is why things sound “far away.”
@Eoin Thank you for your reply!
@robyu Thank you for your reply! I think adding the reverberated sound event (using data augmentation or recording from real enviroment) into the training set might be helpful. But is there any solution such as Neuron Network hyperparameter tuning if I prefer not to modify the training set?