Hello,
I am playing with a audio recognition model that is supposed to recognize blowing sounds from a human.
I have 3 classes, a “cold” blow, “hot” blow and non blow.
The model is trained using a dataset of different persons performing these blowing commands. Dataset consists of around 3 minutes of audio for each command. The non blows have more audio consisting of background noise picked up by the microphone and some random clips from the EC-50 dataset.
I am using a NN classifier that gets MFE features. The window size for the audio is 300ms with 150ms increase. The MFE has 0.02 frame length and 0.01 stride, 50 filters and 1024 FFT length.
After training with 100 cycles and data augmentation on, the model has 99.8% accuracy.
Retesting the model with some various other clips via the model testing tab gives 93% accuracy.
So far so good.
But here is where the weird things happen.
When i record an audio clip with blowing commands, upload this as a test fragment. And classify this via the live classification of an existing test sample. The result is as expected, the classifier sees all blowing commands.
But when i play the same clip via my computer using Voicemeeter to output the audio via a virtual microphone and then classifying this via the real-time live classification running on my computer in Chrome it always classifies it as nonBlow command.
I see that it reacts a little bit, that is, changing the confidence level of the nonBlow class. But this is minimal only lowering it by about 0.1.
I have tried amplifying the signal in Voicemeeter, this improved things a bit but it is still much less reliable than the classification of an existing sample.
I have also tried multiple computers and a mobile phone, all showing the same result.
Even when not using Voicemeeter but instead using a real microphone the result is still not comparable.
Can someone maybe shed some light on why i get different results from live classification as opposed to classification of an existing test sample file? I can’t seem to figure out why the results are different.
Thanks in advance.