Should you pre-process audio to filter out noise?

I’m recording audio using a basic microphone which has some noise when recording environmental sounds. Should I, in my dsp, filter out the noise in these recordings before uploading it to edgeimpulse? Or should I leave the noise in there, so that the noise in the audio is similar both during training and during inference when the same microphone is used?

I’d keep it in, the more real-world your samples are the better they’ll perform. We’re also working on some ways to augment the audio (adding artificial noise, time shifting, etc.) in the processing step to harden the model automatically but it’s a few weeks off. @dansitu can give some more background!

Thanks for the info! That’s what I thought is the right approach. And awesome news re:augmentation. Keep up the wonderful work and I’m very excited to see where this goes.

Thanks for the kind words, @oveddan! :slight_smile:

As Jan mentions, we’re currently working on a few types of data augmentation—from adding noise through to implementing the SpecAugment approach.

Regarding your original question, it’s a good practice to grab audio from as many sources as possible, since the more diversity, the more robust the resulting model will be. Ideally, you could record with a couple of different devices (including whatever device your model will eventually be running on), and you might want to record in a few different settings—for example, for our basic audio classification tutorial I captured audio in two different rooms of my apartment.

It’s totally OK to have fewer samples/sources if you’re experimenting, but adding more will often increase your accuracy. At the very least, you should make sure your test set includes samples recorded in a variety of settings and on the same device you’ll be running inference on in production, so that you can have confidence your model will work well on production data.